hh.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 48) Show all publications
Pirasteh, P., Nowaczyk, S., Pashami, S., Löwenadler, M., Thunberg, K., Ydreskog, H. & Berck, P. (2019). Interactive feature extraction for diagnostic trouble codes in predictive maintenance: A case study from automotive domain. In: Proceedings of the Workshop on Interactive Data Mining: . Paper presented at WSDM 2019: The 12th ACM International Conference on Web Search and Data Mining, Melbourne, VIC, Australia, 11-15 February, 2019. New York, NY: Association for Computing Machinery (ACM), Article ID 4.
Open this publication in new window or tab >>Interactive feature extraction for diagnostic trouble codes in predictive maintenance: A case study from automotive domain
Show others...
2019 (English)In: Proceedings of the Workshop on Interactive Data Mining, New York, NY: Association for Computing Machinery (ACM), 2019, article id 4Conference paper, Published paper (Refereed)
Abstract [en]

Predicting future maintenance needs of equipment can be addressed in a variety of ways. Methods based on machine learning approaches provide an interesting platform for mining large data sets to find patterns that might correlate with a given fault. In this paper, we approach predictive maintenance as a classification problem and use Random Forest to separate data readouts within a particular time window into those corresponding to faulty and non-faulty component categories. We utilize diagnostic trouble codes (DTCs) as an example of event-based data, and propose four categories of features that can be derived from DTCs as a predictive maintenance framework. We test the approach using large-scale data from a fleet of heavy duty trucks, and show that DTCs can be used within our framework as indicators of imminent failures in different components.

Place, publisher, year, edition, pages
New York, NY: Association for Computing Machinery (ACM), 2019
Keywords
Predictive maintenance, failure detection, diagnostic trouble codes, feature extraction
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-40184 (URN)10.1145/3304079.3310288 (DOI)978-1-4503-6296-2 (ISBN)
Conference
WSDM 2019: The 12th ACM International Conference on Web Search and Data Mining, Melbourne, VIC, Australia, 11-15 February, 2019
Available from: 2019-07-07 Created: 2019-07-07 Last updated: 2019-08-02Bibliographically approved
Ashfaq, A. & Nowaczyk, S. (2019). Machine learning in healthcare - a system’s perspective. In: B. Aditya Prakash, Anil Vullikanti, Shweta Bansal, Adam Sadelik (Ed.), Proceedings of the ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK): . Paper presented at 25th ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK '19), Anchorage, Alaska, United States, August 5, 2019 (pp. 14-17). Arlington
Open this publication in new window or tab >>Machine learning in healthcare - a system’s perspective
2019 (English)In: Proceedings of the ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK) / [ed] B. Aditya Prakash, Anil Vullikanti, Shweta Bansal, Adam Sadelik, Arlington, 2019, p. 14-17Conference paper, Published paper (Refereed)
Abstract [en]

A consequence of the fragmented and siloed healthcare landscape is that patient care (and data) is split along multitude of different facilities and computer systems and enabling interoperability between these systems is hard. The lack interoperability not only hinders continuity of care and burdens providers, but also hinders effective application of Machine Learning (ML) algorithms. Thus, most current ML algorithms, designed to understand patient care and facilitate clinical decision-support, are trained on limited datasets. This approach is analogous to the Newtonian paradigm of Reductionism in which a system is broken down into elementary components and a description of the whole is formed by understanding those components individually. A key limitation of the reductionist approach is that it ignores the component-component interactions and dynamics within the system which are often of prime significance in understanding the overall behaviour of complex adaptive systems (CAS). Healthcare is a CAS.

Though the application of ML on health data have shown incremental improvements for clinical decision support, ML has a much a broader potential to restructure care delivery as a whole and maximize care value. However, this ML potential remains largely untapped: primarily due to functional limitations of Electronic Health Records (EHR) and the inability to see the healthcare system as a whole. This viewpoint (i) articulates the healthcare as a complex system which has a biological and an organizational perspective, (ii) motivates with examples, the need of a system's approach when addressing healthcare challenges via ML and, (iii) emphasizes to unleash EHR functionality - while duly respecting all ethical and legal concerns - to reap full benefits of ML.

Place, publisher, year, edition, pages
Arlington: , 2019
Keywords
Machine learning, Healthcare complexity, System's thinking, Electronic health records
National Category
Other Medical Engineering
Identifiers
urn:nbn:se:hh:diva-40395 (URN)
Conference
25th ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK '19), Anchorage, Alaska, United States, August 5, 2019
Available from: 2019-08-14 Created: 2019-08-14 Last updated: 2019-08-14Bibliographically approved
Ashfaq, A., Pinheiro Sant'Anna, A., Lingman, M. & Nowaczyk, S. (2019). Readmission prediction using deep learning on electronic health records. Journal of Biomedical Informatics, 97, Article ID 103256.
Open this publication in new window or tab >>Readmission prediction using deep learning on electronic health records
2019 (English)In: Journal of Biomedical Informatics, ISSN 1532-0464, E-ISSN 1532-0480, Vol. 97, article id 103256Article in journal (Refereed) Published
Abstract [en]

Unscheduled 30-day readmissions are a hallmark of Congestive Heart Failure (CHF) patients that pose significant health risks and escalate care cost. In order to reduce readmissions and curb the cost of care, it is important to initiate targeted intervention programs for patients at risk of readmission. This requires identifying high-risk patients at the time of discharge from hospital. Here, using real data from over 7,500 CHF patients hospitalized between 2012 and 2016 in Sweden, we built and tested a deep learning framework to predict 30-day unscheduled readmission. We present a cost-sensitive formulation of Long Short-Term Memory (LSTM) neural network using expert features and contextual embedding of clinical concepts. This study targets key elements of an Electronic Health Record (EHR) driven prediction model in a single framework: using both expert and machine derived features, incorporating sequential patterns and addressing the class imbalance problem. We show that the model with all key elements achieves a higher discrimination ability (AUC 0.77) compared to the rest. Additionally, we present a simple financial analysis to estimate annual savings if targeted interventions are offered to high risk patients. © 2019 The Authors

Place, publisher, year, edition, pages
Maryland Heights, MO: Academic Press, 2019
Keywords
Electronic health records, Readmission prediction, Long short-term memory networks, Contextual embeddings
National Category
Health Care Service and Management, Health Policy and Services and Health Economy
Identifiers
urn:nbn:se:hh:diva-39297 (URN)10.1016/j.jbi.2019.103256 (DOI)31351136 (PubMedID)2-s2.0-85069858722 (Scopus ID)
Projects
HiCube - behovsmotiverad hälsoinnovation
Funder
European Regional Development Fund (ERDF)
Note

Funding: The authors thank the European Regional Development Fund (ERDF), Health Technology Center and CAISR at Halmstad University and Hallands Hospital for financing the research work under the project HiCube - behovsmotiverad hälsoinnovation.

Available from: 2019-04-30 Created: 2019-04-30 Last updated: 2019-09-10Bibliographically approved
Mashad Nemati, H., Pinheiro Sant'Anna, A., Nowaczyk, S., Jürgensen, J. H. & Hilber, P. (2019). Reliability Evaluation of Power Cables Considering the Restoration Characteristic. International Journal of Electrical Power & Energy Systems, 105, 622-631
Open this publication in new window or tab >>Reliability Evaluation of Power Cables Considering the Restoration Characteristic
Show others...
2019 (English)In: International Journal of Electrical Power & Energy Systems, ISSN 0142-0615, E-ISSN 1879-3517, Vol. 105, p. 622-631Article in journal (Refereed) Published
Abstract [en]

In this paper Weibull parametric proportional hazard model (PHM) is used to estimate the failure rate of every individual cable based on its age and a set of explanatory factors. The required information for the proposed method is obtained by exploiting available historical cable inventory and failure data. This data-driven method does not require any additional measurements on the cables, and allows the cables to be ranked for maintenance prioritization and repair actions.

Furthermore, the results of reliability analysis of power cables are compared when the cables are considered as repairable or non-repairable components. The paper demonstrates that the methods which estimate the time-to-the-first failure (for non-repairable components) lead to incorrect conclusions about reliability of repairable power cables.

The proposed method is used to evaluate the failure rate of each individual Paper Insulated Lead Cover (PILC) underground cables in a distribution grid in the south of Sweden. © 2018 Elsevier Ltd

Place, publisher, year, edition, pages
London: Elsevier, 2019
Keywords
Power cable, historical data, reliability, proportional hazard model, preventive maintenance.
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hh:diva-35470 (URN)10.1016/j.ijepes.2018.08.047 (DOI)000449447200055 ()2-s2.0-85053080255 (Scopus ID)
Available from: 2017-11-24 Created: 2017-11-24 Last updated: 2019-03-19Bibliographically approved
Bouguelia, M.-R., Nowaczyk, S., Santosh, K. C. & Verikas, A. (2018). Agreeing to disagree: active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307-1319
Open this publication in new window or tab >>Agreeing to disagree: active learning with noisy labels without crowdsourcing
2018 (English)In: International Journal of Machine Learning and Cybernetics, ISSN 1868-8071, E-ISSN 1868-808X, Vol. 9, no 8, p. 1307-1319Article in journal (Refereed) Published
Abstract [en]

We propose a new active learning method for classification, which handles label noise without relying on multiple oracles (i.e., crowdsourcing). We propose a strategy that selects (for labeling) instances with a high influence on the learned model. An instance x is said to have a high influence on the model h, if training h on x (with label y = h(x)) would result in a model that greatly disagrees with h on labeling other instances. Then, we propose another strategy that selects (for labeling) instances that are highly influenced by changes in the learned model. An instance x is said to be highly influenced, if training h with a set of instances would result in a committee of models that agree on a common label for x but disagree with h(x). We compare the two strategies and we show, on different publicly available datasets, that selecting instances according to the first strategy while eliminating noisy labels according to the second strategy, greatly improves the accuracy compared to several benchmarking methods, even when a significant amount of instances are mislabeled. © Springer-Verlag Berlin Heidelberg 2017

Place, publisher, year, edition, pages
Heidelberg: Springer, 2018
Keywords
Active learning, Classification, Label noise, Mislabeling, Interactive learning, Machine learning, Data mining
National Category
Signal Processing Computer Systems Computer Sciences
Identifiers
urn:nbn:se:hh:diva-33365 (URN)10.1007/s13042-017-0645-0 (DOI)
Available from: 2017-02-27 Created: 2017-02-27 Last updated: 2018-07-23Bibliographically approved
Bouguelia, M.-R., Nowaczyk, S. & Payberah, A. H. (2018). An adaptive algorithm for anomaly and novelty detection in evolving data streams. Data mining and knowledge discovery, 32(6), 1597-1633
Open this publication in new window or tab >>An adaptive algorithm for anomaly and novelty detection in evolving data streams
2018 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 32, no 6, p. 1597-1633Article in journal (Refereed) Published
Abstract [en]

In the era of big data, considerable research focus is being put on designing efficient algorithms capable of learning and extracting high-level knowledge from ubiquitous data streams in an online fashion. While, most existing algorithms assume that data samples are drawn from a stationary distribution, several complex environments deal with data streams that are subject to change over time. Taking this aspect into consideration is an important step towards building truly aware and intelligent systems. In this paper, we propose GNG-A, an adaptive method for incremental unsupervised learning from evolving data streams experiencing various types of change. The proposed method maintains a continuously updated network (graph) of neurons by extending the Growing Neural Gas algorithm with three complementary mechanisms, allowing it to closely track both gradual and sudden changes in the data distribution. First, an adaptation mechanism handles local changes where the distribution is only non-stationary in some regions of the feature space. Second, an adaptive forgetting mechanism identifies and removes neurons that become irrelevant due to the evolving nature of the stream. Finally, a probabilistic evolution mechanism creates new neurons when there is a need to represent data in new regions of the feature space. The proposed method is demonstrated for anomaly and novelty detection in non-stationary environments. Results show that the method handles different data distributions and efficiently reacts to various types of change. © 2018 The Author(s)

Place, publisher, year, edition, pages
New York: Springer, 2018
Keywords
Data stream, Growing neural gas, Change detection, Non-stationary environments, Anomaly and novelty detection
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-36752 (URN)10.1007/s10618-018-0571-0 (DOI)2-s2.0-85046792304 (Scopus ID)
Projects
BIDAF
Available from: 2018-05-13 Created: 2018-05-13 Last updated: 2018-09-20Bibliographically approved
Pashami, S., Holst, A., Bae, J. & Nowaczyk, S. (2018). Causal discovery using clusters from observational data. In: : . Paper presented at FAIM'18 Workshop on CausalML, Stockholm, Sweden, July 15, 2018.
Open this publication in new window or tab >>Causal discovery using clusters from observational data
2018 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Many methods have been proposed over the years for distinguishing causes from effects using observational data only, and new ones are continuously being developed – deducing causal relationships is difficult enough that we do not hope to ever get the perfect one. Instead, we progress by creating powerful heuristics, capable of capturing more and more of the hints that are present in real data.

One type of such hints, quite surprisingly rarely explicitly addressed by existing methods, is in-homogeneities in the data. Clusters are a very typical occurrence that should be taken into account, and exploited, in the process of identifying causes and effects. In this paper, we discuss the potential benefits, and explore the hints that clusters in the data can provide for causal discovery. We propose a new method, and show, using both artificial and real data, that accounting for clusters in the data leads to more accurate learning of causal structures.

National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:hh:diva-39216 (URN)
Conference
FAIM'18 Workshop on CausalML, Stockholm, Sweden, July 15, 2018
Available from: 2019-04-09 Created: 2019-04-09 Last updated: 2019-04-11Bibliographically approved
Vaiciukynas, E., Uličný, M., Pashami, S. & Nowaczyk, S. (2018). Learning Low-Dimensional Representation of Bivariate Histogram Data. IEEE transactions on intelligent transportation systems (Print), 19(11), 3723-3735
Open this publication in new window or tab >>Learning Low-Dimensional Representation of Bivariate Histogram Data
2018 (English)In: IEEE transactions on intelligent transportation systems (Print), ISSN 1524-9050, E-ISSN 1558-0016, Vol. 19, no 11, p. 3723-3735Article in journal (Refereed) Published
Abstract [en]

With an increasing amount of data in intelligent transportation systems, methods are needed to automatically extract general representations that accurately predict not only known tasks but also similar tasks that can emerge in the future. Creation of low-dimensional representations can be unsupervised or can exploit various labels in multi-task learning (when goal tasks are known) or transfer learning (when they are not) settings. Finding a general, low-dimensional representation suitable for multiple tasks is an important step toward knowledge discovery in aware intelligent transportation systems. This paper evaluates several approaches mapping high-dimensional sensor data from Volvo trucks into a low-dimensional representation that is useful for prediction. Original data are bivariate histograms, with two types--turbocharger and engine--considered. Low-dimensional representations were evaluated in a supervised fashion by mean equal error rate (EER) using a random forest classifier on a set of 27 1-vs-Rest detection tasks. Results from unsupervised learning experiments indicate that using an autoencoder to create an intermediate representation, followed by $t$-distributed stochastic neighbor embedding, is the most effective way to create low-dimensional representation of the original bivariate histogram. Individually, $t$-distributed stochastic neighbor embedding offered best results for 2-D or 3-D and classical autoencoder for 6-D or 10-D representations. Using multi-task learning, combining unsupervised and supervised objectives on all 27 available tasks, resulted in 10-D representations with a significantly lower EER compared to the original 400-D data. In transfer learning setting, with topmost diverse tasks used for representation learning, 10-D representations achieved EER comparable to the original representation.

Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2018
Keywords
Task analysis, Histograms, Engines, Intelligent transportation systems, Maintenance engineering, Machine learning, Feature extraction
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:hh:diva-38252 (URN)10.1109/TITS.2018.2865103 (DOI)2-s2.0-85053294183 (Scopus ID)
Available from: 2018-11-04 Created: 2018-11-04 Last updated: 2018-11-20Bibliographically approved
Bouguelia, M.-R., Karlsson, A., Pashami, S., Nowaczyk, S. & Holst, A. (2018). Mode tracking using multiple data streams. Information Fusion, 43, 33-46
Open this publication in new window or tab >>Mode tracking using multiple data streams
Show others...
2018 (English)In: Information Fusion, ISSN 1566-2535, E-ISSN 1872-6305, Vol. 43, p. 33-46Article in journal (Refereed) Published
Abstract [en]

Most existing work in information fusion focuses on combining information with well-defined meaning towards a concrete, pre-specified goal. In contradistinction, we instead aim for autonomous discovery of high-level knowledge from ubiquitous data streams. This paper introduces a method for recognition and tracking of hidden conceptual modes, which are essential to fully understand the operation of complex environments. We consider a scenario of analyzing usage of a fleet of city buses, where the objective is to automatically discover and track modes such as highway route, heavy traffic, or aggressive driver, based on available on-board signals. The method we propose is based on aggregating the data over time, since the high-level modes are only apparent in the longer perspective. We search through different features and subsets of the data, and identify those that lead to good clusterings, interpreting those clusters as initial, rough models of the prospective modes. We utilize Bayesian tracking in order to continuously improve the parameters of those models, based on the new data, while at the same time following how the modes evolve over time. Experiments with artificial data of varying degrees of complexity, as well as on real-world datasets, prove the effectiveness of the proposed method in accurately discovering the modes and in identifying which one best explains the current observations from multiple data streams. © 2017 Elsevier B.V. All rights reserved.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2018
Keywords
Mode tracking, Clustering, Data streams, Time series, Knowledge discovery
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-35729 (URN)10.1016/j.inffus.2017.11.011 (DOI)2-s2.0-85037072003 (Scopus ID)
Projects
BIDAF
Available from: 2017-12-01 Created: 2017-12-01 Last updated: 2019-04-12Bibliographically approved
Nowaczyk, S., Pinheiro Sant'Anna, A., Calikus, E. & Fan, Y. (2018). Monitoring equipment operation through model and event discovery. In: Hujun Yin, David Camacho Paulo Novais & Antonio J. Tallón-Ballesteros (Ed.), Intelligent Data Engineering and Automated Learning – IDEAL 2018: 19th International Conference, Madrid, Spain, November 21–23, 2018, Proceedings, Part II. Paper presented at Intelligent Data Engineering and Automated Learning – IDEAL 2018, 19th International Conference, Madrid, Spain, November 21–23, 2018 (pp. 41-53). Cham: Springer, 11315
Open this publication in new window or tab >>Monitoring equipment operation through model and event discovery
2018 (English)In: Intelligent Data Engineering and Automated Learning – IDEAL 2018: 19th International Conference, Madrid, Spain, November 21–23, 2018, Proceedings, Part II / [ed] Hujun Yin, David Camacho Paulo Novais & Antonio J. Tallón-Ballesteros, Cham: Springer, 2018, Vol. 11315, p. 41-53Conference paper, Published paper (Refereed)
Abstract [en]

Monitoring the operation of complex systems in real-time is becoming both required and enabled by current IoT solutions. Predicting faults and optimising productivity requires autonomous methods that work without extensive human supervision. One way to automatically detect deviating operation is to identify groups of peers, or similar systems, and evaluate how well each individual conforms with the group. We propose a monitoring approach that can construct knowledge more autonomously and relies on human experts to a lesser degree: without requiring the designer to think of all possible faults beforehand; able to do the best possible with signals that are already available, without the need for dedicated new sensors; scaling up to “one more system and component” and multiple variants; and finally, one that will adapt to changes over time and remain relevant throughout the lifetime of the system. © Springer Nature Switzerland AG 2018.

Place, publisher, year, edition, pages
Cham: Springer, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11315
Keywords
Artificial intelligence, Computer science, Computers, Event discoveries, Human expert, Human supervision, Monitoring approach, Monitoring equipment, Multiple variants, Real time, Scaling-up, Real time systems
National Category
Embedded Systems
Identifiers
urn:nbn:se:hh:diva-38732 (URN)10.1007/978-3-030-03496-2_6 (DOI)2-s2.0-85057087564 (Scopus ID)9783030034955 (ISBN)978-3-030-03496-2 (ISBN)
Conference
Intelligent Data Engineering and Automated Learning – IDEAL 2018, 19th International Conference, Madrid, Spain, November 21–23, 2018
Available from: 2019-01-08 Created: 2019-01-08 Last updated: 2019-01-08Bibliographically approved
Projects
iMedA: Improving MEDication Adherence through Person Centered Care and Adaptive Interventions [2017-04617_Vinnova]; Halmstad University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7796-5201

Search in DiVA

Show all publications