hh.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 49) Show all publications
Calikus, E., Nowaczyk, S., Pinheiro Sant'Anna, A., Gadd, H. & Werner, S. (2019). A data-driven approach for discovering heat load patterns in district heating. Applied Energy, 252, Article ID 113409.
Open this publication in new window or tab >>A data-driven approach for discovering heat load patterns in district heating
Show others...
2019 (English)In: Applied Energy, ISSN 0306-2619, E-ISSN 1872-9118, Vol. 252, article id 113409Article in journal (Refereed) Published
Abstract [en]

Understanding the heat usage of customers is crucial for effective district heating operations and management. Unfortunately, existing knowledge about customers and their heat load behaviors is quite scarce. Most previous studies are limited to small-scale analyses that are not representative enough to understand the behavior of the overall network. In this work, we propose a data-driven approach that enables large-scale automatic analysis of heat load patterns in district heating networks without requiring prior knowledge. Our method clusters the customer profiles into different groups, extracts their representative patterns, and detects unusual customers whose profiles deviate significantly from the rest of their group. Using our approach, we present the first large-scale, comprehensive analysis of the heat load patterns by conducting a case study on many buildings in six different customer categories connected to two district heating networks in the south of Sweden. The 1222 buildings had a total floor space of 3.4 million square meters and used 1540 TJ heat during 2016. The results show that the proposed method has a high potential to be deployed and used in practice to analyze and understand customers’ heat-use habits. © 2019 Calikus et al. Published by Elsevier Ltd.

Place, publisher, year, edition, pages
Oxford: Elsevier, 2019
Keywords
District heating, Energy efficiency, Heat load patterns, Clustering, Abnormal heat use
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hh:diva-40907 (URN)10.1016/j.apenergy.2019.113409 (DOI)2-s2.0-85066961984 (Scopus ID)
Funder
Knowledge Foundation, 20160103
Available from: 2019-11-12 Created: 2019-11-12 Last updated: 2019-11-13
Pirasteh, P., Nowaczyk, S., Pashami, S., Löwenadler, M., Thunberg, K., Ydreskog, H. & Berck, P. (2019). Interactive feature extraction for diagnostic trouble codes in predictive maintenance: A case study from automotive domain. In: Proceedings of the Workshop on Interactive Data Mining: . Paper presented at WSDM 2019: The 12th ACM International Conference on Web Search and Data Mining, Melbourne, VIC, Australia, 11-15 February, 2019. New York, NY: Association for Computing Machinery (ACM), Article ID 4.
Open this publication in new window or tab >>Interactive feature extraction for diagnostic trouble codes in predictive maintenance: A case study from automotive domain
Show others...
2019 (English)In: Proceedings of the Workshop on Interactive Data Mining, New York, NY: Association for Computing Machinery (ACM), 2019, article id 4Conference paper, Published paper (Refereed)
Abstract [en]

Predicting future maintenance needs of equipment can be addressed in a variety of ways. Methods based on machine learning approaches provide an interesting platform for mining large data sets to find patterns that might correlate with a given fault. In this paper, we approach predictive maintenance as a classification problem and use Random Forest to separate data readouts within a particular time window into those corresponding to faulty and non-faulty component categories. We utilize diagnostic trouble codes (DTCs) as an example of event-based data, and propose four categories of features that can be derived from DTCs as a predictive maintenance framework. We test the approach using large-scale data from a fleet of heavy duty trucks, and show that DTCs can be used within our framework as indicators of imminent failures in different components.

Place, publisher, year, edition, pages
New York, NY: Association for Computing Machinery (ACM), 2019
Keywords
Predictive maintenance, failure detection, diagnostic trouble codes, feature extraction
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-40184 (URN)10.1145/3304079.3310288 (DOI)978-1-4503-6296-2 (ISBN)
Conference
WSDM 2019: The 12th ACM International Conference on Web Search and Data Mining, Melbourne, VIC, Australia, 11-15 February, 2019
Available from: 2019-07-07 Created: 2019-07-07 Last updated: 2019-08-02Bibliographically approved
Ashfaq, A. & Nowaczyk, S. (2019). Machine learning in healthcare - a system’s perspective. In: B. Aditya Prakash, Anil Vullikanti, Shweta Bansal, Adam Sadelik (Ed.), Proceedings of the ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK): . Paper presented at 25th ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK '19), Anchorage, Alaska, United States, August 5, 2019 (pp. 14-17). Arlington
Open this publication in new window or tab >>Machine learning in healthcare - a system’s perspective
2019 (English)In: Proceedings of the ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK) / [ed] B. Aditya Prakash, Anil Vullikanti, Shweta Bansal, Adam Sadelik, Arlington, 2019, p. 14-17Conference paper, Published paper (Refereed)
Abstract [en]

A consequence of the fragmented and siloed healthcare landscape is that patient care (and data) is split along multitude of different facilities and computer systems and enabling interoperability between these systems is hard. The lack interoperability not only hinders continuity of care and burdens providers, but also hinders effective application of Machine Learning (ML) algorithms. Thus, most current ML algorithms, designed to understand patient care and facilitate clinical decision-support, are trained on limited datasets. This approach is analogous to the Newtonian paradigm of Reductionism in which a system is broken down into elementary components and a description of the whole is formed by understanding those components individually. A key limitation of the reductionist approach is that it ignores the component-component interactions and dynamics within the system which are often of prime significance in understanding the overall behaviour of complex adaptive systems (CAS). Healthcare is a CAS.

Though the application of ML on health data have shown incremental improvements for clinical decision support, ML has a much a broader potential to restructure care delivery as a whole and maximize care value. However, this ML potential remains largely untapped: primarily due to functional limitations of Electronic Health Records (EHR) and the inability to see the healthcare system as a whole. This viewpoint (i) articulates the healthcare as a complex system which has a biological and an organizational perspective, (ii) motivates with examples, the need of a system's approach when addressing healthcare challenges via ML and, (iii) emphasizes to unleash EHR functionality - while duly respecting all ethical and legal concerns - to reap full benefits of ML.

Place, publisher, year, edition, pages
Arlington: , 2019
Keywords
Machine learning, Healthcare complexity, System's thinking, Electronic health records
National Category
Other Medical Engineering
Identifiers
urn:nbn:se:hh:diva-40395 (URN)
Conference
25th ACM SIGKDD Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK '19), Anchorage, Alaska, United States, August 5, 2019
Available from: 2019-08-14 Created: 2019-08-14 Last updated: 2019-08-14Bibliographically approved
Ashfaq, A., Pinheiro Sant'Anna, A., Lingman, M. & Nowaczyk, S. (2019). Readmission prediction using deep learning on electronic health records. Journal of Biomedical Informatics, 97, Article ID 103256.
Open this publication in new window or tab >>Readmission prediction using deep learning on electronic health records
2019 (English)In: Journal of Biomedical Informatics, ISSN 1532-0464, E-ISSN 1532-0480, Vol. 97, article id 103256Article in journal (Refereed) Published
Abstract [en]

Unscheduled 30-day readmissions are a hallmark of Congestive Heart Failure (CHF) patients that pose significant health risks and escalate care cost. In order to reduce readmissions and curb the cost of care, it is important to initiate targeted intervention programs for patients at risk of readmission. This requires identifying high-risk patients at the time of discharge from hospital. Here, using real data from over 7,500 CHF patients hospitalized between 2012 and 2016 in Sweden, we built and tested a deep learning framework to predict 30-day unscheduled readmission. We present a cost-sensitive formulation of Long Short-Term Memory (LSTM) neural network using expert features and contextual embedding of clinical concepts. This study targets key elements of an Electronic Health Record (EHR) driven prediction model in a single framework: using both expert and machine derived features, incorporating sequential patterns and addressing the class imbalance problem. We show that the model with all key elements achieves a higher discrimination ability (AUC 0.77) compared to the rest. Additionally, we present a simple financial analysis to estimate annual savings if targeted interventions are offered to high risk patients. © 2019 The Authors

Place, publisher, year, edition, pages
Maryland Heights, MO: Academic Press, 2019
Keywords
Electronic health records, Readmission prediction, Long short-term memory networks, Contextual embeddings
National Category
Health Care Service and Management, Health Policy and Services and Health Economy
Identifiers
urn:nbn:se:hh:diva-39297 (URN)10.1016/j.jbi.2019.103256 (DOI)31351136 (PubMedID)2-s2.0-85069858722 (Scopus ID)
Projects
HiCube - behovsmotiverad hälsoinnovation
Funder
European Regional Development Fund (ERDF)
Note

Funding: The authors thank the European Regional Development Fund (ERDF), Health Technology Center and CAISR at Halmstad University and Hallands Hospital for financing the research work under the project HiCube - behovsmotiverad hälsoinnovation.

Available from: 2019-04-30 Created: 2019-04-30 Last updated: 2019-09-10Bibliographically approved
Mashad Nemati, H., Pinheiro Sant'Anna, A., Nowaczyk, S., Jürgensen, J. H. & Hilber, P. (2019). Reliability Evaluation of Power Cables Considering the Restoration Characteristic. International Journal of Electrical Power & Energy Systems, 105, 622-631
Open this publication in new window or tab >>Reliability Evaluation of Power Cables Considering the Restoration Characteristic
Show others...
2019 (English)In: International Journal of Electrical Power & Energy Systems, ISSN 0142-0615, E-ISSN 1879-3517, Vol. 105, p. 622-631Article in journal (Refereed) Published
Abstract [en]

In this paper Weibull parametric proportional hazard model (PHM) is used to estimate the failure rate of every individual cable based on its age and a set of explanatory factors. The required information for the proposed method is obtained by exploiting available historical cable inventory and failure data. This data-driven method does not require any additional measurements on the cables, and allows the cables to be ranked for maintenance prioritization and repair actions.

Furthermore, the results of reliability analysis of power cables are compared when the cables are considered as repairable or non-repairable components. The paper demonstrates that the methods which estimate the time-to-the-first failure (for non-repairable components) lead to incorrect conclusions about reliability of repairable power cables.

The proposed method is used to evaluate the failure rate of each individual Paper Insulated Lead Cover (PILC) underground cables in a distribution grid in the south of Sweden. © 2018 Elsevier Ltd

Place, publisher, year, edition, pages
London: Elsevier, 2019
Keywords
Power cable, historical data, reliability, proportional hazard model, preventive maintenance.
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hh:diva-35470 (URN)10.1016/j.ijepes.2018.08.047 (DOI)000449447200055 ()2-s2.0-85053080255 (Scopus ID)
Available from: 2017-11-24 Created: 2017-11-24 Last updated: 2019-03-19Bibliographically approved
Bouguelia, M.-R., Nowaczyk, S., Santosh, K. C. & Verikas, A. (2018). Agreeing to disagree: active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307-1319
Open this publication in new window or tab >>Agreeing to disagree: active learning with noisy labels without crowdsourcing
2018 (English)In: International Journal of Machine Learning and Cybernetics, ISSN 1868-8071, E-ISSN 1868-808X, Vol. 9, no 8, p. 1307-1319Article in journal (Refereed) Published
Abstract [en]

We propose a new active learning method for classification, which handles label noise without relying on multiple oracles (i.e., crowdsourcing). We propose a strategy that selects (for labeling) instances with a high influence on the learned model. An instance x is said to have a high influence on the model h, if training h on x (with label y = h(x)) would result in a model that greatly disagrees with h on labeling other instances. Then, we propose another strategy that selects (for labeling) instances that are highly influenced by changes in the learned model. An instance x is said to be highly influenced, if training h with a set of instances would result in a committee of models that agree on a common label for x but disagree with h(x). We compare the two strategies and we show, on different publicly available datasets, that selecting instances according to the first strategy while eliminating noisy labels according to the second strategy, greatly improves the accuracy compared to several benchmarking methods, even when a significant amount of instances are mislabeled. © Springer-Verlag Berlin Heidelberg 2017

Place, publisher, year, edition, pages
Heidelberg: Springer, 2018
Keywords
Active learning, Classification, Label noise, Mislabeling, Interactive learning, Machine learning, Data mining
National Category
Signal Processing Computer Systems Computer Sciences
Identifiers
urn:nbn:se:hh:diva-33365 (URN)10.1007/s13042-017-0645-0 (DOI)
Available from: 2017-02-27 Created: 2017-02-27 Last updated: 2018-07-23Bibliographically approved
Bouguelia, M.-R., Nowaczyk, S. & Payberah, A. H. (2018). An adaptive algorithm for anomaly and novelty detection in evolving data streams. Data mining and knowledge discovery, 32(6), 1597-1633
Open this publication in new window or tab >>An adaptive algorithm for anomaly and novelty detection in evolving data streams
2018 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 32, no 6, p. 1597-1633Article in journal (Refereed) Published
Abstract [en]

In the era of big data, considerable research focus is being put on designing efficient algorithms capable of learning and extracting high-level knowledge from ubiquitous data streams in an online fashion. While, most existing algorithms assume that data samples are drawn from a stationary distribution, several complex environments deal with data streams that are subject to change over time. Taking this aspect into consideration is an important step towards building truly aware and intelligent systems. In this paper, we propose GNG-A, an adaptive method for incremental unsupervised learning from evolving data streams experiencing various types of change. The proposed method maintains a continuously updated network (graph) of neurons by extending the Growing Neural Gas algorithm with three complementary mechanisms, allowing it to closely track both gradual and sudden changes in the data distribution. First, an adaptation mechanism handles local changes where the distribution is only non-stationary in some regions of the feature space. Second, an adaptive forgetting mechanism identifies and removes neurons that become irrelevant due to the evolving nature of the stream. Finally, a probabilistic evolution mechanism creates new neurons when there is a need to represent data in new regions of the feature space. The proposed method is demonstrated for anomaly and novelty detection in non-stationary environments. Results show that the method handles different data distributions and efficiently reacts to various types of change. © 2018 The Author(s)

Place, publisher, year, edition, pages
New York: Springer, 2018
Keywords
Data stream, Growing neural gas, Change detection, Non-stationary environments, Anomaly and novelty detection
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-36752 (URN)10.1007/s10618-018-0571-0 (DOI)2-s2.0-85046792304 (Scopus ID)
Projects
BIDAF
Available from: 2018-05-13 Created: 2018-05-13 Last updated: 2018-09-20Bibliographically approved
Pashami, S., Holst, A., Bae, J. & Nowaczyk, S. (2018). Causal discovery using clusters from observational data. In: : . Paper presented at FAIM'18 Workshop on CausalML, Stockholm, Sweden, July 15, 2018.
Open this publication in new window or tab >>Causal discovery using clusters from observational data
2018 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Many methods have been proposed over the years for distinguishing causes from effects using observational data only, and new ones are continuously being developed – deducing causal relationships is difficult enough that we do not hope to ever get the perfect one. Instead, we progress by creating powerful heuristics, capable of capturing more and more of the hints that are present in real data.

One type of such hints, quite surprisingly rarely explicitly addressed by existing methods, is in-homogeneities in the data. Clusters are a very typical occurrence that should be taken into account, and exploited, in the process of identifying causes and effects. In this paper, we discuss the potential benefits, and explore the hints that clusters in the data can provide for causal discovery. We propose a new method, and show, using both artificial and real data, that accounting for clusters in the data leads to more accurate learning of causal structures.

National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:hh:diva-39216 (URN)
Conference
FAIM'18 Workshop on CausalML, Stockholm, Sweden, July 15, 2018
Available from: 2019-04-09 Created: 2019-04-09 Last updated: 2019-04-11Bibliographically approved
Vaiciukynas, E., Uličný, M., Pashami, S. & Nowaczyk, S. (2018). Learning Low-Dimensional Representation of Bivariate Histogram Data. IEEE transactions on intelligent transportation systems (Print), 19(11), 3723-3735
Open this publication in new window or tab >>Learning Low-Dimensional Representation of Bivariate Histogram Data
2018 (English)In: IEEE transactions on intelligent transportation systems (Print), ISSN 1524-9050, E-ISSN 1558-0016, Vol. 19, no 11, p. 3723-3735Article in journal (Refereed) Published
Abstract [en]

With an increasing amount of data in intelligent transportation systems, methods are needed to automatically extract general representations that accurately predict not only known tasks but also similar tasks that can emerge in the future. Creation of low-dimensional representations can be unsupervised or can exploit various labels in multi-task learning (when goal tasks are known) or transfer learning (when they are not) settings. Finding a general, low-dimensional representation suitable for multiple tasks is an important step toward knowledge discovery in aware intelligent transportation systems. This paper evaluates several approaches mapping high-dimensional sensor data from Volvo trucks into a low-dimensional representation that is useful for prediction. Original data are bivariate histograms, with two types--turbocharger and engine--considered. Low-dimensional representations were evaluated in a supervised fashion by mean equal error rate (EER) using a random forest classifier on a set of 27 1-vs-Rest detection tasks. Results from unsupervised learning experiments indicate that using an autoencoder to create an intermediate representation, followed by $t$-distributed stochastic neighbor embedding, is the most effective way to create low-dimensional representation of the original bivariate histogram. Individually, $t$-distributed stochastic neighbor embedding offered best results for 2-D or 3-D and classical autoencoder for 6-D or 10-D representations. Using multi-task learning, combining unsupervised and supervised objectives on all 27 available tasks, resulted in 10-D representations with a significantly lower EER compared to the original 400-D data. In transfer learning setting, with topmost diverse tasks used for representation learning, 10-D representations achieved EER comparable to the original representation.

Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2018
Keywords
Task analysis, Histograms, Engines, Intelligent transportation systems, Maintenance engineering, Machine learning, Feature extraction
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:hh:diva-38252 (URN)10.1109/TITS.2018.2865103 (DOI)2-s2.0-85053294183 (Scopus ID)
Available from: 2018-11-04 Created: 2018-11-04 Last updated: 2018-11-20Bibliographically approved
Bouguelia, M.-R., Karlsson, A., Pashami, S., Nowaczyk, S. & Holst, A. (2018). Mode tracking using multiple data streams. Information Fusion, 43, 33-46
Open this publication in new window or tab >>Mode tracking using multiple data streams
Show others...
2018 (English)In: Information Fusion, ISSN 1566-2535, E-ISSN 1872-6305, Vol. 43, p. 33-46Article in journal (Refereed) Published
Abstract [en]

Most existing work in information fusion focuses on combining information with well-defined meaning towards a concrete, pre-specified goal. In contradistinction, we instead aim for autonomous discovery of high-level knowledge from ubiquitous data streams. This paper introduces a method for recognition and tracking of hidden conceptual modes, which are essential to fully understand the operation of complex environments. We consider a scenario of analyzing usage of a fleet of city buses, where the objective is to automatically discover and track modes such as highway route, heavy traffic, or aggressive driver, based on available on-board signals. The method we propose is based on aggregating the data over time, since the high-level modes are only apparent in the longer perspective. We search through different features and subsets of the data, and identify those that lead to good clusterings, interpreting those clusters as initial, rough models of the prospective modes. We utilize Bayesian tracking in order to continuously improve the parameters of those models, based on the new data, while at the same time following how the modes evolve over time. Experiments with artificial data of varying degrees of complexity, as well as on real-world datasets, prove the effectiveness of the proposed method in accurately discovering the modes and in identifying which one best explains the current observations from multiple data streams. © 2017 Elsevier B.V. All rights reserved.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2018
Keywords
Mode tracking, Clustering, Data streams, Time series, Knowledge discovery
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-35729 (URN)10.1016/j.inffus.2017.11.011 (DOI)2-s2.0-85037072003 (Scopus ID)
Projects
BIDAF
Available from: 2017-12-01 Created: 2017-12-01 Last updated: 2019-04-12Bibliographically approved
Projects
iMedA: Improving MEDication Adherence through Person Centered Care and Adaptive Interventions [2017-04617_Vinnova]; Halmstad University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7796-5201

Search in DiVA

Show all publications