hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Feature extraction from unequal length heterogeneous EHR time series via dynamic time warping and tensor decomposition
Department of Biostatistics, Domus Medica, University of Oslo, Oslo, Norway.ORCID iD: 0000-0003-0501-5909
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.ORCID iD: 0000-0001-8413-963x
Department of Biostatistics, Domus Medica, University of Oslo, Oslo, Norway.
2021 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 35, p. 1760-1784Article in journal (Refereed) Published
Abstract [en]

Electronic Health Records (EHR) data is routinely generated patient data that can provide useful information for analytical tasks such as disease detection and clinical event prediction. However, temporal EHR data such as physiological vital signs and lab test results are particularly challenging. Temporal EHR features typically have different sampling frequencies; such examples include heart rate (measured almost continuously) and blood test results (a few times during a patient’s entire stay). Different patients also have different length of stays. Existing approaches for temporal EHR sequence extraction either ignore the temporal pattern within features, or use a predefined window to select a section of the sequences without taking into account all the information. We propose a novel approach to tackle the issue of irregularly sampled, unequal length EHR time series using dynamic time warping and tensor decomposition. We use DTW to learn the pairwise distances for each temporal feature among the patient cohort and stack the distance matrices into a tensor. We then decompose the tensor to learn the latent structure, which is consequently used for patient representation. Finally, we use the patient representation for in-hospital mortality prediction. We illustrate our method on two cohorts from the MIMIC-III database: the sepsis and the acute kidney failure cohorts. We show that our method produces outstanding classification performance in terms of AUROC, AUPRC and accuracy compared with the baseline methods: LSTM and DTW-KNN. In the end we provide a detailed analysis on the feature importance for the interpretability of our method. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.

Place, publisher, year, edition, pages
New York, NY: Springer, 2021. Vol. 35, p. 1760-1784
Keywords [en]
Electronic health records, Dynamic time warping, Tensor decomposition, Patient similarity
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-43757DOI: 10.1007/s10618-020-00724-6ISI: 000604530000001Scopus ID: 2-s2.0-85098651998OAI: oai:DiVA.org:hh-43757DiVA, id: diva2:1514479
Available from: 2021-01-05 Created: 2021-01-05 Last updated: 2021-08-04Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Fanaee Tork, Hadi

Search in DiVA

By author/editor
Zhang, ChiFanaee Tork, Hadi
By organisation
CAISR - Center for Applied Intelligent Systems Research
In the same journal
Data mining and knowledge discovery
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 126 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf