Dealing with missing data in data analysis is inevitable. Although powerful imputation methods that address this problem exist, there is still much room for improvement. In this study, we examined single imputation based on deep autoencoders, motivated by the apparent success of deep learning to efficiently extract useful dataset features. We have developed a consistent framework for both training and imputation. Moreover, we benchmarked the results against state-of-the-art imputation methods on different data sizes and characteristics. The work was not limited to the one-type variable dataset; we also imputed missing data with multi-type variables, e.g., a combination of binary, categorical, and continuous attributes. To evaluate the imputation methods, we randomly corrupted the complete data, with varying degrees of corruption, and then compared the imputed and original values. In all experiments, the developed autoencoder obtained the smallest error for all ranges of initial data corruption. © 2019 Elsevier B.V.
Generalization performance in recurrent neural networks is enhanced by cascading several networks. By discretizing abstractions induced in one network, other networks can operate on a coarse symbolic level with increased performance on sparse and structural prediction tasks. The level of systematicity exhibited by the cascade of recurrent networks is assessed on the basis of three language domains.
Recent studies have shown how synthetic data generation methods can be applied to electronic health records (EHRs) to obtain synthetic versions that do not violate privacy rules. This growing body of research has resulted in the emergence of numerous methods for evaluating the quality of generated data, with new publications often introducing novel evaluation methods. This work presents a detailed review of synthetic EHRs, focusing on the various evaluation methods used to assess the quality of the generated EHRs. We discuss the existing evaluation methods, offering insights into their use as well as providing an interpretation of the evaluation metrics from the perspectives of achieving fidelity, utility and privacy. Furthermore, we highlight the key factors influencing the selection of evaluation methods, such as the type of data (e.g., categorical, continuous, or discrete) and the mode of application (e.g., patient level, cohort level, and feature level). To assess the effectiveness of current evaluation measures, we conduct a series of experiments to shed light on the potential limitations of these measures. The findings from these experiments reveal notable shortcomings, including the need for meticulous application of methods to the data to reduce inconsistent evaluations, the qualitative nature of some assessments subject to individual judgment, the need for clinical validations, and the absence of techniques to evaluate temporal dependencies within the data. This highlights the need to place greater emphasis on evaluation measures, their application, and the development of comprehensive evaluation frameworks as it is crucial for advancing progress in this field. © 2024 The Author(s)
We considered the case of monitoring a large fleet where heterogeneity in the operational behavior among its constituent units (i.e., systems or machines) is non-negligible, and no labeled data is available. Each unit in the fleet, referred to as a target, is tracked by its sub-fleet. A conformal sub-fleet (CSF) is a set of units that act as a proxy for the normal operational behavior of a target unit by relying on the Mondrian conformal anomaly detection framework. Two approaches, the k-nearest neighbors and conformal clustering, were investigated for constructing such a sub-fleet by formulating a stability criterion. Moreover, it is important to discover the sub-sequence of events that describes an anomalous behavior in a target unit. Hence, we proposed to extract such sub-sequences for further investigation without pre-specifying their length. We refer to it as a conformal anomaly sequence (CAS). Furthermore, different nonconformity measures were evaluated for their efficiency, i.e., their ability to detect anomalous behavior in a target unit, based on the length of the observed CAS and the S-criterion value. The CSF approach was evaluated in the context of monitoring district heating substations. Anomalous behavior sub-sequences were corroborated with the domain expert leading to the conclusion that the proposed approach has the potential to be useful for both diagnostic and knowledge extraction purposes, especially in domains where labeled data is not available or hard to obtain. © 2021
Three-dimensional human pose and shape estimation is to compute a full human 3D mesh given a single image. The contamination of features caused by occlusion usually degrades its performance significantly. Recent progress in this field typically addressed the occlusion problem implicitly. By contrast, in this paper, we address it explicitly using a simple yet effective de-occlusion multi-task learning network. Our key insight is that feature for mesh parameter regression should be noiseless. Thus, in the feature space, our method disentangles the occludee that represents the noiseless human feature from the occluder. Specifically, a spatial regularization and an attention mechanism are imposed in the backbone of our network to disentangle the features into different channels. Furthermore, two segmentation tasks are proposed to supervise the de-occlusion process. The final mesh model is regressed by the disentangled occlusion-aware features. Experiments on both occlusion and non-occlusion datasets are conducted, and the results prove that our method is superior to the state-of-the-art methods on two occlusion datasets, while achieving competitive performance on a non-occlusion dataset. We also demonstrate that the proposed de-occlusion strategy is the main factor to improve the robustness against occlusion. The code is available at https://github.com/qihangran/De-occlusion_MTL_HMR. © 2023
The incremental learning paradigm in machine learning has consistently been a focus of academic research. It is similar to the way in which biological systems learn, and reduces energy consumption by avoiding excessive retraining. Existing studies utilize the powerful feature extraction capabilities of pre-trained models to address incremental learning, but there remains a problem of insufficient utilization of neural network feature knowledge. To address this issue, this paper proposes a novel method called Pre-trained Model Knowledge Distillation (PMKD) which combines knowledge distillation of neural network representations and replay. This paper designs a loss function based on centered kernel alignment to transfer neural network representations knowledge from the pre-trained model to the incremental model layer-by-layer. Additionally, the use of memory buffer for Dark Experience Replay helps the model retain past knowledge better. Experiments show that PMKD achieved superior performance on various datasets and different buffer sizes. Compared to other methods, our class incremental learning accuracy reached the best performance. The open-source code is published athttps://github.com/TianSongS/PMKD-IL. © 2023 The Author(s)
We present a stochastic learning algorithm for neural networks. The algorithm does not make any assumptions about transfer functions of individual neurons and does not depend on a functional form of a performance measure. The algorithm uses a random step of varying size to adapt weights. The average size of the step decreases during learning. The large steps enable the algorithm to jump over local maxima/minima, while the small ones ensure convergence in a local area. We investigate convergence properties of the proposed algorithm as well as test the algorithm on four supervised and unsupervised learning problems. We have found a superiority of this algorithm compared to several known algorithms when testing them on generated as well as real data.
The standard machine learning assumption that training and test data are drawn from the same probability distribution does not hold in many real-world applications due to the inability to reproduce testing conditions at training time. Existing unsupervised domain adaption (UDA) methods address this problem by learning a domain-invariant feature space that performs well on available source domain(s) (labeled training data) and the specific target domain (unlabeled test data). In contrast, instead of simply adapting to domains, this paper aims for an approach that learns to adapt effectively to new unlabeled domains. To do so, we leverage meta-learning to optimize a neural network such that an unlabeled adaptation of its parameters to any domain would yield a good generalization on this latter. The experimental evaluation shows that the proposed approach outperforms standard approaches even when a small amount of unlabeled test data is used for adaptation, demonstrating the benefit of meta-learning prior knowledge from various domains to solve UDA problems.