hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Wisdom of the contexts: active ensemble learning for contextual anomaly detection
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-6249-4144
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-7796-5201
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-2859-6155
Halmstad University, School of Information Technology.
2022 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 36, p. 2410-2458Article in journal (Refereed) Published
Abstract [en]

In contextual anomaly detection, an object is only considered anomalous within a specific context. Most existing methods use a single context based on a set of user-specified contextual features. However, identifying the right context can be very challenging in practice, especially in datasets with a large number of attributes. Furthermore, in real-world systems, there might be multiple anomalies that occur in different contexts and, therefore, require a combination of several "useful" contexts to unveil them. In this work, we propose a novel approach, called WisCon (Wisdom of the Contexts), to effectively detect complex contextual anomalies in situations where the true contextual and behavioral attributes are unknown. Our method constructs an ensemble of multiple contexts, with varying importance scores, based on the assumption that not all useful contexts are equally so. We estimate the importance of each context using an active learning approach with a novel query strategy. Experiments show that WisCon significantly outperforms existing baselines in different categories (i.e., active classifiers, unsupervised contextual, and non-contextual anomaly detectors) on 18 datasets. Furthermore, the results support our initial hypothesis that there is no single perfect context that successfully uncovers all kinds of contextual anomalies, and leveraging the "wisdom" of multiple contexts is necessary. © 2022, The Author(s).

Place, publisher, year, edition, pages
New York: Springer-Verlag New York, 2022. Vol. 36, p. 2410-2458
Keywords [en]
Anomaly detection, Active learning, Contextual anomaly detection, Ensemble learning, Active learning
National Category
Computer Sciences
Research subject
Smart Cities and Communities
Identifiers
URN: urn:nbn:se:hh:diva-46401DOI: 10.1007/s10618-022-00868-7ISI: 000864233400001Scopus ID: 2-s2.0-85139454448OAI: oai:DiVA.org:hh-46401DiVA, id: diva2:1639850
Funder
Knowledge Foundation, 20160103
Note

Som manuskript i avhandling / As manuscript in thesis

Available from: 2022-02-22 Created: 2022-02-22 Last updated: 2023-01-12Bibliographically approved
In thesis
1. Together We Learn More: Algorithms and Applications for User-Centric Anomaly Detection
Open this publication in new window or tab >>Together We Learn More: Algorithms and Applications for User-Centric Anomaly Detection
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Anomaly detection is the problem of identifying data points or patterns that do not conform to normal behavior. Anomalies in data often correspond to important and actionable information such as frauds in financial applications, faults in production units, intrusions in computer systems, and serious diseases in patient records. One of the fundamental challenges of anomaly detection is that the exact notion of anomaly is subjective and varies greatly in different applications and domains. This makes distinguishing anomalies that match with the end-user's expectations from other observations difficult. As a result, anomaly detectors produce many false alarms that do not correspond to semantically meaningful anomalies for the analyst. 

Humans can help, in different ways, to bridge this gap between detected anomalies and ''anomalies-of-interest'': by giving clues on features more likely to reveal interesting anomalies or providing feedback to separate them from irrelevant ones. However, it is not realistic to assume a human to easily provide feedback without explaining why the algorithm classifies a certain sample as an anomaly. Interpretability of results is crucial for an analyst to be able to investigate the candidate anomaly and decide whether it is actually interesting or not. 

In this thesis, we take a step forward to improve the practical use of anomaly detection in real-life by leveraging human-algorithm collaboration. This thesis and appended papers study the problem of formulating and implementing algorithms for user-centric anomaly detection-- a setting in which people analyze, interpret, and learn from the detector's results, as well as provide domain knowledge or feedback. Throughout this thesis, we have described a number of diverse approaches, each addressing different challenges and needs of user-centric anomaly detection in the real world, and combined these methods into a coherent framework. By conducting different studies, this thesis finds that a comprehensive approach incorporating human knowledge and providing interpretable results can lead to more effective and practical anomaly detection and more successful real-world applications. The major contributions that result from the studies included in this work and led the above conclusion can be summarized into five categories: (1) exploring different data representations that are suitable for anomaly detection based on data characteristics and domain knowledge, (2) discovering patterns and groups in data that describe normal behavior in the current application, (3) implementing a generic and extensible framework enabling use-case-specific detectors suitable for different scenarios, (4) incorporating domain knowledge and expert feedback into anomaly detection, and (5) producing interpretable detection results that support end-users in understanding and validating the anomalies. 

Place, publisher, year, edition, pages
Halmstad University Press, 2022. p. 211
Series
Halmstad University Dissertations ; 9
Keywords
data mining, machine learning, anomaly detection
National Category
Computer Sciences
Research subject
Smart Cities and Communities
Identifiers
urn:nbn:se:hh:diva-46404 (URN)978-91-88749-87-1 (ISBN)978-91-88749-88-8 (ISBN)
Public defence
2022-03-22, Visionen (Halda), Kristian IV:s väg 3, Halmstad, 13:00 (English)
Opponent
Supervisors
Available from: 2022-02-25 Created: 2022-02-22 Last updated: 2022-02-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Calikus, EceNowaczyk, SławomirBouguelia, Mohamed-RafikDikmen, Onur

Search in DiVA

By author/editor
Calikus, EceNowaczyk, SławomirBouguelia, Mohamed-RafikDikmen, Onur
By organisation
School of Information Technology
In the same journal
Data mining and knowledge discovery
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 190 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf