hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Corrupted Contextual Bandits with Action Order Constraints
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.ORCID iD: 0000-0002-7453-9186
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.ORCID iD: 0000-0002-7796-5201
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.ORCID iD: 0000-0003-1145-4297
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We consider a variant of the novel contextual bandit problem with corrupted context, which we call the contextual bandit problem with corrupted context and action correlation, where actions exhibit a relationship structure that can be exploited to guide the exploration of viable next decisions. Our setting is primarily motivated by adaptive mobile health interventions and related applications, where users might transitions through different stages requiring more targeted action selection approaches. In such settings, keeping user engagement is paramount for the success of interventions and therefore it is vital to provide relevant recommendations in a timely manner. The context provided by users might not always be informative at every decision point and standard contextual approaches to action selection will incur high regret. We propose a meta-algorithm using a referee that dynamically combines the policies of a contextual bandit and multi-armed bandit, similar to previous work, as wells as a simple correlation mechanism that captures action to action transition probabilities allowing for more efficient exploration of time-correlated actions. We evaluate empirically the performance of said algorithm on a simulation where the sequence of best actions is determined by a hidden state that evolves in a Markovian manner. We show that the proposed meta-algorithm improves upon regret in situations where the performance of both policies varies such that one is strictly superior to the other for a given time period. To demonstrate that our setting has relevant practical applicability, we evaluate our method on several real world data sets, clearly showing better empirical performance compared to a set of simple algorithms.

Keywords [en]
Contextual Bandit, Sequential Decision Making, Action Sequence, Nonstationarity
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-43530OAI: oai:DiVA.org:hh-43530DiVA, id: diva2:1504087
Part of project
iMedA: Improving MEDication Adherence through Person Centered Care and Adaptive Interventions, Vinnova
Note

Som manuskript i avhandling / As manuscript in thesis

Available from: 2020-11-26 Created: 2020-11-26 Last updated: 2021-04-07Bibliographically approved
In thesis
1. Data-driven personalized healthcare: Towards personalized interventions via reinforcement learning for Mobile Health
Open this publication in new window or tab >>Data-driven personalized healthcare: Towards personalized interventions via reinforcement learning for Mobile Health
2021 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Medical and technological advancement in the last century has led to the unprecedented increase of the populace's quality of life and lifespan. As a result, an ever-increasing number of people live with chronic health conditions that require long-term treatment, resulting in increased healthcare costs and managerial burden to the healthcare provider. This increase in complexity can lead to ineffective decision-making and reduce care quality for the individual while increasing costs. One promising direction to tackle these issues is the active involvement of the patient in managing their care. Particularly for chronic diseases, where ongoing support is often required, patients must understand their illness and be empowered to manage their care. With the advent of smart devices such as smartphones, it is easier than ever to provide personalised digital interventions to patients, help them manage their treatment in their daily lives, and raise awareness about their illness. If such new approaches are to succeed, scalability is necessary, and solutions are needed that can act autonomously without costly human intervention. Furthermore, solutions should exhibit adaptability to the changing circumstances of an individual patient's health, needs and goals. Through the ongoing digitisation of healthcare, we are presented with the unique opportunity to develop cost-effective and scalable solutions through Artificial Intelligence (AI).

This thesis presents work that we conducted as part of the project improving Medication Adherence through Person-Centered Care and Adaptive Interventions (iMedA) that aims to provide personalised adaptive interventions to hypertensive patients, supporting them in managing their medication regiment. The focus lies on inadequate medication adherence (MA), a pervasive issue where patients do not take their medication as instructed by their physician. The selection of individuals for intervention through secondary database analysis on Electronic Health Records (EHRs) was a key challenge and is addressed through in-depth analysis of common adherence measures, development of prediction models for MA and discussions on limitations of such approaches for analysing MA. Furthermore, providing personalised adaptive interventions is framed in the contextual bandit setting and addresses the challenge of delivering relevant interventions in environments where contextual information is significantly corrupted.       

The contributions of the thesis can be summarised as follows: (1) Highlighting the issues encountered in measuring MA through secondary database analysis and providing recommendations to address these issues, (2) Investigating machine learning models developed using EHRs for MA prediction and extraction of common refilling patterns through EHRs and (3) formal problem definition for a novel contextual bandit setting with context uncertainty commonly encountered in Mobile Health and development of an algorithm designed for such environments.  

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2021. p. 55
Series
Halmstad University Dissertations ; 79
Keywords
Information Driven Care, Electronic Health Records, Machine Learning, Reinforcement Learning
National Category
Health Care Service and Management, Health Policy and Services and Health Economy Signal Processing
Identifiers
urn:nbn:se:hh:diva-44091 (URN)9789188749666 (ISBN)9789188749673 (ISBN)
Presentation
2021-04-19, Wigforss, Visionen, Kristian IV:s väg 3, Halmstad, 14:00 (English)
Opponent
Supervisors
Available from: 2021-04-08 Created: 2021-04-01 Last updated: 2022-03-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Full Text

Authority records

Galozy, AlexanderNowaczyk, SławomirOhlsson, Mattias

Search in DiVA

By author/editor
Galozy, AlexanderNowaczyk, SławomirOhlsson, Mattias
By organisation
CAISR - Center for Applied Intelligent Systems Research
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 177 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf