hh.sePublications
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning to Learn without Forgetting using Attention
Halmstad University, School of Information Technology.ORCID iD: 0000-0003-0185-5038
Eindhoven University of Technology, Eindhoven, Netherlands.ORCID iD: 0000-0001-7044-9805
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-2859-6155
Halmstad University, School of Information Technology.ORCID iD: 0000-0001-5163-2997
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Continual learning (CL) refers to the ability to continually learn over time by accommodating new knowledge while retaining previously learned experience. While this concept is inherent in human learning, current machine learning methods are highly prone to overwrite previously learned patterns and thus forget past experience. Instead, model parameters should be updated selectively and carefully, avoiding unnecessary forgetting while optimally leveraging previously learned patterns to accelerate future learning. Since hand-crafting effective update mechanisms is difficult, we propose meta-learning a transformer-based optimizer to enhance CL. This meta-learned optimizer uses attention to learn the complex relationships between model parameters across a stream of tasks, and is designed to generate effective weight updates for the current task while preventing catastrophic forgetting on previously encountered tasks. Evaluations on benchmark datasets like SplitMNIST, RotatedMNIST, and SplitCIFAR-100 affirm the efficacy of the proposed approach in terms of both forward and backward transfer, even on small sets of labeled data, highlighting the advantages of integrating a meta-learned optimizer within the continual learning framework. 

Place, publisher, year, edition, pages
2024. p. 1-16
Keywords [en]
Continual learning, Meta-learning, Few-shot learning, Catastrophic forgetting
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-55120OAI: oai:DiVA.org:hh-55120DiVA, id: diva2:1922853
Conference
3rd Conference on Lifelong Learning Agents (CoLLAs), Pisa, Italy, July 29 - 1 August, 2024
Available from: 2024-12-19 Created: 2024-12-19 Last updated: 2025-10-01Bibliographically approved
In thesis
1. Advancing Meta-Learning for Enhanced Generalization Across Diverse Tasks
Open this publication in new window or tab >>Advancing Meta-Learning for Enhanced Generalization Across Diverse Tasks
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Meta-learning, or learning to learn, is a rapidly evolving area in machine learning that aims to enhance the adaptability and efficiency of learning algorithms. Inspired by the human ability to learn new concepts from limited examples and quickly adapt to unforeseen situations, meta-learning leverages prior experience to prepare models for fast adaptation to new tasks. Unlike traditional machine learning systems, where models are trained for specific tasks, meta-learning frameworks enable models to acquire generalized knowledge during training and efficiently learn new tasks during inference. This ability to generalize from past experiences to new tasks makes meta-learning a key focus in advancing artificial intelligence, offering the potential to create more flexible and efficient AI systems capable of performing well with minimal data.

In this thesis, we begin by formally defining the meta-learning framework, establishing clear terminology, and synthesizing existing work in a comprehensive survey paper. Building on this foundation, we demonstrate how meta-learning can be integrated into various fields to enhance model performance and extend capabilities to few-shot learning scenarios. We show how meta-learning can significantly improve the accuracy and efficiency of transferring knowledge across domains in domain adaptation. In scenarios involving a multimodal distribution of tasks, we develop methods that efficiently learn from and adapt to a wide variety of tasks drawn from different modes within the distribution, ensuring effective adaptation across diverse domains. Our work on personalized federated learning highlights meta-learning's potential to tailor federated learning processes to individual user needs while maintaining privacy and data security. Additionally, we address the challenges of continual learning by developing models that continuously integrate new information without forgetting previously acquired knowledge. For time series data analysis, we present meta-learning strategies that automatically learn optimal augmentation techniques, enhancing model predictions and offering robust solutions for real-world applications. Lastly, our pioneering research on unsupervised meta-learning via in-context learning explores innovative approaches for constructing tasks and learning effectively from unlabeled data.

Overall, the contributions of this thesis emphasize the potential of meta-learning techniques to improve performance across diverse research areas and demonstrate how advancements in one area can benefit the field as a whole.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2025. p. 46
Series
Halmstad University Dissertations ; 127
Keywords
Meta-learning, Few-shot learning, Domain adaptation, Federated learning, Continual learning, Unsupervised learning, In-context learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-55147 (URN)978-91-89587-71-7 (ISBN)978-91-89587-70-0 (ISBN)
Public defence
2025-02-03, S1022, Kristian IV:s väg 3, 30118, Halmstad, Halmstad, 13:00 (English)
Opponent
Supervisors
Available from: 2025-01-08 Created: 2025-01-07 Last updated: 2025-10-01Bibliographically approved

Open Access in DiVA

fulltext(4237 kB)107 downloads
File information
File name FULLTEXT01.pdfFile size 4237 kBChecksum SHA-512
48481cfb613f6d2d06c042797f5aa92a2ca12a2ba19a393a74446ba0e1ef8f4d9b1d90af796baf8180b2e83df1b0f5b6fbb559609476602423d1cf99468776dd
Type fulltextMimetype application/pdf

Authority records

Vettoruzzo, AnnaBouguelia, Mohamed-RafikRögnvaldsson, Thorsteinn

Search in DiVA

By author/editor
Vettoruzzo, AnnaJoaquin, VanschorenBouguelia, Mohamed-RafikRögnvaldsson, Thorsteinn
By organisation
School of Information Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 107 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 304 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf