hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rolling The Dice For Better Deep Learning Performance: A Study Of Randomness Techniques In Deep Neural Networks
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-6040-2269
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-7796-5201
Halmstad University, School of Information Technology.ORCID iD: 0000-0003-3272-4145
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-0051-0954
Show others and affiliations
2024 (English)In: Information Sciences, ISSN 0020-0255, E-ISSN 1872-6291, Vol. 667, p. 1-17, article id 120500Article in journal (Refereed) Published
Abstract [en]

This paper presents a comprehensive empirical investigation into the interactions between various randomness techniques in Deep Neural Networks (DNNs) and how they contribute to network performance. It is well-established that injecting randomness into the training process of DNNs, through various approaches at different stages, is often beneficial for reducing overfitting and improving generalization. However, the interactions between randomness techniques such as weight noise, dropout, and many others remain poorly understood. Consequently, it is challenging to determine which methods can be effectively combined to optimize DNN performance. To address this issue, we categorize the existing randomness techniques into four key types: data, model, optimization, and learning. We use this classification to identify gaps in the current coverage of potential mechanisms for the introduction of noise, leading to proposing two new techniques: adding noise to the loss function and random masking of the gradient updates.

In our empirical study, we employ a Particle Swarm Optimizer (PSO) to explore the space of possible configurations to answer where and how much randomness should be injected to maximize DNN performance. We assess the impact of various types and levels of randomness for DNN architectures applied to standard computer vision benchmarks: MNIST, FASHION-MNIST, CIFAR10, and CIFAR100. Across more than 30\,000 evaluated configurations, we perform a detailed examination of the interactions between randomness techniques and their combined impact on DNN performance. Our findings reveal that randomness in data augmentation and in weight initialization are the main contributors to performance improvement. Additionally, correlation analysis demonstrates that different optimizers, such as Adam and Gradient Descent with Momentum, prefer distinct types of randomization during the training process. A GitHub repository with the complete implementation and generated dataset is available. © 2024 The Author(s)

Place, publisher, year, edition, pages
Philadelphia, PA: Elsevier, 2024. Vol. 667, p. 1-17, article id 120500
Keywords [en]
Neural Networks, Randomized Neural Networks, Convolutional Neural Network, hyperparameter optimization, Particle swarm optimization
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:hh:diva-52467DOI: 10.1016/j.ins.2024.120500ISI: 001224296500001Scopus ID: 2-s2.0-85188777216&OAI: oai:DiVA.org:hh-52467DiVA, id: diva2:1831059
Available from: 2024-01-24 Created: 2024-01-24 Last updated: 2024-06-11Bibliographically approved
In thesis
1. Evolving intelligence: Overcoming challenges for Evolutionary Deep Learning
Open this publication in new window or tab >>Evolving intelligence: Overcoming challenges for Evolutionary Deep Learning
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Deep Learning (DL) has achieved remarkable results in both academic and industrial fields over the last few years. However, DL models are often hard to design and require proper selection of features and tuning of hyper-parameters to achieve high performance. These selections are tedious for human experts and require substantial time and resources. A difficulty that encouraged a growing number of researchers to use Evolutionary Computation (EC) algorithms to optimize Deep Neural Networks (DNN); a research branch called Evolutionary Deep Learning (EDL).

This thesis is a two-fold exploration within the domains of EDL, and more broadly Evolutionary Machine Learning (EML). The first goal is to makeEDL/EML algorithms more practical by reducing the high computational costassociated with EC methods. In particular, we have proposed methods to alleviate the computation burden using approximate models. We show that surrogate-models can speed up EC methods by three times without compromising the quality of the final solutions. Our surrogate-assisted approach allows EC methods to scale better for both, expensive learning algorithms and large datasets with over 100K instances. Our second objective is to leverage EC methods for advancing our understanding of Deep Neural Network (DNN) design. We identify a knowledge gap in DL algorithms and introduce an EC algorithm precisely designed to optimize this uncharted aspect of DL design. Our analytical focus revolves around revealing avant-garde concepts and acquiring novel insights. In our study of randomness techniques in DNN, we offer insights into the design and training of more robust and generalizable neural networks. We also propose, in another study, a novel survival regression loss function discovered based on evolutionary search.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2024. p. 32
Series
Halmstad University Dissertations ; 109
Keywords
neural networks, evolutionary deep learning, evolutionary machine learning, feature selection, hyperparameter optimization, evolutionary computation, particle swarm optimization, genetic algorithm
National Category
Computer Systems Signal Processing
Identifiers
urn:nbn:se:hh:diva-52469 (URN)978-91-89587-31-1 (ISBN)978-91-89587-32-8 (ISBN)
Public defence
2024-02-16, Wigforss, Kristian IV:s väg 3, Halmstad, 08:00 (English)
Opponent
Supervisors
Available from: 2024-01-24 Created: 2024-01-24 Last updated: 2024-03-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Altarabichi, Mohammed GhaithNowaczyk, SławomirPashami, SepidehSheikholharam Mashhadi, Peyman

Search in DiVA

By author/editor
Altarabichi, Mohammed GhaithNowaczyk, SławomirPashami, SepidehSheikholharam Mashhadi, PeymanHandl, Julia
By organisation
School of Information Technology
In the same journal
Information Sciences
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 190 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf