Open this publication in new window or tab >>Show others...
2024 (English)In: Information Sciences, ISSN 0020-0255, E-ISSN 1872-6291, Vol. 667, p. 1-17, article id 120500Article in journal (Refereed) Published
Abstract [en]
This paper presents a comprehensive empirical investigation into the interactions between various randomness techniques in Deep Neural Networks (DNNs) and how they contribute to network performance. It is well-established that injecting randomness into the training process of DNNs, through various approaches at different stages, is often beneficial for reducing overfitting and improving generalization. However, the interactions between randomness techniques such as weight noise, dropout, and many others remain poorly understood. Consequently, it is challenging to determine which methods can be effectively combined to optimize DNN performance. To address this issue, we categorize the existing randomness techniques into four key types: data, model, optimization, and learning. We use this classification to identify gaps in the current coverage of potential mechanisms for the introduction of noise, leading to proposing two new techniques: adding noise to the loss function and random masking of the gradient updates.
In our empirical study, we employ a Particle Swarm Optimizer (PSO) to explore the space of possible configurations to answer where and how much randomness should be injected to maximize DNN performance. We assess the impact of various types and levels of randomness for DNN architectures applied to standard computer vision benchmarks: MNIST, FASHION-MNIST, CIFAR10, and CIFAR100. Across more than 30\,000 evaluated configurations, we perform a detailed examination of the interactions between randomness techniques and their combined impact on DNN performance. Our findings reveal that randomness in data augmentation and in weight initialization are the main contributors to performance improvement. Additionally, correlation analysis demonstrates that different optimizers, such as Adam and Gradient Descent with Momentum, prefer distinct types of randomization during the training process. A GitHub repository with the complete implementation and generated dataset is available. © 2024 The Author(s)
Place, publisher, year, edition, pages
Philadelphia, PA: Elsevier, 2024
Keywords
Neural Networks, Randomized Neural Networks, Convolutional Neural Network, hyperparameter optimization, Particle swarm optimization
National Category
Computer Systems
Identifiers
urn:nbn:se:hh:diva-52467 (URN)10.1016/j.ins.2024.120500 (DOI)001224296500001 ()2-s2.0-85188777216& (Scopus ID)
2024-01-242024-01-242024-06-11Bibliographically approved