This study explores Convolutional Neural Networks (CNNs) in detail, this includes various layers and architectural designs. It highlights the creation of a dataset for the Swedish Sign Language (SSL) and the use of augmentation techniques to enhance model training. The dataset consisted of 47320 images. The project uses hand-tracking to locate the sign for translation. Furthermore, the models included a fine-tuned MobileNet model and a custom model. Notably, fine-tuning MobileNet's architecture achieved the highest test accuracy of 97%. Additionally, the research evaluates the applicability of image recognition models on low-power devices, exemplified by a Raspberry Pi 4 model B for practical experimentation. Through these processes insights into the efficacy of CNNs and their potential deployment on low-power platforms are analyzed.
Datasets are available at: https://huggingface.co/datasets/Bachelor2024/SwedishSignLanguageAlphabet
Code and models: https://github.com/gooligang/SSLAlphabetTranslator