Fusing Various Audio Feature Sets for Detection of Parkinson’s Disease from Sustained Voice and Speech RecordingsShow others and affiliations
2016 (English)In: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, Vol. 9811, p. 328-337Article in journal (Refereed) Published
Abstract [en]
The aim of this study is the analysis of voice and speech recordings for the task of Parkinson’s disease detection. Voice modality corresponds to sustained phonation /a/ and speech modality to a short sentence in Lithuanian language. Diverse information from recordings is extracted by 22 well-known audio feature sets. Random forest is used as a learner, both for individual feature sets and for decision-level fusion. Essentia descriptors were found as the best individual feature set, achieving equal error rate of 16.3 % for voice and 13.3 % for speech. Fusion of feature sets and modalities improved detection and achieved equal error rate of 10.8 %. Variable importance in fusion revealed speech modality as more important than voice. © Springer International Publishing Switzerland 2016
Place, publisher, year, edition, pages
Heidelberg: Springer Berlin/Heidelberg, 2016. Vol. 9811, p. 328-337
Keywords [en]
Parkinson’s disease, Audio signal processing, OpenSMILE, Essentia, MPEG-7, jAudio, YAAFE, Random forest, Information fusion
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:hh:diva-31872DOI: 10.1007/978-3-319-43958-7_39ISI: 000389335600039Scopus ID: 2-s2.0-84984851988OAI: oai:DiVA.org:hh-31872DiVA, id: diva2:955931
Conference
18th International Conference, SPECOM 2016, Budapest, Hungary, August 23-27, 2016
Note
Funding: Research Council of Lithuania (No. MIP-075/2015)
2016-08-272016-08-272018-01-10Bibliographically approved