hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Exploring similarity-based classification of larynx disorders from human voice
Department of Electrical & Control Equipment, Kaunas University of Technology, Kaunas, Lithuania.
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).ORCID iD: 0000-0003-2185-8973
Department of Electrical & Control Equipment, Kaunas University of Technology, Kaunas, Lithuania.
Department of Electrical & Control Equipment, Kaunas University of Technology, Kaunas, Lithuania.
Show others and affiliations
2012 (English)In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 54, no 5, p. 601-610Article in journal (Refereed) Published
Abstract [en]

In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient's voice, are further approximated, using various strategies (sampling, averaging, and clustering by Gaussian mixture model). The effectiveness of similarity-based classification techniques in categorizing such pre-processed data into normal voice, nodular, and diffuse vocal fold lesion classes is explored and schemes to combine binary decisions of support vector machines (SVMs) are evaluated. Most practiced RBF kernel was compared to several constructed custom kernels: (i) a sequence kernel, defined over a pair of matrices, rather than over a pair of vectors and calculating the kernelized principal angle (KPA) between subspaces; (ii) a simple supervector kernel using only means of patient's GMM; (iii) two distance kernels, specifically tailored to exploit covariance matrices of GMM and using the approximation of the Kullback-Leibler divergence from the Monte-Carlo sampling (KL-MCS), and the Kullback-Leibler divergence combined with the Earth mover's distance (KL-EMD) as similarity metrics. The sequence kernel and the distance kernels both outperformed the popular RBF kernel, but the difference is statistically significant only in the distance kernels case. When tested on voice recordings, collected from 410 subjects (130 normal voice, 140 diffuse, and 140 nodular vocal fold lesions), the KL-MCS kernel, using GMM with full covariance matrices, and the KL-EMD kernel, using GMM with diagonal covariance matrices, provided the best overall performance. In most cases, SVM reached higher accuracy than least squares SVM, except for common binary classification using distance kernels. The results indicate that features, modeled with GMM, and kernel methods, exploiting this information, is an interesting fusion of generative (probabilistic) and discriminative (hyperplane) models for similarity-based classification. (C) 2011 Elsevier B.V. All rights reserved.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2012. Vol. 54, no 5, p. 601-610
Keywords [en]
Laryngeal disorder, Pathological voice, Mel-frequency cepstral coefficients, Sequence kernel, Kullback–Leibler divergence, Earth mover’s distance, GMM, SVM
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hh:diva-15973DOI: 10.1016/j.specom.2011.04.004ISI: 000302756600002Scopus ID: 2-s2.0-84858446124OAI: oai:DiVA.org:hh-15973DiVA, id: diva2:436825
Available from: 2011-09-07 Created: 2011-08-25 Last updated: 2018-01-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Verikas, Antanas

Search in DiVA

By author/editor
Verikas, Antanas
By organisation
Intelligent systems (IS-lab)
In the same journal
Speech Communication
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 143 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf