hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Synergy of lip motion and acoustic features in biometric speech and speaker recognition
Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
2007 (English)In: I.E.E.E. transactions on computers (Print), ISSN 0018-9340, E-ISSN 1557-9956, Vol. 56, no 9, 1169-1175 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents the scheme and evaluation of a robust audio-visual digit-and-speaker-recognition system using lip motion and speech biometrics. Moreover, a liveness verification barrier based on a person's lip movement is added to the system to guard against advanced spoofing attempts such as replayed videos. The acoustic and visual features are integrated at the feature level and evaluated first by a support vector machine for digit and speaker identification and, then, by a Gaussian mixture model for speaker verification. Based on ap300 different personal identities, this paper represents, to our knowledge, the first extensive study investigating the added value of lip motion features for speaker and speech-recognition applications. Digit recognition and person-identification and verification experiments are conducted on the publicly available XM2VTS database showing favorable results (speaker verification is 98 percent, speaker identification is 100 percent, and digit identification is 83 percent to 100 percent).

Place, publisher, year, edition, pages
New York: IEEE Press, 2007. Vol. 56, no 9, 1169-1175 p.
Keyword [en]
GMM, SVM, Speech recognition, biometrics, Lip motion, Lip reading, Motion estimation, Normal image flow, normal image velocity, Speaker recognition
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:hh:diva-2058DOI: 10.1109/TC.2007.1074ISI: 000248208300003Scopus ID: 2-s2.0-34548205797Local ID: 2082/2453OAI: oai:DiVA.org:hh-2058DiVA: diva2:239276
Note

©2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Available from: 2008-10-17 Created: 2008-10-17 Last updated: 2012-09-19Bibliographically approved

Open Access in DiVA

fulltext(646 kB)420 downloads
File information
File name FULLTEXT01.pdfFile size 646 kBChecksum SHA-512
6ae27fed6be51af50bf1729fa78771399793c4d38c873926c723e055501d57a0c14a44c4f6ebbc8b2c40d420acbb8e26f4576ccc24ee6db5b45133d543708363
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Faraj, MaycelBigun, Josef
By organisation
Halmstad Embedded and Intelligent Systems Research (EIS)
In the same journal
I.E.E.E. transactions on computers (Print)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 420 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 180 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf