Lip-motion and speech biometrics in person recognition
2006 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]
Biometric identification techniques are frequently used to improve security, e.g. in financial transactions, computer networks and secure critical locations. The purpose of biometric authentication systems is to verify an individual by her biological characteristics including those generating characterisitic behaviour. It is not only fingerprints that are used for authentication; our lips, eyes, speech, signatures and even facial temperature are now being used to identify us. This presumably increases security since these traits are harder to copy, steal or lose.
This thesis attempts to present an effective scheme to extract descriminative features based on a novel motion estimation algorithm for lip movement. Motion is defined as the distribution of apparent velocities in the changes of brightness patterns in an image. The velocity components of a lip sequence are computed by the well-known 3D structure tensor using 1D processing, in 2D manifolds. Since the velocities are computed without extracting the speaker's lip contours, more robust visual features can be obtained. The velocity estimation is performed in rectangular lip regions, which affords increased computational efficiency.
To investigate the usefulness of the proposed motion features we implement a person authentication system based on lip movements information with (and without) speech information. It yields a speaker verification rate of 98% with lip and speech information. Comparisons are made with an alternative motion estimation technique and a description of our proposed feature fusion technique is given. Beside its value in authentication, the technique can be used naturally to evaluate the liveness i.e. to determine if the biometric data is be captured from a legitimate user, live user who is physically present at the point of acquisition, of a speaking person as it can be used in a text-prompted dialog.
Place, publisher, year, edition, pages
Göteborg: Department of Signals and Systems, Chalmers University of Technology , 2006. , p. 51
Series
Technical report R, ISSN 1403-266X ; 2006:20
Keywords [en]
Audio-visual recognition, biometrics, Biometric recognition, Speaker verification, Speaker authentication, Person identification, Lip movements, Motion, Structure tensor, Orientation, Optical flow, Hidden Markov Model, Gaussian Markov Model, Lip-motion
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:hh:diva-1977Local ID: 2082/2372OAI: oai:DiVA.org:hh-1977DiVA, id: diva2:239195
Presentation
(English)
2008-09-292008-09-292025-02-07Bibliographically approved