hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping
Berge Consulting, Gothenburg, Sweden.
Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).ORCID iD: 0000-0002-5712-6777
Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).ORCID iD: 0000-0002-1400-346X
Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).ORCID iD: 0000-0002-1043-8773
2023 (English)In: Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023, Piscataway: IEEE, 2023, p. 3443-3452Conference paper, Published paper (Refereed)
Abstract [en]

In this work, we present a new single-stage method for subject agnostic face swapping and identity transfer, named FaceDancer. We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR). The AFFA module is embedded in the decoder and adaptively learns to fuse attribute features and features conditioned on identity information without requiring any additional facial segmentation process. In IFSR, we leverage the intermediate features in an identity encoder to preserve important attributes such as head pose, facial expression, lighting, and occlusion in the target face, while still transferring the identity of the source face with high fidelity. We conduct extensive quantitative and qualitative experiments on various datasets and show that the proposed FaceDancer outperforms other state-of-the-art networks in terms of identityn transfer, while having significantly better pose preservation than most of the previous methods. © 2023 IEEE.

Place, publisher, year, edition, pages
Piscataway: IEEE, 2023. p. 3443-3452
Keywords [en]
Algorithms, Biometrics, and algorithms (including transfer, low-shot, semi-, self-, and un-supervised learning), body pose, face, formulations, gesture, Machine learning architectures
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:hh:diva-48618DOI: 10.1109/WACV56688.2023.00345ISI: 000971500203054Scopus ID: 2-s2.0-85149000603ISBN: 9781665493468 (print)OAI: oai:DiVA.org:hh-48618DiVA, id: diva2:1710964
Conference
23rd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023, Waikoloa, Hawaii, USA, 3-7 January 2023
Available from: 2022-11-15 Created: 2022-11-15 Last updated: 2025-03-18Bibliographically approved
In thesis
1. Anonymizing Faces without Destroying Information
Open this publication in new window or tab >>Anonymizing Faces without Destroying Information
2024 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Anonymization is a broad term. Meaning that personal data, or rather data that identifies a person, is redacted or obscured. In the context of video and image data, the most palpable information is the face. Faces barely change compared to other aspect of a person, such as cloths, and we as people already have a strong sense of recognizing faces. Computers are also adroit at recognizing faces, with facial recognition models being exceptionally powerful at identifying and comparing faces. Therefore it is generally considered important to obscure the faces in video and image when aiming for keeping it anonymized. Traditionally this is simply done through blurring or masking. But this de- stroys useful information such as eye gaze, pose, expression and the fact that it is a face. This is an especial issue, as today our society is data-driven in many aspects. One obvious such aspect is autonomous driving and driver monitoring, where necessary algorithms such as object-detectors rely on deep learning to function. Due to the data hunger of deep learning in conjunction with society’s call for privacy and integrity through regulations such as the General Data Protection Regularization (GDPR), anonymization that preserve useful information becomes important.

This Thesis investigates the potential and possible limitation of anonymizing faces without destroying the aforementioned useful information. The base approach to achieve this is through face swapping and face manipulation, where the current research focus on changing the face (or identity) while keeping the original attribute information. All while being incorporated and consistent in an image and/or video. Specifically, will this Thesis demonstrate how target-oriented and subject-agnostic face swapping methodologies can be utilized for realistic anonymization that preserves attributes. Thru this, this Thesis points out several approaches that is: 1) controllable, meaning the proposed models do not naively changes the identity. Meaning that what kind of change of identity and magnitude is adjustable, thus also tunable to guarantee anonymization. 2) subject-agnostic, meaning that the models can handle any identity. 3) fast, meaning that the models is able to run efficiently. Thus having the potential of running in real-time. The end product consist of an anonymizer that achieved state-of-the-art performance on identity transfer, pose retention and expression retention while providing a realism.

Apart of identity manipulation, the Thesis demonstrate potential security issues. Specifically reconstruction attacks, where a bad-actor model learns convolutional traces/patterns in the anonymized images in such a way that it is able to completely reconstruct the original identity. The bad-actor networks is able to do this with simple black-box access of the anonymization model by constructing a pair-wise dataset of unanonymized and anonymized faces. To alleviate this issue, different defense measures that disrupts the traces in the anonymized image was investigated. The main take away from this, is that naively using what qualitatively looks convincing of hiding an identity is not necessary the case at all. Making robust quantitative evaluations important.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2024. p. 50
Series
Halmstad University Dissertations ; 111
Keywords
Anonymization, Data Privacy, Generative AI, Reconstruction Attacks, Deep Fakes, Facial Recognition, Identity Tracking, Biometrics
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-52892 (URN)978-91-89587-36-6 (ISBN)978-91-89587-35-9 (ISBN)
Presentation
2024-04-10, S1078, Halmstad University, Kristian IV:s väg 3, Halmstad, 10:00 (English)
Opponent
Supervisors
Available from: 2024-03-18 Created: 2024-03-18 Last updated: 2024-03-18Bibliographically approved
2. Non-Reversible and Attribute Preserving Face De-Identification
Open this publication in new window or tab >>Non-Reversible and Attribute Preserving Face De-Identification
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

De-identification, also known as anonymization, is a broad term that refers to the process of redacting or obscuring personal data, or data that identifies an individual. In the context of video and image data de-identification, the most tangible personal information is the face. Faces are considered biometric data, thus change little compared to other aspects of an individual, such as clothing and hairstyle. Humans possess a strong innate ability to recognize faces. Computers are also adept at recognizing faces, and face recognition models are exceptionally powerful at identifying and comparing faces. Consequently, it is widely recognized as crucial to obscure the faces in video and images to ensure the integrity of de-identified data. Conventionally, this has been achieved through blurring or masking techniques. However, these methods are destructive of data characteristics and thus compromise critical attribute information such as eye gaze, pose, expression and the fact that it is a face. This is a particular problem because our society is data-driven in many ways. This information is useful for a plethora of functions such as traffic safety. One obvious such aspect is autonomous driving and driver monitoring, where necessary algorithms such as object detectors rely on deep learning to function. Due to the data hunger of deep learning, combined with society's demand for privacy and integrity through regulations such as the General Data Protection Regulation (GDPR), face de-identification, which preserves useful information, becomes significantly important.

This Thesis investigates the potential and possible limitations of de-identifying faces, while preserving the aforementioned useful attribute information. The Thesis is especially focused on the sustainability perspective of de-identification, where the perseverance of both integrity and utility of data is important. The baseline method to achieve this is through methods introduced from the face swapping and face manipulation literature, where the current research focuses on changing the face (or identity) with generative models while keeping the original attribute information as intact as possible. All while being integrated and consistent in an image and/or video. Specifically, this Thesis will demonstrate how generative target-oriented and subject-agnostic face manipulation models, which aim to anonymize facial identities by transforming original faces to resemble specific targets, can be used for realistic de-identification that preserves attributes.

While this Thesis will demonstrate and introduce novel de-identification capabilities, it also addresses and highlight potential vulnerabilities and security issues that arise from naively applying generative target-oriented de-identification models. First, since state-of-the-art face representation models are typically restricting the face representation embeddings to a hyper-sphere, maximizing the privacy may lead to trivial identity retrieval matching. Second, transferable adversarial attacks, where adversarial perturbations generated by surrogate identity encoders cause identity leakage in the victim de-identification system. Third, reconstruction attacks, where bad actor models are able to learn and extract enough information from subtle cues left by the de-identification model to consistently reconstruct the original identity.

Through this, this Thesis points out several approaches that are: 1) Controllable, meaning that the proposed models do not naively change the identity. This means that the type and magnitude of identity change is adjustable, and thus tunable to ensure anonymization. 2) Subject agnostic, meaning that the models can handle any identity or face. 3) Fast, meaning that the models are able to run efficiently. Thus having the potential of running in real-time. 4) Non-reversible, this Thesis introduces a novel diffusion-based method to make generative target-oriented models robust against reconstruction attacks. The end product consists of a hybrid generative target-oriented and diffusion de-identification pipeline that achieves state-of-the-art performance on privacy protection as measured by identity retrieval, pose retention, expression retention, gaze retention, and visual fidelity while being robust against reconstruction attacks.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2025. p. 79
Series
Halmstad University Dissertations ; 130
Keywords
Anonymization, Data Privacy, Generative AI, Reconstruction Attacks, Deep Fakes, Facial Recognition, Identity Tracking, Biometrics
National Category
Signal Processing
Identifiers
urn:nbn:se:hh:diva-55652 (URN)978-91-89587-77-9 (ISBN)978-91-89587-76-2 (ISBN)
Public defence
2025-04-17, S3030, Kristian IV:s väg 3, Halmstad, 10:00 (English)
Opponent
Supervisors
Available from: 2025-03-19 Created: 2025-03-18 Last updated: 2025-03-19Bibliographically approved

Open Access in DiVA

fulltext(14309 kB)477 downloads
File information
File name FULLTEXT01.pdfFile size 14309 kBChecksum SHA-512
796528ab05c30bac9602fb495d64477a91e16561e6e97809bf9f8e9e08a1464126fcabc66cde45c10e412b9767f9d050310c3c204b2a7711828dc60caf453377
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Aksoy, ErenAlonso-Fernandez, FernandoEnglund, Cristofer

Search in DiVA

By author/editor
Aksoy, ErenAlonso-Fernandez, FernandoEnglund, Cristofer
By organisation
Center for Applied Intelligent Systems Research (CAISR)
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 477 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 834 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf