hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
LMVD: A large-scale multimodal vlog dataset for depression detection in the wild
Xi'an Institute of Posts and Telecommunications, Xi'an, China.
Xi'an Institute of Posts and Telecommunications, Xi'an, China.
Xi'an Institute of Posts and Telecommunications, Xi'an, China.
Xi'an Institute of Posts and Telecommunications, Xi'an, China.
Show others and affiliations
2026 (English)In: Information Fusion, ISSN 1566-2535, E-ISSN 1872-6305, Vol. 126, no B, p. 1-11, article id 103632Article in journal (Refereed) In press
Abstract [en]

Depression profoundly impacts multiple dimensions of an individual's life, including personal and social functioning, academic achievement, occupational productivity, and overall quality of life. With recent advancements in affective computing, deep learning technologies have been increasingly adopted to identify patterns indicative of depression. However, due to concerns over participant privacy, data in this domain remain scarce, posing significant challenges for the development of robust discriminative models for depression detection. To address this limitation, we build a Large-scale Multimodal Vlog Dataset (LMVD) for depression recognition in real-world settings. The LMVD dataset comprises 1,823 video samples, totaling approximately 214 h of content, collected from 1,475 participants across four major multimedia platforms: Sina Weibo, Bilibili, TikTok, and YouTube. In addition, we introduce a novel architecture, MDDformer, specifically designed to capture non-verbal behavioral cues associated with depressive states. Extensive experimental evaluations conducted on LMVD demonstrate the superior performance of MDDformer in depression detection tasks. We anticipate that LMVD will become a valuable benchmark resource for the research community, facilitating progress in multimodal, real-world depression recognition. The dataset and source code will be made publicly available at: https://github.com/helang818/LMVD. © 2025 Elsevier B.V., All rights reserved.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2026. Vol. 126, no B, p. 1-11, article id 103632
Keywords [en]
Deep Learning, Depression Detection, Multimodal, Transformer, Vlog, Behavioral Research, Data Privacy, Human Computer Interaction, Interactive Computer Systems, Large Datasets, Learning Systems, Multimedia Systems, Academic Achievements, Deep Learning, Depression Detection, Large-scales, Multi-modal, Multiple Dimensions, Overall Quality, Quality Of Life, Transformer, Vlog
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-57350DOI: 10.1016/j.inffus.2025.103632Scopus ID: 2-s2.0-105014021546OAI: oai:DiVA.org:hh-57350DiVA, id: diva2:1998973
Available from: 2025-09-18 Created: 2025-09-18 Last updated: 2025-10-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Tiwari, Prayag

Search in DiVA

By author/editor
Jiang, JieweiZhang, ShiqingTiwari, Prayag
By organisation
School of Information Technology
In the same journal
Information Fusion
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 65 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf