The Detection of Depression Using Multimodal Models Based on Text and Voice Quality Features


Abstract:

The article proves the concept that an automatic diagnosis of depression can be achieved using audio recordings of the individuals' voices. DAIC-WOZ database was used as ...Show More

Abstract:

The article proves the concept that an automatic diagnosis of depression can be achieved using audio recordings of the individuals' voices. DAIC-WOZ database was used as a data source. Audio and textual data were preprocessed and converted to a set of optimized parameters for two models. Appropriate Deep Learning models to detect depression in the transcripts of the audio recordings and voice quality features, were utilized. We created a text analysis model on a word-level using Natural Language Processing (NLP) techniques, and a voice quality analysis model on tense to breathy dimension. The text analysis model made its best performance with an F1-score equal to 0.8 (0.42) for non-depressed (depressed) individuals, while the voice quality model scored 0.76 (0.38). As a result, we had two models that would be implemented in a system for the diagnosis of depression.
Date of Conference: 26-29 January 2021
Date Added to IEEE Xplore: 09 April 2021
ISBN Information:

ISSN Information:

Conference Location: St. Petersburg, Moscow, Russia

Contact IEEE to Subscribe

References

References is not available for this document.