UTD Home UTD Home

MSP-AVATAR corpus:

A motion capture database of spontaneous improvisations

The MSP-Avatar corpus is a motion capture database which explores the role of discourse functions in non-verbal human interactions. This database comprises three sessions of recordings of spontaneous dyadic interactions between six actors. The scenarios are designed to elicit different types of discourse-related gestures in the actors. The actors are selected from the UT Dallas art department.

Dyadic Setting

High Quality Recordings

The MSP-AVATAR corpus is being recorded as part of our NSF project "EAGER: Investigating the Role of Discourse Context in Speech-Driven Facial Animations" (NSF IIS: 1352950) which studies the benefits of using discourse and dialog contextual information in the generation of believable, human-like behaviors for conversational agent (CA).

Generating a CA requires a careful analysis of human gestures and speech during human interactions. The MSP-AVATAR corpus is a rich resource for this purpose, since it includes spontaneous interaction targeting several discourse functions. We expect to investigate the effect of context in nonverbal human interaction.

The recordings include audio, video, and motion capture data from the actors. The motion captures are from the upper-body skeleton and facial area. The categories of discourse functions are carefully chosen. The different types of contexts considered are contrast, confirmation-negation, question, uncertainty, order, suggest, warn, inform, large-small (reference to the size), and deictic.

Subjects are presented with a slide which describes a scenario and also some typical gestures assumed to be associated with the corresponding context. They are told to behave naturally and use their body language in conveying their meanings, while incorporating the presented gestures or any other gestures which felt natural. For further information on the corpus, please read:

  1. Najmeh Sadoughi, Yang Liu, and Carlos Busso, "MSP-AVATAR corpus: Motion capture recordings to study the role of discourse functions in the design of intelligent virtual agents," in 1st International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS 2015), Ljubljana, Slovenia, May 2015. [pdf] [cited] [bib] [slides]

We are currently cleaning the motion capture data for the analysis. We plan to share this corpus with the research community in the future.

Some of our Publications using this Corpus:

  1. Najmeh Sadoughi, Yang Liu, and Carlos Busso, "MSP-AVATAR corpus: Motion capture recordings to study the role of discourse functions in the design of intelligent virtual agents," in 1st International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS 2015), Ljubljana, Slovenia, May 2015. [pdf] [cited] [bib] [slides]
  2. Najmeh Sadoughi and Carlos Busso, "Speech-driven animation with meaningful behaviors," Speech Communication, vol. 110, pp. 90-100, July 2019. [pdf] [cited] [ArXiv] [bib]
  3. Najmeh Sadoughi and Carlos Busso, "Retrieving target gestures toward speech driven animation with meaningful behaviors," in International conference on Multimodal interaction (ICMI 2015), Seattle, WA, USA, November 2015, pp. 115-122. [pdf] [cited] [bib] [slides]

This material is based upon work supported by the National Science Foundation under Grant IIS-1352950. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

(c) Copyrights. All rights reserved.