Welcome to the Multimodal Signal Processing (MSP) Laboratory
In this website, you will find an overview of the exciting activities that are happening at Multimodal Signal Processing (MSP) laboratory. It also introduces the faculty and students involved in the lab.
We developed an audiovisual LVASR system that works better or equal than an audio-based ASR, even when the visual features are not very discriminative.
We created a probability gaze region by upsampling CNNs, starting from the head pose of the driver.
An elegant approach to derive relative labels from sentence level annotations. QA-based labels provides reliable information to train preference learning models.
We built a gaze detection algorithm that does not need calibration or cooperation from the user. Learn more in our paper.
We noticed that higher regularization is needed to detect valence. Here is our study to understand the underlying reasons.
Lip motion is not perfectly synchronized with speech (e.g., anticipatory movements). We compensate for this phase difference with our new AliNN framework.
We do not have a big database for speech emotion recognition, until now! Learn more about the MSP-Podcast database in our paper
This study increases the temporal modeling for audiovisual speech activity detection
Our solution to record head pose data in real driving recordings. Read more in our paper
Can we train deep models for speech emotion recognition? Read our experience in our paper
If we know that the data is emotional, can we say anything about the reliability of an speaker ID systems? read more in our paper
We use lip motion features to improve our supervised audiovisual speech activity detection. Read the details on our paper
Do you believe that disagrement between evaluators is noise? We believe it is valuable information for speech emotion recognition. Read more in our paper
Where is the driving looking now? We ask this question in our paper using a probabilistic framework. Read more in our paper
About the MSP laboratory
The MSP laboratory is dedicated to advance technology in the area human-centered
multimodal signal processing. We are looking at theoretical problems with practical
applications. Our goal is to develop methods, algorithms and models to recognize
and synthesize human verbal and non-verbal communication behaviors to improve
human machine interaction.
Our current research includes:
- Affective computing
- Speech, video and multimodal processing
- Multimodal human-machine interfaces
- Analysis and modeling of verbal and non-verbal interaction
- Human interaction analysis and modeling
- Multimodal speaker identification
- Meeting analysis and intelligent meeting spaces
- Machine learning methods for multimodal processing
The MSP lab was established by Prof. Carlos Busso
in August 2009. He is also the director of the group.
The MSP lab is part of the Erik
Jonsson School of Engineering and Computer Science at The University of Texas at Dallas .
(c) Copyrights. All rights reserved.