Publications - Multimodal Signal Processing (MSP) Laboratory

Publications at MSP lab

Journal Articles

Candice M. Monson, Ariella P. Lenton-Brym, Alexis Collins, Jeanine Lane, Carlos Busso, Jessica Ouyang, Skye Fitzpatrick, and Janice R. Kuo, "Using machine learning to increase access to and engagement with trauma-focused interventions for posttraumatic stress disorder," British Journal of Clinical Psychology, 2024. [soon cited][soon pdf] [bib]
Luz Martinez-Lucas, Wei-Cheng Lin, and Carlos Busso, "Analyzing continuous-time and sentence-level annotations for speech emotion recognition," IEEE Transactions on Affective Computing, vol. to appear, 2024. [pdf] [cited] [bib]
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso, "Selective acoustic feature enhancement for speech emotion recognition with noisy speech," IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 32, pp. 917-929, 2024. [soon cited] [pdf] [bib]
Wei-Cheng Lin and Carlos Busso, "Deep temporal clustering features for speech emotion recognition," Speech Communication, vol. 157, pp. 103027, February 2024. [soon cited] [pdf] [bib]
Andrea Vidal, and Carlos Busso, "Multimodal attention for lip synthesis using conditional generative adversarial networks," Speech communication, vol. 153, pp. 102959, September 2023. [pdf] [cited] [bib]
Wei-Cheng Lin and Carlos Busso, "Chunk-level speech emotion recognition: A general framework of sequence-to-one dynamic temporal modeling," IEEE Transactions on Affective Computing, vol. 14, no. 2, pp. 1215-1227, April-June 2023. [pdf] [cited] [bib]
John Harvill, Seong-Gyun Leem, Mohammed Abdelwahab, Reza Lotfian, and Carlos Busso, "Quantifying emotional similarity in speech," IEEE Transactions on Affective Computing, vol. 14, no. 2, pp. 1376-1390, April-June 2023. [pdf] [cited] [bib]
Yuning Qiu, Teruhisa Misu, Carlos Busso, "Unsupervised scalable multimodal driving anomaly detection," IEEE Transactions on Intelligent Vehicles, vol. 8, no. 4, pp. 3154-3165, April 2023. [pdf] [cited] [bib]
Sumit Jha, Naofal Al-Dhahir, and Carlos Busso, "Driver visual attention estimation using head pose and eye appearance information," IEEE Open Journal of Intelligent Transportation System, vol. 4, pp. 216-231, March 2023. [pdf] [cited] [bib]
Wei-Cheng Lin and Carlos Busso, "Sequential modeling by leveraging non-uniform distribution of speech emotion," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1087-1099, February 2023. [pdf] [cited] [bib]
Sumit Jha and Carlos Busso, "Estimation of driver's gaze region from head position and orientation using probabilistic confidence regions," IEEE Transactions on Intelligent Vehicles, vol. 8, no. 1, pp. 59-72, January 2023. (arXiv:2012.12754) [pdf] [ArXiv 2012.12754] [cited] [bib]
Kayla Caughlin, Elvis Duran-Sierra, Shuna Cheng, Rodrigo Cuenca, Beena Ahmed, Jim Ji, Mathias Martinez, Moustafa Al-Khalil, Hussain Al-Enazi, Yi-Shing Lisa Cheng, John Wright, Javier A. Jo, and Carlos Busso, "Aligning small datasets using domain adversarial learning: Applications in automated in vivo oral cancer diagnosis," IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 1, pp. 457-468, January 2023. [pdf] [cited] [bib]
Lucas Goncalves and Carlos Busso, "Robust audiovisual emotion recognition: Aligning modalities, capturing temporal information, and handling missing features," IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 2156-2170, October-December 2022. [pdf] [cited] [bib]
Kusha Sridhar and Carlos Busso, "Unsupervised personalization of an emotion recognition system: The unique properties of the externalization of valence in speech," IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1959-1972, October-December 2022. [pdf] [ArXiv 2201.07876] [cited] [bib]
Sumit Jha, Mohamed F. Marzban, Tiancheng Hu, Mohamed H. Mahmoud, Naofal Al-Dhahir, and Carlos Busso, "The multimodal driver monitoring database: A naturalistic corpus to study driver attention," IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10736-10752, August 2022. [pdf] [ArXiv 2101.04639] [cited] [bib]
Tiancheng Hu, Sumit Jha, and Carlos Busso, "Temporal head pose estimation from point cloud in naturalistic driving conditions," IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 7, pp. 8063-8076, July 2022. [pdf] [cited] [bib]
Andrea Vidal, Sumit Jha, Shayne Hassler, Theodore Price, and Carlos Busso, "Face detection and grimace scale prediction of white furred mice," Machine Learning with Applications, vol. 8, pp. 100312, June 2022. [pdf] [cited] [bib] [data]
Najmeh Sadoughi and Carlos Busso, "Speech-driven expressive talking lips with conditional sequential generative adversarial networks," IEEE Transactions on Affective Computing, vol. 12, no. 4, pp. 1031-1044, October-December 2021. [pdf] [cited] [ArXiv 1806.00154] [bib]
Reza Lotfian and Carlos Busso, "Over-sampling emotional speech data based on subjective evaluations provided by multiple individuals," IEEE Transactions on Affective Computing, vol. 4, no. 12, pp. 870-882, October-December 2021. [pdf] [cited] [bib]
Chi-Chun Lee, Kusha Sridhar, Jeng-Lin Li, Wei-Cheng Lin, Bo-Hao Su, and Carlos Busso, "Deep representation learning for affective speech signal analysis and processing: Preventing unwanted signal disparities," IEEE Signal Processing Magazine, vol. 38, no. 6, pp. 22-38, November 2021. [pdf] [cited] [bib]
Elvis Duran-Sierra, Shuna Cheng, Rodrigo Cuenca, Beena Ahmed, Jim Ji, Vladislav V. Yakovlev, Mathias Martinez, Moustafa Al-Khalil, Hussain Al-Enazi, Y. S. Lisa Cheng, John Wright, Carlos Busso and Javier A. Jo, "Machine-learning assisted discrimination of precancerous and cancerous from healthy oral tissue based on multispectral autofluorescence lifetime imaging endoscopy," Cancers, vol. 13, no. 19, pp. 1-16, September 2021. [pdf] [cited] [bib]
Kazi Nazmul Haque, Rajib Rana, Jiajun Liu, John H. L. Hansen, Nicholas Cummins, Carlos Busso, and Bjorn W Schuller, "Guided generative adversarial neural network for representation learning and audio generation using fewer labelled audio data," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2575-2590, July 2021. [pdf] [cited] [bib]
Daniel Fulford, Jasmine Mote, Rachel Gonzalez, Samuel Abplanalp, Yuting Zhang, Jarrod Luckenbaugh, Jukka-Pekka Onnela, Carlos Busso, and David Gard, "Smartphone sensing of social interactions in people with and without schizophrenia," Journal of Psychiatric Research, vol. 137, pp. 613-620, May 2021. [pdf] [cited] [bib]
Srinivas Parthasarathy and Carlos Busso, "Predicting emotionally salient regions using qualitative agreement of deep neural network regressors," IEEE Transactions on Affective Computing, vol. 12, no. 2, pp. 402-416, April-June 2021. [pdf] [cited] [bib]
Georgios N. Yannakakis, Roddy Cowie, and Carlos Busso, "The ordinal nature of emotions: An emerging approach," IEEE Transactions on Affective Computing, vol. 12, no. 1, pp. 16-35, January-March 2021. [pdf] [cited] [bib]
Fei Tao and Carlos Busso, "End-to-end audiovisual speech recognition system with multi-task learning," IEEE Transactions on Multimedia, vol. 23, pp. 1-11, January 2021. [pdf] [cited] [bib]
Srinivas Parthasarathy and Carlos Busso, "Semi-supervised speech emotion recognition with ladder networks," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2697-2709, September 2020. [pdf] [cited] [ArXiv 1905.02921] [bib]
Reza Lotfian and Carlos Busso, "Building naturalistic emotionally balanced speech corpus by retrieving emotional speech from existing podcast recordings," IEEE Transactions on Affective Computing, vol. 10, no. 4, pp. 471-483, October-December 2019. [pdf] [cited] [bib]
Andrea Vidal, Jorge F. Silva, and Carlos Busso, "Discriminative features for texture retrieval using wavelet packets," IEEE Access, vol. 7, no. 1, pp. 148882-148896, December 2019. [pdf] [cited] [bib]
Reza Lotfian and Carlos Busso, "Lexical dependent emotion detection using synthetic speech reference," IEEE Access, vol. 7, no. 1, pp. 22071-22085, December 2019. [pdf] [cited] [bib]
Fei Tao and Carlos Busso, "End-to-end audiovisual speech activity detection with bimodal recurrent neural models," Speech Communication, vol. 113, pp. 25-35, October 2019. [pdf] [cited] [ArXiv 1809.04553] [bib]
Najmeh Sadoughi and Carlos Busso, "Speech-driven animation with meaningful behaviors," Speech Communication, vol. 110, pp. 90-100, July 2019. [pdf] [cited] [ArXiv 1708.01640] [bib]
Reza Lotfian and Carlos Busso, "Curriculum learning for speech emotion recognition from crowdsourced labels," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 4, pp. 815-826, April 2019. [pdf] [cited] [ArXiv 1805.10339] [bib]
Mohammed Abdelwahab and Carlos Busso, "Domain adversarial for acoustic emotion recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 12, pp. 2423-2435, December 2018. [pdf] [cited] [ArXiv 1804.07690] [bib]
Fei Tao and Carlos Busso, "Gating neural network for large vocabulary audiovisual speech recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 7, pp. 1286-1298, July 2018. [pdf] [cited] [bib]
Nanxiang Li and Carlos Busso, "Calibration free, user independent gaze estimation with tensor analysis," Image and Vision Computing, vol. 74, pp. 10-20, June 2018. [pdf] [cited] [bib]
Najmeh Sadoughi, Yang Liu, and Carlos Busso, "Meaningful head movements driven by emotional synthetic speech," Speech Communication, vol. 95, pp. 87-99, December 2017. [pdf] [cited] [bib]
John H.L. Hansen, Carlos Busso, Yang Zheng, and Amardeep Sathyanarayana, "Driver Modeling for Detection & Assessment of Distraction: Examples from the UTDrive Testbed," IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 130-142, July 2017. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "The cost of dichotomizing continuous labels for binary classification problems: Deriving a Bayesian-optimal classifier," IEEE Transactions on Affective Computing, vol. 8, no. 1, pp. 67-80 January-March 2017. [pdf] [cited] [bib]
Carlos Busso, Srinivas Parthasarathy, Alec Burmania, Mohammed AbdelWahab, Najmeh Sadoughi, and Emily Mower Provost, "MSP-IMPROV: An acted corpus of dyadic interactions to study emotion perception," IEEE Transactions on Affective Computing, vol. 8, no. 1, pp. 119-130 January-March 2017. [pdf] [cited] [bib]
Srinivas Parthasarathy, Roddy Cowie, and Carlos Busso, "Using agreement on direction of change to build rank-based emotion classifiers,," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 11, pp. 2108-2121, November 2016. [pdf] [cited] [bib]
Nanxiang Li and Carlos Busso, "Detecting drivers' mirror-checking actions and its application to maneuver and secondary task recognition," IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 4, pp. 980-992, April 2016. [pdf] [cited] [bib]
Alec Burmania, Srinivas Parthasarathy, and Carlos Busso, "Increasing the reliability of crowdsourcing evaluations using online quality assessment," IEEE Transactions on Affective Computing, vol. 7, no. 4, pp. 374-388, October-December 2016. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "Facial expression recognition in the presence of speech using blind lexical compensation," IEEE Transactions on Affective Computing, vol. 7, no. 4, pp. 346-359, October-December 2016. [pdf] [cited] [bib]
Florian Eyben, Klaus Scherer, Bjorn Schuller, Johan Sundberg, Elisabeth Andre, Carlos Busso, Laurence Devillers, Julien Epps, Petri Laukka, Shrikanth Narayanan, Khiet Truong, "The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing," IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190-202, April-June 2016. [pdf] [cited] [bib]
Angeliki Metallinou, Zhaojun Yang, Chi-Chun Lee, Carlos Busso, Sharon Carnicke, and Shrikanth S. Narayanan, "The USC CreativeIT database of multimodal dyadic interactions: From speech and full body motion capture to continuous emotional annotations," Journal of Language Resources and Evaluation, vol. 50, no. 3, pp. 497-521, September 2016. [pdf] [cited] [bib] Data available upon request [link-Data]
Emily Mower Provost, Yuan Shangguan, and Carlos Busso, "UMEME: University of Michigan emotional McGurk effect data set," IEEE Transactions on Affective Computing, vol. 6, no. 4, pp. 395-409, October-December 2015. [pdf] [cited] [bib]
Christian Poellabauer, Nikhil Yadav, Louis Daudet, Sandy Schneider, Carlos Busso, and Patrick Flynn, "Challenges in concussion detection using vocal acoustic biomarkers," IEEE Access, vol. 3, pp. 1143-1160, August 2015. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "Correcting time-continuous emotional labels by modeling the reaction lag of evaluators," IEEE Transactions on Affective Computing, vol. 6, no. 2, pp. 97-108, April-June 2015. [pdf] [cited] [bib] Special Issue Best of ACII
Nanxiang Li and Carlos Busso, "Predicting perceived visual and cognitive distractions of drivers with multimodal features," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 1, pp. 51-65, February 2015. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "Compensating for speaker or lexical variabilities in speech for emotion recognition," Speech Communication, vol. 57, pp. 1-12, February 2014. [pdf] [cited] [bib]
Juan Pablo Arias, Carlos Busso, and Nestor Becerra Yoma, "Shape-based modeling of the fundamental frequency contour for emotion detection in speech," Computer Speech and Language, vol. 28, no. 1, pp. 278-294, January 2014. [pdf] [cited] [bib]
Carlos Busso, Soroosh Mariooryad, Angeliki Metallinou, and Shrikanth S. Narayanan, "Iterative feature normalization scheme for automatic emotion detection from speech," IEEE Transactions on Affective Computing, vol. 4, no. 4, pp. 386-397, October-December 2013. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "Exploring cross-modality affective reactions for audiovisual emotion recognition," IEEE Transactions on Affective Computing, vol. 4, no. 2, pp. 183-196, April-June 2013. [pdf] [cited] [bib]
Nanxiang Li, Jinesh J. Jain, and Carlos Busso, "Modeling of driver behavior in real world scenarios using multiple noninvasive sensors," IEEE Transactions on Multimedia, vol. 15, no. 5, pp. 1213-1225, August 2013. [pdf] [cited] [bib]
Soroosh Mariooryad and Carlos Busso, "Generating human-like behaviors using joint, speech-driven models for conversational agents," IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 8, pp. 2329-2340, October 2012. [pdf] [cited] [bib]
Chi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Emotion recognition using a hierarchical binary decision tree approach," Speech Communication, vol. 53, no. 9-10, pp. 1162-1171, November-December 2011. [pdf] [cited] [bib]
Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Analysis of emotionally salient aspects of fundamental frequency for emotion detection," IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 4, pp. 582-596, May 2009. [pdf] [cited] [bib]
Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette Chang, Sungbok Lee, and Shrikanth S. Narayanan, "IEMOCAP: Interactive emotional dyadic motion capture database," Journal of Language Resources and Evaluation, vol. 42, no. 4, pp. 335-359, December 2008. [pdf] [cited] [bib]
Carlos Busso and Shrikanth S. Narayanan, "Interrelation between speech and facial gestures in emotional utterances: a single subject study," IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2331-2347, November 2007. [pdf] [cited] [bib]
Carlos Busso, Zhigang Deng, Michael Grimm, Ulrich Neumann, and Shrikanth S. Narayanan, "Rigid head motion in expressive speech animation: Analysis and synthesis," IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 3, pp. 1075-1086, March 2007. [pdf] [cited] [bib]
Nestor Becerra Yoma, Carlos Molina, Jorge Silva, and Carlos Busso, "Modeling, estimating, and compensating low-bit rate coding distortion in speech recognition," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 1, pp. 246-255, January 2006. [pdf] [cited] [bib]
Carlos Busso, Zhigang Deng, Ulrich Neumann, and Shrikanth S. Narayanan, "Natural head motion synthesis driven by acoustic prosodic features," Computer Animation and Virtual Worlds, vol. 16, no. 3-4, pp. 283-290, July 2005. [pdf] [cited] [bib] [slides]
Nestor Becerra Yoma, Carlos Busso, and Ismael Soto, "Packet-loss modelling in IP networks with state-duration constraints," Communications, IEE Proceedings, vol. 152, no. 1, pp. 1-5, Feb 2005. [pdf] [cited] [bib]
Nestor Becerra Yoma, Juan Hood, and Carlos Busso, "A real-time protocol for the internet based on the least mean square algorithm," Transactions on Multimedia, IEEE, vol. 6, no. 1, pp. 174-184, Feb 2004. [pdf] [cited] [bib]
Nestor Becerra Yoma, Jorge Silva, Carlos Busso, and Ivan Brito, "Compensating additive noise and CS-CELP distortion in speech recognition using stochastic weighted Viterbi algorithm," Electronics Letters, IEE, vol. 39, no. 4, pp. 409-411, Feb 2003. [pdf] [cited] [bib]

Book chapters

Catherine Pelachaud, Carlos Busso, and Dirk Heylen,"Multimodal behavior modeling for socially interactive agents," in Handbook of Socially Interactive Agents: 20 Years of Research on Intelligent Virtual Agents, Embodied Conversational Agents, and Social Robotics, B. Lugrin, C. Pelachaud, and D. Traum, Eds., vol. to appear. ACM Book, Human-Centered Computing, 2021. [pdf] [cited] [bib]
Sumit Jha and Carlos Busso,"Head pose as an indicator of drivers' visual attention," in Vehicles, Drivers, and Safety, H. Abut, J.H.L. Hansen, G. Scmidt, and K. Takeda, Eds., Intelligent Vehicles and Transportation: Volume 2, pp. 113-132. De Gruyter, May 2020. [link-to-pdf] [cited] [bib]
Nanxiang Li and Carlos Busso,"Driver mirror-checking action detection," in DSP for In-Vehicle Systems and Safety, H. Abut, J.H.L. Hansen, G. Schmidt, K. Takeda, and H. Ko, Eds., Intelligent Vehicles and Transportation. DeGruyter, July 2017. [link-to-pdf] [cited] [bib]
Najmeh Sadoughi and Carlos Busso,"Head motion generation," in Handbook of Human Motion, B. Muller, S.I. Wolf, G.-P. Brueggemann, Z. Deng, A. McIntosh, F. Miller, and W. Scott Selbie, Eds., pp. 1-25. Springer International Publishing, January 2017. [link-to-pdf] [soon cited] [bib]
Chi-Chun Lee, Jangwon Kim, Angeliki Metallinou, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Speech in Affective Computing," in The Oxford Handbook of Affective Computing, R. Calvo, S. D'Mello, J. Gratch, and A. Kappas Eds., pp. 170-183. Oxford University press, New York, NY, USA, December 2014. [link-to-pdf] [cited] [bib]
Nanxiang Li and Carlos Busso, "Using perceptual evaluation to quantify cognitive and visual driver distractions," In Smart Mobile In-Vehicle Systems - Next Generation Advancements, G. Schmidt, H. Abut, K. Takeda, and J. H. L. Hansen, Eds. pp. 183-207. Springer, New York, NY, USA, January 2014. [link-to-pdf] [cited] [bib]
Carlos Busso, Murtaza Bulut, and Shrikanth S. Narayanan, "Toward effective automatic recognition systems of emotion in speech," in Social emotions in nature and artifact: emotions in human and human-computer interaction, S. Marsella J. Gratch, Eds., pp. 110-127. Oxford University Press, New York, NY, USA, November 2013. [link-to-pdf] [cited] [bib]
Carlos Busso and Jinesh J. Jain, "Advances in multimodal tracking of driver distraction," in DSP for In-Vehicle Systems & Safety, J. Hansen, P. Boyraz, K. Takeda, and H. Abut, Eds., p. In Press. Springer, New York, NY, USA, 2012. [link-to-pdf] [cited] [bib]
Carlos Busso, Murtaza Bulut, Sungbok Lee, and Shrikanth S. Narayanan, "Fundamental frequency analysis for speech emotion processing," in The Role of Prosody in Affective Speech, Sylvie Hancil, Ed., pp. 309-337. Peter Lang Publishing Group, Berlin, Germany, 2009. [link-to-pdf] [cited] [bib]
Carlos Busso, Zhigang Deng, Ulrich Neumann, and Shrikanth S. Narayanan, "Learning expressive human-like head motion sequences from speech," in Data-Driven 3D Facial Animations, Zhigang Deng and Ulrich Neumann, Eds. Surrey, United Kingdom: Springer-Verlag London Ltd, 2007, pp. 113-131. [pdf] [cited] [bib]

Conference Proceedings

Kayla R. Caughlin, Rodrigo Cuenca Martinez, Gabriel P. Tortorelli, Kathleen E. Higgins, Ronald Faram, Javier A. Jo, and Carlos Busso, "Understanding bias in multispectral autofluorescence lifetime imaging: Are models sensitive to oral location?," in IEEE Engineering in Medicine and Biology Society (EMBS 2024), Orlando, FL, USA, July 2024. [soon cited][soon pdf] [bib]
Kayla R. Caughlin, Rodrigo Cuenca Martinez, Gabriel P. Tortorelli, Yi-Shing L. Cheng, Rashmi Hegde, Celeste Abraham, Jacqueline M. Plemons, Ying S. Wang, Victoria Woo, Javier A. Jo, and Carlos Busso, "Beyond dysplasia: Uncovering structure in oral potentially malignant diseases with unsupervised contrastive learning," in IEEE Engineering in Medicine and Biology Society (EMBS 2024), Orlando, FL, USA, July 2024. [soon cited][soon pdf] [bib]
Kun Zhou, Berrak Sisman, Carlos Busso, Bin Ma, and Haizhou Li, "Mixed-EVC: Mixed emotion synthesis and control in voice conversion," in The Speaker and Language Recognition Workshop (Odyssey 2024), Quebec, Canada, June 2024. [soon cited][soon pdf] [bib]
Lucas Goncalves, Ali N. Salman, Abinay Reddy Naini, Laureano Moro-Velazquez, Thomas Thebaud, Paola Garcia, Najim Dehak, Berrak Sisman, and Carlos Busso, "Odyssey 2024 - speech emotion recognition challenge: Dataset, baseline framework, and results," in The Speaker and Language Recognition Workshop (Odyssey 2024), Quebec, Canada, June 2024. [soon cited] [soon pdf] [bib]
Susmitha Gogineni and Carlos Busso, "Driver head pose estimation with multimodal temporal fusion of color and depth modeling networks," in IEEE Intelligent Vehicles Symposium (IV 2024), Jeju Island, Korea, June 2024. [soon cited] [soon pdf] [bib]
Karen Rosero, Ali N. Salman, Berrak Sisman, Rami Hallac, and Carlos Busso, "Data augmentation techniques for enhanced facial landmarks detection in patients with repaired cleft lip and palate," in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024), Istanbul, Turkey, May 2024. [soon cited][soon pdf] [bib]
Luz Martinez-Lucas and Carlos Busso, "Dynamic speech emotion recognition using a conditional neural process," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Republic of Korea, April 2024, vol. To appear. [soon cited] [pdf] [bib]
Abinay Reddy Naini, Mary A. Kohler, Elizabeth Richerson, Donita Robinson, and Carlos Busso, "Generalization of self-supervised learning-based representations for cross-domain speech emotion recognition," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Republic of Korea, April 2024, vol. To appear. [pdf] [cited] [bib]
Ismail Rasim Ulgen, Zongyang Du, Carlos Busso, Berrak Sisman, "Revealing emotional clusters in speaker embeddings: A contrastive learning strategy for speech emotion recognition," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Republic of Korea, April 2024, vol. To appear. [soon cited] [pdf] [bib]
Abinay Reddy Naini, Shruthi Subramanium, Seong-Gyun Leem, and Carlos Busso, "Combining relative and absolute learning formulations to predict emotional attributes from speech," in IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2023), Taipei, Taiwan, December 2023. [soon cited] [pdf] [bib] [poster]
Wei-Cheng Lin, Lucas Goncalves, and Carlos Busso, "Enhancing resilience to missing data in audio-text emotion recognition with multi-scale chunk regularization," in ACM International Conference on Multimodal Interaction (ICMI 2023), Paris, France, October 2023, vol. To appear. [pdf] [cited] [bib] [slides]
Isaac Brooks, Susmitha Gogineni, Sumit Jha, Soumitry J. Ray, Rajesh Narasimha, Naofal Al-Dhahir, and Carlos Busso, "MSP- DISK: Naturalistic and diverse in-vehicle database for joint pose and seat belt detection," in IEEE International Conference on Intelligent Transportation Systems (ITSC 2023), Bilbao, Spain, September 2023. [soon cited] [pdf] [bib] [slides]
Luz Martinez-Lucas, Ali N. Salman, Seong-Gyun Leem, Shreya G. Upadhyay, Chi-Chun Lee, and Carlos Busso, "Analyzing the effect of affective priming on emotional annotations," in International Conference on Affective Computing and Intelligent Interaction (ACII 2023), Cambridge, MA, USA, September 2023. [pdf] [cited] [bib] [slides]
Shreya G. Upadhyay, Woan-Shiuan Chien, Bo-Hao Su, Lucas Goncalves, Ya-Tse Wu, Ali N. Salman, Carlos Busso, and Chi-Chun Lee, "An intelligent infrastructure toward large scale naturalistic affective speech corpora collection," in International Conference on Affective Computing and Intelligent Interaction (ACII 2023), Cambridge, MA, USA, September 2023. [pdf] [cited] [bib]
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso, "Computation and memory efficient noise adaptation of Wav2Vec2.0 for noisy speech emotion recognition with skip connection adapters," In Interspeech 2023, Dublin, Ireland, August 2023, pp. 1888-1892. [pdf] [cited] [bib] [poster]
Abinay Reddy Naini, Ali N. Salman, and Carlos Busso, "Preference learning labels by anchoring on consecutive annotations," In Interspeech 2023, Dublin, Ireland, August 2023, pp. 1898-1902. [soon cited] [pdf] [bib] [poster]
Nicolás Grágeda, Carlos Busso, Eduardo Alvarado, Rodrigo Mahu, and Néstor Becerra Yoma, "Distant speech emotion recognition in an indoor human-robot interaction scenario," In Interspeech 2023, Dublin, Ireland, August 2023, pp. 3657-3661. [soon cited] [pdf] [bib]
Huang-Cheng Chou, Lucas Goncalves, Seong-Gyun Leem, Chi-Chun Lee, Carlos Busso, "The importance of calibration: Rethinking confidence and performance of speech multi-label emotion classifiers," In Interspeech 2023, Dublin, Ireland, August 2023, pp. 641-645. [soon cited] [pdf] [bib] [slides]
Sumit Jha, Isaac Brooks, Soumitry J. Ray, Rajesh Narasimha, Naofal Al-Dhahir and Carlos Busso, "Seatbelt segmentation using synthetic images," In IEEE Intelligent Vehicles Symposium (IV 2023), Anchorage, AK, USA, June 2023, vol. To appear. [pdf] [cited] [bib] [poster]
Yuning Qiu, Teruhisa Misu, and Carlos Busso, "Example-based query to identify cause of driving anomaly with few labeled samples," In IEEE Intelligent Vehicles Symposium (IV 2023), Anchorage, AK, USA, June 2023, vol. To appear. [soon cited] [pdf] [bib] [poster]
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso, "Adapting a self-supervised speech representation for noisy speech emotion recognition by using contrastive teacher-student learning," In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, June 2023, pp. 1-5. [pdf] [cited] [bib] [poster] [slides]
Wei-Cheng Lin and Carlos Busso, "Role of lexical boundary information in chunk-level segmentation for speech emotion recognition," In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, June 2023, pp. 1-5. [pdf] [cited] [bib] [slides] [poster]
Lucas Goncalves and C. Busso, "Learning cross-modal audiovisual representations with ladder networks for emotion recognition," In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, June 2023, pp. 1-5. [pdf] [cited] [bib] [slides] [poster]
Abinay Reddy Naini, Mary A. Kohler, and Carlos Busso, "Unsupervised domain adaptation for preference learning based speech emotion recognition," In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, June 2023, pp. 1-5. [pdf] [cited] [bib] [slides]
Shreya G. Upadhyay, Luz Martinez-Lucas, Bo-Hao Su, Wei-Cheng Lin, Woan-Shiuan Chien, Ya-Tse Wu, William Katz, Carlos Busso, Chi-Chun Lee, "Phonetic anchor-based transfer learning to facilitate unsupervised cross-lingual speech emotion recognition," In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, June 2023, pp. 1-5. [pdf] [cited] [bib]
Ali N. Salman and Carlos Busso, "Privacy preserving personalization for video facial expression recognition using federated learning," ACM International Conference on Multimodal Interaction (ICMI 2022), Bangalore, India, November 2022. [pdf] [cited] [bib] [slides]
Woan-Shiuan Chien, Shreya Upadhyay, Wei-Cheng Lin, Ya-Tse Wu, Bo-Hao Su, Carlos Busso, Chi-Chun Lee, "Monologue versus conversation: Differences in emotion perception and acoustic expressivity," International Conference on Affective Computing and Intelligent Interaction (ACII 2022), Nara, Japan, October 2022. [pdf] [cited] [bib]
Yuning Qiu, Teruhisa Misu, Carlos Busso, "Driving anomaly detection using contrastive multiview coding to interpret cause of anomaly," IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 2022. [soon cited] [pdf] [bib]
Lucas Goncalves and Carlos Busso, "Improving speech emotion recognition using self-supervised learning with domain-specific audiovisual tasks," in Interspeech 2022, Incheon, South Korea, September 2022. [pdf] [cited] [bib] [Youtube]
Huang-Cheng Chou, Chi-Chun Lee, Carlos Busso, "Exploiting co-occurrence frequency of emotions in perceptual evaluations to train a speech emotion classifier," in Interspeech 2022, Incheon, South Korea, September 2022. [pdf] [cited] [bib]
Lucas Goncalves and Carlos Busso, "AuxFormer: Robust approach to audiovisual emotion recognition," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, May 2022. [pdf] [cited] [bib]
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso, "Not all features are equal: Selection of robust features for speech emotion recognition in noisy environments," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, May 2022. [pdf] [cited] [bib]
Yuning Qiu, Carlos Busso, Teruhisa Misu, and Kumar Akash, "Incorporating gaze behavior using joint embedding with scene context for driver takeover detection," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, May 2022. [pdf] [cited] [bib]
Huang-Cheng Chou, Wei-Cheng Lin, Chi-Chun Lee, Carlos Busso, "Exploiting annotators' typed description of emotion perception to maximize utilization of ratings for speech emotion recognition," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, May 2022. [pdf] [cited] [bib]
Kayla Caughlin, Elvis Duran-Sierra, Shuna Cheng, Rodrigo Cuenca, Beena Ahmed, Jim Ji, Vladislav V. Yakovlev, Mathias Martinez, Moustafa Al-Khalil, Hussain Al-Enazi, Javier A. Jo, and Carlos Busso, "End-to-end neural network for feature extraction and cancer diagnosis of in vivo fluorescence lifetime images of oral lesions," in International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2021), Guadalajara, Mexico, October-November 2021. [pdf] [cited] [bib]
Kusha Sridhar, Wei-Cheng Lin and C. Busso, "Generative approach using soft-labels to learn uncertainty in predicting emotional attributes," in International Conference on Affective Computing and Intelligent Interaction (ACII 2021), Nara, Japan, September-October 2021 [pdf] [cited] [bib] [slides]
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard and Carlos Busso, "Separation of emotional and reconstruction embeddings on ladder network to improve speech emotion recognition robustness in noisy conditions," in Interspeech 2021, Brno, Czech Republic, August-September 2021. [pdf] [cited] [bib] [slides]
Jarrod Luckenbaugh, Samuel Abplanalp, Rachel Gonzalez, Daniel Fulford, David Gard and Carlos Busso, "Voice activity detection with teacher-student domain emulation," in Interspeech 2021, Brno, Czech Republic, August-September 2021. [pdf] [cited] [bib] [slides]
Wei-Cheng Lin, Kusha Sridhar, and Carlos Busso, "DeepEmoCluster: A semi-supervised framework for latent cluster representation of speech emotions," in IEEE international conference on acoustics, speech and signal processing (ICASSP 2021), Toronto, ON, Canada, June 2021, pp. 7263-7267. [pdf] [cited] [bib] [slides]
Andrea Vidal, Ali Salman, Wei-Cheng Lin, and Carlos Busso, "MSP-face corpus: A natural audiovisual emotional database," in ACM International Conference on Multimodal Interaction (ICMI 2020), Utrecht, The Netherlands, October 2020, pp. 397-405. [pdf] [cited] [bib] [slides]
Wei-Cheng Lin and Carlos Busso, "An Efficient Temporal Modeling Approach for Speech Emotion Recognition by Mapping Varied Duration Sentences into Fixed Number of Chunks," in Interspeech 2020, Shanghai, China, October 2020, pp. 2322-2326. [pdf] [cited] [bib] [slides]
Kusha Sridhar and Carlos Busso, "Ensemble of students taught by probabilistic teachers to improve speech emotion recognition," in Interspeech 2020, Shanghai, China, October 2020, pp. 516-520. [pdf] [cited] [bib] [slides] [Youtube]
Luz Martinez-Lucas, Mohammed Abdelwahab, and Carlos Busso, "The MSP-conversation corpus," in Interspeech 2020, Shanghai, China, October 2020, pp. 1823-1827. [pdf] [cited] [bib] [slides]
Ali N. Salman and Carlos Busso, "Style extractor for facial expression recognition in the presence of speech," in IEEE International Conference on Image Processing (ICIP 2020), Abu Dhabi, United Arab Emirates (UAE), October 2020, pp. 1806-1810. [pdf] [cited] [bib] [slides]
Yuning Qiu, Teruhisa Misu and Carlos Busso, "Use of triplet loss function to improve driving anomaly detection using conditional generative adversarial network," in Intelligent Transportation Systems Conference (ITSC 2020), Rhodes, Greece, September 2020, pp. 1-7. [pdf] [cited] [bib] [slides]
Tiancheng Hu, Sumit Jha, and Carlos Busso, "Robust driver head pose estimation in naturalistic conditions from point-cloud data," in IEEE Intelligent Vehicles Symposium (IV2020), Las Vegas, NV USA, October 2020. [pdf] [cited] [bib] [slides]
Ali N. Salman and Carlos Busso, "Dynamic versus static facial expressions in the presence of speech," in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), Buenos Aires, Argentina, May 2020. [pdf] [cited] [bib] [slides]
Kusha Sridhar and Carlos Busso, "Modeling uncertainty in predicting emotional attributes from spontaneous speech," in IEEE international conference on acoustics, speech and signal processing (ICASSP 2020), Barcelona, Spain, May 2020, pp. 8384-8388. [pdf] [cited] [bib] [slides]
Yuning Qiu, Teruhisa Misu and Carlos Busso, "Analysis of the relationship between physiological signals and vehicle maneuvers during a naturalistic driving study," in Intelligent Transportation Systems Conference (ITSC 2019), Auckland, New Zealand, October 2019, pp. 3230-3235. [pdf] [cited] [bib] [slides]
Yuning Qiu, Teruhisa Misu and Carlos Busso, "Driving anomaly detection with conditional generative adversarial network using physiological and CAN-bus data," ACM International Conference on Multimodal Interaction (ICMI 2019), Suzhou, Jiangsu, China, October 2019, pp. 164-173. [pdf] [cited] [bib] [slides]
Kusha Sridhar and Carlos Busso, "Speech emotion recognition with a reject option," in Interspeech 2019, Graz, Austria, September 2019, pp. 3272-3276 [pdf] [cited] [bib] [poster]
Mohammed Abdelwahab and Carlos Busso, "Active learning for speech emotion recognition using deep neural network," in International Conference on Affective Computing and Intelligent Interaction (ACII 2019), Cambridge, UK, September 2019, pp. 441-447. [pdf] [cited] [bib] [slides]
Michelle Bancroft, Reza Lotfian, John Hansen, and Carlos Busso, "Exploring the Intersection Between Speaker Verification and Emotion Recognition," in International Workshop on Social & Emotion AI for Industry (SEAIxI), Cambridge, UK, September 2019, pp. 337-342. [pdf] [cited] [bib] [slides]
John Harvill, Mohammed AbdelWahab, Reza Lotfian, and Carlos Busso, "Retrieving speech samples with similar emotional content using a triplet loss function," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019), Brighton, UK, May 2019, pp. 3792-3796. [pdf] [cited] [bib] [poster]
Sumit Jha and Carlos Busso, "Estimation of gaze region using two dimensional probabilistic maps constructed using convolutional neural networks," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019), Brighton, UK, May 2019, pp. 3792-3796. [pdf] [cited] [bib] [poster]
Sumit Jha and Carlos Busso, "Probabilistic estimation of the gaze region of the driver using dense classification," in IEEE International Conference on Intelligent Transportation (ITSC 2018), Maui, HI, USA, November 2018, pp. 697-702. [pdf] [cited] [bib] [slides]
Srinivas Parthasarathy and Carlos Busso, "Ladder networks for emotion recognition: Using unsupervised auxiliary tasks to improve predictions of emotional attributes," in Interspeech 2018, Hyderabad, India, September 2018, pp. 3698-3702. [pdf] [cited] [ArXiv 1804.10816] [bib] [poster]
Fei Tao and Carlos Busso, "Audiovisual speech activity detection with advanced long short-term memory," in Interspeech 2018, Hyderabad, India, September 2018, pp. 1244-1248. [pdf] [cited] [bib] [poster]
Kusha Sridhar, Srinivas Parthasarathy and Carlos Busso, "Role of regularization in the prediction of valence from speech," in Interspeech 2018, Hyderabad, India, September 2018, pp. 941-945. [pdf] [cited] [bib] [slides]
Srinivas Parthasarathy and Carlos Busso, "Preference-learning with qualitative agreement for sentence level emotional annotations," in Interspeech 2018, Hyderabad, India, September 2018, pp. 252-256. [pdf] [cited] [bib] [poster]
Reza Lotfian and Carlos Busso, "Predicting categorical emotions by jointly learning primary and secondary emotions through multitask learning," in Interspeech 2018, Hyderabad, India, September 2018, pp. 951-955. [pdf] [cited] [bib] [slides]
Fei Tao and Carlos Busso, "Aligning audiovisual features for audiovisual speech recognition," in IEEE International Conference on Multimedia and Expo (ICME 2018), San Diego, CA, USA, July 2018, pp. 1-6. [pdf] [cited] [bib] [slides]
Sumit Jha and Carlos Busso, "Fi-Cap: Robust framework to benchmark head pose estimation in challenging environments," in IEEE International Conference on Multimedia and Expo (ICME 2018), San Diego, CA, USA, July 2018, pp. 1-6. [pdf] [cited] [bib] [poster]
Najmeh Sadoughi and Carlos Busso, "Expressive speech-driven lip movements with multitask learning," in IEEE Conference on Automatic Face and Gesture Recognition (FG 2018), Xi'an, China, May 2018, pp. 409-415. [pdf] [cited] [bib] [poster]
Mohammed Abdelwahab and Carlos Busso, "Study of dense network approaches for speech emotion recognition," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, AB, Canada, April 2018, pp. 5084-5088. [pdf] [cited] [bib] [poster] [teaser]
Najmeh Sadoughi and Carlos Busso, "Novel realizations of speech-driven head movements with generative adversarial networks," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, AB, Canada, April 2018, pp. 6169-6173. [pdf] [cited] [bib] [poster]
Marigona Bokshi, Fei Tao, Carlos Busso, and John H. L. Hansen, "Assessment and classification of singing quality based on audio-visual features," in IEEE Visual Communications and Image Processing (VCIP 2017), St. Petersburg, FL, USA, December 2017, pp. 1-4. [soon cited] [pdf] [bib] [poster]
Reza Lotfian and Carlos Busso, "Formulating emotion perception as a probabilistic model with application to categorical emotion classification," in International Conference on Affective Computing and Intelligent Interaction (ACII 2017), San Antonio, TX, USA, October 2017, pp. 415-420. [pdf] [cited] [bib] [slides]
Srinivas Parthasarathy and Carlos Busso, "Predicting speaker recognition reliability by considering emotional content," in International Conference on Affective Computing and Intelligent Interaction (ACII 2017), San Antonio, TX, USA, October 2017, pp. 434-436. [pdf] [cited] [bib] [poster]
Georgios N. Yannakakis, Roddy Cowie, and Carlos Busso, "The ordinal nature of emotions," in International Conference on Affective Computing and Intelligent Interaction (ACII 2017), San Antonio, TX, USA, October 2017, pp. 248-255. [pdf] [cited] [bib] [Slides]
Best Paper at ACII 2017!
Sumit Jha and Carlos Busso, "Probabilistic estimation of the driver's gaze from head orientation and position," in IEEE International Conference on Intelligent Transportation (ITSC), Yokohama, Japan, October 2017, pp. 1630-1635. [pdf] [cited] [bib] [slides]
Sumit Jha and Carlos Busso, "Challenges in head pose estimation of drivers in naturalistic recordings using existing tools," in IEEE International Conference on Intelligent Transportation (ITSC), Yokohama, Japan, October 2017, pp. 1624-1629. [pdf] [cited] [bib] [slides]
Najmeh Sadoughi and Carlos Busso, "Joint learning of speech-driven facial motion with bidirectional long-short term memory," International Conference on Intelligent Virtual Agents (IVA 2017), J. Beskow, C. Peters, G. Castellano, C. O'Sullivan, I. Leite, S. Kopp, Eds., vol. 10498 of Lecture Notes in Computer Science, pp. 389-402. Springer Berlin Heidelberg, Stockholm, Sweden, August 2017. [pdf] [cited] [bib] [slides]
Fei Tao and Carlos Busso, "Bimodal recurrent neural network for audiovisual voice activity detection," Interspeech 2017, Stockholm, Sweden, August 2017, pp. 1938-1942. [pdf] [cited] [bib] [poster]
Srinivas Parthasarathy and Carlos Busso, "Jointly predicting arousal, valence and dominance with multi-task learning," in Interspeech 2017, Stockholm, Sweden, August 2017, pp. 1103-1107. [pdf] [cited] [bib] [slides]
Nominated for Best Student Paper at Interspeech 2017!
Alec Burmania and Carlos Busso, "A stepwise analysis of aggregated crowdsourced labels describing multimodal emotional behaviors," in in Interspeech 2017, Stockholm, Sweden, August 2017, pp. 152-157. [pdf] [cited] [bib] [slides]
Mohammed Abdelwahab and Carlos Busso, "Incremental adaptation using active learning for acoustic emotion recognition," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017), New Orleans, LA, USA, March 2017, pp. 5160-5164. [pdf] [cited] [bib] [poster]
Srinivas Parthasarathy and Reza Lotfian and Carlos Busso, "Ranking emotional attributes with deep neural networks," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017), New Orleans, LA, USA, March 2017, pp. 4995-4999. [pdf] [cited] [bib] [slides]
Mohammed Abdelwahab and Carlos Busso, "Ensemble feature selection for domain adaptation in speech emotion recognition," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017), New Orleans, LA, USA, March 2017, pp. 5000-5004. [pdf] [cited] [bib] [slides]
Srinivas Parthasarathy, Chunlei Zhang, John H.L. Hansen, and Carlos Busso, " A study of speaker verification performance with expressive speech," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017), New Orleans, LA, USA, March 2017, pp. 5540-5544. [pdf] [cited] [bib] [poster]
Sumit Jha and Carlos Busso, "Analyzing the Relationship Between Head Pose and Gaze to Model Driver Visual Attention," in IEEE Conference on Intelligent Transportation Systems (ITSC 2016), Rio de Janeiro, Brazil, November 2016, pp. 2157-2162. [pdf] [cited] [bib] [slides]
Najmeh Sadoughi and Carlos Busso, "Head motion generation with synthetic speech: a data driven approach," in Interspeech 2016, San Francisco, CA, USA, September 2016, pp. 52-56. [pdf] [cited] [bib] [slides]
Nominated for Best Student Paper at Interspeech 2016!
Fei Tao, Louis Daudet, Christian Poellabauer, Sandra L. Schneider, and Carlos Busso, "A portable automatic PA-TA-KA syllable detection system to derive biomarkers for neurological disorders," in Interspeech 2016, San Francisco, CA, USA, September 2016, pp. 362-366. [pdf] [cited] [bib] [poster]
Fei Tao, John H.L. Hansen, and Carlos Busso, "Improving boundary estimation in audiovisual speech activity detection using Bayesian information criterion," in Interspeech 2016, San Francisco, CA, USA, September 2016, pp. 2130-2134. [pdf] [cited] [bib] [slides]
Reza Lotfian and Carlos Busso, "Retrieving categorical emotions using a probabilistic framework to define preference learning samples," in Interspeech 2016, San Francisco, CA, USA, September 2016, pp. 490-494. [pdf] [cited] [bib] [slides]
Srinivas Parthasarathy and Carlos Busso, "Defining emotionally salient regions using qualitative agreement method," in Interspeech 2016, San Francisco, CA, USA, September 2016, pp. 3598-3602. [pdf] [cited] [bib] [slides]
Reza Lotfian and Carlos Busso, "Practical considerations on the use of preference learning for ranking emotional speech," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016, pp. 5205-5209. [pdf] [cited] [bib] [slides]
Alec Burmania, Mohammed Abdelwahab, and Carlos Busso, "Tradeoff between quality and quantity of emotional annotations to characterize expressive behaviors," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016, pp. 5190-5194. [pdf] [cited] [bib] [slides]
Anil Jakkam and Carlos Busso, "A multimodal analysis of synchrony during dyadic interaction using a metric based on sequential pattern mining," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016, pp. 6085-6089. [pdf] [cited] [bib] [slides]
Taufiq Hasan, Mohammed Abdelwahab, Srinivas Parthasarathy, Carlos Busso, and Yang Liu, "Automatic composition of broadcast news summaries using rank classifiers trained with acoustic and lexical features," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016, pp. 6080-6084. [pdf] [cited] [bib] [slides]
Najmeh Sadoughi and Carlos Busso, "Retrieving target gestures toward speech driven animation with meaningful behaviors," in International conference on Multimodal interaction (ICMI 2015), Seattle, WA, USA, November 2015, pp. 115-122. [pdf] [cited] [bib] [slides]
Asif Iqbal, Carlos Busso, and Nicholas Gans, "Adjacent vehicle collision warning system using image sensor and inertial measurement unit," in International conference on Multimodal interaction (ICMI 2015), Seattle, WA, USA, November 2015, pp. 291-298. [pdf] [cited] [bib] [poster]
Fei Tao, John H.L. Hansen, and Carlos Busso, "An unsupervised visual-only voice activity detection approach using temporal orofacial features," in Interspeech 2015, Dresden, Germany, September 2015, pp. 2302-2306 [pdf] [cited] [bib] [slides]
Najmeh Sadoughi, Yang Liu, and Carlos Busso, "MSP-AVATAR corpus: Motion capture recordings to study the role of discourse functions in the design of intelligent virtual agents," in 1st International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS 2015), Ljubljana, Slovenia, May 2015, pp. 1-6. [pdf] [cited] [bib] [slides]
Mohammed Abdelwahab and Carlos Busso, "Supervised domain adaptation for emotion recognition from speech," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015), Brisbane, Australia, April 2015, pp. 5058-5062. [pdf] [cited] [bib] [poster]
Reza Lotfian and Carlos Busso, "Emotion recognition using synthetic speech as neutral reference," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015), Brisbane, Australia, April 2015, pp. 4759-4763. [pdf] [cited] [bib] [poster]
Mohammed Abdelwahab and Carlos Busso, "Evaluation of syllable rate estimation in expressive speech and its contribution to emotion recognition," in IEEE Spoken Language Technology Workshop (SLT), South Lake Tahoe, CA, USA, December 2014, pp. 472-477. [pdf] [cited] [bib] [poster]
Najmeh Sadoughi, Yang Liu, and Carlos Busso, "Speech-driven animation constrained by appropriate discourse functions," in International conference on multimodal interaction (ICMI 2014), Istanbul, Turkey, November 2014, pp. 148-155. [pdf] [cited] [bib] [poster]
Nanxiang Li and Carlos Busso, "User-independent gaze estimation by exploiting similarity measures in the eye pair appearance eigenspace," in International conference on multimodal interaction (ICMI 2014), Istanbul, Turkey, November 2014, pp. 335-338. [pdf] [cited] [bib] [slides]
Fei Tao and Carlos Busso, "Lipreading approach for isolated digits recognition under whisper and neutral speech," in Interspeech 2014, Singapore, September 2014, pp. 1154-1158. [pdf] [cited] [bib] [poster]
Soroosh Mariooryad, Reza Lotfian, and Carlos Busso, "Building a naturalistic emotional speech corpus by retrieving expressive behaviors from existing speech corpora," in Interspeech 2014, Singapore, September 2014, pp. 238-242. [pdf] [cited] [bib] [poster]
Nanxiang Li and Carlos Busso, "Evaluating the robustness of an appearance-based gaze estimation method for multimodal interfaces," in International conference on multimodal interaction (ICMI 2013), Sydney, Australia, December 2013, pp. 91-98. [pdf] [cited] [bib] [poster]
Nanxiang Li and Carlos Busso, "Driver mirror-checking action detection using multi-modal signals," in The 6th Biennial Workshop on Digital Signal Processing for In-Vehicle Systems, Seoul, Korea, September-October 2013, pp. 101-108. [pdf] [cited] [bib] [slides]
Nanxiang Li, Amardeep Sathyanarayana, Carlos Busso, and John H.L. Hansen, "Rear-end collision prevention using mobile devices," in The 6th Biennial Workshop on Digital Signal Processing for In-Vehicle Systems, Seoul, Korea, September-October 2013, pp. 36-43. [pdf] [cited] [bib] [poster]
Soroosh Mariooryad and Carlos Busso, "Analysis and compensation of the reaction lag of evaluators in continuous emotional annotations," in Affective Computing and Intelligent Interaction (ACII 2013), Geneva, Switzerland, September 2013, pp. 85-90. [pdf] [cited] [bib] [slides]
Nominated for Best Student Paper at ACII 2013!
Juan Pablo Arias, Carlos Busso, and Nestor Becerra Yoma, "Energy and F0 contour modeling with functional data analysis for emotional speech detection," in Interspeech 2013, Lyon, France, August 2013, pp. 2871-2875. [pdf] [cited] [bib] [poster]
Nanxiang Li and Carlos Busso, "Analysis of facial features of drivers under cognitive and visual distractions," in IEEE International Conference on Multimedia and Expo (ICME 2013), San Jose, CA, USA, July 2013, pp. 1-6. [pdf] [cited] [bib] [slides]
Tam Tran, Soroosh Mariooryad, and Carlos Busso, "Audiovisual corpus to analyze whisper speech," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), Vancouver, BC, Canada, May 2013, pp. 8101-8105. [pdf] [cited] [bib] [poster]
Soroosh Mariooryad and Carlos Busso, "Feature and model level compensation of lexical content for facial emotion recognition," in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2013), Shanghai, China, April 2013, pp. 1-6. [pdf] [cited] [bib] [slides]
Carlos Busso and Tauhidur Rahman, "Unveiling the Acoustic Properties that Describe the Valence Dimension," in Interspeech 2012, Portland, OR, USA, September 2012, pp. 1179-1182. [pdf] [cited] [bib] [poster]
Soroosh Mariooryad and Carlos Busso, "Factorizing speaker, lexical and emotional variabilities observed in facial expressions," in IEEE International Conference on Image Processing (ICIP 2012), Orlando, FL, USA, September-October 2012, pp. 2605-2608. [pdf] [cited] [bib] [poster]
David Tick, Tauhidur Rahman, Carlos Busso, and Nicholas Gans, "Indoor robotic terrain classification via angular velocity based hierarchical classifier selection," in IEEE International Conference on Robotics and Automation (ICRA 2012), St. Paul, MN, USA, May 2012, pp. 3594-3600. [pdf] [cited] [bib] [slides]
Tauhidur Rahman and Carlos Busso, "A personalized emotion recognition system using an unsupervised feature adaptation scheme," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan, March 2012, pp. 5117-5120. [pdf] [cited] [bib] [poster]
Jinesh J. Jain and Carlos Busso, "Assessment of driver's distraction using perceptual evaluations, self assessments and multimodal feature analysis," in 5th Biennial Workshop on DSP for In-Vehicle Systems, Kiel, Germany, September 2011. [pdf] [cited] [bib] [slides]
Tauhidur Rahman, Soroosh Mariooryad, Shalini Keshavamurthy, Gang Liu, John H.L. Hansen, and Carlos Busso, "Detecting sleepiness by fusing classifiers trained with novel acoustic features," in 12th Annual Conference of the International Speech Communication Association (Interspeech-2011), Florence, Italy, August 2011, pp. 3285-3288. [pdf] [cited] [bib] [slides]
Xing Fan, Carlos Busso, and John H.L. Hansen, "Audio-visual isolated digit recognition for whispered speech," in European Signal Processing Conference (EUSIPCO-2011), Barcelona, Spain, August-September 2011, pp. 1500-1503. [pdf] [cited] [bib] [slides]
Jinesh J. Jain and Carlos Busso, "Analysis of driver behaviors during common tasks using frontal video camera and CAN-Bus information," IEEE International Conference on Multimedia and Expo (ICME 2011), Barcelona, Spain, July 2011. [pdf] [cited] [bib] [poster] [slides] [Youtube]
Hewlett Packard Best Paper Award at ICME2011!
Carlos Busso, Angeliki Metallinou, and Shrikanth S. Narayanan, "Iterative feature normalization for emotional speech detection," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011) Prague, Czech Republic, May 2011, pp. 5692-5695. [pdf] [cited] [bib] [poster]
Angeliki Metallinou, Chi-Chun Lee, Carlos Busso, Sharon Carnicke, and Shrikanth S. Narayanan, "The USC CreativeIT database: A multimodal database of theatrical improvisation," in Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010), Valletta, Malta, May 2010. [pdf] [cited] [bib]
Angeliki Metallinou, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Visual emotion recognition using compact facial representations and viseme information," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, TX, USA, March 2010, pp. 2474-2477. [pdf] [cited] [bib] [poster]
Emily Mower, Angeliki Metallinou, Chi-Chun Lee, Abe Kazemzadeh, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Interpreting ambiguous emotional expressions," in International Conference on Affective Computing and Intelligent Interaction (ACII 2009), Amsterdam, The Netherlands, September 2009. [pdf] [cited] [bib]
Chi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Emotion recognition using a hierarchical binary decision tree approach," in Interspeech 2009, Brighton, UK, September 2009, pp. 320-323. [pdf] [cited] [bib] [slides]
Chi-Chun Lee, Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions," in Interspeech 2009, Brighton, UK, September 2009, pp. 1983-1986. [pdf] [cited] [bib] [poster]
Carlos Busso and Shrikanth S. Narayanan, "The expression and perception of emotions: Comparing assessments of self versus others," in Interspeech 2008 - Eurospeech, Brisbane, Australia, September 2008, pp. 257-260. [pdf] [cited] [bib] [poster]
Carlos Busso and Shrikanth S. Narayanan, "Scripted dialogs versus improvisation: Lessons learned about emotional elicitation techniques from the IEMOCAP database," in Interspeech 2008 - Eurospeech, Brisbane, Australia, September 2008, pp. 1670-1673. [pdf] [cited] [bib] [poster]
Carlos Busso and Shrikanth S. Narayanan, "Recording audio-visual emotional databases from actors: a closer look," in Second International Workshop on Emotion: Corpora for Research on Emotion and Affect, International conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 2008, pp. 17-22. [pdf] [cited] [bib] [slides]
Carlos Busso and Shrikanth S. Narayanan, "Joint analysis of the emotional fingerprint in the face and speech: A single subject study," in International Workshop on Multimedia Signal Processing (MMSP 2007), Chania, Crete, Greece, October 2007, pp. 43-47. [pdf] [cited] [bib] [poster]
Viktor Rozgic, Carlos Busso, Panayiotis G. Georgiou, and Shrikanth S. Narayanan, "Multimodal meeting monitoring: Improvements on speaker tracking and segmentation through a modified mixture particle filter," in International Workshop on Multimedia Signal Processing (MMSP 2007), Chania, Crete, Greece, October 2007, pp. 60-65. [pdf] [cited] [bib]
Carlos Busso, Sungbok Lee, and Shrikanth S. Narayanan, "Using neutral speech models for emotional speech analysis," in Interspeech 2007 - Eurospeech, Antwerp, Belgium, August 2007, pp. 2225-2228. [pdf] [cited] [bib] [poster]
Carlos Busso, Panayiotis G. Georgiou, and Shrikanth S. Narayanan, "Real-time monitoring of participants interaction in a meeting using audio-visual sensors," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), vol. 2, Honolulu, HI, USA, April 2007, pp. 685-688. [pdf] [cited] [bib] [slides]
Carlos Busso and Shrikanth S. Narayanan, "Interplay between linguistic and affective goals in facial expression during emotional utterances," in 7th International Seminar on Speech Production (ISSP 2006), Ubatuba-SP, Brazil, December 2006, pp. 549-556. [pdf] [cited] [bib] [poster]
Murtaza Bulut, Carlos Busso, Serdar Yildirim, Abe Kazemzadeh, Chul Min Lee, Sungbok Lee, and Shrikanth S. Narayanan, "Investigating the role of phoneme-level modifications in emotional speech resynthesis," in 9th European Conference on Speech Communication and Technology (Interspeech-2005 - Eurospeech), Lisbon, Portugal, September 2005, pp. 801-804. [pdf] [cited] [bib]
Carlos Busso, Sergi Hernanz, Chi-Wei Chu, Soon-il Kwon, Sungbok Lee, Panayiotis G. Georgiou, Isaac Cohen, and Shrikanth S. Narayanan, "Smart Room: Participant and speaker localization and identification," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), vol. 2, Philadelphia, PA, USA, March 2005, pp. 1117-1120. [pdf] [cited] [bib] [poster]
Nestor Becerra Yoma, Carlos Busso, Juan Inzunza, and Fernando Huenupan, "Packet-loss modeling with state duration constraints and VoIP based on perceptual quality maximization," in 10th International Conference on Speech and Computer (SPECOM 2005), Patras, Greece, October 2005, pp. 757-760. [pdf] [soon cited] [bib]
Carlos Busso, Zhigang Deng, Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Sungbok Lee, Ulrich Neumann, and Shrikanth S. Narayanan, "Analysis of emotion recognition using facial expressions, speech and multimodal information," in Sixth International Conference on Multimodal Interfaces ICMI 2004. State College, PA: ACM Press, October 2004, pp. 205-211. [pdf] [cited] [bib] [poster] [slides]
ICMI Ten-Year Technical Impact Award!
Zhigang Deng, Carlos Busso, Shrikanth S. Narayanan, and Ulrich Neumann, "Audio-based head motion synthesis for avatar-based telepresence systems," in ACM SIGMM 2004 Workshop on Effective Telepresence (ETP 2004). New York, NY: ACM Press, October 2004, pp. 24-30. [pdf] [cited] [bib]
Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, and Shrikanth S. Narayanan, "Emotion recognition based on phoneme classes," in 8th International Conference on Spoken Language Processing (ICSLP 04), Jeju Island, Korea, October 2004, pp. 889-892. [pdf] [cited] [bib]
Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, and Shrikanth S. Narayanan, "An acoustic study of emotions expressed in speech," in 8th International Conference on Spoken Language Processing (ICSLP 04), Jeju Island, Korea, October 2004, pp. 2193-2196. [pdf] [cited] [bib]
Nestor Becerra Yoma, Juan Hood, and Carlos Busso, "An UDP-based real time protocol for the Internet," in International Telecommunications Symposium(ITS 2002), Natal, Brazil, September 2002. [pdf] [soon cited] [bib]

Abstracts

Karen Rosero, Ali N. Salman, Carlos Busso, and Rami Hallac, "A tailored machine learning approach for cleft lip symmetry analysis," in The American Cleft Palate Craniofacial Association (ACPA 2024), Denver, CO, April 2024. [soon pdf][soon cited] [bib]
Rodrigo Cuenca, Michael J. Serafino, Gabriel P. Tortorelli, Ronald C. Faram, Kathleen E. Higgins, Sharukh S. Khajotia, Y.S. Lisa Cheng, Jacqueline M. Plemons, Victoria Woo, Carlos Busso, Kayla R. Caughlin, and Javier A. Jo, "Dual-excitation multispectral autofluorescence lifetime endoscopy for clinical label-free metabolic imaging of oral lesions," in Imaging, Therapeutics, and Advanced Technology in Head and Neck Surgery and Otolaryngology 2023, San Francisco, CA, USA, January-February 2023, vol. SPIE PC12354, pp. 12354-8. [soon pdf][soon cited] [bib]
Elvis Duran, Shuna Cheng, Rodrigo Cuenca, Beena Ahmed, Jim Ji, Vladislav V. Yakovlev, Mathias Martinez, Moustafa Al-Khalil, Hussain Al-Enazi, Y.S. Lisa Cheng, John Wright, Carlos Busso, Javier Jo, "Machine-learning driven automated discrimination of precancerous and cancerous from benign oral lesions using multispectral autofluorescence imaging endoscopy," in Imaging, Therapeutics, and Advanced Technology in Head and Neck Surgery and Otolaryngology 2022, San Francisco, CA, USA, March 2022, vol. SPIE PC11935, p. PC1193507. [soon cited] [pdf] [bib]
Elvis Duran, Shuna Cheng, Rodrigo Cuenca, Beena Ahmed, Jim Ji, Vladislav V. Yakovlev, Mathias Martinez, Moustafa Al-Khalil, Hussain Al-Enazi, Y.S. Lisa Cheng, John Wright, Carlos Busso, Javier Jo, "Computer-aided detection system for automated discrimination of precancerous and cancerous from healthy oral tissue based on multispectral autofluorescence lifetime imaging endoscopy," in Imaging, Therapeutics, and Advanced Technology in Head and Neck Surgery and Otolaryngology 2022, San Francisco, CA, USA, March 2022, vol. SPIE PC11935, p. PC1193506. [soon cited] [pdf] [bib]
Sumit Jha and Carlos Busso, "Analysis of head pose as an indicator of drivers' visual attention," in 7th Biennial Workshop on DSP for In-Vehicle Systems and Safety, Berkeley, CA, USA, October 2015. [pdf] [slides] [cited] [bib]
Serdar Yildirim, Murtaza Bulut, Carlos Busso, Chul Min Lee, Abe Kazemzadeh, Sungbok Lee, and Shrikanth S. Narayanan, "Study of acoustic correlates associate with emotional speech," J. Acoust. Soc. Am., vol. 116, p. 2481, 2004. [pdf] [cited] [bib]
Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Carlos Busso, Abe Kazemzadeh, Sungbok Lee, and Shrikanth S. Narayanan, "Effects of emotion on different phoneme classes," J. Acoust. Soc. Am., vol. 116, p. 2481, 2004. [pdf] [bib]
Murtaza Bulut, Serdar Yildirim, Sungbok Lee, Chul Min Lee, Carlos Busso, Abe Kazemzadeh, and Shrikanth S. Narayanan, "Emotion to emotion speech conversion in phoneme level," J. Acoust. Soc. Am., vol. 116, p. 2481, 2004. [pdf] [bib]

ArXiv Papers

Lucas Goncalves, Seong-Gyun Leem, Wei-Cheng Lin, Berrak Sisman, and Carlos Busso "Versatile audiovisual learning for handling single and multi modalities in emotion regression and classification tasks," ArXiv e-prints (arXiv:2305.07216), pp. 1-14, May 2023. [pdf] [cited] [bib]
Kun Zhou, Berrak Sisman, Carlos Busso, and Haizhou Li "Mixed emotion modelling for emotional voice conversion," ArXiv e-prints 2210.13756v2, pp. 1-5, October 2022. [pdf] [cited] [bib]
Yuning Qiu, Teruhisa Misu and Carlos Busso "Driving anomaly detection using conditional generative adversarial network," ArXiv e-prints 2203.08289, pp. 1-15, March 2022. [pdf] [cited] [bib]
Vidhyasaharan Sethu, Emily Mower Provost, Julien Epps, Carlos Busso, Nicholas Cummins, and Shrikanth Narayanan "The ambiguous world of emotion representation," ArXiv e-prints 1909.00360, pp. 1-19, September 2019. [pdf] [cited] [bib]
Shih-Fu Chang, Alex Hauptmann, Louis-Philippe Morency, Sameer Antani, Dick Bulterman, Carlos Busso, Joyce Chai, Julia Hirschberg, Ramesh Jain, Ketan Mayer-Patel, Reuven Meth, Raymond Mooney, Klara Nahrstedt, Shri Narayanan, Prem Natarajan, Sharon Oviatt, Balakrishnan Prabhakaran, Arnold Smeulders, Hari Sundaram, Zhengyou Zhang, Michelle Zhou "Report of 2017 NSF workshop on multimedia challenges, opportunities and research roadmaps," ArXiv e-prints (arXiv:1908.02308), pp. 1-150, August 2019. [pdf] [cited] [bib]

Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.