Abstract
Tracing 20 years of progress in making machines hear our emotions based on speech signal properties.
- Abdelwahab, M. and Busso, C. Supervised domain adaptation for emotion recognition from speech. In Proceedings of ICASSP. (Brisbane, Australia, 2015). IEEE, 5058--5062.Google ScholarCross Ref
- Anagnostopoulos, C.-N., Iliou, T. and Giannoukos, I. Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artificial Intelligence Review 43, 2 (2015), 155--177. Google ScholarDigital Library
- Bhaykar, M., Yadav, J. and Rao, K.S. Speaker dependent, speaker independent and cross language emotion recognition from speech using GMM and HMM. In Proceedings of the National Conference on Communications. (Delhi, India, 2013). IEEE, 1--5.Google ScholarCross Ref
- Blanton, S. The voice and the emotions. Q. Journal of Speech 1, 2 (1915), 154--172.Google Scholar
- Chang, J. and Scherer, S. Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks. arxiv.org, (arXiv:1705.02394), 2017.Google Scholar
- Chen, L., Mao, X., Xue, Y. and Cheng, L.L. Speech emotion recognition: Features and classification models. Digital Signal Processing 22, 6 (2012), 1154--1160. Google ScholarDigital Library
- Cibau, N.E., Albornoz. E.M., and Rufiner, H.L. Speech emotion recognition using a deep autoencoder. San Carlos de Bariloche, Argentina, 2013, 934--939.Google Scholar
- Darwin, C. The Expression of Emotion in Man and Animals. Watts, 1948.Google Scholar
- Davis, A., Rubinstein, M., Wadhwa, N., Mysore, G. J., Durand, F. and Freeman, W.T. The visual microphone: Passive recovery of sound from video. ACM Trans. Graphics 33, 4 (2014), 1--10. Google ScholarDigital Library
- Dellaert, F., Polzin, T. and Waibel, A. Recognizing emotion in speech. In Proceedings of ICSLP 3, (Philadelphia, PA, 1996). IEEE, 1970--1973.Google ScholarCross Ref
- Deng, J. Feature Transfer Learning for Speech Emotion Recognition. PhD thesis, Dissertation, Technische Universität München, Germany, 2016.Google Scholar
- Deng, J., Xu, X., Zhang, Z., Frühholz, S., and Schuller B. Semisupervised Autoencoders for Speech Emotion Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 1 (2018), 31--43. Google ScholarDigital Library
- Devillers, L., Vidrascu, L. and Lamel, L. Challenges in real-life emotion annotation and machine learning based detection. Neural Networks 18, 4 (2005), 407--422. Google ScholarDigital Library
- Dhall, A., Goecke, R., Joshi, J., Sikka, K. and Gedeon, T. Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In Proceedings of ICMI (Istanbul, Turkey, 2014). ACM, 461--466. Google ScholarDigital Library
- El Ayadi, M., Kamel, M.S., and Karray, F. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, 3 (2011), 572--587. Google ScholarDigital Library
- Fairbanks, G. and Pronovost, W. Vocal pitch during simulated emotion. Science 88, 2286 (1938), 382--383.Google ScholarCross Ref
- Gunes, H. and Schuller, B. Categorical and dimensional affect analysis in continuous input: Current trends and future directions. Image and Vision Computing 31, 2 (2013), 120--136. Google ScholarDigital Library
- Joachims, T. Learning to classify text using support vector machines: Methods, theory and algorithms. Kluwer Academic Publishers, 2002. Google ScholarDigital Library
- Kim, Y., Lee, H. and Provost, E.M. Deep learning for robust feature generation in audiovisual emotion recognition. In Proceedings of ICASSP, (Vancouver, Canada, 2013). IEEE, 3687--3691.Google ScholarCross Ref
- Koolagudi, S.G. and Rao, K.S. Emotion recognition from speech: A review. Intern. J. of Speech Technology 15, 2 (2012), 99--117. Google ScholarDigital Library
- Kramer, E. Elimination of verbal cues in judgments of emotion from voice. The J. Abnormal and Social Psychology 68, 4 (1964), 390.Google ScholarCross Ref
- Kraus, M.W. Voice-only communication enhances empathic accuracy. American Psychologist 72, 7 (2017), 644.Google ScholarCross Ref
- Lee, C.M., Narayanan, S.S., and Pieraccini, R. Combining acoustic and language information for emotion recognition. In Proceedings of INTERSPEECH, (Denver, CO, 2002). ISCA, 873--876.Google ScholarCross Ref
- Leng, Y., Xu, X., and Qi, G. Combining active learning and semi-supervised learning to construct SVM classifier. Knowledge-Based Systems 44 (2013), 121--131. Google ScholarDigital Library
- Liu, J., Chen, C., Bu, J., You, M. and Tao, J. Speech emotion recognition using an enhanced co-training algorithm. In Proceedings ICME. (Beijing, P.R. China, 2007). IEEE, 999--1002.Google ScholarCross Ref
- Lotfian, R. and Busso, C. Emotion recognition using synthetic speech as neutral reference. In Proceedings of ICASSP. (Brisbane, Australia, 2015). IEEE, 4759--4763.Google ScholarCross Ref
- Mao, Q., Dong, M., Huang, Z. and Zhan, Y. Learning salient features for speech emotion recognition using convolutional neural networks. IEEE Trans. Multimedia 16, 8 (2014), 2203--2213.Google ScholarCross Ref
- Marsella, S. and Gratch, J. Computationally modeling human emotion. Commun. ACM 57, 12 (Dec. 2014), 56--67. Google ScholarDigital Library
- Picard, R.W. and Picard, R. Affective Computing, vol. 252. MIT Press Cambridge, MA, 1997. Google ScholarDigital Library
- Ram, C.S. and Ponnusamy, R. Assessment on speech emotion recognition for autism spectrum disorder children using support vector machine. World Applied Sciences J. 34, 1 (2016), 94--102.Google Scholar
- Schmitt, M., Ringeval, F. and Schuller, B. At the border of acoustics and linguistics: Bag-of-audio-words for the recognition of emotions in speech. In Proceedings of INTERSPEECH. (San Francisco, CA, 2016). ISCA, 495--499.Google ScholarCross Ref
- Schuller, B. and Batliner, A. Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. Wiley, 2013. Google ScholarDigital Library
- Schuller, B, Mousa, A. E.-D., and Vasileios, V. Sentiment analysis and opinion mining: On optimal parameters and performances. WIREs Data Mining and Knowledge Discovery (2015), 5:255--5:263. Google ScholarDigital Library
- Soskin, W.F. and Kauffman, P.E. Judgment of emotion in word-free voice samples. J. of Commun. 11, 2 (1961), 73--80.Google ScholarCross Ref
- Stuhlsatz, A., Meyer, C., Eyben, F., Zielke, T., Meier, G. and Schuller, B. Deep neural networks for acoustic emotion recognition: Raising the benchmarks. In Proceedings of ICASSP. (Prague, Czech Republic, 2011). IEEE,5688--5691.Google ScholarCross Ref
- Tosa, N. and Nakatsu, R. Life-like communication agent-emotion sensing character 'MIC' and feeling session character 'MUSE.' In Proceedings of the 3rd International Conference on Multimedia Computing and Systems. (Hiroshima, Japan, 1996). IEEE, 12--19. Google ScholarDigital Library
- Trigeorgis, G., Ringeval, F., Brückner, R., Marchi, E., Nicolaou, M., Schuller, B. and Zafeiriou, S. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network. In Proceedings of ICASSP. (Shanghai, P.R. China, 2016). IEEE, 5200--5204.Google ScholarDigital Library
- Ververidis, D. and Kotropoulos, C. Emotional speech recognition: Resources, features, and methods. Speech Commun. 48, 9 (2006), 1162--1181.Google ScholarCross Ref
- Watson, D., Clark, L.A., and Tellegen, A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J. of Personality and Social Psychology 54, 6 (1988), 1063.Google ScholarCross Ref
- Weninger, F., Eyben, F., Schuller, B.W., Mortillaro, M., and Scherer, K.R. On the acoustics of emotion in audio: What speech, music and sound have in common. Frontiers in Psychology 4, Article ID 292 (2013), 1--12.Google Scholar
- Williamson, J. Speech analyzer for analyzing pitch or frequency perturbations in individual speech pattern to determine the emotional state of the person. U.S. Patent 4,093,821, 1978.Google Scholar
- Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E. and Cowie, R. Abandoning emotion classes--- Towards continuous emotion recognition with modeling of long-range dependencies. In Proceedings of INTERSPEECH. (Brisbane, Australia, 2008). ISCA, 597--600.Google ScholarCross Ref
- Zeng, Z., Pantic, M., Roisman, G.I., and Huang, T.S. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Trans. Pattern Analysis and Machine Intelligence 31, 1 (2009), 39--58. Google ScholarDigital Library
Index Terms
- Speech emotion recognition: two decades in a nutshell, benchmarks, and ongoing trends
Recommendations
Emotion recognition from speech: a review
Emotion recognition from speech has emerged as an important research area in the recent past. In this regard, review of existing work on emotional speech processing is useful for carrying out further research. In this paper, the recent literature on ...
Application of Emotion Recognition and Modification for Emotional Telugu Speech Recognition
AbstractMajority of the automatic speech recognition systems (ASR) are trained with neutral speech and the performance of these systems are affected due to the presence of emotional content in the speech. The recognition of these emotions in human speech ...
Comments