skip to main content
10.1145/2856767.2856785acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article
Public Access

AutoManner: An Automated Interface for Making Public Speakers Aware of Their Mannerisms

Published:07 March 2016Publication History

ABSTRACT

Many individuals exhibit unconscious body movements called mannerisms while speaking. These repeated changes often distract the audience when not relevant to the verbal context. We present an intelligent interface that can automatically extract human gestures using Microsoft Kinect to make speakers aware of their mannerisms. We use a sparsity-based algorithm, Shift Invariant Sparse Coding, to automatically extract the patterns of body movements. These patterns are displayed in an interface with subtle question and answer-based feedback scheme that draws attention to the speaker's body language. Our formal evaluation with 27 participants shows that the users became aware of their body language after using the system. In addition, when independent observers annotated the accuracy of the algorithm for every extracted pattern, we find that the patterns extracted by our algorithm is significantly (p<0.001) more accurate than just random selection. This represents a strong evidence that the algorithm is able to extract human-interpretable body movement patterns. An interactive demo of AutoManner is available at http://tinyurl.com/AutoManner.

References

  1. Aggarwal, J. K., et al. Human activity analysis: A review. ACM Computing Surveys (CSUR) (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Anderson, K., André, E., Baur, T., Bernardini, S., Chollet, M., Chryssafidou, E., Damian, I., Ennis, C., Egges, A., Gebhard, P., et al. The tardis framework: intelligent virtual agents for social coaching in job interviews. In Advances in Computer Entertainment. Springer, 2013, 476--491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Batrinca, L., Stratou, G., Shapiro, A., Morency, L.-P., and Scherer, S. Cicero-towards a multimodal virtual audience platform for public speaking training. In Intelligent Virtual Agents, Springer (2013), 116--128.Google ScholarGoogle Scholar
  4. Battiti, R. Accelerated backpropagation learning: Two optimization methods. Complex systems 3, 4 (1989), 331--342.Google ScholarGoogle Scholar
  5. Beck, A., and Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences 2, 1 (2009), 183--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, L., Feng, G., Joe, J., Leong, C. W., Kitchen, C., and Lee, C. M. Towards automated assessment of public speaking skills using multimodal cues. In Proceedings of the 16th International Conference on Multimodal Interaction, ACM (2014), 200--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cheng, G., et al. Advances in human action recognition: A survey. arXiv preprint arXiv:1501.05964 (2015).Google ScholarGoogle Scholar
  8. Chollet, M., Wörtwein, T., Morency, L.-P., Shapiro, A., and Scherer, S. Exploring feedback strategies to improve public speaking: an interactive virtual audience framework. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, ACM (2015), 1143--1154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D'Arcy, J. Technically speaking: A guide for communicating complex information. Battelle Press Columbus, OH, 1998.Google ScholarGoogle Scholar
  10. de Gelder, B. Why bodies' twelve reasons for including bodily expressions in affective neuroscience. Philosophical Transactions of the Royal Society B: Biological Sciences 364, 1535 (2009), 3475--3484.Google ScholarGoogle ScholarCross RefCross Ref
  11. DiMatteo, M. R., Hays, R. D., and Prince, L. M. Relationship of physicians' nonverbal communication skill to patient satisfaction, appointment noncompliance, and physician workload. Health Psychology 5, 6 (1986), 581.Google ScholarGoogle ScholarCross RefCross Ref
  12. Fay, M. P., and Proschan, M. A. Wilcoxon-mann-whitney or t-test? on assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics surveys 4 (2010), 1.Google ScholarGoogle Scholar
  13. Hoogterp, B. Your Perfect Presentation: Speak in Front of Any Audience Anytime Anywhere and Never Be Nervous Again. McGraw-Hill Education, 2014.Google ScholarGoogle Scholar
  14. Hoque, M. E., Courgeon, M., Martin, J.-C., Mutlu, B., and Picard, R. W. Mach: My automated conversation coach. In Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, ACM (2013), 697--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hotelling, H. Analysis of a complex of statistical variables into principal components. Journal of educational psychology 24, 6 (1933), 417.Google ScholarGoogle Scholar
  16. Knapp, M., Hall, J., and Horgan, T. Nonverbal communication in human interaction. Cengage Learning, 2013.Google ScholarGoogle Scholar
  17. Likert, R. A technique for the measurement of attitudes. Archives of psychology (1932).Google ScholarGoogle Scholar
  18. Lucas, S. E. The art of public speaking. International Book Publishing Company, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Metaxas, D., and Zhang, S. A review of motion analysis methods for human nonverbal communication computing. Image and Vision Computing (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mitra, T., Hutto, C., and Gilbert, E. Comparing person-and process-centric strategies for obtaining quality data on amazon mechanical turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, ACM (2015), 1345--1354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mørup, M., et al. Shift invariant sparse coding of image and music data. Tech. Rep. IMM2008-04659, Technical University of Denmark, 2008.Google ScholarGoogle Scholar
  22. Murphy, J. The power of your subconscious mind. Courier Corporation, 2012.Google ScholarGoogle Scholar
  23. Naim, I., Tanveer, M. I., Gildea, D., and Hoque, M. E. Automated prediction and analysis of job interview performance: The role of what you say and how you say it. Automatic Face and Gesture Recognition (FG) (2015).Google ScholarGoogle Scholar
  24. Nguyen, A.-T., Chen, W., and Rauterberg, M. Online feedback system for public speakers. In E-Learning, E-Management and E-Services (IS3e), 2012 IEEE Symposium on, IEEE (2012), 1--5.Google ScholarGoogle Scholar
  25. Niebles, J. C., Wang, H., and Fei-Fei, L. Unsupervised learning of human action categories using spatial-temporal words. International journal of computer vision 79, 3 (2008), 299--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Park, S., Shoemark, P., and Morency, L.-P. Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization. In Proceedings of the 19th international conference on Intelligent User Interfaces, ACM (2014), 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pfister, T., and Robinson, P. Real-time recognition of affective states from nonverbal features of speech and its application for public speaking skill analysis. Affective Computing, IEEE Transactions on 2, 2 (2011), 66--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ranganath, R., Jurafsky, D., and McFarland, D. It's not you, it's me: detecting flirting and its misperception in speed-dates. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, Association for Computational Linguistics (2009), 334--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Schreiber, L. M., Paul, G. D., and Shibley, L. R. The development and test of the public speaking competence rubric. Communication Education 61, 3 (2012), 205--233.Google ScholarGoogle ScholarCross RefCross Ref
  30. Shim, H. S., Park, S., Chatterjee, M., Scherer, S., Sagae, K., and Morency, L.-P. Acoustic and para-verbal indicators of persuasiveness in social multimedia. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, IEEE (2015), 2239--2243.Google ScholarGoogle ScholarCross RefCross Ref
  31. Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., and Moore, R. Real-time human pose recognition in parts from single depth images. Communications of the ACM 56, 1 (2013), 116--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Strangert, E., and Gustafson, J. What makes a good speaker? subject ratings, acoustic measurements and perceptual evaluations. In INTERSPEECH, vol. 8 (2008), 1688--1691.Google ScholarGoogle Scholar
  33. Tanaka, H., Sakti, S., Neubig, G., Toda, T., Negoro, H., Iwasaka, H., and Nakamura, S. Automated social skills trainer. In Proceedings of the 20th International Conference on Intelligent User Interfaces, ACM (2015), 17--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tanveer, M. I., Lin, E., and Hoque, M. E. Rhema: A real-time in-situ intelligent interface to help people with public speaking. In Proceedings of the 20th International Conference on Intelligent User Interfaces, ACM (2015), 286--295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Tanveer, M. I., Liu, J., and Hoque, M. E. Unsupervised extraction of human-interpretable nonverbal behavioral cues in a public speaking scenario. In ACM Multimedia (ACMMM'15) (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Toastmasters International. Gestures: Your body speaks. Online Document. Available at http://web.mst.edu/?toast/docs/Gestures.pdf, 2011.Google ScholarGoogle Scholar
  37. Vinciarelli, A., Pantic, M., and Bourlard, H. Social signal processing: Survey of an emerging domain. Image and Vision Computing 27, 12 (2009), 1743--1759. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wilson, T. D. Strangers to ourselves. Harvard University Press, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  39. Zhang, Z. Microsoft kinect sensor and its effect. MultiMedia, IEEE 19, 2 (2012), 4--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zhou, F., et al. Aligned cluster analysis for temporal segmentation of human motion. In FG'08 (2008).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. AutoManner: An Automated Interface for Making Public Speakers Aware of Their Mannerisms

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IUI '16: Proceedings of the 21st International Conference on Intelligent User Interfaces
        March 2016
        446 pages
        ISBN:9781450341370
        DOI:10.1145/2856767

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 March 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        IUI '16 Paper Acceptance Rate49of194submissions,25%Overall Acceptance Rate746of2,811submissions,27%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader