skip to main content
research-article

Gaze correction for home video conferencing

Published:01 November 2012Publication History
Skip Abstract Section

Abstract

Effective communication using current video conferencing systems is severely hindered by the lack of eye contact caused by the disparity between the locations of the subject and the camera. While this problem has been partially solved for high-end expensive video conferencing systems, it has not been convincingly solved for consumer-level setups. We present a gaze correction approach based on a single Kinect sensor that preserves both the integrity and expressiveness of the face as well as the fidelity of the scene as a whole, producing nearly artifact-free imagery. Our method is suitable for mainstream home video conferencing: it uses inexpensive consumer hardware, achieves real-time performance and requires just a simple and short setup. Our approach is based on the observation that for our application it is sufficient to synthesize only the corrected face. Thus we render a gaze-corrected 3D model of the scene and, with the aid of a face tracker, transfer the gaze-corrected facial portion in a seamless manner onto the original image.

References

  1. Argyle, M., and Cook, M. 1976. Gaze and mutual gaze. Cambridge University Press.Google ScholarGoogle Scholar
  2. Cham, T.-J., Krishnamoorthy, S., and Jones, M. 2002. Analogous view transfer for gaze correction in video sequences. In ICARCV, vol. 3, 1415--1420.Google ScholarGoogle Scholar
  3. Chen, M. 2002. Leveraging the asymmetric sensitivity of eye contact for videoconference. In CHI, 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Criminisi, A., Shotton, J., Blake, A., and Torr, P. H. S. 2003. Gaze manipulation for one-to-one teleconferencing. In ICCV, 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dale, K., Sunkavalli, K., Johnson, M. K., Vlasic, D., Matusik, W., and Pfister, H. 2011. Video face replacement. In SIGGRAPH Asia, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gemmell, J., Toyama, K., Zitnick, C. L., Kang, T., and Seitz, S. 2000. Gaze awareness for video-conferencing: A software approach. IEEE MultiMedia 7, 26--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Grayson, D. M., and Monk, A. F. 2003. Are you looking at me? eye contact and desktop video conferencing. ACM Trans. Comput.-Hum. Interact. 10, 221--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gross, M., Würmlin, S., Naef, M., Lamboray, E., Spagno, C., Kunz, A., Koller-Meier, E., Svoboda, T., Van Gool, L., Lang, S., Strehlke, K., Moere, A. V., and Staadt, O. 2003. Blue-c: a spatially immersive display and 3D video portal for telepresence. In SIGGRAPH, 819--827. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ishii, H., and Kobayashi, M. 1992. Clearboard: a seamless medium for shared drawing and conversation with eye contact. In CHI, 525--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jones, A., Lang, M., Fyffe, G., Yu, X., Busch, J., McDowall, I., Bolas, M., and Debevec, P. 2009. Achieving eye contact in a one-to-many 3D video teleconferencing system. In SIGGRAPH, 64:1--64:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kuster, C., Popa, T., Zach, C., Gotsman, C., and Gross, M. 2011. FreeCam: a hybrid camera system for interactive free-viewpoint video. In VMV, 17--24.Google ScholarGoogle Scholar
  12. Macrae, C. N., Hood, B., Milne, A. B., Rowe, A. C., and Mason, M. F. 2002. Are you looking at me? eye gaze and person perception. In Psychological Science, 460--464.Google ScholarGoogle Scholar
  13. Matusik, W., and Pfister, H. 2004. 3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes. In SIGGRAPH, 814--824. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In SIGGRAPH, 369--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Microsoft, 2010. http://www.xbox.com/en-US/kinect.Google ScholarGoogle Scholar
  16. Monk, A. F., and Gale, C. 2002. A look is worth a thousand words: Full gaze awareness in video-mediated conversation. Discourse Processes 33, 3, 257--278.Google ScholarGoogle ScholarCross RefCross Ref
  17. Mukawa, N., Oka, T., Arai, K., and Yuasa, M. 2005. What is connected by mutual gaze?: user's behavior in video-mediated communication. In CHI, 1677--1680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nguyen, D., and Canny, J. 2005. Multiview: spatially faithful group video conferencing. In CHI, 799--808. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Okada, K.-I., Maeda, F., Ichikawaa, Y., and Matsushita, Y. 1994. Multiparty videoconferencing at virtual social distance: Majic design. In Proc. Conference on Computer supported cooperative work (CSW), 385--393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Petit, B., Lesage, J.-D., Menier, C., Allard, J., Franco, J.-S., Raffin, B., Boyer, E., and Faure, F. 2010. Multi-camera real-time 3D modeling for telepresence and remote collaboration. Intern. Journ. of Digital Multi. Broadcasting.Google ScholarGoogle Scholar
  21. Saragih, J., Lucey, S., and Cohn, J. 2011. Deformable model fitting by regularized landmark mean-shift. IJCV 91, 200--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Stokes, R. 1969. Human factors and appearance design considerations of the mod II picturephone station set. IEEE Transactions on Communication Technology 17, 2, 318--323.Google ScholarGoogle ScholarCross RefCross Ref
  23. Yang, R., and Zhang, Z. 2002. Eye gaze correction with stereovision for video-teleconferencing. In ECCV, 479--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yip, B., and Jin, J. S. 2003. Face re-orientation in video conference using ellipsoid model. In OZCHI, 167--173.Google ScholarGoogle Scholar
  25. Zhu, J., Yang, R., and Xiang, X. 2011. Eye contact in video conference via fusion of time-of-flight depth sensor and stereo. 3D Research 2, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. SIGGRAPH 23, 600--608. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Gaze correction for home video conferencing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 31, Issue 6
      November 2012
      794 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2366145
      Issue’s Table of Contents

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 November 2012
      Published in tog Volume 31, Issue 6

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader