research-article

Dynamic 3D avatar creation from hand-held video input

Authors:
Alexandru Eugen Ichim

EPFL

EPFL
View Profile

,
Sofien Bouaziz

EPFL

EPFL
View Profile

,
Mark Pauly

EPFL

EPFL
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 34 Issue 4Article No.: 45pp 1–14https://doi.org/10.1145/2766974

Published:27 July 2015Publication History

ACM Transactions on Graphics

Abstract

We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method.

Supplemental Material

a45.mp4

mp4

20.4 MB

Download

Available for Download

zip

a45-ichim.zip (375.8 MB)

Supplemental files

References

Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. Creating a photoreal digital actor: The digital emily project. In Visual Media Production, 2009. CVMP'09. Conference for. Google ScholarDigital Library
Alexander, O., Fyffe, G., Busch, J., Yu, X., Ichikari, R., Jones, A., Debevec, P., Jimenez, J., Danvoye, E., Antionazzi, B., Eheler, M., Kysela, Z., and von der Pahlen, J. 2013. Digital ira: Creating a real-time photoreal digital actor. In ACM SIGGRAPH 2013 Posters. Google ScholarDigital Library
Amberg, B., Blake, A., Fitzgibbon, A. W., Romdhani, S., and Vetter, T. 2007. Reconstructing high quality face-surfaces using model based stereo. In ICCV.Google Scholar
Beeler, T., Bickel, B., Beardsley, P., Sumner, B., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Transactions on Graphics (TOG). Google ScholarDigital Library
Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph.. Google ScholarDigital Library
Beeler, T., Bickel, B., Noris, G., Beardsley, P., Marschner, S., Sumner, R. W., and Gross, M. 2012. Coupled 3d reconstruction of sparse facial hair and skin. ACM Trans. Graph.. Google ScholarDigital Library
Bérard, P., Bradley, D., Nitti, M., Beeler, T., and Gross, M. 2014. High-quality capture of eyes. ACM Trans. Graph. 33, 6 (Nov.), 223:1--223:12. Google ScholarDigital Library
Bermano, A. H., Bradley, D., Beeler, T., Zünd, F., Nowrouzezahrai, D., Baran, I., Sorkine, O., Pfister, H., Sumner, R. W., Bickel, B., and Gross, M. 2014. Facial performance enhancement using dynamic shape space analysis. ACM Trans. Graph.. Google ScholarDigital Library
Bickel, B., Lang, M., Botsch, M., Otaduy, M. A., and Gross, M. H. 2008. Pose-space animation and transfer of facial details. In Symposium on Computer Animation. Google ScholarDigital Library
Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. Google ScholarDigital Library
Botsch, M., Kobbelt, L., Pauly, M., Alliez, P., and Levy, B. 2010. Polygon Mesh Processing. AK Peters.Google Scholar
Bouaziz, S., Wang, Y., and Pauly, M. 2013. Online modeling for realtime facial animation. ACM Trans. Graph.. Google ScholarDigital Library
Bouaziz, S., Tagliasacchi, A., and Pauly, M. 2014. Dynamic 2d/3d registration. Eurographics Tutorial.Google Scholar
Bunnell, M. 2005. Dynamic ambient occlusion and indirect lighting. Gpu gems.Google Scholar
Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In CVPR. Google ScholarDigital Library
Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3d shape regression for real-time facial animation. ACM Trans. Graph.. Google ScholarDigital Library
Cao, C., Hou, Q., and Zhou, K. 2014. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph.. Google ScholarDigital Library
Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics. Google ScholarDigital Library
Chai, M., Zheng, C., and Zhou, K. 2014. A reduced model for interactive hairs. ACM Transactions on Graphics (July). Google ScholarDigital Library
Chambolle, A., Caselles, V., Cremers, D., Novaga, M., and Pock, T. 2010. An introduction to total variation for image analysis. Theoretical foundations and numerical methods for sparse recovery 9, 263--340.Google Scholar
Chartrand, R., and Yin, W. 2008. Iteratively reweighted algorithms for compressive sensing. In Acoustics, speech and signal processing, 2008. ICASSP 2008. IEEE international conference on, IEEE, 3869--3872.Google Scholar
Duda, R. O., and Hart, P. E. 1972. Use of the hough transformation to detect lines and curves in pictures. Commun. ACM. Google ScholarDigital Library
Frolova, D., Simakov, D., and Basri, R. 2004. Accuracy of spherical harmonic approximations for images of lambertian objects under far and near lighting. In Computer Vision-ECCV 2004.Google Scholar
Fu, W. J. 1998. Penalized Regressions: The Bridge versus the Lasso. J. Comp. Graph. Stat..Google Scholar
Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell.. Google ScholarDigital Library
Garrido, P., Valgaerts, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics. Google ScholarDigital Library
Ghosh, A., Fyffe, G., Tunwattanapong, B., Busch, J., Yu, X., and Debevec, P. 2011. Multiview face capture using polarized spherical gradient illumination. In Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
Gonzalez, R. C., and Woods, R. E. 2006. Digital Image Processing (3rd Edition). Prentice-Hall, Inc. Google ScholarDigital Library
Gray, R. M. 2006. Toeplitz and circulant matrices: A review. now publishers Inc. Google ScholarDigital Library
Hu, L., Ma, C., Luo, L., and Li, H. 2014. Robust hair capture using simulated examples. ACM Transactions on Graphics. Google ScholarDigital Library
Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. (Proc. SIGGRAPH). Google ScholarDigital Library
Jimenez, J., Echevarria, J. I., Oat, C., and Gutierrez, D. 2011. GPU Pro 2. AK Peters Ltd., ch. Practical and Realistic Facial Wrinkles Animation.Google Scholar
Kemelmacher-Shlizerman, I., and Basri, R. 2011. 3d face reconstruction from a single image using a single reference face shape. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Google ScholarDigital Library
Lewis, J. P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., and Deng, Z. 2014. Practice and Theory of Blendshape Facial Models. In EG - STARs.Google Scholar
Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph.. Google ScholarDigital Library
Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics. Google ScholarDigital Library
Li, J., Xu, W., Cheng, Z., Xu, K., and Klein, R. 2015. Lightweight wrinkle synthesis for 3d facial modeling and animation. Computer-Aided Design 58, 0, 117--122. Solid and Physical Modeling 2014.Google ScholarDigital Library
Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
Oat, C. 2007. Animated wrinkle maps. In ACM SIGGRAPH 2007 courses. Google ScholarDigital Library
Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph.. Google ScholarDigital Library
Saragih, J. M., Lucey, S., and Cohn, J. F. 2009. Face alignment through subspace constrained mean-shifts. In Computer Vision, 2009 IEEE 12th International Conference on.Google Scholar
Saragih, J. M., Lucey, S., and Cohn, J. F. 2011. Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vision. Google ScholarDigital Library
Shi, F., Wu, H.-T., Tong, X., and Chai, J. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33, 6 (Nov.), 222:1--222:13. Google ScholarDigital Library
Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Trans. Graph.. Google ScholarDigital Library
Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. Proc. of ACM SIGGRAPH Asia.Google Scholar
Venkataraman, K., Lodha, S., and Raghavan, R. 2005. A kinematic-variational model for animating skin with wrinkles. Computers & Graphics. Google ScholarDigital Library
Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models.Google Scholar
Weise, T., Li, H., Van Gool, L., and Pauly, M. 2009. Face/off: Live facial puppetry. ACM Trans. Graph..Google Scholar
Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. In ACM SIGGRAPH 2011 Papers. Google ScholarDigital Library
Wu, Y., Kalra, P., and Thalmann, N. M. 1996. Simulation of static and dynamic wrinkles of skin. In Proc. of IEEE Computer Animation. Google ScholarDigital Library
Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., and Theobalt, C. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. 33, 6 (Nov.), 200:1--200:10. Google ScholarDigital Library
Wu, C. 2013. Towards linear-time incremental structure from motion. In 3D Vision, 2013 International Conference on. Google ScholarDigital Library
Zach, C., Pock, T., and Bischof, H. 2007. A duality based approach for realtime tv-l 1 optical flow. In Pattern Recognition. Springer, 214--223. Google ScholarDigital Library
Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High-resolution capture for modeling and animation. In ACM Annual Conference on Computer Graphics.Google Scholar

Index Terms

Dynamic 3D avatar creation from hand-held video input
1. Computing methodologies
  1. Computer graphics
    1. Animation

Recommendations

Reconstruction of Personalized 3D Face Rigs from Monocular Video

We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data (e.g., vintage movies). Our rig is based on three distinct layers that allow us to model the actor’s facial ...
Read More
Blendshapes from commodity RGB-D sensors
SIGGRAPH '15: ACM SIGGRAPH 2015 Talks

Creating and animating a realistic 3D human face is an important task in computer graphics. The capability of capturing the 3D face of a human subject and reanimate it quickly will find many applications in games, training simulations, and interactive ...
Read More
Phace: physics-based face modeling and animation

We present a novel physics-based approach to facial animation. Contrary to commonly used generative methods, our solution computes facial expressions by minimizing a set of non-linear potential energies that model the physical interaction of passive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 34, Issue 4
August 2015
1307 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2809654
Issue’s Table of Contents

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 July 2015
Published in tog Volume 34, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3D avatar creation
blendshapes
face animation
rigging
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 183
  Total Citations
  View Citations
- 1,875
  Total Downloads
- Downloads (Last 12 months)147
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dynamic 3D avatar creation from hand-held video input

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Reconstruction of Personalized 3D Face Rigs from Monocular Video

Blendshapes from commodity RGB-D sensors

Phace: physics-based face modeling and animation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Dynamic 3D avatar creation from hand-held video input

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Reconstruction of Personalized 3D Face Rigs from Monocular Video

Blendshapes from commodity RGB-D sensors

Phace: physics-based face modeling and animation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media