Abstract
We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method.
Supplemental Material
Available for Download
Supplemental files
- Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. Creating a photoreal digital actor: The digital emily project. In Visual Media Production, 2009. CVMP'09. Conference for. Google ScholarDigital Library
- Alexander, O., Fyffe, G., Busch, J., Yu, X., Ichikari, R., Jones, A., Debevec, P., Jimenez, J., Danvoye, E., Antionazzi, B., Eheler, M., Kysela, Z., and von der Pahlen, J. 2013. Digital ira: Creating a real-time photoreal digital actor. In ACM SIGGRAPH 2013 Posters. Google ScholarDigital Library
- Amberg, B., Blake, A., Fitzgibbon, A. W., Romdhani, S., and Vetter, T. 2007. Reconstructing high quality face-surfaces using model based stereo. In ICCV.Google Scholar
- Beeler, T., Bickel, B., Beardsley, P., Sumner, B., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Transactions on Graphics (TOG). Google ScholarDigital Library
- Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph.. Google ScholarDigital Library
- Beeler, T., Bickel, B., Noris, G., Beardsley, P., Marschner, S., Sumner, R. W., and Gross, M. 2012. Coupled 3d reconstruction of sparse facial hair and skin. ACM Trans. Graph.. Google ScholarDigital Library
- Bérard, P., Bradley, D., Nitti, M., Beeler, T., and Gross, M. 2014. High-quality capture of eyes. ACM Trans. Graph. 33, 6 (Nov.), 223:1--223:12. Google ScholarDigital Library
- Bermano, A. H., Bradley, D., Beeler, T., Zünd, F., Nowrouzezahrai, D., Baran, I., Sorkine, O., Pfister, H., Sumner, R. W., Bickel, B., and Gross, M. 2014. Facial performance enhancement using dynamic shape space analysis. ACM Trans. Graph.. Google ScholarDigital Library
- Bickel, B., Lang, M., Botsch, M., Otaduy, M. A., and Gross, M. H. 2008. Pose-space animation and transfer of facial details. In Symposium on Computer Animation. Google ScholarDigital Library
- Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. Google ScholarDigital Library
- Botsch, M., Kobbelt, L., Pauly, M., Alliez, P., and Levy, B. 2010. Polygon Mesh Processing. AK Peters.Google Scholar
- Bouaziz, S., Wang, Y., and Pauly, M. 2013. Online modeling for realtime facial animation. ACM Trans. Graph.. Google ScholarDigital Library
- Bouaziz, S., Tagliasacchi, A., and Pauly, M. 2014. Dynamic 2d/3d registration. Eurographics Tutorial.Google Scholar
- Bunnell, M. 2005. Dynamic ambient occlusion and indirect lighting. Gpu gems.Google Scholar
- Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In CVPR. Google ScholarDigital Library
- Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3d shape regression for real-time facial animation. ACM Trans. Graph.. Google ScholarDigital Library
- Cao, C., Hou, Q., and Zhou, K. 2014. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph.. Google ScholarDigital Library
- Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics. Google ScholarDigital Library
- Chai, M., Zheng, C., and Zhou, K. 2014. A reduced model for interactive hairs. ACM Transactions on Graphics (July). Google ScholarDigital Library
- Chambolle, A., Caselles, V., Cremers, D., Novaga, M., and Pock, T. 2010. An introduction to total variation for image analysis. Theoretical foundations and numerical methods for sparse recovery 9, 263--340.Google Scholar
- Chartrand, R., and Yin, W. 2008. Iteratively reweighted algorithms for compressive sensing. In Acoustics, speech and signal processing, 2008. ICASSP 2008. IEEE international conference on, IEEE, 3869--3872.Google Scholar
- Duda, R. O., and Hart, P. E. 1972. Use of the hough transformation to detect lines and curves in pictures. Commun. ACM. Google ScholarDigital Library
- Frolova, D., Simakov, D., and Basri, R. 2004. Accuracy of spherical harmonic approximations for images of lambertian objects under far and near lighting. In Computer Vision-ECCV 2004.Google Scholar
- Fu, W. J. 1998. Penalized Regressions: The Bridge versus the Lasso. J. Comp. Graph. Stat..Google Scholar
- Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell.. Google ScholarDigital Library
- Garrido, P., Valgaerts, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics. Google ScholarDigital Library
- Ghosh, A., Fyffe, G., Tunwattanapong, B., Busch, J., Yu, X., and Debevec, P. 2011. Multiview face capture using polarized spherical gradient illumination. In Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
- Gonzalez, R. C., and Woods, R. E. 2006. Digital Image Processing (3rd Edition). Prentice-Hall, Inc. Google ScholarDigital Library
- Gray, R. M. 2006. Toeplitz and circulant matrices: A review. now publishers Inc. Google ScholarDigital Library
- Hu, L., Ma, C., Luo, L., and Li, H. 2014. Robust hair capture using simulated examples. ACM Transactions on Graphics. Google ScholarDigital Library
- Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. (Proc. SIGGRAPH). Google ScholarDigital Library
- Jimenez, J., Echevarria, J. I., Oat, C., and Gutierrez, D. 2011. GPU Pro 2. AK Peters Ltd., ch. Practical and Realistic Facial Wrinkles Animation.Google Scholar
- Kemelmacher-Shlizerman, I., and Basri, R. 2011. 3d face reconstruction from a single image using a single reference face shape. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Google ScholarDigital Library
- Lewis, J. P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., and Deng, Z. 2014. Practice and Theory of Blendshape Facial Models. In EG - STARs.Google Scholar
- Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph.. Google ScholarDigital Library
- Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics. Google ScholarDigital Library
- Li, J., Xu, W., Cheng, Z., Xu, K., and Klein, R. 2015. Lightweight wrinkle synthesis for 3d facial modeling and animation. Computer-Aided Design 58, 0, 117--122. Solid and Physical Modeling 2014.Google ScholarDigital Library
- Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
- Oat, C. 2007. Animated wrinkle maps. In ACM SIGGRAPH 2007 courses. Google ScholarDigital Library
- Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph.. Google ScholarDigital Library
- Saragih, J. M., Lucey, S., and Cohn, J. F. 2009. Face alignment through subspace constrained mean-shifts. In Computer Vision, 2009 IEEE 12th International Conference on.Google Scholar
- Saragih, J. M., Lucey, S., and Cohn, J. F. 2011. Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vision. Google ScholarDigital Library
- Shi, F., Wu, H.-T., Tong, X., and Chai, J. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33, 6 (Nov.), 222:1--222:13. Google ScholarDigital Library
- Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Trans. Graph.. Google ScholarDigital Library
- Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. Proc. of ACM SIGGRAPH Asia.Google Scholar
- Venkataraman, K., Lodha, S., and Raghavan, R. 2005. A kinematic-variational model for animating skin with wrinkles. Computers & Graphics. Google ScholarDigital Library
- Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models.Google Scholar
- Weise, T., Li, H., Van Gool, L., and Pauly, M. 2009. Face/off: Live facial puppetry. ACM Trans. Graph..Google Scholar
- Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. In ACM SIGGRAPH 2011 Papers. Google ScholarDigital Library
- Wu, Y., Kalra, P., and Thalmann, N. M. 1996. Simulation of static and dynamic wrinkles of skin. In Proc. of IEEE Computer Animation. Google ScholarDigital Library
- Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., and Theobalt, C. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. 33, 6 (Nov.), 200:1--200:10. Google ScholarDigital Library
- Wu, C. 2013. Towards linear-time incremental structure from motion. In 3D Vision, 2013 International Conference on. Google ScholarDigital Library
- Zach, C., Pock, T., and Bischof, H. 2007. A duality based approach for realtime tv-l 1 optical flow. In Pattern Recognition. Springer, 214--223. Google ScholarDigital Library
- Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High-resolution capture for modeling and animation. In ACM Annual Conference on Computer Graphics.Google Scholar
Index Terms
- Dynamic 3D avatar creation from hand-held video input
Recommendations
Reconstruction of Personalized 3D Face Rigs from Monocular Video
We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data (e.g., vintage movies). Our rig is based on three distinct layers that allow us to model the actor’s facial ...
Blendshapes from commodity RGB-D sensors
SIGGRAPH '15: ACM SIGGRAPH 2015 TalksCreating and animating a realistic 3D human face is an important task in computer graphics. The capability of capturing the 3D face of a human subject and reanimate it quickly will find many applications in games, training simulations, and interactive ...
Phace: physics-based face modeling and animation
We present a novel physics-based approach to facial animation. Contrary to commonly used generative methods, our solution computes facial expressions by minimizing a set of non-linear potential energies that model the physical interaction of passive ...
Comments