research-article

High-quality streamable free-viewpoint video

Authors:
Alvaro Collet

Microsoft Corporation

Microsoft Corporation
View Profile

,
Ming Chuang

Microsoft Corporation

Microsoft Corporation
View Profile

,
Pat Sweeney

Microsoft Corporation

Microsoft Corporation
View Profile

,
Don Gillett

Microsoft Corporation

Microsoft Corporation
View Profile

,
Dennis Evseev

Microsoft Corporation

Microsoft Corporation
View Profile

,
David Calabrese

Microsoft Corporation

Microsoft Corporation
View Profile

,
Hugues Hoppe

Microsoft Corporation

Microsoft Corporation
View Profile

,
Adam Kirk

Microsoft Corporation

Microsoft Corporation
View Profile

,
Steve Sullivan

Microsoft Corporation

Microsoft Corporation
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 34 Issue 4Article No.: 69pp 1–13https://doi.org/10.1145/2766945

Published:27 July 2015Publication History

ACM Transactions on Graphics

Abstract

We present the first end-to-end solution to create high-quality free-viewpoint video encoded as a compact data stream. Our system records performances using a dense set of RGB and IR video cameras, generates dynamic textured surfaces, and compresses these to a streamable 3D video format. Four technical advances contribute to high fidelity and robustness: multimodal multi-view stereo fusing RGB, IR, and silhouette information; adaptive meshing guided by automatic detection of perceptually salient areas; mesh tracking to create temporally coherent subsequences; and encoding of tracked textured meshes as an MPEG video stream. Quantitative experiments demonstrate geometric accuracy, texture fidelity, and encoding efficiency. We release several datasets with calibrated inputs and processed results to foster future research.

Supplemental Material

a69.mp4

mp4

32.8 MB

Download

Available for Download

zip

a69-collet.zip (640.8 MB)

Supplemental files

References

4D View Solutions, 2007. http://www.4dviews.com.Google Scholar
Ahmed, N., Theobalt, C., Dobrev, P., and Seidel, H. 2008. Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry. In Proc. CVPR.Google Scholar
Ahmed, N., Theobalt, C., Rossl, C., Thrun, S., and Seidel, H. 2008. Dense correspondence finding for parameterization-free animation reconstruction from video. In Proc. CVPR.Google Scholar
Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., and Silva, C. T. 2001. Point set surfaces. In Proc. Conf. on Visualization. Google ScholarDigital Library
Aspert, N., Santa-cruz, D., and Ebrahimi, T. 2002. MESH: Measuring errors between surfaces using the Hausdorff distance. In Proc. ICME.Google Scholar
Bleyer, M., Rhemann, C., and Rother, C. 2011. PatchMatch stereo - stereo matching with slanted support windows. In Proc. BMVC.Google Scholar
Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4. Google ScholarDigital Library
Borshukov, G., Piponi, D., Larsen, O., Lewis, J. P., and Tempelaar-Lietz, C. 2005. Universal capture -- Image-based facial animation for "The Matrix Reloaded". In ACM SIGGRAPH Courses. Google ScholarDigital Library
Budd, C., Huang, P., Klaudiny, M., and Hilton, A. 2013. Global non-rigid alignment of surface sequences. Int. J. Comput. Vision 102, 1--3. Google ScholarDigital Library
Campbell, N. D. F., Vogiatzis, G., Hernandez, C., and Cipolla, R. 2008. Using multiple hypotheses to improve depth-maps for multi-view stereo. In Proc. ECCV. Google ScholarDigital Library
Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Trans. Graph. 22, 3. Google ScholarDigital Library
Casas, D., Volino, M., Collomosse, J., and Hilton, A. 2014. 4D video textures for interactive character appearance. Comput. Graph. Forum 33, 2. Google ScholarDigital Library
Chuang, M., Luo, L., Brown, B., Rusinkiewicz, S., and Kazhdan, M. 2009. Estimating the Laplace-Beltrami operator by restricting 3D functions. Symposium on Geometry Processing. Google ScholarDigital Library
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
DoubleMe, 2014. https://www.doubleme.me.Google Scholar
Erickson, J., and Whittlesey, K. 2005. Greedy optimal homotopy and homology generators. In Proc. ACM-SIAM Symposium on Discrete algorithms. Google ScholarDigital Library
Franco, J., Lapierre, M., and Boyer, E. 2006. Visual shapes of silhouette sets. In Proc. Intl. Symp. 3D Data Processing, Visualization and Transmission. Google ScholarDigital Library
FreeD, 2014. http://replay-technologies.com.Google Scholar
Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE PAMI 32, 8. Google ScholarDigital Library
Gal, R., Wexler, Y., Ofek, E., Hoppe, H., and Cohen-Or, D. 2010. Seamless montage for texturing models. Comput. Graph. Forum 29, 2.Google ScholarCross Ref
Gall, J., Stoll, C., Aguiar, E. D., Theobalt, C., Rosenhahn, B., and peter Seidel, H. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR.Google ScholarCross Ref
Garland, M., and Heckbert, P. S. 1997. Surface simplification using quadric error metrics. In ACM SIGGRAPH. Google ScholarDigital Library
Goesele, M., Curless, B., and Seitz, S. M. 2006. Multi-view stereo revisited. In Proc. CVPR. Google ScholarDigital Library
Goldluecke, B., and Magnor, M. 2004. Space-time isosurface evolution for temporally coherent 3D reconstruction. In Proc. CVPR.Google Scholar
Golomb, S. 1966. Run-length encodings (corresp.). IEEE Transactions on Information Theory 12, 3. Google ScholarDigital Library
Guennebaud, G., Jacob, B., et al., 2010. Eigen v3. http://eigen.tuxfamily.org.Google Scholar
Guskov, I., and Wood, Z. J. 2001. Topological noise removal. In Proc. Graphics Interface. Google ScholarDigital Library
Hernandez, C., and Schmitt, F. 2004. Silhouette and stereo fusion for 3D object modeling. Computer Vision and Image Understanding 96, 3. Google ScholarDigital Library
Hiep, V. H., Keriven, R., Labatut, P., and Pons, J.-P. 2009. Towards high-resolution large-scale multi-view stereo. In Proc. CVPR.Google Scholar
Hu, X., and Mordohai, P. 2012. A quantitative evaluation of confidence measures for stereo vision. IEEE PAMI 34, 11. Google ScholarDigital Library
Huang, C.-H., Boyer, E., Navab, N., and Ilic, S. 2014. Human shape and pose tracking using keyframes. In Proc. CVPR. Google ScholarDigital Library
ISO/IEC 23009-1, 2014. Information technology -- dynamic adaptive streaming over HTTP (DASH) -- Part 1: Media presentation description and segment formats.Google Scholar
Kanade, T., Rander, P., and Narayanan, P. J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE Multimedia 4, 1. Google ScholarDigital Library
Kazhdan, M., and Hoppe, H. 2013. Screened Poisson surface reconstruction. ACM Trans. Graph. 32, 3. Google ScholarDigital Library
Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Symposium on Geometry Processing. Google ScholarDigital Library
Klaudiny, M., Budd, C., and Hilton, A. 2012. Towards optimal non-rigid surface tracking. In Proc. ECCV. Google ScholarDigital Library
Labatut, P., Pons, J.-P., and Keriven, R. 2007. Efficient multi-view reconstruction of large-scale scenes using interest points, delaunay triangulation and graph cuts. In Proc. ICCV.Google Scholar
Lee, C. H., Varshney, A., and Jacobs, D. W. 2005. Mesh saliency. ACM Trans. Graph. 24, 3. Google ScholarDigital Library
Lempitsky, V. S., and Ivanov, D. V. 2007. Seamless mosaicing of image-based texture maps. In Proc. CVPR.Google Scholar
Letouzey, A., and Boyer, E. 2012. Progressive shape models. In Proc. CVPR. Google ScholarDigital Library
Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
Lindstrom, P., and Turk, G. 2000. Image-driven simplification. ACM Trans. Graph. 19, 3. Google ScholarDigital Library
Liu, Y., Dai, Q., and Xu, W. 2010. A point-cloud-based multiview stereo algorithm for free-viewpoint video. IEEE TVCG. Google ScholarDigital Library
Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In ACM SIGGRAPH. Google ScholarDigital Library
Microsoft, 2011. UVAtlas. http://uvatlas.codeplex.com.Google Scholar
Moezzi, S., Tai, L.-C., and Gerard, P. 1997. Virtual view generation for 3D digital video. IEEE Multimedia 4, 1. Google ScholarDigital Library
Narayanan, P., Rander, P., and Kanade, T. 1998. Constructing virtual worlds using dense stereo. In Proc. ICCV. Google ScholarDigital Library
Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., and Seitz, S. M. 2014. Occluding contours for multi-view stereo. In Proc. ECCV.Google Scholar
Sinha, S. N., and Pollefeys, M. 2005. Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation. In Proc. ICCV. Google ScholarDigital Library
Song, P., Wu, X., and Wang, M. Y. 2010. Volumetric stereo and silhouette fusion for image-based modeling. The Visual Computer 26, 12. Google ScholarDigital Library
Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. IEEE Computer Graphics and Application 27, 6. Google ScholarDigital Library
Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM Trans. Graph. 26, 3. Google ScholarDigital Library
Vasa, L., and Skala, V. 2007. CoDDyaC: Connectivity Driven Dynamic Mesh Compression. In Proc. 3DTV.Google Scholar
Vlasic, D., Baran, I., Matusik, W., and Popovic, J. 2008. Articulated mesh animation from multiview silhouettes. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
Volino, M., Casas, D., Collomosse, J. P., and Hilton, A. 2014. Optimal representation of multiple view video. In Proc. BMVC.Google Scholar
Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM Trans. Graph. 28, 2. Google ScholarDigital Library
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4. Google ScholarDigital Library
Wood, Z., Hoppe, H., Desbrun, M., and Schröder, P. 2004. Removing excess topology from isosurfaces. ACM Trans. Graph. 23, 2. Google ScholarDigital Library
Wu, C., Varanasi, K., Liu, Y., Seidel, H.-P., and Theobalt, C. 2011. Shading--based dynamic shape refinement from multi-view video under general illumination. In Proc. ICCV. Google ScholarDigital Library
Ye, G., Liu, Y., Deng, Y., Hasler, N., Ji, X., Dai, Q., and Theobalt, C. 2013. Free-viewpoint video of human actors using multiple handheld Kinects. IEEE Trans. on System, Man & Cybernetics 43, 5.Google Scholar
Yu, F., Luo, H., Lu, Z., and Wang, P. 2010. 3D mesh compression. Three-Dimensional Model Analysis and Processing.Google Scholar
Zhou, Q.-Y., and Koltun, V. 2014. Color map optimization for 3D reconstruction with consumer depth cameras. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3. Google ScholarDigital Library
Zollhöfer, M., Niessner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., and Stamminger, M. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33, 4. Google ScholarDigital Library

Index Terms

High-quality streamable free-viewpoint video
1. Computing methodologies
  1. Computer graphics

Recommendations

Photometric Bundle Adjustment for Dense Multi-view 3D Modeling
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition

Motivated by a Bayesian vision of the 3D multi-view reconstruction from images problem, we propose a dense 3D reconstruction technique that jointly refines the shape and the camera parameters of a scene by minimizing the photometric reprojection error ...
Read More
Stereo fusion

A stereo fusion system that combines binocular and refractive stereo is presented.Our stereo fusion outperforms traditional binocular and refractive stereo.An efficient calibration method for refractive stereo is proposed. Display Omitted The ...
Read More
Free Viewpoint Video Coding With Rate-Distortion Analysis

To improve free viewpoint video (FVV) coding efficiency and optimize the quality of the synthesized virtual view video, this paper proposes a depth-assisted FVV coding framework and analyzes the rate-distortion (R-D) property of the synthesized virtual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 34, Issue 4
August 2015
1307 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2809654
Issue’s Table of Contents

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 July 2015
Published in tog Volume 34, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3D video
MPEG
geometry compression
mesh tracking
multi-view stereo
surface reconstruction
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 381
  Total Citations
  View Citations
- 3,608
  Total Downloads
- Downloads (Last 12 months)171
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

High-quality streamable free-viewpoint video

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Photometric Bundle Adjustment for Dense Multi-view 3D Modeling

Stereo fusion

Free Viewpoint Video Coding With Rate-Distortion Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

High-quality streamable free-viewpoint video

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Photometric Bundle Adjustment for Dense Multi-view 3D Modeling

Stereo fusion

Free Viewpoint Video Coding With Rate-Distortion Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media