Abstract
This article presents a novel approach for 3D mesh labeling by using deep Convolutional Neural Networks (CNNs). Many previous methods on 3D mesh labeling achieve impressive performances by using predefined geometric features. However, the generalization abilities of such low-level features, which are heuristically designed to process specific meshes, are often insufficient to handle all types of meshes. To address this problem, we propose to learn a robust mesh representation that can adapt to various 3D meshes by using CNNs. In our approach, CNNs are first trained in a supervised manner by using a large pool of classical geometric features. In the training process, these low-level features are nonlinearly combined and hierarchically compressed to generate a compact and effective representation for each triangle on the mesh. Based on the trained CNNs and the mesh representations, a label vector is initialized for each triangle to indicate its probabilities of belonging to various object parts. Eventually, a graph-based mesh-labeling algorithm is adopted to optimize the labels of triangles by considering the label consistencies. Experimental results on several public benchmarks show that the proposed approach is robust for various 3D meshes, and outperforms state-of-the-art approaches as well as classic learning algorithms in recognizing mesh labels.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, 3D Mesh Labeling via Deep Convolutional Neural Networks
- O. K.-C. Au, Y. Zheng, M. Chen, P. Xu, and C.-L. Tai. 2012. Mesh segmentation with concavity-aware fields. IEEE TVCG 18, 7, 1125--1134. Google ScholarDigital Library
- J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O. Shaughnessy. 2009. Developments and directions in speech recognition and understanding, part 1. IEEE Signal Processing Magazine 26, 3, 75--80.Google ScholarCross Ref
- S. Belongie, J. Malik, and J. Puzicha. 2002. Shape matching and object recognition using shape contexts. IEEE TPAMI 24, 4, 509--522. Google ScholarDigital Library
- M. Ben-Chen and C. Gotsman. 2008. Characterizing shape using conformal factors. In Proc. Eurographics 3DOR. 1--8. Google ScholarDigital Library
- Y. Bengio. 2009. Learning deep architectures for AI. Foundations and Trends® in Machine Learning 2, 1, 1--127. Google ScholarDigital Library
- Y. Boykov, O. Veksler, and R. Zabih. 2001. Fast approximate energy minimization via graph cuts. IEEE TPAMI 23, 11, 1222--1239. Google ScholarDigital Library
- J. Bruna, W. Zaremba, A. Szlam, and Y. Lecun. 2014. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203.Google Scholar
- C.-C. Chang and C.-J. Lin. 2011. LIBSVM: A library for support vector machines. ACM TIST 2, 27:1--27:27. Software available at www.csie.ntu.edu.tw∼cjlin/libsvm. Google ScholarDigital Library
- X. Chen, A. Golovinskiy, and T. Funkhouser. 2009. A benchmark for 3D mesh segmentation. ACM Trans. Graph. 28, 3, 73:1--73:12. Google ScholarDigital Library
- X. Chen, Y. Guo, B. Zhou, and Q. Zhao. 2013. Deformable model for estimating clothed and naked human shapes from a single image. The Visual Computer 29, 11, 1187--1196. Google ScholarDigital Library
- X. Chen, J. Li, Q. Li, B. Gao, D. Zou, and Q. Zhao. 2015a. Image2scene: Transforming style of 3D room. In Proceedings of ACM MM. 321--330. Google ScholarDigital Library
- X. Chen, B. Zhou, F. Lu, L. Wang, L. Bi, and P. Tan. 2015b. Garment modeling with a depth camera. ACM Trans. Graph. 34, 6. Google ScholarDigital Library
- L. Deng. 2004. Switching dynamic system models for speech articulation and acoustics. In Proceedings of the IMA Workshop. Springer, 115--134.Google ScholarCross Ref
- C. Farabet, C. Couprie, L. Najman, and Y. Lecun. 2013. Learning hierarchical features for scene labeling. IEEE TPAMI 35, 8, 1915--1929. Google ScholarDigital Library
- R. Gal and D. Cohen-Or. 2006. Salient geometric features for partial shape matching and similarity. ACM Trans. Graph. 25, 1, 130--150. Google ScholarDigital Library
- M. Hilaga, Y. Shinagawa, T. Kohmura, and T. L. Kunii. 2001. Topology matching for fully automatic similarity estimation of 3D shapes. In Proc. SIGGRAPH. 203--212. Google ScholarDigital Library
- G. Hinton. 2010. A practical guide to training restricted Boltzmann machines. Momentum 9, 1, 926.Google Scholar
- R. Hu, L. Fan, and L. Liu. 2012. Co-segmentation of 3D shapes via subspace clustering. CGF 31, 5, 1703--1713. Google ScholarDigital Library
- Q. Huang, V. Koltun, and L. Guibas. 2011. Joint shape segmentation with linear programming. ACM Trans. Graph. 30, 6, 125:1--125:12. Google ScholarDigital Library
- Q.-X. Huang, H. Su, and L. Guibas. 2013. Fine-grained semisupervised labeling of large shape collections. ACM Trans. Graph. 32, 6, 190:1--190:10. Google ScholarDigital Library
- Q.-X. Huang, M. Wicke, B. Adams, and L. Guibas. 2009. Shape decomposition using modal analysis. CGF 28, 2, 407--416.Google ScholarCross Ref
- A. E. Johnson and M. Hebert. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE TPAMI 21, 5, 433--449. Google ScholarDigital Library
- E. Kalogerakis, A. Hertzmann, and K. Singh. 2010. Learning 3D mesh segmentation and labeling. ACM Trans. Graph. 29, 4, 102:1--102:12. Google ScholarDigital Library
- S. Katz and A. Tal. 2003. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM Trans. Graph. 22, 3, 954--961. Google ScholarDigital Library
- K. Kavukcuoglu, M. Ranzato, R. Fergus, and Y. Lecun. 2009. Learning invariant features through topographic filter maps. In Proc. CVPR. 1605--1612.Google Scholar
- K. Kavukcuoglu, M. Ranzato, and Y. Lecun. 2010. Fast inference in sparse coding algorithms with applications to object recognition. arXiv preprint arXiv:1010.3467.Google Scholar
- V. G. Kim, W. Li, N. J. Mitra, S. Chaudhuri, S. Diverdi, and T. Funkhouser. 2013. Learning part-based templates from large collections of 3D shapes. ACM Trans. Graph. 32, 4, 70. Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proc. NIPS. 1106--1114.Google Scholar
- J. D. Lafferty, A. McCallum, and F. C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML. 282--289. Google ScholarDigital Library
- H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. 2009. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proc. ICML. 609--616. Google ScholarDigital Library
- R. Liu, H. Zhang, A. Shamir, and D. Cohen-Or. 2009. A part-aware surface metric for shape analysis. CGF 28, 2, 397--406.Google ScholarCross Ref
- J. Lv, X. Chen, J. Huang, and H. Bao. 2012. Semi-supervised mesh segmentation and labeling. CGF 31, 7, 2241--2248. Google ScholarDigital Library
- S. Lyu and E. P. Simoncelli. 2008. Nonlinear image representation using divisive normalization. In Proc. CVPR. 1--8.Google Scholar
- L. Shapira, S. Shalom, A. Shamir, D. Cohen-Or, and H. Zhang. 2010. Contextual part analogies in 3D objects. IJCV 89, 2--3, 309--326. Google ScholarDigital Library
- O. Sidi, O. Van Kaick, Y. Kleiman, H. Zhang, and D. Cohen-Or. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. ACM Trans. Graph. 30, 6, 126:1--126:10. Google ScholarDigital Library
- R. Socher, B. Huval, B. P. Bath, C. D. Manning, and A. Y. Ng. 2012. Convolutional-recursive deep learning for 3D object classification. In Proc. NIPS. 665--673.Google Scholar
- A. Torralba, K. P. Murphy, and W. T. Freeman. 2007. Sharing visual features for multiclass and multiview object detection. IEEE TPAMI 29, 5, 854--869. Google ScholarDigital Library
- O. van Kaick, A. Tagliasacchi, O. Sidi, H. Zhang, D. Cohen-Or, L. Wolf, and G. Hamarneh. 2011. Prior knowledge for part correspondence. CGF 30, 2, 553--562.Google ScholarCross Ref
- O. van Kaick, K. Xu, H. Zhang, Y. Wang, S. Sun, A. Shamir, and D. Cohen-Or. 2013. Co-hierarchical analysis of shape structures. ACM Trans. Graph. 32, 4, 69:1--69:10. Google ScholarDigital Library
- Y. Wang, S. Asafi, O. van Kaick, H. Zhang, D. Cohen-Or, and B. Chen. 2012. Active co-analysis of a set of shapes. ACM Trans. Graph. 31, 6, 165:1--165:10. Google ScholarDigital Library
- Y. Wang, M. Gong, T. Wang, D. Cohen-Or, H. Zhang, and B. Chen. 2013. Projective analysis for 3D shape segmentation. ACM Trans. Graph. 32, 6, 192:1--192:12. Google ScholarDigital Library
- Z. Xie, K. Xu, L. Liu, and Y. Xiong. 2014. 3D shape segmentation and labeling via extreme learning machine. CGF 33, 5, 85--95. Google ScholarDigital Library
- Y. Yang, W. Xu, X. Guo, K. Zhou, and B. Guo. 2013. Boundary-aware multidomain subspace deformation. IEEE TVCG 19, 10, 1633.Google Scholar
- Y. Yu, K. Zhou, D. Xu, X. Shi, H. Bao, B. Guo, and H.-Y. Shum. 2004. Mesh editing with poisson-based gradient field manipulation. ACM Trans. Graph. 23, 3, 644--651. Google ScholarDigital Library
- M. D. Zeiler, G. W. Taylor, and R. Fergus. 2011. Adaptive deconvolutional networks for mid and high level feature learning. In Proc. ICCV. 2018--2025. Google ScholarDigital Library
- J. Zhang, J. Zheng, C. Wu, and J. Cai. 2012. Variational mesh decomposition. ACM Trans. Graph. 31, 3, 21:1--21:14. Google ScholarDigital Library
Index Terms
- 3D Mesh Labeling via Deep Convolutional Neural Networks
Recommendations
Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on MultimediaSuccessful semantic segmentation methods typically rely on the training datasets containing a large number of pixel-wise labeled images. To alleviate the dependence on such a fully annotated training dataset, in this paper, we propose a semi- and weakly-...
Convergence of deep convolutional neural networks
AbstractConvergence of deep neural networks as the depth of the networks tends to infinity is fundamental in building the mathematical foundation for deep learning. In a previous study, we investigated this question for deep networks with the Rectified ...
Relevance Feedback in Deep Convolutional Neural Networks for Content Based Image Retrieval
SETN '16: Proceedings of the 9th Hellenic Conference on Artificial IntelligenceIn this paper a novel Relevance Feedback approach that uses deep Convolutional Neural Networks (CNNs) for image retrieval is proposed. We utilize a deep CNN model to refine the feature representations of the deeper layer used for the retrieval, based on ...
Comments