ABSTRACT
We propose a method that integrates two widely available data sources, building footprints from 2D maps and street level images, to derive valuable information that is generally difficult to acquire --- building heights and building facade masks in images. Building footprints are elevated in world coordinates and projected onto images. Building heights are estimated by scoring projected footprints based on their alignment with building features in images. Building footprints with estimated heights can be converted to simple 3D building models, which are projected back to images to identify buildings. In this procedure, accurate camera projections are critical. However, camera position errors inherited from external sensors commonly exist, which adversely affect results. We derive a solution to precisely locate cameras on maps using correspondence between image features and building footprints. Experiments on real-world datasets show the promise of our method.
- D. Anguelov, C. Dulong, D. Filip, C. Frueh, S. Lafon, R. Lyon, A. Ogale, L. Vincent, and J. Weaver. Google street view: Capturing the world at street level. Computer, 43(6):32--38, 2010. Google ScholarDigital Library
- M. Bansal and K. Daniilidis. Geometric urban geo-localization. In CVPR, 2014. Google ScholarDigital Library
- J. B. Burns, A. R. Hanson, and E. M. Riseman. Extracting straight lines. TPAMI, 8(4):425--455, 1986. Google ScholarDigital Library
- T.-J. Cham, A. Ciptadi, W.-C. Tan, M.-T. Pham, and L.-T. Chia. Estimating camera pose from a single urban ground-view omnidirectional image and a 2D building outline map. In CVPR, 2010.Google ScholarCross Ref
- L.-C. Chen, T.-A. Teo, C.-Y. Kuo, and J.-Y. Rau. Shaping polyhedral buildings by the fusion of vector maps and LiDAR point clouds. Photogrammetric Engineering & Remote Sensing, 74(9):1147--1157, 2008.Google ScholarCross Ref
- P. Cho and N. Snavely. 3D exploitationof 2D imagery. Lincoln Laboratory Journal, 20(1):105--137, 2013.Google Scholar
- H. Chu, A. Gallagher, and T. Chen. GPS refinement and camera orientation estimation from a single image and a 2D map. In CVPR workshops, 2014. Google ScholarDigital Library
- N. M. Drawil and O. Basir. Intervehicle-communication-assisted localization. IEEE Transactions on Intelligent Transportation Systems, 11(3):678--691, 2010. Google ScholarDigital Library
- C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1915--1929, 2013. Google ScholarDigital Library
- N. Haala and K.-H. Anders. Fusion of 2D-GIS and image data for 3D building reconstruction. International Archives of Photogrammetry and Remote Sensing, 31:285--290, 1996.Google Scholar
- K. Jo, K. Chu, and M. Sunwoo. Interacting multiple model filter-based sensor fusion of GPS with in-vehicle sensors for real-time vehicle positioning. IEEE Transactions on Intelligent Transportation Systems, 13(1):329--343, 2012. Google ScholarDigital Library
- X. Liu and D. Wang. A spectral histogram model for texton modeling and texture discrimination. Vision Research, 42(23):2617--2634, 2002.Google ScholarCross Ref
- J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. CVPR, pages 3431--3440, 2015.Google ScholarCross Ref
- H. Qi and J. B. Moore. Direct kalman filtering approach for GPS/INS integration. IEEE Transactions on Aerospace and Electronic Systems, 38(2):687--693, 2002.Google ScholarCross Ref
- F. Tack, G. Buyuksalih, and R. Goossens. 3D building reconstruction based on given ground plan information and surface models extracted from spaceborne imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 67:285--290, 2012.Google ScholarCross Ref
- J. Tighe and S. Lazebnik. Superparsing: scalable nonparametric image parsing with superpixels. In ECCV, pages 352--365. 2010. Google ScholarDigital Library
- C.-P. Wang, K. Wilson, and N. Snavely. Accurate georegistration of point clouds using geographic data. In International Conference on 3D Vision-3DV, pages 33--40, 2013. Google ScholarDigital Library
- S. Wang, S. Fidler, and R. Urtasun. Holistic 3D scene understanding from a single geo-tagged image. In CVPR, pages 3964--3972, 2015.Google ScholarCross Ref
- J. Xiao, T. Fang, P. Zhao, M. Lhuillier, and L. Quan. Image-based street-side city modeling. ACM Transactions on Graphics, 28(5):114, 2009. Google ScholarDigital Library
- J. Yuan and A. M. Cheriyadat. Road segmentation in aerial images by exploiting road vector data. In COM. Geo, pages 16--23, 2013. Google ScholarDigital Library
- P. A. Zandbergen and S. J. Barbeau. Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones. Journal of Navigation, 64(03):381--399, 2011.Google ScholarCross Ref
- L. Zebedin, J. Bauer, K. Karner, and H. Bischof. Fusion of feature-and area-based information for urban buildings modeling from aerial imagery. In ECCV. 2008. Google ScholarDigital Library
- Combining maps and street level images for building height and facade estimation
Recommendations
CBHE: Corner-based Building Height Estimation for Complex Street Scene Images
WWW '19: The World Wide Web ConferenceBuilding height estimation is important in many applications such as 3D city reconstruction, urban planning, and navigation. Recently, a new building height estimation method using street scene images and 2D maps was proposed. This method is more ...
Joining Street-View Images and Building Footprint GIS Data
GeoSearch'21: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial DataThis paper proposes a new method to join building footprint GIS data with the relevant buildings in a street-view image, taken by a vehicle-mounted camera. This is achieved by segmenting buildings in the street-view images and identifying the relevant ...
Building facade detection, segmentation, and parameter estimation for mobile robot stereo vision
Building facade detection is an important problem in computer vision, with applications in mobile robotics and semantic scene understanding. In particular, mobile platform localization and guidance in urban environments can be enabled with accurate ...
Comments