Abstract
The multitude of cameras constantly present nowadays redefines the meaning of capturing an event and the meaning of sharing this event with others. The images are frequently uploaded to a common platform, and the image navigation challenge naturally arises. We introduce RingIt: a spectral technique for recovering the spatial order of a set of still images capturing an event taken by a group of people situated around the event. We assume a nearly instantaneous event, such as an interesting moment in a performance captured by the digital cameras and smartphones of the surrounding crowd. The ordering method extracts the K-nearest neighbors (KNN) of each image from a rough all-pairs dissimilarity estimate. The KNN dissimilarities are refined to form a sparse weighted Laplacian, and a spectral analysis then yields a ring angle for each image. The spatial order is recovered by sorting the obtained ring angles. The ordering of the unorganized set of images allows for a sequential display of the captured object. We demonstrate our technique on a number of sets capturing momentary events, where the images were acquired with low-quality consumer cameras by a group of people positioned around the event.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, RingIt: Ring-Ordering Casual Photos of a Temporal Event
- I. Arev, H. S. Park, Y. Sheikh, J. K. Hodgins, and A. Shamir. 2014. Automatic editing of footage from multiple social cameras. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
- A. Arpa, L. Ballan, R. Sukthankar, G. Taubin, M. Pollefeys, and R. Raskar. 2013. Crowdcam: Instantaneous navigation of crowd images using angled graph. In Proceedings of the International Conference on 3D Vision (3DV'13). 422--429. Google ScholarDigital Library
- L. Ballan, G. J. Brostow, J. Puwein, and M. Pollefeys. 2010. Unstructured video-based rendering: Interactive exploration of casually captured videos. ACM Trans. Graph. 29, 4. Google ScholarDigital Library
- T. Basha, Y. Moses, and S. Avidan. 2012. Photo sequencing. In Proceedings of the European Conference on Computer Vision (ECCV'12). 654--667. Google ScholarDigital Library
- D. Batra, A. Kowdle, D. Parikh, J. Luo, and T. Chen. 2010. iCoseg: Interactive co-segmentation with intelligent scribble guidance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). 3169--3176.Google Scholar
- M. Belkin and P. Niyogi. 2003. Laplacian eigenmaps for dimension-ality reduction and data representation. Neural Comput. 15, 6, 1373--1396. Google ScholarDigital Library
- A. A. E. Brouwer and W. H. Haemers. 2012. Spectra of Graphs. Springer.Google Scholar
- G. Cormode. 2003. Sequence distance embeddings. Ph.D. thesis, University of Warwick. http://webcat.warwick.ac.uk/record=b1663364∼S1.Google Scholar
- V. De Silva, D. Morozov, and M. Vejdemo Ohansson. 2011. Persistent cohomology and circular coordinates. Discr. Comput. Geom. 45, 4, 737--759.Google ScholarCross Ref
- J.-M. Frahm, P. Fite-Georgel, D. Gallup, T. Johnson, R. Raguram, C. Wu, Y.-H. Jen, E. Dunn, B. Clipp, S. Lazebnik et al. 2010. Building Rome on a cloudless day. In Proceedings of the 11<sup>th</sup> European Conference on Computer Vision (ECCV'10). 368--381. Google ScholarDigital Library
- J.-Y. Guillemaut, J. Kilner, and A. Hilton. 2009. Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes. In Proceedings of the 12<sup>th</sup> International Conference on Computer Vision (ICCV'09). 809--816.Google Scholar
- Y. Hacohen, E. Shechtman, D. B. Goldman, and D. Lischinski. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30, 4. Google ScholarDigital Library
- Y. Hacohen, E. Shechtman, D. B. Goldman, and D. Lischinski. 2013. Optimizing color consistency in photo collections. ACM Trans. Graph. 32, 4. Google ScholarDigital Library
- K. M. Hall. 1970. An r-dimensional quadratic placement algorithm. Manag. Sci. 17, 3.Google ScholarCross Ref
- K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. J. Guibas. 2010. Image webs: Computing and exploiting connectivity in image collections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). 3432--3439.Google Scholar
- T. Kanade, P. Rander, and P. Narayanan. 1997. Virtualized reality: Constructing virtual worlds from real scenes. MultiMedia 4, 1, 34--47. Google ScholarDigital Library
- I. Kemelmacher Hlizerman, E. Shechtman, R. Garg, and S. M. Seitz. 2011. Exploring photobios. ACM Trans. Graph. 30, 4. Google ScholarDigital Library
- A. Kushal, B. Self, Y. Furukawa, D. Gallup, C. Hernandez, B. Curless, and S. M. Seitz. 2012. Photo tours. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT'12). Google ScholarDigital Library
- J. A. Lee and M. Verleysen. 2005. Nonlinear dimensionality reduction of data manifolds with essential loops. Neurocomput. 67, 29--53. Google ScholarDigital Library
- H. Ling and D. W. Jacobs. 2007. Shape classification using the inner-distance. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2, 286--299. Google ScholarDigital Library
- X. Lu, C. Wang, J.-M. Yang, Y. Pang, and L. Zhang. 2010. Photo2trip: Generating travel routes from geo-tagged photos for trip planning. In Proceedings of the International Conference on Multimedia (MM'10). 143--152. Google ScholarDigital Library
- A. Oliva and A. Torralba. 2006. Building the gist of a scene: The role of global image features in recognition. Progress Brain Res. 155, 23--36.Google ScholarCross Ref
- R. Pless and I. Simon. 2001. Embedding images in non-flat spaces. In Proceedings of the International Conference on Imaging Science, Systems, and Technology (CISST'01).Google Scholar
- C. Rother, V. Kolmogorov, and A. Blake. 2004. Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 3, 309--314. Google ScholarDigital Library
- F. Schaffalitzky and A. Zisserman. 2002. Multi-view matching for unordered image sets, or "how do I organize my holiday snaps?" In Proceedings of the 7<sup>th</sup> European Conference on Computer Vision (ECCV'02). 414--431. Google ScholarDigital Library
- N. Snavely, R. Garg, S. M. Seitz, and R. Szeliski. 2008. Finding paths through the world's photos. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
- N. Snavely, S. M. Seitz, and R. Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3, 835--846. Google ScholarDigital Library
- Visual Geometry Group. 2004. Multiview and Oxford Colleges building reconstruction. http://www.robots.ox.ac.uk/%7Evgg/data/data-mview.html.Google Scholar
- G. Wan, N. Snavely, R. D. Cohen, Q. Zheng, B. Chen, and S. Li. 2012. Sorting unorganized photo sets for urban reconstruction. Graph. Models 74, 1, 14--28. Google ScholarDigital Library
- C. Wu. 2011. VisualSFM: A visual structure from motion system. http://ccwu.me/vsfm/.Google Scholar
- K. Yucer, A. Jacobson, A. Hornung, and O. Sorkine. 2012. Transfusive image manipulation. ACM Trans. Graph. 31, 6, 176. Google ScholarDigital Library
- C. L. Zitnick, S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3, 600--608. Google ScholarDigital Library
Index Terms
- RingIt: Ring-Ordering Casual Photos of a Temporal Event
Recommendations
Multi-view dense 3D modelling of untextured objects from a moving projector-cameras system
Structured light methods achieve 3D modelling by observing with a camera system, a known pattern projected on the scene. The main drawback of single projection structured light methods is that moving the projector changes significatively the appearance ...
A hybrid image-based modelling algorithm
ACSC '13: Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135This paper explores the practical aspects associated with visual-geometric reconstruction of a complex 3D scene from a sequence of unconstrained and uncalibrated 2D images. These image sequences can be acquired by a video camera or a handheld digital ...
Image-based 3D acquisition of archaeological heritage and applications
VAST '01: Proceedings of the 2001 conference on Virtual reality, archeology, and cultural heritageIn this paper an approach is presented that obtains virtual models from sequences of images. The system can deal with uncalibrated image sequences acquired with a hand-held camera. Based on tracked or matched features the relations between multiple ...
Comments