Abstract
In this article we present a novel dictionary learning framework designed for compression and sampling of light fields and light field videos. Unlike previous methods, where a single dictionary with one-dimensional atoms is learned, we propose to train a Multidimensional Dictionary Ensemble (MDE). It is shown that learning an ensemble in the native dimensionality of the data promotes sparsity, hence increasing the compression ratio and sampling efficiency. To make maximum use of correlations within the light field data sets, we also introduce a novel nonlocal pre-clustering approach that constructs an Aggregate MDE (AMDE). The pre-clustering not only improves the image quality but also reduces the training time by an order of magnitude in most cases. The decoding algorithm supports efficient local reconstruction of the compressed data, which enables efficient real-time playback of high-resolution light field videos. Moreover, we discuss the application of AMDE for compressed sensing. A theoretical analysis is presented that indicates the required conditions for exact recovery of point-sampled light fields that are sparse under AMDE. The analysis provides guidelines for designing efficient compressive light field cameras. We use various synthetic and natural light field and light field video data sets to demonstrate the utility of our approach in comparison with the state-of-the-art learning-based dictionaries, as well as established analytical dictionaries.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, A Unified Framework for Compression and Compressed Sensing of Light Fields and Light Field Videos
- Edward H. Adelson and James R. Bergen. 1991. The plenoptic function and the elements of early vision. In Computational Models of Visual Processing. MIT Press, 3--20.Google Scholar
- M. Aharon, M. Elad, and A. Bruckstein. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Proc. 54, 11 (Nov. 2006), 4311--4322. Google ScholarDigital Library
- H. Arguello and G. R. Arce. 2014. Colored coded aperture design by concentration of measure in compressive spectral imaging. IEEE Trans. Image Proc. 23, 4 (Apr. 2014), 1896--1908. Google ScholarDigital Library
- Amit Ashok and Mark A. Neifeld. 2010. Compressive light field imaging. SPIE Proceedings 7690 (2010), 1--12.Google Scholar
- S. D. Babacan, R. Ansorge, M. Luessi, P. Ruiz Mataran, R. Molina, and A. K. Katsaggelos. 2012. Compressive light field sensing. IEEE Trans. Image Proc. 21, 12 (Dec. 2012), 4746--4757. Google ScholarDigital Library
- R. Ballester-Ripoll and R. Pajarola. 2016. Compressing bidirectional texture functions via tensor train decomposition. In Proceedings of the 24<sup>th</sup> Pacific Conference on Computer Graphics and Applications: Short Papers. Eurographics Association, Goslar Germany, Germany, 19--22. Google ScholarDigital Library
- Zvika Ben-Haim, Yonina C. Eldar, and Michael Elad. 2010. Coherence-based performance guarantees for estimating a sparse vector under random noise. Trans. Sig. Proc. 58, 10 (Oct. 2010), 5030--5043. Google ScholarDigital Library
- Ahmet Bilgili, Aydn Öztürk, and Murat Kurt. 2011. A general BRDF representation based on tensor decomposition. Comput. Graph. Forum 30, 8 (2011), 2427--2439.Google ScholarCross Ref
- Ori Bryt and Michael Elad. 2008. Compression of facial images using the K-SVD algorithm. J. Vis. Comun. Image Represent. 19, 4 (2008), 270--282. Google ScholarDigital Library
- E. J. Candès, J. Romberg, and T. Tao. 2006. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theor. 52, 2 (Feb. 2006), 489--509. Google ScholarDigital Library
- E. J. Candès, J. K. Romberg, and T. Tao. 2006. Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59, 8 (2006), 1207--1223.Google ScholarCross Ref
- E. J. Candès and T. Tao. 2005. Decoding by linear programming. IEEE Trans. Inform. Theor. 51, 12 (Dec. 2005), 4203--4215. Google ScholarDigital Library
- E. J. Candès and T. Tao. 2006. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theor. 52, 12 (Dec. 2006), 5406--5425. Google ScholarDigital Library
- E. J. Candès and M. B. Wakin. 2008. An introduction to compressive sampling. IEEE Sig. Proc. Mag. 25, 2 (March 2008), 21--30.Google ScholarCross Ref
- L. H. Chang and J. Y. Wu. 2014. An improved RIP-based performance guarantee for sparse signal recovery via orthogonal matching pursuit. IEEE Trans. Inform. Theor. 60, 9 (Sept. 2014), 5702--5715.Google ScholarCross Ref
- B. Choudhury, R. Swanson, F. Heide, G. Wetzstein, and W. Heidrich. 2017. Consensus convolutional sparse coding. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4290--4298.Google Scholar
- A. Cohen, Ingrid Daubechies, and J.-C. Feauveau. 1992. Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45, 5 (June 1992), 485--560.Google ScholarCross Ref
- Kristin J. Dana, Bram van Ginneken, Shree K. Nayar, and Jan J. Koenderink. 1999. Reflectance and texture of real-world surfaces. ACM Trans. Graph. 18, 1 (Jan. 1999), 1--34. Google ScholarDigital Library
- D. L. Donoho. 2006. Compressed sensing. IEEE Trans. Inform. Theor. 52, 4 (Apr. 2006), 1289--1306. Google ScholarDigital Library
- David L. Donoho and Michael Elad. 2003. Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ<sub>1</sub> minimization. In Proceedings of the National Academy of Sciences 100, 5 (2003), 2197--2202.Google ScholarCross Ref
- P. L. Dragotti and Y. M. Lu. 2014. On sparse representation in Fourier and local bases. IEEE Trans. Inform. Theor. 60, 12 (Dec. 2014), 7888--7899.Google ScholarCross Ref
- Michael Elad. 2010. Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing (1st ed.). Springer-Verlag New York. Google ScholarDigital Library
- M. Elad and M. Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Proc. 15, 12 (Dec. 2006), 3736--3745. Google ScholarDigital Library
- Y. C. Eldar, P. Kuppinger, and H. Bolcskei. 2010. Block-sparse signals: Uncertainty relations and efficient recovery. IEEE Trans. Sig. Proc. 58, 6 (2010), 3042--3054. Google ScholarDigital Library
- E. Elhamifar and R. Vidal. 2013. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 11 (Nov. 2013), 2765--2781. Google ScholarDigital Library
- Walter D. Fisher. 1958. On grouping for maximum homogeneity. J. Amer. Statist. Assoc. 53, 284 (1958), 789--798.Google ScholarCross Ref
- Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. 1996. The Lumigraph. In Proceedings of the 23<sup>rd</sup> Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). ACM, New York, NY, 43--54. Google ScholarDigital Library
- R. Gribonval and M. Nielsen. 2003. Sparse representations in unions of bases. IEEE Trans. Inform. Theor. 49, 12 (Dec. 2003), 3320--3325. Google ScholarDigital Library
- K. S. Gurumoorthy, A. Rajwade, A. Banerjee, and A. Rangarajan. 2010. A method for compact image representation using sparse matrix and tensor projections onto exemplar orthonormal bases. IEEE Trans. Image Proc. 19, 2 (2010), 322--334. Google ScholarDigital Library
- Michael Guthe, Gero Müller, Martin Schneider, and Reinhard Klein. 2009. BTF-CIELab: A perceptual difference measure for quality assessment and compression of BTFs. Comput. Graphics Forum 28, 1 (2009), 101--113.Google ScholarCross Ref
- R. A. Horn and C. R. Johnson. 2012. Matrix Analysis. Cambridge University Press. Google ScholarDigital Library
- David A. Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (Sept. 1952), 1098--1101.Google ScholarCross Ref
- Adrian Jarabo, Belen Masia, Adrien Bousseau, Fabio Pellacini, and Diego Gutierrez. 2014. How do people edit light fields? ACM Trans. Graph. 33, 4, Article 146 (July 2014), 10 pages. Google ScholarDigital Library
- A. Jones, K. Nagano, J. Busch, X. Yu, H. Y. Peng, J. Barreto, O. Alexander, M. Bolas, P. Debevec, and J. Unger. 2016. Time-offset conversations on a life-sized automultiscopic projector array. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). IEEE, 927--935.Google Scholar
- Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Trans. Graph. 35, 6, Article 193 (Nov. 2016), 10 pages. Google ScholarDigital Library
- Mahdad Hosseini Kamal, Barmak Heshmat, Ramesh Raskar, Pierre Vandergheynst, and Gordon Wetzstein. 2016. Tensor low-rank and sparse light field photography. Comput. Vis. Image Und. 145 (2016), 172--181. Google ScholarDigital Library
- David Kittle, Kerkil Choi, Ashwin Wagadarikar, and David J. Brady. 2010. Multiframe image estimation for coded aperture snapshot spectral imagers. Appl. Opt. 49, 36 (Dec. 2010), 6824--6833.Google ScholarCross Ref
- Seungjae Lee, Changwon Jang, Seokil Moon, Jaebum Cho, and Byoungho Lee. 2016. Additive light field displays: Realization of augmented reality with holographic optical elements. ACM Trans. Graph. 35, 4, Article 60 (July 2016), 13 pages. Google ScholarDigital Library
- Marc Levoy and Pat Hanrahan. 1996. Light field rendering. In Proceedings of the 23<sup>rd</sup> Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). ACM, New York, NY, 31--42. Google ScholarDigital Library
- Xinguo Liu, Peter-Pike Sloan, Heung-Yeung Shum, and John Snyder. 2004. All-frequency precomputed radiance transfer for glossy objects. In Proceedings of the Eurographics Conference on Rendering Techniques (EGSR’04). Eurographics Association, 337--344. Google ScholarDigital Library
- S. Lloyd. 1982. Least squares quantization in PCM. IEEE Trans. Inform. Theor. 28, 2 (1982), 129--137. Google ScholarDigital Library
- Dhruv Mahajan, Ira Kemelmacher Shlizerman, Ravi Ramamoorthi, and Peter Belhumeur. 2007. A theory of locally low dimensional light transport. ACM Trans. Graph. 26, 3 (July 2007). Google ScholarDigital Library
- Andrew Maimone, Gordon Wetzstein, Matthew Hirsch, Douglas Lanman, Ramesh Raskar, and Henry Fuchs. 2013. Focus 3D: Compressive accommodation display. ACM Trans. Graph. 32, 5, Article 153 (Oct. 2013), 13 pages. Google ScholarDigital Library
- J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. 2009. Non-local sparse models for image restoration. In Proceedings of the 12th IEEE International Conference on Computer Vision. IEEE, 2272--2279.Google Scholar
- S. Mallat. 2008. A Wavelet Tour of Signal Processing: The Sparse Way. Elsevier Science. Google ScholarDigital Library
- Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Trans. Graph. 32, 4, Article 46 (July 2013), 12 pages. Google ScholarDigital Library
- Ehsan Miandji. 2018. Sparse Representation of Visual Data for Compression and Compressed Sensing. Ph.D. Dissertation. Department of Science and Technology, Linköping University, Sweden.Google Scholar
- Ehsan Miandji, Mohammad Emadi, Jonas Unger, and Ehsan Afshari. 2017. On probability of support recovery for orthogonal matching pursuit using mutual coherence. IEEE Sig. Proc. Lett. 24, 11 (Nov. 2017), 1646--1650.Google ScholarCross Ref
- Ehsan Miandji, Joel Kronander, and Jonas Unger. 2013. Learning based compression of surface light fields for real-time rendering of global illumination scenes. In SIGGRAPH Asia 2013 Technical Briefs (SA’13). ACM, New York, NY, Article 24, 4 pages. Google ScholarDigital Library
- Ehsan Miandji, Joel Kronander, and Jonas Unger. 2015. Compressive image reconstruction in reduced union of subspaces. Comput. Graph. Forum 34, 2 (May 2015), 33--44. Google ScholarDigital Library
- Ehsan Miandji, Jonas Unger, and Christine Guillemot. 2018. Multi-shot single sensor light field camera using a color coded mask. In Proceedings of the 26th European Signal Processing Conference (EUSIPCO’18). 1--5.Google ScholarCross Ref
- H. Mohimani, Massoud Babaie-Zadeh, and C. Jutten. 2009. A fast approach for overcomplete sparse decomposition based on smoothed ℓ<sub>0</sub> norm. IEEE Trans. Sig. Proc. 57, 1 (2009), 289--301. Google ScholarDigital Library
- Gero Müller, Jan Meseth, Mirko Sattler, Ralf Sarlette, and Reinhard Klein. 2005. Acquisition, synthesis, and rendering of bidirectional texture functions. Computer Graphics Forum 24, 1 (Mar. 2005), 83--109.Google ScholarCross Ref
- Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, and Pat Hanrahan. 2005. Light field photography with a hand-held plenoptic camera. Comput. Sci. Tech. Rep. 2, 11 (2005), 1--11.Google Scholar
- H. Nyquist. 1928. Certain topics in telegraph transmission theory. Trans. Amer. Inst. Elect. Eng. 47, 2 (Apr. 1928), 617--644.Google ScholarCross Ref
- Renato Pajarola, Susanne K. Suter, and Roland Ruiters. 2013. Tensor approximation in visualization and computer graphics. In Eurographics 2013—Tutorials. Eurographics Association, Girona, Spain.Google Scholar
- Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad. 1993. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Conference Record of the Twenty-Seventh Asilomar Conference on Signals, Systems and Computers, Vol. 1. IEEE, 40--44.Google Scholar
- M. Rahmani and G. K. Atia. 2017. Coherence pursuit: Fast, simple, and robust principal component analysis. IEEE Trans. Sig. Proc. 65, 23 (Dec. 2017), 6260--6275.Google ScholarCross Ref
- Ruiters Roland and Klein Reinhard. 2009. BTF compression via sparse tensor decomposition. Comput. Graph. Forum 28, 4 (2009), 1181--1188.Google ScholarDigital Library
- R. Rubinstein, M. Zibulevsky, and M. Elad. 2010. Double sparsity: Learning sparse dictionaries for sparse signal approximation. IEEE Trans. Sig. Proc. 58, 3 (2010), 1553--1564. Google ScholarDigital Library
- Neus Sabater, Guillaume Boisson, Benoit Vandame, Paul Kerbiriou, Frederic Babon, Matthieu Hog, Tristan Langlois, Remy Gendrot, Olivier Bureller, Arno Schubert, and Valerie Allie. 2017. Dataset and pipeline for multi-view light-field video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 1743--1753.Google ScholarCross Ref
- Peter H. Schönemann. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1 (1966), 1--10.Google ScholarCross Ref
- Kai Schröder, Reinhard Klein, and Arno Zinke. 2013. Non-local image reconstruction for efficient computation of synthetic bidirectional texture functions. Comput. Graph. Forum 32 (2013), 61--71.Google ScholarCross Ref
- C. E. Shannon. 1949. Communication in the presence of noise. Proceedings of the IRE 37, 1 (Jan. 1949), 10--21.Google ScholarCross Ref
- Peter-Pike Sloan, Jesse Hall, John Hart, and John Snyder. 2003. Clustered principal components for precomputed radiance transfer. ACM Trans. Graph. 22, 3 (2003), 382--391. Google ScholarDigital Library
- Peter-Pike Sloan, Jan Kautz, and John Snyder. 2002. Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments. ACM Trans. Graph. 21, 3 (July 2002), 527--536. Google ScholarDigital Library
- Mahdi Soltanolkotabi, Emmanuel J. Candès, et al. 2012. A geometric analysis of subspace clustering with outliers. Ann. Statist. 40, 4 (2012), 2195--2238.Google ScholarCross Ref
- David Taubman and Michael Marcellin. 2013. JPEG2000 Image Compression Fundamentals, Standards and Practice. Springer Publishing Company, Incorporated. Google ScholarDigital Library
- J. A. Tropp. 2004. Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theor. 50, 10 (Oct. 2004), 2231--2242. Google ScholarDigital Library
- Yu-Ting Tsai. 2015. Multiway K-clustered tensor approximation: Toward high-performance photorealistic data-driven rendering. ACM Trans. Graph. 34, 5, Article 157 (Nov. 2015), 15 pages. Google ScholarDigital Library
- Yu-Ting Tsai and Zen-Chung Shih. 2012. K-clustered tensor approximation: A sparse multilinear model for real-time rendering. ACM Trans. Graph. 31, 3, Article 19 (June 2012), 17 pages. Google ScholarDigital Library
- Vaibhav Vaish, Marc Levoy, Richard Szeliski, C. L. Zitnick, and Sing Bing Kang. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2331--2338. Google ScholarDigital Library
- M. Alex O. Vasilescu and Demetri Terzopoulos. 2004. TensorTextures: Multilinear image-based rendering. ACM Trans. Graph. 23, 3 (Aug. 2004), 336--342. Google ScholarDigital Library
- Hongcheng Wang, Qing Wu, Lin Shi, Yizhou Yu, and Narendra Ahuja. 2005. Out-of-core tensor approximation of multi-dimensional matrices of visual data. ACM Trans. Graph. 24, 3 (July 2005), 527--535. Google ScholarDigital Library
- Y. Wang, F. Liu, K. Zhang, G. Hou, Z. Sun, and T. Tan. 2018. LFNet: A novel bidirectional recurrent convolutional neural network for light-field image super-resolution. IEEE Trans. Image Proc. 27, 9 (Sept. 2018), 4274--4286.Google ScholarCross Ref
- Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4 (April 2004), 600--612. Google ScholarDigital Library
- Gordon Wetzstein, Douglas Lanman, Matthew Hirsch, Wolfgang Heidrich, and Ramesh Raskar. 2012b. Compressive light field displays. IEEE Comput. Graph. Applicat. 32, 5 (2012), 6--11. Google ScholarDigital Library
- Gordon Wetzstein, Douglas Lanman, Matthew Hirsch, and Ramesh Raskar. 2012a. Tensor displays: Compressive light field synthesis using multilayer displays with directional backlighting. ACM Trans. Graph. 31, 4, Article 80 (July 2012), 11 pages. Google ScholarDigital Library
- G. Wu, B. Masia, A. Jarabo, Y. Zhang, L. Wang, Q. Dai, T. Chai, and Y. Liu. 2017a. Light field image processing: An overview. IEEE J. Select. Topics Sig. Proc. 11, 7 (Oct. 2017), 926--954.Google ScholarCross Ref
- G. Wu, M. Zhao, L. Wang, Q. Dai, T. Chai, and Y. Liu. 2017b. Light field reconstruction using deep convolutional network on EPI. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1638--1646.Google Scholar
- H. Xu, C. Caramanis, and S. Mannor. 2013. Outlier-robust PCA: The high-dimensional case. IEEE Trans. Inform. Theor. 59, 1 (Jan. 2013), 546--572. Google ScholarDigital Library
- Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. 2015. Learning a deep convolutional network for light-field image super-resolution. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW’15). 57--65. Google ScholarDigital Library
- Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. 2017. Light-field image super-resolution using convolutional neural network. IEEE Sig. Proc. Lett. 24, 6 (June 2017), 848--852.Google ScholarCross Ref
- J. Zepeda, C. Guillemot, and E. Kijak. 2011. Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE J. Select. Topics Sig. Proc. 5, 5 (Sept. 2011), 1061--1073.Google Scholar
- Richard Zhang, Phillip Isola, Alexei Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 586--595.Google ScholarCross Ref
- Zhiming Zhou, Guojun Chen, Yue Dong, David Wipf, Yong Yu, John Snyder, and Xin Tong. 2016. Sparse-as-possible SVBRDF acquisition. ACM Trans. Graph. 35, 6, Article 189 (Nov. 2016), 12 pages. Google ScholarDigital Library
Index Terms
- A Unified Framework for Compression and Compressed Sensing of Light Fields and Light Field Videos
Recommendations
Distributed compressed video sensing
ICIP'09: Proceedings of the 16th IEEE international conference on Image processingThis paper proposes a novel framework called Distributed Compressed Video Sensing (DISCOS) - a solution for Distributed Video Coding (DVC) based on the recently emerging Compressed Sensing theory. The DISCOS framework compressively samples each video ...
Dequantizing compressed sensing with non-Gaussian constraints
ICIP'09: Proceedings of the 16th IEEE international conference on Image processingIn this paper, following the Compressed Sensing (CS) paradigm, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. We present a new class of convex optimization programs, or decoders, coined Basis ...
Quantized Perceptual Compressed Sensing for Audio Signal Compression
DCC '15: Proceedings of the 2015 Data Compression ConferenceCompressed Sensing (CS) has been widely used for multimedia processing to reduce the number of the measurements required to acquire signals that are spare or compressible sparse in some basis. CS provides good quality of the restored signal even when the ...
Comments