skip to main content
article
Free Access

Variable Sparsity Kernel Learning

Published:01 February 2011Publication History
Skip Abstract Section

Abstract

This paper presents novel algorithms and applications for a particular class of mixed-norm regularization based Multiple Kernel Learning (MKL) formulations. The formulations assume that the given kernels are grouped and employ l1 norm regularization for promoting sparsity within RKHS norms of each group and ls, s≥2 norm regularization for promoting non-sparse combinations across groups. Various sparsity levels in combining the kernels can be achieved by varying the grouping of kernels---hence we name the formulations as Variable Sparsity Kernel Learning (VSKL) formulations. While previous attempts have a non-convex formulation, here we present a convex formulation which admits efficient Mirror-Descent (MD) based solving techniques. The proposed MD based algorithm optimizes over product of simplices and has a computational complexity of O(m2ntot log nmax2) where m is no. training data points, nmax,ntot are the maximum no. kernels in any group, total no. kernels respectively and ε is the error in approximating the objective. A detailed proof of convergence of the algorithm is also presented. Experimental results show that the VSKL formulations are well-suited for multi-modal learning tasks like object categorization. Results also show that the MD based algorithm outperforms state-of-the-art MKL solvers in terms of computational efficiency.

References

  1. F. Bach, G. R. G. Lanckriet, and M. I. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the International Conference on Machine Learning, 2004. Google ScholarGoogle Scholar
  2. A. Beck and M. Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 31:167-175, 2003. Google ScholarGoogle Scholar
  3. A. Ben-Tal and A. Nemirovski. Lectures on modern convex optimization: Analysis, algorithms and engineering applications. MPS/ SIAM Series on Optimization, 1, 2001. Google ScholarGoogle Scholar
  4. A. Ben-Tal, T. Margalit, and A. Nemirovski. The ordered subsets mirror descent optimization method with applications to tomography. SIAM Journal of Optimization, 12(1):79-108, 2001. Google ScholarGoogle Scholar
  5. A. C. Berg, T. L. Berg, and J. Malik. Shape matching and object recognition using low distortion correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2005. Google ScholarGoogle Scholar
  6. D. P. Bertsekas. Nonlinear Programming. Athena Scientific, second edition, 1999.Google ScholarGoogle Scholar
  7. C. Blake and C. Merz. Uci repository of machine learning databases, 1998. URL http://www.ics.uci.edu/~mlearn/MLRepository.html.Google ScholarGoogle Scholar
  8. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. Google ScholarGoogle Scholar
  9. T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, 2006. Google ScholarGoogle Scholar
  10. A. D'Aspermont. Smooth optimization with approximate gradient. SIAM Journal on Optimization, 19(3):1171-1183, 2008. Google ScholarGoogle Scholar
  11. L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Generative-Model Based Vision, 2004. Google ScholarGoogle Scholar
  12. G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report CNS-TR-2007-001, California Institute of Technology, 2007.Google ScholarGoogle Scholar
  13. M. Kloft, U. Brefeld, S. Sonnenburg, and A. Zien. Non-sparse regularization and efficient training with multiple kernels. Technical Report UCB/EECS-2010-21, University of California, Berkeley, 2010.Google ScholarGoogle Scholar
  14. G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M.I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27-72, 2004. Google ScholarGoogle Scholar
  15. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006. Google ScholarGoogle Scholar
  16. C. A. Micchelli and M. Pontil. Learning the kernel function via regularization. Journal of Machine Learning Research, 6:1099-1125, 2005. Google ScholarGoogle Scholar
  17. J. S. Nath, G. Dinesh, S. Raman, C. Bhattacharyya, A. Ben-Tal, and K. R. Ramakrishnan. On the algorithmics and applications of a mixed-norm regularization based kernel learning formulation. In Advances in Neural Information Processing Systems (NIPS), 2009.Google ScholarGoogle Scholar
  18. M-E. Nilsback and A. Zisserman. A visual vocabulary for flower classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 1447-1454, 2006. Google ScholarGoogle Scholar
  19. M-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proceedings of the Sixth Indian Conference on Computer Vision, Graphics & Image Processing, 2008. Google ScholarGoogle Scholar
  20. A. Quattoni, X. Carreras, M. Collins, and T. Darrell. An efficient projection for l 1,¿ regularization. In Proceedings of the International Conference on Machine Learning, 2009. Google ScholarGoogle Scholar
  21. A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet. Simplemkl. Journal of Machine Learning Research, 9:2491-2521, 2008.Google ScholarGoogle Scholar
  22. R. T. Rockafellar. Minimax theorems and conjugate saddle-functions. Mathematica Scandinava, 14:151-173, 1964.Google ScholarGoogle Scholar
  23. R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970.Google ScholarGoogle Scholar
  24. E. Shechtman and M. Irani. Matching local self-similarities across images and videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007.Google ScholarGoogle Scholar
  25. M. Sion. On general minimax theorem. Pacific Journal of Mathematics, 1958.Google ScholarGoogle Scholar
  26. S. Sonnenburg, G. Ratsch, C. Schafer, and B. Scholkopf. Large scale multiple kernel learning. Journal of Machine Learning Research, 7:1531-1565, 2006. Google ScholarGoogle Scholar
  27. M. Szafranski, Y. Grandvalet, and A. Rakotomamonjy. Composite kernel learning. In Proceedings of the International Conference on Machine Learning, 2008. Google ScholarGoogle Scholar
  28. P. Tseng. Convergence of a block coordinate descent method for nondifferentiable minimisation. Journal of Optimization Theory and Applications, 2001. Google ScholarGoogle Scholar
  29. V. Vapnik. Statistical Learning Theory. Wiley-Interscience, 1998.Google ScholarGoogle Scholar
  30. M. Varma and D. Ray. Learning the discriminative power invariance trade-off. In Proceedings of the International Conference on Computer Vision, 2007.Google ScholarGoogle Scholar
  31. A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In Proceedings of the International Conference on Computer Vision, 2009.Google ScholarGoogle Scholar
  32. H. Zhang, A. Berg, M. Maire, and J. Malik. Svm-knn: Discriminative nearest neighbor classication for visual category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2126-2136, 2006. Google ScholarGoogle Scholar

Index Terms

  1. Variable Sparsity Kernel Learning

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image The Journal of Machine Learning Research
            The Journal of Machine Learning Research  Volume 12, Issue
            2/1/2011
            3426 pages
            ISSN:1532-4435
            EISSN:1533-7928
            Issue’s Table of Contents

            Publisher

            JMLR.org

            Publication History

            • Published: 1 February 2011
            Published in jmlr Volume 12, Issue

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader