skip to main content
article
Free Access

Limitations of learning via embeddings in euclidean half spaces

Authors Info & Claims
Published:01 March 2003Publication History
Skip Abstract Section

Abstract

The notion of embedding a class of dichotomies in a class of linear half spaces is central to the support vector machines paradigm. We examine the question of determining the minimal Euclidean dimension and the maximal margin that can be obtained when the embedded class has a finite VC dimension. We show that an overwhelming majority of the family of finite concept classes of any constant VC dimension cannot be embedded in low-dimensional half spaces. (In fact, we show that the Euclidean dimension must be almost as high as the size of the instance space.) We strengthen this result even further by showing that an overwhelming majority of the family of finite concept classes of any constant VC dimension cannot be embedded in half spaces (of arbitrarily high Euclidean dimension) with a large margin. (In fact, the margin cannot be substantially larger than the margin achieved by the trivial embedding.) Furthermore, these bounds are robust in the sense that allowing each image half space to err on a small fraction of the instances does not imply a significant weakening of these dimension and margin bounds. Our results indicate that any universal learning machine, which transforms data into the Euclidean space and then applies linear (or large margin) classification, cannot enjoy any meaningful generalization guarantees that are based on either VC dimension or margins considerations. This failure of generalization bounds applies even to classes for which "straight forward" empirical risk minimization does enjoy meaningful generalization guarantees.

References

  1. N. Alon, P. Frankl, and V. Rödl. Geometrical realization of set systems and probabilistic communication complexity. In Proceedings of the 26th Symposium on Foundations of Computer Science, pages 277-280. 1985.]]Google ScholarGoogle Scholar
  2. Rosa I. Arriaga and Santosh Vempala. An algorithmic theory of learning: Robust concepts and random projection. In Proceedings of the 40'th Annual Symposium on the Foundations of Computer Science, pages 616-623, 1999.]] Google ScholarGoogle Scholar
  3. Shai Ben-David. A priori generalization bounds for kernel based learning: Can sparsity save the day? Technical Report, Cornell University, 2001.]]Google ScholarGoogle Scholar
  4. Béla Bollobás. Extremal Graph Theory. Academic Press, 1978.]]Google ScholarGoogle Scholar
  5. Jürgen Forster. A linear bound on the unbounded error probabilistic communication complexity. In Proceedings of the 16th Annual Conference on Computational Complexity, pages 100-106, 2001.]] Google ScholarGoogle Scholar
  6. Jürgen Forster, Niels Schmitt, and Hans Ulrich Simon. Estimating the optimal margins of embeddings in euclidean half spaces. In Proceedings of the 14th Annual Workshop on Computational Learning Theory, pages 402-415. Springer Verlag, 2001.]] Google ScholarGoogle Scholar
  7. Yoav Freund and Robert Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277-296, 1999.]] Google ScholarGoogle Scholar
  8. Wolfgang Maass and György Turán. Lower bound methods and separation results for on-line learning models. Machine Learning, 9:107-145, 1992.]] Google ScholarGoogle Scholar
  9. Llew Mason, Peter L. Bartlett, and Jonathan Baxter. Improved generalization through explicit optimization of margins. Machine Learning, 38(3):243-255, 2000.]] Google ScholarGoogle Scholar
  10. Marvin L. Minsky and Seymour A. Papert. Perceptrons. The MIT Press, Cambrigde MA, third edition, 1988.]]Google ScholarGoogle Scholar
  11. A. B. J. Novikoff. On convergence proofs for perceptrons. In Proceedings of the Symposium of Mathematical Theory of Automata, pages 615-622, 1962.]]Google ScholarGoogle Scholar
  12. Ramamohan Paturi and Janos Simon. Probabilistic communication complexity. Journal of Computer and System Sciences, 33(1):106-123, 1986.]] Google ScholarGoogle Scholar
  13. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psych. Rev., 65:386-407, 1958.]]Google ScholarGoogle Scholar
  14. F. Rosenblatt. Principles and Neurodynamics: Perceptrons and the Theory of Brain Mechanisms . Spartan Books, Washington, D.C., 1962.]]Google ScholarGoogle Scholar
  15. Vladimir Vapnik. Statistical Learning Theory. Wiley Series on Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons, 1998.]]Google ScholarGoogle Scholar
  16. K. Zarankiewicz. Problem P 101. Colloq. Math., 2:301, 1951.]]Google ScholarGoogle Scholar

Index Terms

  1. Limitations of learning via embeddings in euclidean half spaces

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader