article

Free Access

Limitations of learning via embeddings in euclidean half spaces

Authors:
Shai Ben-David

Department of Computer Science, Technion, Haifa 32000, Israel

Department of Computer Science, Technion, Haifa 32000, Israel
View Profile

,
Nadav Eiron

IBM Almaden Research Center, 650 Harry Road, San Jose, CA

IBM Almaden Research Center, 650 Harry Road, San Jose, CA
View Profile

,
Hans Ulrich Simon

Fakultät für Mathematik, Ruhr-Universität Bochum, D-44780 Bochum, Germany

Fakultät für Mathematik, Ruhr-Universität Bochum, D-44780 Bochum, Germany
View Profile

Authors Info & Claims

The Journal of Machine Learning Research Volume 3pp 441–461

Published:01 March 2003Publication History

The Journal of Machine Learning Research

Abstract

The notion of embedding a class of dichotomies in a class of linear half spaces is central to the support vector machines paradigm. We examine the question of determining the minimal Euclidean dimension and the maximal margin that can be obtained when the embedded class has a finite VC dimension. We show that an overwhelming majority of the family of finite concept classes of any constant VC dimension cannot be embedded in low-dimensional half spaces. (In fact, we show that the Euclidean dimension must be almost as high as the size of the instance space.) We strengthen this result even further by showing that an overwhelming majority of the family of finite concept classes of any constant VC dimension cannot be embedded in half spaces (of arbitrarily high Euclidean dimension) with a large margin. (In fact, the margin cannot be substantially larger than the margin achieved by the trivial embedding.) Furthermore, these bounds are robust in the sense that allowing each image half space to err on a small fraction of the instances does not imply a significant weakening of these dimension and margin bounds. Our results indicate that any universal learning machine, which transforms data into the Euclidean space and then applies linear (or large margin) classification, cannot enjoy any meaningful generalization guarantees that are based on either VC dimension or margins considerations. This failure of generalization bounds applies even to classes for which "straight forward" empirical risk minimization does enjoy meaningful generalization guarantees.

References

N. Alon, P. Frankl, and V. Rödl. Geometrical realization of set systems and probabilistic communication complexity. In Proceedings of the 26th Symposium on Foundations of Computer Science, pages 277-280. 1985.]]Google Scholar
Rosa I. Arriaga and Santosh Vempala. An algorithmic theory of learning: Robust concepts and random projection. In Proceedings of the 40'th Annual Symposium on the Foundations of Computer Science, pages 616-623, 1999.]] Google Scholar
Shai Ben-David. A priori generalization bounds for kernel based learning: Can sparsity save the day? Technical Report, Cornell University, 2001.]]Google Scholar
Béla Bollobás. Extremal Graph Theory. Academic Press, 1978.]]Google Scholar
Jürgen Forster. A linear bound on the unbounded error probabilistic communication complexity. In Proceedings of the 16th Annual Conference on Computational Complexity, pages 100-106, 2001.]] Google Scholar
Jürgen Forster, Niels Schmitt, and Hans Ulrich Simon. Estimating the optimal margins of embeddings in euclidean half spaces. In Proceedings of the 14th Annual Workshop on Computational Learning Theory, pages 402-415. Springer Verlag, 2001.]] Google Scholar
Yoav Freund and Robert Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277-296, 1999.]] Google Scholar
Wolfgang Maass and György Turán. Lower bound methods and separation results for on-line learning models. Machine Learning, 9:107-145, 1992.]] Google Scholar
Llew Mason, Peter L. Bartlett, and Jonathan Baxter. Improved generalization through explicit optimization of margins. Machine Learning, 38(3):243-255, 2000.]] Google Scholar
Marvin L. Minsky and Seymour A. Papert. Perceptrons. The MIT Press, Cambrigde MA, third edition, 1988.]]Google Scholar
A. B. J. Novikoff. On convergence proofs for perceptrons. In Proceedings of the Symposium of Mathematical Theory of Automata, pages 615-622, 1962.]]Google Scholar
Ramamohan Paturi and Janos Simon. Probabilistic communication complexity. Journal of Computer and System Sciences, 33(1):106-123, 1986.]] Google Scholar
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psych. Rev., 65:386-407, 1958.]]Google Scholar
F. Rosenblatt. Principles and Neurodynamics: Perceptrons and the Theory of Brain Mechanisms . Spartan Books, Washington, D.C., 1962.]]Google Scholar
Vladimir Vapnik. Statistical Learning Theory. Wiley Series on Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons, 1998.]]Google Scholar
K. Zarankiewicz. Problem P 101. Colloq. Math., 2:301, 1951.]]Google Scholar

Index Terms

Limitations of learning via embeddings in euclidean half spaces
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Limitations of Learning via Embeddings in Euclidean Half-Spaces
COLT '01/EuroCOLT '01: Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory

This paper considers the embeddability of general concept classes in Euclidean half spaces. By embedding in half spaces we refer to a mapping from some concept class to half spaces so that the labeling given to points in the instance space is retained. ...
Read More
Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

Concept classes can canonically be represented by matrices with entries 1 and ý1. We use the singular value decomposition of this matrix to determine the optimal margins of embeddings of the concept classes of singletons and of half intervals in ...
Read More
Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces
COLT '01/EuroCOLT '01: Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory

Concept classes can canonically be represented by matrices with entries 1 and -1. We use the singular value decomposition of this matrix to determine the optimal margins of embeddings of the concept classes of singletons and of half intervals in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

The Journal of Machine Learning Research Volume 3, Issue
3/1/2003
1437 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
JMLR.org
Publication History
- Published: 1 March 2003
Published in jmlr Volume 3, Issue
Author Tags
concept learning
embeddings in half spaces
large margin classification
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 324
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Limitations of learning via embeddings in euclidean half spaces

The Journal of Machine Learning Research

Abstract

References

Cited By

Index Terms

Recommendations

Limitations of Learning via Embeddings in Euclidean Half-Spaces

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Limitations of learning via embeddings in euclidean half spaces

The Journal of Machine Learning Research

Abstract

References

Cited By

Index Terms

Recommendations

Limitations of Learning via Embeddings in Euclidean Half-Spaces

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media