skip to main content
Skip header Section
Perceptrons: expanded editionFebruary 1988
Publisher:
  • MIT Press
  • 55 Hayward St.
  • Cambridge
  • MA
  • United States
ISBN:978-0-262-63111-2
Published:26 February 1988
Pages:
292
Skip Bibliometrics Section
Bibliometrics
Abstract

No abstract available.

Cited By

  1. Beniamini G The approximate degree of bipartite perfect matching Proceedings of the 37th Computational Complexity Conference, (1-26)
  2. Kong Y and Saeedi E (2019). The investigation of neural networks performance in side-channel attacks, Artificial Intelligence Review, 52:1, (607-623), Online publication date: 1-Jun-2019.
  3. Hashemi M and Hall M Identifying the Responsible Group for Extreme Acts of Violence Through Pattern Recognition HCI in Business, Government, and Organizations, (594-605)
  4. Teran E, Wang Z and Jiménez D Perceptron learning for reuse prediction The 49th Annual IEEE/ACM International Symposium on Microarchitecture, (1-12)
  5. Harrington K (2016). A circuit basis for morphogenesis, Theoretical Computer Science, 633:C, (28-36), Online publication date: 20-Jun-2016.
  6. Pagh R Locality-sensitive hashing without false negatives Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, (1-9)
  7. ACM
    Yuksel K and Adali S Prototyping input controller for touch-less interaction with ubiquitous environments Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, (635-640)
  8. Chen Z and Fu B Approximating multilinear monomial coefficients and maximum multilinear monomials in multivariate polynomials Proceedings of the 4th international conference on Combinatorial optimization and applications - Volume Part I, (309-323)
  9. Sug H (2009). Empirical determination of sample sizes for multi-layer perceptrons by simple RBF networks, WSEAS Transactions on Computers, 8:9, (1504-1513), Online publication date: 1-Sep-2009.
  10. Sug H A pilot sampling method for multi-layer perceptrons Proceedings of the WSEAES 13th international conference on Computers, (629-633)
  11. Latino C, Moreno-Armendáriz M and Hagan M Realizing general MLP networks with minimal FPGA resources Proceedings of the 2009 international joint conference on Neural Networks, (702-709)
  12. Levy S and Gayler R Vector Symbolic Architectures Proceedings of the 2008 conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference, (414-418)
  13. ACM
    Sherstov A The pattern matrix method for lower bounds on quantum communication Proceedings of the fortieth annual ACM symposium on Theory of computing, (85-94)
  14. Klivans A and Sherstov A (2007). Unconditional lower bounds for learning intersections of halfspaces, Machine Language, 69:2-3, (97-114), Online publication date: 1-Dec-2007.
  15. ACM
    Sherstov A Separating AC0 from depth-2 majority circuits Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, (294-301)
  16. Gori M and Sperduti A (2005). 2005 Special Issue, Neural Networks, 18:8, (1064-1079), Online publication date: 1-Oct-2005.
  17. Case J Automata theory Encyclopedia of Computer Science, (112-117)
  18. Albrecht A and Wong C (2001). Combining the Perceptron Algorithm with Logarithmic Simulated Annealing, Neural Processing Letters, 14:1, (75-83), Online publication date: 1-Aug-2001.
  19. ACM
    Beals R, Buhrman H, Cleve R, Mosca M and de Wolf R (2001). Quantum lower bounds by polynomials, Journal of the ACM, 48:4, (778-797), Online publication date: 1-Jul-2001.
  20. Gat Y (2001). A Learning Generalization Bound with an Application to Sparse-Representation Classifiers, Machine Language, 42:3, (233-239), Online publication date: 1-Mar-2001.
  21. ACM
    Kumar R (2001). A neural net compiler system for hierarchical organization, ACM SIGPLAN Notices, 36:2, (26-36), Online publication date: 1-Feb-2001.
  22. Balkenius C (1999). Dynamics of a Classical Conditioning Model, Autonomous Robots, 7:1, (41-56), Online publication date: 1-Jul-1999.
  23. ACM
    Nayak A and Wu F The quantum query complexity of approximating the median and related statistics Proceedings of the thirty-first annual ACM symposium on Theory of Computing, (384-393)
  24. ACM
    Linial N and Sasson O Non-expansive hashing Proceedings of the twenty-eighth annual ACM symposium on Theory of Computing, (509-518)
  25. ACM
    Glover C, Rao N and Oblow E Hybrid pattern recognition system capable of self-modification Proceedings of the second international conference on Information and knowledge management, (239-244)
  26. ACM
    Kearns M Efficient noise-tolerant learning from statistical queries Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing, (392-401)
  27. Regier T Learning perceptually-grounded semantics in the L0 project Proceedings of the 29th annual meeting on Association for Computational Linguistics, (138-145)
  28. Montanvert A, Meer P and Rosenfeld A (1991). Hierarchical Image Analysis Using Irregular Tessellations, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:4, (307-316), Online publication date: 1-Apr-1991.
  29. ACM
    Yao A Circuits and local computation Proceedings of the twenty-first annual ACM symposium on Theory of computing, (186-196)
  30. ACM
    Brown D Kraft storage and access for list implementations(Extended Abstract) Proceedings of the twelfth annual ACM symposium on Theory of computing, (100-107)
Contributors
  • Massachusetts Institute of Technology
  • MIT Media Lab

Recommendations

Stephen P. Smith

This book is a reprint of the classic 1969 treatise on perceptrons, containing the 1972 handwritten alterations of that text. This expanded edition includes two sections written in 1988: a 9-page prologue and a 34-page epilogue. With the advent of new neural net and connectionist research [1], another look at this classic work is in order. This book contains results on the fundamental mathematical properties of linear threshold functions or perceptrons. Let R designate an arbitrary set of points and let &Fgr; = {&fgr; 1, &fgr; 2, . . .} be a family of predicates defined on R. A function &psgr; is a linear threshold function with respect to &Fgr; if there exists a number &thgr; and a set of numbers &agr;(&fgr;), one for each &fgr; in &Fgr;, such that &psgr;- ( X) = &ceill0; :1S&fgr; :3W+9T&Fgr; &agr;(&fgr;)&fgr;( X) > &thgr; &ceilr0; :9F:Y where X ? 9T R and &ceill0; P &ceilr0; = 1 if and only if predicate P is true; otherwise &ceill0; P &ceilr0; = 0. Let L(&Fgr;) denote all the functions that can be defined in this way; these then are all the perceptron-computable functions of &Fgr;. Minsky and Papert ask the important question: what characterizes the set of patterns over R (which we normally think of as a two-dimensional retina) which perceptrons can recognize__?__ The work in this book answers many important facets of that question. First, the question is meaningless without restrictions on the set of predicates &Fgr;. An example restriction studied by the authors is k-order-restricted perceptrons, in which each member of &Fgr; must depend on no more than k points of R. The order of a threshold function &psgr; is defined to be the smallest integer k such that we can find a set of k-order-restricted predicates &Fgr; with &psgr; ? L(&Fgr;). The book is divided into three main sections. Section 1, with four chapters, treats the basic algebraic theory of perceptrons. Section 2, of five chapters, treats their geometric theory, while the final three chapters (Section 3) treat the learning aspects of perceptrons. The algebraic portion serves as a basic background. In it, the authors prove the group-invariance theorem, which states that any &psgr; which is invariant under a group of transformations of R can be represented as a sum in which the coefficients of all group-equivalent predicates have the same numeric value. This theorem is used to show (among other things) that &psgr; PARITY( X) = &ceill0; &vbm0; X &vbm0; is an odd number &ceilr0; is of order &vbm0; R &vbm0; , where &vbm0; X &vbm0; is the cardinality of the set X. In the second main section, the authors begin by showing that the predicate connected is not of finite order. More generally, they show that the only topological invariant predicates of finite order are functions of the Euler number. They then prove that a number of interesting geometric predicates (circular, rectangular, etc.) are of low order. They prove a stratification theorem that allows parallel perceptrons to simulate sequential decision making. They study another limitation of the &Fgr; class of predicates, namely, diameter-limited perceptrons. Chapter 9 looks at computing geometric predicates using serial devices such as Turing machines. In the section on learning, the authors begin by examining measures of complexity about perceptron computation. They show the important fact that some predicates have coefficients that can grow faster than exponentially with &vbm0; R &vbm0; . They then prove Rosenblatt's perceptron convergence theorem, which states that the simple perceptron reinforcement learning scheme converges to a correct solution when such a solution exists. The authors conclude by describing other learning schemes and popular pattern recognition devices of the period. To understand the importance of the work in this excellent book, one must know some of the history of the research on neural nets in the late 1950s and early 1960s. The authors give a simple overview in their last chapter. The perceptron convergence theorem was an example of the dominant paradigm, which asked such questions as If a given predicate is in an L(&Fgr;), then what learning procedure will find coefficients to represent it__?__ This leads to a view of recognition and learning as finding decision surfaces in n-dimensional coefficient spaces (which can lead to a complete statistical theory; see Duda and Hart [1]). It was the authors' insight to ask what characterizes L(&Fgr;) in general and to use geometric predicates and the k-order perceptrons to do so. This insight moved from analyzing perceptrons and their learning procedures to the characterization of the classes of problems computable by perceptrons, a move which leads to results similar to today's theory of computational complexity. Many current neural network researchers say that Minsky and Papert's work is not relevant to their research. In the epilogue, the authors address this claim in detail. They take as an example the work in Rumelhart et al. [2] and illustrate how their research makes important contributions to the experimental work done there. They show that their stratification theorem is relevant for the symmetry predicate found experimentally. They make the important point that the Generalized Delta Rule (and backpropagation) is nothing more or less than hill climbing and thus has all its power and problems. To characterize Minsky and Papert's view of the current work on neural nets, they feel the state of knowledge has seen little theoretical advance. Much of the current work has the flavor of the old neural network research: my net can do toy problem X in Y learning trials. It was precisely to answer the “so what” questions this raises that the work described in this book was performed. Minsky has been quoted as saying that the problem with Perceptrons was that it was too thorough; it contained all the mathematically “easy” results. A new researcher in the field has no new theorems to prove and thus no motivation to continue using these analytical techniques. It is a challenge to neural net researchers to provide as detailed and exacting an analysis of their networks as Minsky and Papert have done for perceptrons. The writing style of the book is pleasant. Even with its highly detailed formal mathematical results, the presentation is still informal, making it much easier to read than it would be otherwise. In this style, the authors describe in simple terms what they are trying to achieve, present the mathematical details with theorems and lemmas, then follow up with a restatement of what was achieved. Reading this book is quite different from reading mathematical texts presented in a more “formal” style. I have some minor quibbles. The 1972 handwritten alterations are both interesting and annoying. Pages 231 and 232 are printed in reverse order. The scene analysis discussion, as the authors point out in one of the handwritten notes, is dated, as is some of the other work in chapters 12 and 13. I recommend this book highly since, as a pure mathematical text, its style and scope are interesting in and of themselves. Further, neural net researchers should be required to read this book (along with a classic pattern recognition text, such as Duda and Hart [1]), even if they limit themselves to the prologue, the introduction, and the epilogue.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.