skip to main content
Learning structured prediction models: a large margin approach
Publisher:
  • Stanford University
  • 408 Panama Mall, Suite 217
  • Stanford
  • CA
  • United States
ISBN:978-0-496-13523-3
Order Number:AAI3153077
Pages:
215
Bibliometrics
Skip Abstract Section
Abstract

This thesis presents a novel statistical estimation framework for structured models based on the large margin principle underlying support vector machines. We consider standard probabilistic models, such as Markov networks (undirected graphical models) and context free grammars as well as less conventional combinatorial models such as weighted graph-cuts and matchings. Our framework results in several efficient learning formulations for complex prediction tasks. Fundamentally, we rely on the expressive power of convex optimization problems to compactly capture inference or solution optimality in structured models. Directly embedding this structure within the learning formulation produces compact convex problems for efficient estimation of very complex and diverse models. For some of these models, alternative estimation methods are intractable. We analyze the theoretical generalization properties of our approach and derive a novel margin-based bound for structured prediction. In order to scale up to very large training datasets, we develop problem-specific optimization algorithms that exploit efficient dynamic programming subroutines. We describe experimental applications to a diverse range of tasks, including handwriting recognition, 3D terrain classification, disulfide connectivity prediction, hypertext categorization, natural language parsing, email organization and image segmentation. These empirical evaluations show significant improvements over state-of-the-art methods and promise wide practical use for our framework.

Cited By

  1. Coelho M, Borges C and Neto R (2016). A dual method for solving the nonlinear structured prediction problem, Pattern Recognition Letters, 75:C, (55-62), Online publication date: 1-May-2016.
  2. ACM
    Pfeiffer J, Moreno S, La Fond T, Neville J and Gallagher B Attributed graph models Proceedings of the 23rd international conference on World wide web, (831-842)
  3. Zhang X, Saha A and Vishwanathan S Accelerated training of max-margin Markov networks with kernels Proceedings of the 22nd international conference on Algorithmic learning theory, (292-307)
  4. Yu X and Lam W Accelerated training of maximum margin Markov models for sequence labeling Proceedings of the 23rd International Conference on Computational Linguistics: Posters, (1408-1416)
  5. Hazan T and Urtasun R A Primal-Dual message-passing algorithm for approximated large scale structured prediction Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 1, (838-846)
  6. Jiang X, Dong B and Sweeney L Temporal Maximum Margin Markov Network Proceedings of the 2010th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (587-600)
  7. Tran D and Forsyth D Improved human parsing with a full relational model Proceedings of the 11th European conference on Computer vision: Part IV, (227-240)
  8. Jiang X, Dong B and Sweeney L Temporal maximum margin Markov network Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I, (587-600)
  9. Himmelsbach M, Luettel T and Wuensche H Real-time object classification in 3D point clouds using point feature histograms Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems, (994-1000)
  10. Jain B and Obermayer K (2009). Structure Spaces, The Journal of Machine Learning Research, 10, (2667-2714), Online publication date: 1-Dec-2009.
  11. Whiteson S and Whiteson D (2009). Machine learning for event selection in high energy physics, Engineering Applications of Artificial Intelligence, 22:8, (1203-1217), Online publication date: 1-Dec-2009.
  12. ACM
    Lopez A (2008). Statistical machine translation, ACM Computing Surveys (CSUR), 40:3, (1-49), Online publication date: 1-Aug-2008.
  13. ACM
    Sarawagi S and Gupta R Accurate max-margin training for structured output spaces Proceedings of the 25th international conference on Machine learning, (888-895)
  14. Sarawagi S (2008). Information Extraction, Foundations and Trends in Databases, 1:3, (261-377), Online publication date: 1-Mar-2008.
  15. ACM
    Agarwal A and Chakrabarti S Learning random walks to rank nodes in graphs Proceedings of the 24th international conference on Machine learning, (9-16)
  16. ACM
    Bordes A, Bottou L, Gallinari P and Weston J Solving multiclass support vector machines with LaRank Proceedings of the 24th international conference on Machine learning, (89-96)
  17. Deshpande A and Sarawagi S Probabilistic graphical models and their role in databases Proceedings of the 33rd international conference on Very large data bases, (1435-1436)
  18. Triebel R, Schmidt R, Mozos Ó and Burgard W Instance-based AMN classification for improved object recognition in 2D and 3D laser range data Proceedings of the 20th international joint conference on Artifical intelligence, (2225-2230)
  19. Lacoste-Julien S, Taskar B, Klein D and Jordan M Word alignment via quadratic assignment Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, (112-119)
  20. Taskar B, Lacoste-Julien S and Jordan M (2006). Structured Prediction, Dual Extragradient and Bregman Projections, The Journal of Machine Learning Research, 7, (1627-1653), Online publication date: 1-Dec-2006.
  21. Joachims T Structured output prediction with support vector machines Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition, (1-7)
  22. ACM
    Taskar B, Chatalbashev V, Koller D and Guestrin C Learning structured prediction models Proceedings of the 22nd international conference on Machine learning, (896-903)
  23. Okanohara D and Tsujii J Assigning polarity scores to reviews using machine learning techniques Proceedings of the Second international joint conference on Natural Language Processing, (314-325)
Contributors
  • Stanford University
  • University of Washington

Recommendations