Abstract
Whereas people learn many different types of knowledge from diverse experiences over many years, and become better learners over time, most current machine learning systems are much more narrow, learning just a single function or data model based on statistical analysis of a single data set. We suggest that people learn better than computers precisely because of this difference, and we suggest a key direction for machine learning research is to develop software architectures that enable intelligent agents to also learn many types of knowledge, continuously over many years, and to become better learners over time. In this paper we define more precisely this never-ending learning paradigm for machine learning, and we present one case study: the Never-Ending Language Learner (NELL), which achieves a number of the desired properties of a never-ending learner. NELL has been learning to read the Web 24hrs/day since January 2010, and so far has acquired a knowledge base with 120mn diverse, confidence-weighted beliefs (e.g., servedWith(tea,biscuits)), while learning thousands of interrelated functions that continually improve its reading competence over time. NELL has also learned to reason over its knowledge base to infer new beliefs it has not yet read from those it has, and NELL is inventing new relational predicates to extend the ontology it uses to represent beliefs. We describe the design of NELL, experimental results illustrating its behavior, and discuss both its successes and shortcomings as a case study in never-ending learning. NELL can be tracked online at http://rtw.ml.cmu.edu, and followed on Twitter at @CMUNELL.
- Balcan, M.-F., Blum, A. A PAC-style model for learning from labeled and unlabeled data. Proc. of COLT (2004). Google ScholarDigital Library
- Bengio, Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning 2, 1 (2009), 1--127. Google ScholarDigital Library
- Bengio, Y., Louradour, J., Collobert, R., Weston, J. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (2009), ACM, 41--48. Google ScholarDigital Library
- Blum, A., Mitchell, T. Combining labeled and unlabeled data with co-training. Proc. of COLT (1998). Google ScholarDigital Library
- Brunskill, E., Leffler, B., Li, L., Littman, M.L., Roy, N. Corl: A continuous-state offset-dynamics reinforcement learner. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI) (2012), 53--61. Google ScholarDigital Library
- Callan, J. Clueweb12 data set (2013; http://lemurproject.org/clueweb12/.Google Scholar
- Callan, J., Hoy, M. Clueweb09 data set (2009) http://boston.lti.cs.cmu.edu/Data/clueweb09/.Google Scholar
- Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr, E.R., Mitchell, T.M. Toward an architecture for never-ending language learning. AAAI 5, 3 (2010a). Google ScholarDigital Library
- Carlson, A., Betteridge, J., Wang, R.C., Hruschka Jr., E.R., Mitchell, T.M. Coupled semi-supervised learning for information extraction. Proc. of WSDM (2010b). Google ScholarDigital Library
- Caruana, R. Multitask learning. Machine Learning 28 (1997), 41--75. Google ScholarDigital Library
- Chen, Z., Liu, B. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 10, 3 (2016), 1--145. Google ScholarDigital Library
- Chen, X., Shrivastava, A., Gupta, A. Neil: Extracting visual knowledge from web data. In Proceedings of ICCV (2013). Google ScholarDigital Library
- Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S. Learning to extract symbolic knowledge from the world wide web. In Proceedings of the 15th National Conference on Artificial Intelligence (1998). Google ScholarDigital Library
- Dempster, A., Laird, N., Rubin, D. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Series B (1977).Google Scholar
- Donmez, P., Carbonell, J.G. Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In Proceedings of the 17th ACM conference on Information and knowledge management (2008), ACM, 619--628. Google ScholarDigital Library
- Duarte, M.C., Hruschka Jr., E.R. How to read the web in portuguese using the never-ending language learner's principles. In Intelligent Systems Design and Applications (ISDA), 2014 14th International Conference on (2014), IEEE, 162--167.Google ScholarCross Ref
- Etzioni, O.e.a. Web-scale information extraction in knowitall (preliminary results). In WWW (2004). Google ScholarDigital Library
- Etzioni, O.e.a. Open information extraction: The second generation. Proc. of IJCAI (2011). Google ScholarDigital Library
- Gardner, M., Talukdar, P., Krishnamurthy, J., Mitchell, T. Incorporating vector space similarity in random walk inference over knowledge bases. Proc. of EMNLP (2014).Google ScholarCross Ref
- Krishnamurthy, J., Mitchell, T.M. Which noun phrases denote which concepts. Proc. of ACL (2011). Google ScholarDigital Library
- Laird, J., Newell, A., Rosenbloom, P. SOAR: An architecture for general intelligence. Artif. Intel. 33, (1987), 1--64. Google ScholarDigital Library
- Langley, P., McKusick, K.B., Allen, J.A., Iba, W.F., Thompson, K. A design for the ICARUS architecture. SIGART Bull. 2, 4 (1991), 104--109. Google ScholarDigital Library
- Lao, N., Mitchell, T., Cohen, W.W. Random walk inference and learning in a large scale knowledge base. Proc. of EMNLP (2011). Google ScholarDigital Library
- Lenat, D.B. Eurisko: A program that learns new heuristics and domain concepts. Artif. Intel. 21, 1--2 (1983), 61--98. Google ScholarDigital Library
- Maaten, L.v.d., Hinton, G. Visualizing data using t-SNE. J. Machine Learning Res. 9, Nov (2008):2579--2605.Google Scholar
- Mitchell, T.M., Allen, J., Chalasani, P., Cheng, J., Etzioni, O., Ringuette, M.N., Schlimmer, J.C. THEO: A framework for self-improving systems. Arch. for Intel. (1991), 323--356.Google Scholar
- Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M., Welling, J. Never-ending learning. In AAAI Conference on Artificial Intelligence (2015), AAAI, 2302--2310. Google ScholarDigital Library
- Mohamed, T., Hruschka Jr., E.R., Mitchell, T.M. Discovering relations between noun categories. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (2011), Association for Computational Linguistics, Edinburgh, Scotland, UK, 1447--1455. Google ScholarDigital Library
- Muggleton, S., Buntine, W. Machine invention of first-order predicates by inverting resolution. Inductivelogic programming (1992), 261--280.Google Scholar
- Nigam, K., McCallum, A., Thrun, S., Mitchell, T. Text classification using labeled and unlabeled documents. Machine Learning 39 (2000), 103--134. Google ScholarDigital Library
- Pedro, S.D., Hruschka Jr, E.R. Conversing learning: Active learning and active social interaction for human supervision in never-ending learning systems. In Advances in Artificial Intelligence--IBERAMIA 2012 (Springer, 2012), 231--240.Google ScholarCross Ref
- Platanios, E.A., Blum, A., Mitchell, T.M. Estimating Accuracy from Unlabeled Data. Proc. of UAI (2014). Google ScholarDigital Library
- Platanios, E.A., Dubey, A., Mitchell, T.M. Estimating Accuracy from Unlabeled Data: A Bayesian Approach. In Proceedings of the International Conference on Machine Learning (2016). Google ScholarDigital Library
- Platanios, E.A., Poon, H., Mitchell, T.M., Horvitz, E. Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach (2017). preprint, https://arxiv.org/abs/1705.07086.Google Scholar
- Pujara, J., Miao, H., Getoor, L., Cohen, W. Knowledge graph identification. ISWC (2013). Google ScholarDigital Library
- Samadi, M., Veloso, M.M., Blum, M. Openeval: Web information query evaluation. In AAAI (2013). Google ScholarDigital Library
- Suchanek, F.M., Kasneci, G., Weikum, G. Yago: A Core of Semantic Knowledge. In 16th international World Wide Web conference (WWW 2007) (2007), ACM Press, New York, NY, USA. Google ScholarDigital Library
- Thrun, S., Mitchell, T. Lifelong robot learning. Rob. Auton. Sys. 15, (1995), 25--46.Google ScholarCross Ref
- Thrun, S., Pratt, L. (eds) Learning to learn, Kluwer Academic Publishers, Norwell, MA, USA, 1998. Google ScholarDigital Library
- Tong, S., Koller, D. Active learning for structure in bayesian networks. IJCAI (2001). Google ScholarDigital Library
- Wang, R.C., Cohen, W.W. Language-independent set expansion of named entities using the web. Proc. of ICDM (2007). Google ScholarDigital Library
- Wieting, J., Bansal, M., Gimpel, K., Livescu, K. Towards universal paraphrastic sentence embeddings. In Proceedings of the International Conference on Learning Representations (ICLR) (2015).Google Scholar
- Wijaya, D.T. VerbKB: A Knowledge Base of Verbs for Natural Language Understanding. Ph.D. Dissertation, Carnegie Mellon University, 2016.Google Scholar
- Yang, B., Mitchell, T. Leveraging knowledge bases in lstms for improving machine reading. ACL (2017).Google Scholar
Index Terms
- Never-ending learning
Recommendations
Never-ending learning
AAAI'15: Proceedings of the Twenty-Ninth AAAI Conference on Artificial IntelligenceWhereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm ...
Never ending learning
ECAI'12: Proceedings of the 20th European Conference on Artificial IntelligenceWe will never really understand learning or intelligence until we can build machines that learn many different things, over years, and become better learners over time.
This talk describes our research to build a Never-Ending Language Learner (NELL) ...
Conversing Learning: Active Learning and Active Social Interaction for Human Supervision in Never-Ending Learning Systems
Advances in Artificial Intelligence – IBERAMIA 2012AbstractThe Machine Learning community have been introduced to NELL (Never-Ending Language Learning), a system able to learn from web and to use its knowledge to keep learning infinitely. The idea of continuously learning from the web brings concerns ...
Comments