ABSTRACT
Learning node representations for networks has attracted much attention recently due to its effectiveness in a variety of applications. This paper focuses on learning node representations for heterogeneous star networks, which have a center node type linked with multiple attribute node types through different types of edges. In heterogeneous star networks, we observe that the training order of different types of edges affects the learning performance significantly. Therefore we study learning curricula for node representation learning in heterogeneous star networks, i.e., learning an optimal sequence of edges of different types for the node representation learning process. We formulate the problem as a Markov decision process, with the action as selecting a specific type of edges for learning or terminating the training process, and the state as the sequence of edge types selected so far. The reward is calculated as the performance on external tasks with node representations as features, and the goal is to take a series of actions to maximize the cumulative rewards. We propose an approach based on deep reinforcement learning for this problem. Our approach leverages LSTM models to encode states and further estimate the expected cumulative reward of each state-action pair, which essentially measures the long-term performance of different actions at each state. Experimental results on real-world heterogeneous star networks demonstrate the effectiveness and efficiency of our approach over competitive baseline approaches.
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2--3):235--256, 2002. Google ScholarDigital Library
- R. Bellman. A markovian decision process. Technical report, DTIC Document, 1957.Google Scholar
- S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In Advances in Neural Information Processing Systems, pages 1171--1179, 2015. Google ScholarDigital Library
- Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41--48. ACM, 2009. Google ScholarDigital Library
- T. Chen and Y. Sun. Task-guided and path-augmented heterogeneous network embedding for author identification. arXiv preprint arXiv:1612.02814, 2016. Google ScholarDigital Library
- J. L. Elman. Learning and development in neural networks: The importance of starting small. Cognition, 48(1):71--99, 1993.Google ScholarCross Ref
- R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
- S. Gelly and D. Silver. Combining online and offline knowledge in uct. In Proceedings of the 24th international conference on Machine learning, pages 273--280. ACM, 2007. Google ScholarDigital Library
- S. Gelly, Y. Wang, O. Teytaud, M. U. Patterns, and P. Tao. Modification of uct with patterns in monte-carlo go. 2006.Google Scholar
- K. J. Geras and C. Sutton. Scheduled denoising autoencoders. arXiv preprint arXiv:1406.3269, 2014.Google Scholar
- A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855--864. ACM, 2016. Google ScholarDigital Library
- H. Gui, J. Liu, F. Tao, M. Jiang, B. Norick, and J. Han. Large-scale embedding learning in heterogeneous event data. In Data Mining (ICDM), 2016 IEEE 16th International Conference on, pages 907--912. IEEE, 2016.Google ScholarCross Ref
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
- L. Kocsis and C. Szepesvári. Bandit based monte-carlo planning. In European conference on machine learning, pages 282--293. Springer, 2006. Google ScholarDigital Library
- Y. J. Lee and K. Grauman. Learning the easy things first: Self-paced visual category discovery. In CVPR, pages 1721--1728. IEEE, 2011. Google ScholarDigital Library
- J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, and H. Liu. Attributed network embedding for learning in a dynamic environment. arXiv preprint arXiv:1706.01860, 2017.Google Scholar
- J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, and D. Jurafsky. Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541, 2016.Google Scholar
- J. Li, A. Ritter, and D. Jurafsky. Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks. arXiv preprint arXiv:1510.05198, 2015.Google Scholar
- J. Louradour and C. Kermorvant. Curriculum learning for handwritten text line recognition. In Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on, pages 56--60. IEEE, 2014. Google ScholarDigital Library
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119, 2013. Google ScholarDigital Library
- A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012. Google ScholarDigital Library
- V. Mnih, N. Heess, A. Graves, et al. Recurrent models of visual attention. In Advances in neural information processing systems, pages 2204--2212, 2014. Google ScholarDigital Library
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.Google Scholar
- R. Munos. From bandits to monte-carlo tree search: The optimistic principle applied to optimization and planning. 2014.Google ScholarCross Ref
- A. Pentina, V. Sharmanska, and C. H. Lampert. Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5492--5500, 2015.Google ScholarCross Ref
- T. Pepels, M. H. Winands, and M. Lanctot. Real-time monte carlo tree search in ms pac-man. IEEE Transactions on Computational Intelligence and AI in Games, 6(3):245--257, 2014.Google ScholarCross Ref
- B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701--710. ACM, 2014. Google ScholarDigital Library
- M. Qu, J. Tang, J. Shang, X. Ren, M. Zhang, and J. Han. An attention-based collaboration framework for multi-view network representation learning. arXiv preprint arXiv:1709.06636, 2017. Google ScholarDigital Library
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature.Google Scholar
- D. Silver, R. S. Sutton, and M. Müller. Sample-based learning and search with permanent and transient memories. In Proceedings of the 25th international conference on Machine learning, pages 968--975. ACM, 2008. Google ScholarDigital Library
- V. I. Spitkovsky, H. Alshawi, and D. Jurafsky. Baby steps: How ?less is more? in unsupervised dependency parsing. NIPS: Grammar Induction, Representation of Language and Language Learning, pages 1--10, 2009.Google Scholar
- Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, 4(11):992--1003, 2011.Google ScholarDigital Library
- Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 797--806. ACM, 2009. Google ScholarDigital Library
- R. S. Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming.Google Scholar
- R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction, volume 1. Google ScholarDigital Library
- J. Tang, J. Liu, M. Zhang, and Q. Mei. Visualizing large-scale and high-dimensional data. In Proceedings of the 25th International Conference on World Wide Web, pages 287--297. International World Wide Web Conferences, 2016. Google ScholarDigital Library
- J. Tang, M. Qu, and Q. Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1165--1174. ACM, 2015. Google ScholarDigital Library
- J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, pages 1067--1077. International World Wide Web Conferences Steering Committee, 2015. Google ScholarDigital Library
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 990--998. ACM, 2008. Google ScholarDigital Library
- T. Tieleman and G. Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2):26--31, 2012.Google Scholar
- Y. Tsvetkov, M. Faruqui, W. Ling, and C. Dyer. Learning the curriculum with bayesian optimization for task-specific word representation learning. arXiv preprint arXiv:1605.03852, 2016.Google Scholar
- B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.Google Scholar
Index Terms
- Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning
Recommendations
Deep Ordinal Reinforcement Learning
Machine Learning and Knowledge Discovery in DatabasesAbstractReinforcement learning usually makes use of numerical rewards, which have nice properties but also come with drawbacks and difficulties. Using rewards on an ordinal scale (ordinal rewards) is an alternative to numerical rewards that has received ...
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information ProcessingAbstractAs the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Reinforcement learning with time
AAAI'97/IAAI'97: Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligenceThis paper steps back from the standard infinite horizon formulation of reinforcement learning problems to consider the simpler case of finite horizon problems. Although finite horizon problems may be solved using infinite horizon learning algorithms by ...
Comments