Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning

Authors:
Meng Qu

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
Jian Tang

HEC Montreal&Montreal Institute for Learning Algorithms, Montreal, Canada

HEC Montreal&Montreal Institute for Learning Algorithms, Montreal, Canada
View Profile

,
Jiawei Han

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data MiningFebruary 2018Pages 468–476https://doi.org/10.1145/3159652.3159711

Published:02 February 2018Publication History

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 468–476

ABSTRACT

Learning node representations for networks has attracted much attention recently due to its effectiveness in a variety of applications. This paper focuses on learning node representations for heterogeneous star networks, which have a center node type linked with multiple attribute node types through different types of edges. In heterogeneous star networks, we observe that the training order of different types of edges affects the learning performance significantly. Therefore we study learning curricula for node representation learning in heterogeneous star networks, i.e., learning an optimal sequence of edges of different types for the node representation learning process. We formulate the problem as a Markov decision process, with the action as selecting a specific type of edges for learning or terminating the training process, and the state as the sequence of edge types selected so far. The reward is calculated as the performance on external tasks with node representations as features, and the goal is to take a series of actions to maximize the cumulative rewards. We propose an approach based on deep reinforcement learning for this problem. Our approach leverages LSTM models to encode states and further estimate the expected cumulative reward of each state-action pair, which essentially measures the long-term performance of different actions at each state. Experimental results on real-world heterogeneous star networks demonstrate the effectiveness and efficiency of our approach over competitive baseline approaches.

References

P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2--3):235--256, 2002. Google ScholarDigital Library
R. Bellman. A markovian decision process. Technical report, DTIC Document, 1957.Google Scholar
S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In Advances in Neural Information Processing Systems, pages 1171--1179, 2015. Google ScholarDigital Library
Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41--48. ACM, 2009. Google ScholarDigital Library
T. Chen and Y. Sun. Task-guided and path-augmented heterogeneous network embedding for author identification. arXiv preprint arXiv:1612.02814, 2016. Google ScholarDigital Library
J. L. Elman. Learning and development in neural networks: The importance of starting small. Cognition, 48(1):71--99, 1993.Google ScholarCross Ref
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
S. Gelly and D. Silver. Combining online and offline knowledge in uct. In Proceedings of the 24th international conference on Machine learning, pages 273--280. ACM, 2007. Google ScholarDigital Library
S. Gelly, Y. Wang, O. Teytaud, M. U. Patterns, and P. Tao. Modification of uct with patterns in monte-carlo go. 2006.Google Scholar
K. J. Geras and C. Sutton. Scheduled denoising autoencoders. arXiv preprint arXiv:1406.3269, 2014.Google Scholar
A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855--864. ACM, 2016. Google ScholarDigital Library
H. Gui, J. Liu, F. Tao, M. Jiang, B. Norick, and J. Han. Large-scale embedding learning in heterogeneous event data. In Data Mining (ICDM), 2016 IEEE 16th International Conference on, pages 907--912. IEEE, 2016.Google ScholarCross Ref
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
L. Kocsis and C. Szepesvári. Bandit based monte-carlo planning. In European conference on machine learning, pages 282--293. Springer, 2006. Google ScholarDigital Library
Y. J. Lee and K. Grauman. Learning the easy things first: Self-paced visual category discovery. In CVPR, pages 1721--1728. IEEE, 2011. Google ScholarDigital Library
J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, and H. Liu. Attributed network embedding for learning in a dynamic environment. arXiv preprint arXiv:1706.01860, 2017.Google Scholar
J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, and D. Jurafsky. Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541, 2016.Google Scholar
J. Li, A. Ritter, and D. Jurafsky. Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks. arXiv preprint arXiv:1510.05198, 2015.Google Scholar
J. Louradour and C. Kermorvant. Curriculum learning for handwritten text line recognition. In Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on, pages 56--60. IEEE, 2014. Google ScholarDigital Library
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119, 2013. Google ScholarDigital Library
A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012. Google ScholarDigital Library
V. Mnih, N. Heess, A. Graves, et al. Recurrent models of visual attention. In Advances in neural information processing systems, pages 2204--2212, 2014. Google ScholarDigital Library
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.Google Scholar
R. Munos. From bandits to monte-carlo tree search: The optimistic principle applied to optimization and planning. 2014.Google ScholarCross Ref
A. Pentina, V. Sharmanska, and C. H. Lampert. Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5492--5500, 2015.Google ScholarCross Ref
T. Pepels, M. H. Winands, and M. Lanctot. Real-time monte carlo tree search in ms pac-man. IEEE Transactions on Computational Intelligence and AI in Games, 6(3):245--257, 2014.Google ScholarCross Ref
B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701--710. ACM, 2014. Google ScholarDigital Library
M. Qu, J. Tang, J. Shang, X. Ren, M. Zhang, and J. Han. An attention-based collaboration framework for multi-view network representation learning. arXiv preprint arXiv:1709.06636, 2017. Google ScholarDigital Library
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature.Google Scholar
D. Silver, R. S. Sutton, and M. Müller. Sample-based learning and search with permanent and transient memories. In Proceedings of the 25th international conference on Machine learning, pages 968--975. ACM, 2008. Google ScholarDigital Library
V. I. Spitkovsky, H. Alshawi, and D. Jurafsky. Baby steps: How ?less is more? in unsupervised dependency parsing. NIPS: Grammar Induction, Representation of Language and Language Learning, pages 1--10, 2009.Google Scholar
Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, 4(11):992--1003, 2011.Google ScholarDigital Library
Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 797--806. ACM, 2009. Google ScholarDigital Library
R. S. Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming.Google Scholar
R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction, volume 1. Google ScholarDigital Library
J. Tang, J. Liu, M. Zhang, and Q. Mei. Visualizing large-scale and high-dimensional data. In Proceedings of the 25th International Conference on World Wide Web, pages 287--297. International World Wide Web Conferences, 2016. Google ScholarDigital Library
J. Tang, M. Qu, and Q. Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1165--1174. ACM, 2015. Google ScholarDigital Library
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, pages 1067--1077. International World Wide Web Conferences Steering Committee, 2015. Google ScholarDigital Library
J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 990--998. ACM, 2008. Google ScholarDigital Library
T. Tieleman and G. Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2):26--31, 2012.Google Scholar
Y. Tsvetkov, M. Faruqui, W. Ling, and C. Dyer. Learning the curriculum with bayesian optimization for task-specific word representation learning. arXiv preprint arXiv:1605.03852, 2016.Google Scholar
B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.Google Scholar

Index Terms

Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning

Recommendations

Deep Ordinal Reinforcement Learning
Machine Learning and Knowledge Discovery in Databases
Abstract
Reinforcement learning usually makes use of numerical rewards, which have nice properties but also come with drawbacks and difficulties. Using rewards on an ordinal scale (ordinal rewards) is an alternative to numerical rewards that has received ...
Read More
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Read More
Reinforcement learning with time
AAAI'97/IAAI'97: Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

This paper steps back from the standard infinite horizon formulation of reinforcement learning problems to consider the simpler case of finite horizon problems. Although finite horizon problems may be solved using infinite horizon learning algorithms by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
February 2018
821 pages
ISBN:9781450355810
DOI:10.1145/3159652
General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 February 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 1,456
  Total Downloads
- Downloads (Last 12 months)99
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep Ordinal Reinforcement Learning

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning

Reinforcement learning with time