research-article

XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music

Authors:
Hongyuan Zhu

University of Science and Technology of China, AI and Research Microsoft, Hefei, China

University of Science and Technology of China, AI and Research Microsoft, Hefei, China
View Profile

,
Qi Liu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Nicholas Jing Yuan

AI and Research Microsoft, Suzhou, China

AI and Research Microsoft, Suzhou, China
View Profile

,
Chuan Qin

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Jiawei Li

AI and Research Microsoft, Soochow University, Suzhou, China

AI and Research Microsoft, Soochow University, Suzhou, China
View Profile

,
Kun Zhang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Guang Zhou

AI and Research Microsoft, Suzhou, China

AI and Research Microsoft, Suzhou, China
View Profile

,
Furu Wei

AI and Research Microsoft, Beijing, China

AI and Research Microsoft, Beijing, China
View Profile

,
Yuanchun Xu

AI and Research Microsoft, Beijing, China

AI and Research Microsoft, Beijing, China
View Profile

,
Enhong Chen

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2018Pages 2837–2846https://doi.org/10.1145/3219819.3220105

Published:19 July 2018Publication History

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2837–2846

ABSTRACT

With the development of knowledge of music composition and the recent increase in demand, an increasing number of companies and research institutes have begun to study the automatic generation of music. However, previous models have limitations when applying to song generation, which requires both the melody and arrangement. Besides, many critical factors related to the quality of a song such as chord progression and rhythm patterns are not well addressed. In particular, the problem of how to ensure the harmony of multi-track music is still underexplored. To this end, we present a focused study on pop music generation, in which we take both chord and rhythm influence of melody generation and the harmony of music arrangement into consideration. We propose an end-to-end melody and arrangement generation framework, called XiaoIce Band, which generates a melody track with several accompany tracks played by several types of instruments. Specifically, we devise a Chord based Rhythm and Melody Cross-Generation Model (CRMCG) to generate melody with chord progressions. Then, we propose a Multi-Instrument Co-Arrangement Model (MICA) using multi-task learning for multi-track music arrangement. Finally, we conduct extensive experiments on a real-world dataset, where the results demonstrate the effectiveness of XiaoIce Band.

Supplemental Material

zhu_pop_music.mp4

mp4

332.4 MB

Download

References

Léon Bottou . 2010. Large-scale machine learning with stochastic gradient descent. Proceedings of COMPSTAT'2010. Springer, 177--186.Google ScholarCross Ref
Mason Bretan, Gil Weinberg, and Larry Heck . 2016. A Unit Selection Methodology for Music Generation Using Deep Neural Networks. arXiv preprint arXiv:1612.03789 (2016).Google Scholar
Pietro Casella and Ana Paiva . 2001. Magenta: An architecture for real time automatic composition of background music International Workshop on Intelligent Virtual Agents. Springer, 224--232. Google ScholarDigital Library
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio . 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).Google Scholar
Parag Chordia, Avinash Sastry, and Sertan cSentürk . 2011. Predictive tabla modelling using variable-length markov and hidden markov models. Journal of New Music Research Vol. 40, 2 (2011), 105--118.Google ScholarCross Ref
Hang Chu, Raquel Urtasun, and Sanja Fidler . 2016. Song from pi: A musically plausible network for pop music generation. arXiv preprint arXiv:1611.03477 (2016).Google Scholar
Ronan Collobert and Jason Weston . 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160--167. Google ScholarDigital Library
Darrell Conklin . 2003. Music generation from statistical models. In Proceedings of the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts and Sciences. Citeseer, 30--35.Google Scholar
Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang . 2015. Multi-Task Learning for Multiple Language Translation. ACL (1). 1723--1732.Google Scholar
Ross Girshick . 2015. Fast r-cnn Proceedings of the IEEE international conference on computer vision. 1440--1448. Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio . 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680. Google ScholarDigital Library
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton . 2013. Speech recognition with deep recurrent neural networks Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 6645--6649.Google Scholar
Gaëtan Hadjeres and Franccois Pachet . 2016. DeepBach: a Steerable Model for Bach chorales generation. arXiv preprint arXiv:1612.01010 (2016).Google Scholar
Christopher Harte, Mark Sandler, and Martin Gasser . 2006. Detecting harmonic change in musical audio. In Proceedings of the 1st ACM workshop on Audio and music computing multimedia. ACM, 21--26. Google ScholarDigital Library
Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher . 2016. A joint many-task model: Growing a neural network for multiple NLP tasks. arXiv preprint arXiv:1611.01587 (2016).Google Scholar
Nanzhu Jiang, Peter Grosche, Verena Konz, and Meinard Müller . 2011. Analyzing chroma feature types for automated chord recognition Audio Engineering Society Conference: 42nd International Conference: Semantic Audio. Audio Engineering Society.Google Scholar
Daniel Johnson . 2015. Composing music with recurrent neural networks. (2015).Google Scholar
Alex Kendall, Yarin Gal, and Roberto Cipolla . 2017. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. arXiv preprint arXiv:1705.07115 (2017).Google Scholar
Diederik P Kingma and Max Welling . 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
Vladimir I Levenshtein . 1966. Binary codes capable of correcting deletions, insertions, and reversals Soviet physics doklady, Vol. Vol. 10. 707--710.Google Scholar
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang . 2016. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016). Google ScholarDigital Library
Mingsheng Long and Jianmin Wang . 2015. Learning multiple tasks with deep relationship networks. arXiv preprint arXiv:1506.02117 (2015).Google ScholarDigital Library
Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert . 2016. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3994--4003.Google ScholarCross Ref
Olof Mogren . 2016. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arXiv preprint arXiv:1611.09904 (2016).Google Scholar
Franccois Pachet, Sony CSL Paris, Alexandre Papadopoulos, and Pierre Roy . 2017. Sampling variations of sequences for structured music generation Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR'2017), Suzhou, China. 167--173.Google Scholar
Franccois Pachet and Pierre Roy . 2011. Markov constraints: steerable generation of Markov sequences. Constraints, Vol. 16, 2 (2011), 148--172. Google ScholarDigital Library
Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard . 2017. Sluice networks: Learning what to share between loosely related tasks. arXiv preprint arXiv:1705.08142 (2017).Google Scholar
Romain Sabathé, Eduardo Coutinho, and Björn Schuller . 2017. Deep recurrent music writer: Memory-enhanced variational autoencoder-based musical score composition and an objective measure Neural Networks (IJCNN), 2017 International Joint Conference on. IEEE, 3467--3474.Google ScholarCross Ref
Paul Schmeling . 2011. Berklee Music Theory. Berklee Press.Google Scholar
Heung-Yeung Shum, Xiaodong He, and Di Li . 2018. From Eliza to XiaoIce: Challenges and Opportunities with Social Chatbots. arXiv preprint arXiv:1801.01957 (2018).Google Scholar
Andries Van Der Merwe and Walter Schulze . 2011. Music generation with Markov models. IEEE MultiMedia, Vol. 18, 3 (2011), 78--85. Google ScholarDigital Library
Li-Chia Yang, Szu-Yu Chou, and Yi-Hsuan Yang . 2017. MidiNet: A convolutional generative adversarial network for symbolic-domain music generation Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR'2017), Suzhou, China.Google Scholar
Xiaofan Zhang, Feng Zhou, Yuanqing Lin, and Shaoting Zhang . 2016. Embedding label structures for fine-grained feature representation Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1114--1123.Google Scholar
Yu Zhang and Qiang Yang . 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.08114 (2017).Google Scholar

Index Terms

XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement
Special Issue on KDD 2018, Regular Papers and Survey Paper

Music plays an important role in our daily life. With the development of deep learning and modern generation techniques, researchers have done plenty of works on automatic music generation. However, due to the special requirements of both melody and ...
Read More
PopMAG: Pop Music Accompaniment Generation
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

In pop music, accompaniments are usually played by multiple instruments (tracks) such as drum, bass, string and guitar, and can make a song more expressive and contagious by arranging together with its melody. Previous works usually generate multiple ...
Read More
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pop music generation has always been an attractive topic for both musicians and scientists for a long time. However, automatically composing pop music with a satisfactory structure is still a challenging issue. In this paper, we propose to leverage ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
harmony evaluation
melody and arrangement generation
multi-task joint learning
music generation
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 41
  Total Citations
  View Citations
- 2,802
  Total Downloads
- Downloads (Last 12 months)119
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement

PopMAG: Pop Music Accompaniment Generation

Structure-Enhanced Pop Music Generation via Harmony-Aware Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement

PopMAG: Pop Music Accompaniment Generation

Structure-Enhanced Pop Music Generation via Harmony-Aware Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media