research-article

Multi-modal Preference Modeling for Product Search

Authors:
Yangyang Guo

Shandong University, Qingdao , China

Shandong University, Qingdao , China
View Profile

,
Zhiyong Cheng

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Liqiang Nie

Shandong University, Qingdao, China

Shandong University, Qingdao, China
View Profile

,
Xin-Shun Xu

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Mohan Kankanhalli

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

MM '18: Proceedings of the 26th ACM international conference on MultimediaOctober 2018Pages 1865–1873https://doi.org/10.1145/3240508.3240541

Published:15 October 2018Publication History

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 1865–1873

ABSTRACT

The visual preference of users for products has been largely ignored by the existing product search methods. In this work, we propose a multi-modal personalized product search method, which aims to search products which not only are relevant to the submitted textual query, but also match the user preferences from both textual and visual modalities. To achieve the goal, we first leverage the also_view and buy_after_viewing products to construct the visual and textual latent spaces, which are expected to preserve the visual similarity and semantic similarity of products, respectively. We then propose a translation-based search model (TranSearch ) to 1) learn a multi-modal latent space based on the pre-trained visual and textual latent spaces; and 2) map the users, queries and products into this space for direct matching. The TranSearch model is trained based on a comparative learning strategy, such that the multi-modal latent space is oriented to personalized ranking in the training stage. Experiments have been conducted on real-world datasets to validate the effectiveness of our method. The results demonstrate that our method outperforms the state-of-the-art method by a large margin.

References

Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W Bruce Croft. 2017. Learning a hierarchical embedding model for personalized product search. In SIGIR. ACM, 645--654. Google ScholarDigital Library
Saeid Balaneshin-kordan and Alexander Kotov. 2018. Deep neural architecture for multi-Modal retrieval based on joint embedding space for text and images. In WSDM. ACM. Google ScholarDigital Library
Yue Cao, Mingsheng Long, Jianmin Wang, Qiang Yang, and Philip S Yu. 2016. Deep visual-semantic hashing for cross-modal retrieval. In SIGKDD. ACM, 1445--1454. Google ScholarDigital Library
Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Qiang Yang. 2017. Transitive hashing network for heterogeneous multimedia retrieval. In AAAI. AAAI, 81--87.Google Scholar
Zhiyong Cheng, Ying Ding, Xiangnan He, Lei Zhu, Xuemeng Song, and Mohan S Kankanhalli. 2018. A^ 3NCF: An adaptive aspect attention model for rating prediction. In IJCAI. Morgan Kaufmann, 3748--3754.Google Scholar
Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S Kankanhalli, and Liqiang Nie. 2017. Exploiting music play sequence for music recommendation. In IJCAI. Morgan Kaufmann, 3654--3660. Google ScholarDigital Library
Huizhong Duan and ChengXiang Zhai. 2015. Mining coordinated intent representation for entity search and recommendation. In CIKM. ACM, 333--342. Google ScholarDigital Library
Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gattani. 2013a. A probabilistic mixture model for mining and analyzing product search log. In CIKM. ACM, 2179--2188. Google ScholarDigital Library
Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gattani. 2013b. Supporting keyword search in product database: A probabilistic approach. VLDB, Vol. 6, 14 (2013), 1786--1797. Google ScholarDigital Library
Golnoosh Farnadi, Jie Tang, Martine De Cock, and Marie-Francine Moens. 2018. User Profiling through Deep Multimodal Fusion. In WSDM. ACM. Google ScholarDigital Library
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.Google Scholar
Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: a visually, socially, and temporally-aware model for artistic recommendation. In RecSys. ACM, 309--316. Google ScholarDigital Library
Ruining He and Julian McAuley. 2016. VBPR: Visual bayesian personalized ranking from implicit feedback. In AAAI. AAAI, 144--150. Google ScholarDigital Library
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. ACM, 173--182. Google ScholarDigital Library
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In MM. ACM, 675--678. Google ScholarDigital Library
Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. TKDE (2017).Google Scholar
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Katrien Laenen, Susana Zoghbi, and Marie-Francine Moens. 2018. Web search of fashion items with multimodal querying. In WSDM. ACM. Google ScholarDigital Library
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In ICML. ACM, 1188--1196. Google ScholarDigital Library
Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, Vol. 7, 1 (2003), 76--80. Google ScholarDigital Library
Meng Liu, Liqiang Nie, Meng Wang, and Baoquan Chen. 2017. Towards micro-video understanding by joint sequential-sparse modeling. In MM. ACM, 970--978. Google ScholarDigital Library
Meng Liu, Xiang Wang, Liqiang Nie, Xiangnan He, Baoquan Chen, and Tat-Seng Chua. 2018. Attentive moment retrieval in videos. In SIGIR. ACM, 15--24. Google ScholarDigital Library
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan Yuille. 2014. Deep captioning with multimodal recurrent neural networks (m-rnn). In arXiv preprint arXiv:1412.6632 .Google Scholar
Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In SIGKDD. ACM, 785--794. Google ScholarDigital Library
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015b. Image-based recommendations on styles and substitutes. In SIGIR. ACM, 43--52. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In arXiv preprint arXiv:1301.3781 .Google Scholar
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In ICML. ACM, 689--696. Google ScholarDigital Library
Liqiang Nie, Xiang Wang, Jianglong Zhang, Xiangnan He, Hanwang Zhang, Richang Hong, and Qi Tian. 2017. Enhancing Micro-video Understanding by Harnessing External Sounds. In MM. ACM, 1192--1200. Google ScholarDigital Library
Jennifer Rowley. 2000. Product search in e-shopping: a review and research propositions. Journal of consumer marketing, Vol. 17, 1 (2000), 20--35.Google ScholarCross Ref
Xuemeng Song, Fuli Feng, Jinhuan Liu, Zekun Li, Liqiang Nie, and Jun Ma. 2017. NeuroStylist: Neural compatibility modeling for clothing matching. In MM. ACM, 753--761. Google ScholarDigital Library
Nitish Srivastava and Ruslan Salakhutdinov. 2012a. Learning representations for multimodal data with deep belief nets. In ICML workshop, Vol. 79. ACM.Google Scholar
Nitish Srivastava and Ruslan R Salakhutdinov. 2012b. Multimodal learning with deep boltzmann machines. In NIPS. MIT Press, 2222--2230. Google ScholarDigital Library
Ning Su, Jiyin He, Yiqun Liu, Min Zhang, and Shaoping Ma. 2018. User intent, behaviour, and perceived satisfaction in product search. In WSDM. ACM, 547--555. Google ScholarDigital Library
Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas. 2016. Learning latent vector spaces for product search. In CIKM. ACM, 165--174. Google ScholarDigital Library
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016a. Structural deep network embedding. In SIGKDD. ACM, 1225--1234. Google ScholarDigital Library
Wei Wang, Xiaoyan Yang, Beng Chin Ooi, Dongxiang Zhang, and Yueting Zhuang. 2016b. Effective deep learning-based multi-modal retrieval. The VLDB Journal, Vol. 25, 1 (2016), 79--101. Google ScholarDigital Library
Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, and Xiangyang Xue. 2015. Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In MM. ACM, 461--470. Google ScholarDigital Library
Chengxiang Zhai and John Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. TOIS, Vol. 22, 2 (2004), 179--214. Google ScholarDigital Library
Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. 2017b. Visual translation embedding network for visual relation detection. In CVPR. IEEE, 3107--3115.Google Scholar
Hanwang Zhang, Yulei Niu, and Shih-Fu Chang. 2018. Grounding referring expressions in images by variational context. In CVPR. IEEE, 4158--4166.Google Scholar
Hanwang Zhang, Yang Yang, Huanbo Luan, Shuicheng Yang, and Tat-Seng Chua. 2014. Start from scratch: Towards automatically identifying, modeling, and naming visual attributes. In MM. ACM, 187--196. Google ScholarDigital Library
Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2013. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In MM. ACM, 33--42. Google ScholarDigital Library
Yongfeng Zhang, Qingyao Ai, Xu Chen, and W Bruce Croft. 2017a. Joint representation learning for top-n recommendation with heterogeneous information sources. In CIKM. ACM, 1449--1458. Google ScholarDigital Library

Index Terms

Multi-modal Preference Modeling for Product Search
1. Information systems
  1. Information retrieval

Recommendations

A Zero Attention Model for Personalized Product Search
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Product search is one of the most popular methods for people to discover and purchase products on e-commerce websites. Because personal preferences often have an important influence on the purchase decision of each customer, it is intuitive that ...
Read More
Learning a Hierarchical Embedding Model for Personalized Product Search
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Product search is an important part of online shopping. In contrast to many search tasks, the objectives of product search are not confined to retrieving relevant products. Instead, it focuses on finding items that satisfy the needs of individuals and ...
Read More
Integrating collaborative filtering and matching-based search for product recommendations

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and, thus, help in making good ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
multi-modal fusion
personalization
product search
Qualifiers
- research-article
Conference

Acceptance Rates
MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 40
  Total Citations
  View Citations
- 717
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multi-modal Preference Modeling for Product Search

MM '18: Proceedings of the 26th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Zero Attention Model for Personalized Product Search

Learning a Hierarchical Embedding Model for Personalized Product Search

Integrating collaborative filtering and matching-based search for product recommendations