ABSTRACT
The visual preference of users for products has been largely ignored by the existing product search methods. In this work, we propose a multi-modal personalized product search method, which aims to search products which not only are relevant to the submitted textual query, but also match the user preferences from both textual and visual modalities. To achieve the goal, we first leverage the also_view and buy_after_viewing products to construct the visual and textual latent spaces, which are expected to preserve the visual similarity and semantic similarity of products, respectively. We then propose a translation-based search model (TranSearch ) to 1) learn a multi-modal latent space based on the pre-trained visual and textual latent spaces; and 2) map the users, queries and products into this space for direct matching. The TranSearch model is trained based on a comparative learning strategy, such that the multi-modal latent space is oriented to personalized ranking in the training stage. Experiments have been conducted on real-world datasets to validate the effectiveness of our method. The results demonstrate that our method outperforms the state-of-the-art method by a large margin.
- Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W Bruce Croft. 2017. Learning a hierarchical embedding model for personalized product search. In SIGIR. ACM, 645--654. Google ScholarDigital Library
- Saeid Balaneshin-kordan and Alexander Kotov. 2018. Deep neural architecture for multi-Modal retrieval based on joint embedding space for text and images. In WSDM. ACM. Google ScholarDigital Library
- Yue Cao, Mingsheng Long, Jianmin Wang, Qiang Yang, and Philip S Yu. 2016. Deep visual-semantic hashing for cross-modal retrieval. In SIGKDD. ACM, 1445--1454. Google ScholarDigital Library
- Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Qiang Yang. 2017. Transitive hashing network for heterogeneous multimedia retrieval. In AAAI. AAAI, 81--87.Google Scholar
- Zhiyong Cheng, Ying Ding, Xiangnan He, Lei Zhu, Xuemeng Song, and Mohan S Kankanhalli. 2018. A^ 3NCF: An adaptive aspect attention model for rating prediction. In IJCAI. Morgan Kaufmann, 3748--3754.Google Scholar
- Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S Kankanhalli, and Liqiang Nie. 2017. Exploiting music play sequence for music recommendation. In IJCAI. Morgan Kaufmann, 3654--3660. Google ScholarDigital Library
- Huizhong Duan and ChengXiang Zhai. 2015. Mining coordinated intent representation for entity search and recommendation. In CIKM. ACM, 333--342. Google ScholarDigital Library
- Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gattani. 2013a. A probabilistic mixture model for mining and analyzing product search log. In CIKM. ACM, 2179--2188. Google ScholarDigital Library
- Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gattani. 2013b. Supporting keyword search in product database: A probabilistic approach. VLDB, Vol. 6, 14 (2013), 1786--1797. Google ScholarDigital Library
- Golnoosh Farnadi, Jie Tang, Martine De Cock, and Marie-Francine Moens. 2018. User Profiling through Deep Multimodal Fusion. In WSDM. ACM. Google ScholarDigital Library
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.Google Scholar
- Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: a visually, socially, and temporally-aware model for artistic recommendation. In RecSys. ACM, 309--316. Google ScholarDigital Library
- Ruining He and Julian McAuley. 2016. VBPR: Visual bayesian personalized ranking from implicit feedback. In AAAI. AAAI, 144--150. Google ScholarDigital Library
- Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. ACM, 173--182. Google ScholarDigital Library
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In MM. ACM, 675--678. Google ScholarDigital Library
- Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. TKDE (2017).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Katrien Laenen, Susana Zoghbi, and Marie-Francine Moens. 2018. Web search of fashion items with multimodal querying. In WSDM. ACM. Google ScholarDigital Library
- Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In ICML. ACM, 1188--1196. Google ScholarDigital Library
- Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, Vol. 7, 1 (2003), 76--80. Google ScholarDigital Library
- Meng Liu, Liqiang Nie, Meng Wang, and Baoquan Chen. 2017. Towards micro-video understanding by joint sequential-sparse modeling. In MM. ACM, 970--978. Google ScholarDigital Library
- Meng Liu, Xiang Wang, Liqiang Nie, Xiangnan He, Baoquan Chen, and Tat-Seng Chua. 2018. Attentive moment retrieval in videos. In SIGIR. ACM, 15--24. Google ScholarDigital Library
- Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan Yuille. 2014. Deep captioning with multimodal recurrent neural networks (m-rnn). In arXiv preprint arXiv:1412.6632 .Google Scholar
- Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In SIGKDD. ACM, 785--794. Google ScholarDigital Library
- Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015b. Image-based recommendations on styles and substitutes. In SIGIR. ACM, 43--52. Google ScholarDigital Library
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In arXiv preprint arXiv:1301.3781 .Google Scholar
- Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In ICML. ACM, 689--696. Google ScholarDigital Library
- Liqiang Nie, Xiang Wang, Jianglong Zhang, Xiangnan He, Hanwang Zhang, Richang Hong, and Qi Tian. 2017. Enhancing Micro-video Understanding by Harnessing External Sounds. In MM. ACM, 1192--1200. Google ScholarDigital Library
- Jennifer Rowley. 2000. Product search in e-shopping: a review and research propositions. Journal of consumer marketing, Vol. 17, 1 (2000), 20--35.Google ScholarCross Ref
- Xuemeng Song, Fuli Feng, Jinhuan Liu, Zekun Li, Liqiang Nie, and Jun Ma. 2017. NeuroStylist: Neural compatibility modeling for clothing matching. In MM. ACM, 753--761. Google ScholarDigital Library
- Nitish Srivastava and Ruslan Salakhutdinov. 2012a. Learning representations for multimodal data with deep belief nets. In ICML workshop, Vol. 79. ACM.Google Scholar
- Nitish Srivastava and Ruslan R Salakhutdinov. 2012b. Multimodal learning with deep boltzmann machines. In NIPS. MIT Press, 2222--2230. Google ScholarDigital Library
- Ning Su, Jiyin He, Yiqun Liu, Min Zhang, and Shaoping Ma. 2018. User intent, behaviour, and perceived satisfaction in product search. In WSDM. ACM, 547--555. Google ScholarDigital Library
- Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas. 2016. Learning latent vector spaces for product search. In CIKM. ACM, 165--174. Google ScholarDigital Library
- Daixin Wang, Peng Cui, and Wenwu Zhu. 2016a. Structural deep network embedding. In SIGKDD. ACM, 1225--1234. Google ScholarDigital Library
- Wei Wang, Xiaoyan Yang, Beng Chin Ooi, Dongxiang Zhang, and Yueting Zhuang. 2016b. Effective deep learning-based multi-modal retrieval. The VLDB Journal, Vol. 25, 1 (2016), 79--101. Google ScholarDigital Library
- Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, and Xiangyang Xue. 2015. Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In MM. ACM, 461--470. Google ScholarDigital Library
- Chengxiang Zhai and John Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. TOIS, Vol. 22, 2 (2004), 179--214. Google ScholarDigital Library
- Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. 2017b. Visual translation embedding network for visual relation detection. In CVPR. IEEE, 3107--3115.Google Scholar
- Hanwang Zhang, Yulei Niu, and Shih-Fu Chang. 2018. Grounding referring expressions in images by variational context. In CVPR. IEEE, 4158--4166.Google Scholar
- Hanwang Zhang, Yang Yang, Huanbo Luan, Shuicheng Yang, and Tat-Seng Chua. 2014. Start from scratch: Towards automatically identifying, modeling, and naming visual attributes. In MM. ACM, 187--196. Google ScholarDigital Library
- Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2013. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In MM. ACM, 33--42. Google ScholarDigital Library
- Yongfeng Zhang, Qingyao Ai, Xu Chen, and W Bruce Croft. 2017a. Joint representation learning for top-n recommendation with heterogeneous information sources. In CIKM. ACM, 1449--1458. Google ScholarDigital Library
Index Terms
- Multi-modal Preference Modeling for Product Search
Recommendations
A Zero Attention Model for Personalized Product Search
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementProduct search is one of the most popular methods for people to discover and purchase products on e-commerce websites. Because personal preferences often have an important influence on the purchase decision of each customer, it is intuitive that ...
Learning a Hierarchical Embedding Model for Personalized Product Search
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalProduct search is an important part of online shopping. In contrast to many search tasks, the objectives of product search are not confined to retrieving relevant products. Instead, it focuses on finding items that satisfy the needs of individuals and ...
Integrating collaborative filtering and matching-based search for product recommendations
Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and, thus, help in making good ...
Comments