Abstract
Popular sites like Houzz, Pinterest, and LikeThatDecor, have communities of users helping each other answer questions about products in images. In this paper we learn an embedding for visual search in interior design. Our embedding contains two different domains of product images: products cropped from internet scenes, and products in their iconic form. With such a multi-domain embedding, we demonstrate several applications of visual search including identifying products in scenes and finding stylistically similar products. To obtain the embedding, we train a convolutional neural network on pairs of images. We explore several training architectures including re-purposing object classifiers, using siamese networks, and using multitask learning. We evaluate our search quantitatively and qualitatively and demonstrate high quality results for search across multiple visual domains, enabling new applications in interior design.
Supplemental Material
Available for Download
Supplemental files
- Babenko, A., Slesarev, A., Chigorin, A., and Lempitsky, V. S. 2014. Neural codes for image retrieval. In ECCV.Google Scholar
- Bell, S., Upchurch, P., Snavely, N., and Bala, K. 2013. OpenSurfaces: A richly annotated catalog of surface appearance. ACM Trans. on Graphics (SIGGRAPH) 32, 4. Google ScholarDigital Library
- Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. 2014. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.Google Scholar
- Chechik, G., Sharma, V., Shalit, U., and Bengio, S. 2010. Large scale online learning of image similarity through ranking. JMLR. Google ScholarDigital Library
- Chopra, S., Hadsell, R., and LeCun, Y. 2005. Learning a similarity metric discriminatively, with application to face verification. In CVPR, IEEE Press. Google ScholarDigital Library
- Garces, E., Agarwala, A., Gutierrez, D., and Hertzmann, A. 2014. A similarity measure for illustration style. ACM Trans. Graph. 33, 4 (July). Google ScholarDigital Library
- Gingold, Y., Shamir, A., and Cohen-Or, D. 2012. Micro perceptual human computation. TOG 31, 5. Google ScholarDigital Library
- Girod, B., Chandrasekhar, V., Chen, D. M., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S. S., and Vedantham, R., 2011. Mobile visual search.Google Scholar
- Hadsell, R., Chopra, S., and LeCun, Y. 2006. Dimensionality reduction by learning an invariant mapping. In CVPR, IEEE Press. Google ScholarDigital Library
- Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., and Schmid, C. 2012. Aggregating local image descriptors into compact codes. PAMI 34, 9. Google ScholarDigital Library
- Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093.Google Scholar
- Karayev, S., Trentacoste, M., Han, H., Agarwala, A., Darrell, T., Hertzmann, A., and Winnemoeller, H. 2014. Recognizing image style. In BMVC.Google Scholar
- Kovashka, A., Parikh, D., and Grauman, K. 2012. Whittlesearch: Image search with relative attribute feedback. In CVPR. Google ScholarDigital Library
- Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In NIPS.Google Scholar
- Kulis, B. 2012. Metric learning: A survey. Foundations and Trends in Machine Learning 5, 4.Google Scholar
- LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4. Google ScholarDigital Library
- Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. 2014. Microsoft COCO: common objects in context. ECCV.Google Scholar
- Muja, M., and Lowe, D. G. 2014. Scalable nearest neighbor algorithms for high dimensional data. PAMI.Google Scholar
- O'Donovan, P., Lībeks, J., Agarwala, A., and Hertzmann, A. 2014. Exploratory font selection using crowdsourced attributes. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
- Ordonez, V., Jagadeesh, V., Di, W., Bhardwaj, A., and Piramuthu, R. 2014. Furniture-geek: Understanding fine-grained furniture attributes from freely associated text and tags. In WACV, 317--324.Google Scholar
- Parikh, D., and Grauman, K. 2011. Relative attributes. In ICCV, 503--510. Google ScholarDigital Library
- Perronnin, F., and Dance, C. 2007. Fisher kernels on visual vocabularies for image categorization. In CVPR.Google Scholar
- Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. 2014. CNN features off-the-shelf: an astounding baseline for recognition. Deep Vision (CVPR Workshop). Google ScholarDigital Library
- Razavian, A. S., Sullivan, J., Maki, A., and Carlsson, S. 2014. Visual instance retrieval with deep convolutional networks. arXiv:1412.6574.Google Scholar
- Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning internal representations by error-propagation. Parallel Distributed Processing 1. Google ScholarDigital Library
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. 2015. Going deeper with convolutions. CVPR.Google Scholar
- Taigman, Y., Yang, M., Ranzato, M. A., and Wolf, L. 2014. Deepface: Closing the gap to human-level performance in face verification. In CVPR. Google ScholarDigital Library
- Van Der Maaten, L., and Hinton, G. 2008. Visualizing data using t-SNE. In Journal of Machine Learning.Google Scholar
- Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., and Wu, Y. 2014. Learning fine-grained image similarity with deep ranking. In CVPR. Google ScholarDigital Library
- Weston, J., Ratle, F., and Collobert, R. 2008. Deep learning via semi-supervised embedding. In ICML. Google ScholarDigital Library
Index Terms
- Learning visual similarity for product design with convolutional neural networks
Recommendations
Halftone Image Steganography Based on Maximizing Visual Similarity of Block Units
Artificial Intelligence and SecurityAbstractSteganography focuses on imperceptibility of both the human eyes but also to potential analyzers. Based on the human visual system (HVS) of halftone image, most previous visual quality measurements for multi-tone images can not be adopted to ...
Feature learning for steganalysis using convolutional neural networks
Traditional steganalysis methods usually rely on handcrafted features. However, with the rapid development of advanced steganography, manual design of complex features has become increasingly difficult. In this paper, we propose a new paradigm for ...
Hebbian Learning Meets Deep Convolutional Neural Networks
Image Analysis and Processing – ICIAP 2019AbstractNeural networks are said to be biologically inspired since they mimic the behavior of real neurons. However, several processes in state-of-the-art neural networks, including Deep Convolutional Neural Networks (DCNN), are far from the ones found in ...
Comments