skip to main content
10.1145/3308558.3313427acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

A Multi-modal Neural Embeddings Approach for Detecting Mobile Counterfeit Apps

Published:13 May 2019Publication History

ABSTRACT

Counterfeit apps impersonate existing popular apps in attempts to misguide users. Many counterfeits can be identified once installed, however even a tech-savvy user may struggle to detect them before installation. In this paper, we propose a novel approach of combining content embeddings and style embeddings generated from pre-trained convolutional neural networks to detect counterfeit apps. We present an analysis of approximately 1.2 million apps from Google Play Store and identify a set of potential counterfeits for top-10,000 apps. Under conservative assumptions, we were able to find 2,040 potential counterfeits that contain malware in a set of 49,608 apps that showed high similarity to one of the top-10,000 popular apps in Google Play Store. We also find 1,565 potential counterfeits asking for at least five additional dangerous permissions than the original app and 1,407 potential counterfeits having at least five extra third party advertisement libraries.

References

  1. Charu C Aggarwal, Alexander Hinneburg, and Daniel A Keim. 2001. On the surprising behavior of distance metrics in high dimensional spaces. In ICDT. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Fawad Ahmed and M. Y. Siyal. 2006. A Secure and Robust Wavelet-Based Hashing Scheme for Image Authentication. In Advances in Multimedia Modeling. 51-62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Pablo Fernández Alcantarilla, Jesús Nuevo, and Adrien Bartoli. 2013. Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces. In BMVC. 1-9.Google ScholarGoogle Scholar
  4. Benjamin Andow, Adwait Nadkarni, Blake Bassett, William Enck, and Tao Xie. 2016. A study of grayware on Google Play. In Security and Privacy Workshops (SPW), 2016 IEEE. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  5. Ionut Arghire. 2017. Fake Netflix App Takes Control of Android Devices. http://www.securityweek.com/fake-netflix-app-takes-control-android-devices.Google ScholarGoogle Scholar
  6. Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket.. In NDSS.Google ScholarGoogle Scholar
  7. Artem Babenko, Anton Slesarev, Alexander Chigorin, and Victor S. Lempitsky. 2014. Neural Codes for Image Retrieval. CoRR abs/1404.1777(2014). arxiv:1404.1777Google ScholarGoogle Scholar
  8. Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. SURF: Speeded Up Robust Features. In Computer Vision-ECCV. Springer Berlin Heidelberg, 404-417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sean Bell and Kavita Bala. 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG)(2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sean Bell and Kavita Bala. 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG)(2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Iker Burguera, Urko Zurutuza, and Simin Nadjm-Tehrani. 2011. Crowdroid: Behavior-based malware detection system for android. In Proc. of the 1st ACM workshop on Security and privacy in smartphones and mobile devices. ACM, 15-26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rishi Chandy and Haijie Gu. 2012. Identifying spam in the iOS app store. In Proc. of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality. ACM, 56-59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sam Costello. 2018. How Many Apps Are in the App Store?https://www.lifewire.com/how-many-apps-in-app-store-2000252. Accessed: 2018-04-12.Google ScholarGoogle Scholar
  14. Jonathan Crussell, Clint Gibler, and Hao Chen. 2013. Andarwin: Scalable detection of semantically similar Android applications. In European Symposium on Research in Computer Security. Springer, 182-199.Google ScholarGoogle ScholarCross RefCross Ref
  15. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576(2015).Google ScholarGoogle Scholar
  16. Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. Texture synthesis and the controlled generation of natural stimuli using convolutional neural networks. CoRR abs/1505.07376(2015). arxiv:1505.07376 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using Convolutional Neural Networks. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  18. Clint Gibler, Ryan Stevens, Jonathan Crussell, Hao Chen, Hui Zang, and Heesook Choi. 2013. Adrob: Examining the landscape and impact of Android application plagiarism. In Proc. of the 11th MobiSys. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael Grace, Yajin Zhou, Qiang Zhang, Shihong Zou, and Xuxian Jiang. 2012. Riskranker: Scalable and accurate zero-day Android malware detection. In Proc. of the 10th international conference on Mobile systems, applications, and services. ACM, 281-294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Muhammad Ikram, Narseo Vallina-Rodriguez, Suranga Seneviratne, Mohamed Ali Kaafar, and Vern Paxson. 2016. An Analysis of the Privacy and Security Risks of Android VPN Permission-enabled Apps. In Proc. of the 2016 ACM on Internet Measurement Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Statista Inc.2018. Number of available applications in the Google Play Store from December 2009 to December 2017. https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/.Google ScholarGoogle Scholar
  22. Chris Jager. 2018. Scam Alert: Fake CBA And ANZ Bank Apps Discovered On Google Play Store. https://www.lifehacker.com.au/2018/09/scam-alert-fake-cba-and-anz-banking-apps-found-on-google-play-store/. Accessed: 2018-10-15.Google ScholarGoogle Scholar
  23. Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming embedding and weak geometric consistency for large scale image search. In European conference on computer vision. Springer, 304-317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, and Mingli Song. 2017. Neural Style Transfer: A Review. arXiv preprint arXiv:1705.04058(2017).Google ScholarGoogle Scholar
  25. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer, 694-711.Google ScholarGoogle Scholar
  26. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proc. of the 31st ICML. 1188-1196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Levi and T. Hassner. 2016. LATCH: Learned arrangements of three patch codes. In IEEE Winter Conference on Applications of Computer Vision. 1-9.Google ScholarGoogle Scholar
  28. Ping Li, Trevor J Hastie, and Kenneth W Church. 2006. Very sparse random projections. In Proc. of the 12th ACM SIGKDD. ACM, 287-296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Luka Malisa, Kari Kostiainen, and Srdjan Capkun. 2017. Detecting Mobile Application Spoofing Attacks by Leveraging User Visual Similarity Perception. In Proc. of the Seventh ACM on Conference on Data and Application Security and Privacy(CODASPY '17). ACM, New York, NY, USA, 289-300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Luka Malisa, Kari Kostiainen, Michael Och, and Srdjan Capkun. 2016. Mobile application impersonation detection using dynamic user interface extraction. In European Symposium on Research in Computer Security. Springer, 217-237.Google ScholarGoogle ScholarCross RefCross Ref
  32. Shin Matsuo and Keiji Yanai. 2016. CNN-based style vector for style image retrieval. In Proc. of the 2016 ACM ICMR. ACM, 309-312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David Nister and Henrik Stewenius. 2006. Scalable Recognition with a Vocabulary Tree. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sarah Perez. 2013. Developer Spams Google Play With Ripoffs Of Well-Known Apps Again. http://techcrunch.com.Google ScholarGoogle Scholar
  35. Ville Satopaa, Jeannie Albrecht, David Irwin, and Barath Raghavan. 2011. Finding a” kneedle” in a haystack: Detecting knee points in system behavior. In Distributed Computing Systems Workshops (ICDCSW), 2011 31st International Conference on. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Suranga Seneviratne, Harini Kolamunna, and Aruna Seneviratne. 2015. A measurement study of tracking in paid mobile applications. In Proc. of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks. ACM, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Suranga Seneviratne, Aruna Seneviratne, Mohamed Ali Kaafar, Anirban Mahanti, and Prasant Mohapatra. 2015. Early detection of spam mobile apps. In Proc. of the 24th International Conference on World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Suranga Seneviratne, Aruna Seneviratne, Mohamed Ali Kaafar, Anirban Mahanti, and Prasant Mohapatra. 2017. Spam Mobile Apps: Characteristics, Detection, and in the Wild Analysis. In To Appear in Proc. of Transactions on the Web (TWEB). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Asaf Shabtai, Uri Kanonov, Yuval Elovici, Chanan Glezer, and Yael Weiss. 2012. Andromaly: A behavioral malware detection framework for android devices. Journal of Intelligent Information Systems 38, 1 (2012), 161-190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556(2014). arxiv:1409.1556Google ScholarGoogle Scholar
  41. Mingshen Sun, Mengmeng Li, and John Lui. 2015. DroidEagle: Seamless detection of visually similar Android apps. In Proc. of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Didi Surian, Suranga Seneviratne, Aruna Seneviratne, and Sanjay Chawla. 2017. App Miscategorization Detection: A Case Study on Google Play. IEEE TKDE 29, 8 (2017).Google ScholarGoogle Scholar
  43. Wei Ren Tan, Chee Seng Chan, Hernán E Aguirre, and Kiyoshi Tanaka. 2016. Ceci n'est pas une pipe: A deep convolutional network for fine-art paintings classification. In Image Processing (ICIP), 2016 IEEE International Conference on. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  44. Nicolas Viennot, Edward Garcia, and Jason Nieh. 2014. A measurement study of Google Play. In ACM SIGMETRICS Performance Evaluation Review. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kyle Wagner. 2012. Fake Angry Birds Space Android App Is Full Of Malware. https://www.gizmodo.com.au/2012/04/psa-fake-angry-birds-space-android-app-is-full-of-malware/.Google ScholarGoogle Scholar
  46. Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600-612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Dong-Jie Wu, Ching-Hao Mao, Te-En Wei, Hahn-Ming Lee, and Kuo-Ping Wu. 2012. Droidmat: Android malware detection through manifest and api calls tracing. In Information Security (Asia JCIS), 2012 Seventh Asia Joint Conference on. IEEE, 62-69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Zhen Xie and Sencun Zhu. 2015. AppWatcher: Unveiling the underground market of trading mobile app reviews. In Proc. of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Zhenlong Yuan, Yongqiang Lu, Zhaoguo Wang, and Yibo Xue. 2014. Droid-Sec: Deep learning in Android malware detection. In ACM SIGCOMM Computer Communication Review. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. H. Zhang, M. Schmucker, and X. Niu. 2007. The Design and Application of PHABS: A Novel Benchmark Platform for Perceptual Hashing Algorithms. In IEEE International Conference on Multimedia and Expo. 887-890.Google ScholarGoogle Scholar
  51. Yajin Zhou and Xuxian Jiang. 2012. Dissecting Android malware: Characterization and evolution. In Security and Privacy (SP), 2012 IEEE Symposium on. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format