Abstract
As popular search engines face the sometimes conflicting interests of protecting privacy while retaining query logs for a variety of uses, numerous technical measures have been suggested to both enhance privacy and preserve at least a portion of the utility of query logs. This article seeks to assess seven of these techniques against three sets of criteria: (1) how well the technique protects privacy, (2) how well the technique preserves the utility of the query logs, and (3) how well the technique might be implemented as a user control. A user control is defined as a mechanism that allows individual Internet users to choose to have the technique applied to their own query logs.
- Adar, E. 2007. User 4XXXXX9: anonymizing query logs. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google Scholar
- Agichtein, E., Brill, E., and Dumais, S. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA. Google ScholarDigital Library
- Article 29 Data Protection Working Party. 2008. Opinion on data protection issues related to search engines. http://www.cbpweb.nl/downloads_int/Opinie%20WP29%20zoekmachines.pdf.Google Scholar
- Ask.com. 2007. Ask.com puts you in control of your search privacy with the launch of ‘AskEraser’. http://www.irconnect.com/ask/pages/news_releases.html?d=132847.Google Scholar
- Barbaro, M. and Zeller, T. 2006. A face is exposed for AOL searcher no. 4417749. In The New York Times. http://www.nytimes.com/2006/08/09/technology/09aol.html?ex=1312776000.Google Scholar
- Bar-Ilan, J. 2007. Position paper: Access to query logs—an academic researcher's point of view. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google Scholar
- Beitzel S., Jensen, E., Chowdhury, A., Grossman D., and Frieder, O. 2004. Hourly analysis of a very large topically categorized web query log. In Proceedings of the 27th Annual International ACM SIGIR Conference, Sheffield, South Yorkshire, UK. Google ScholarDigital Library
- Center for Democracy & Technology. 2006. Digital search & seizure: Updating privacy protections to keep pace with technology. http://www.cdt.org/publications/digital-search-and-seizure.pdf.Google Scholar
- Center for Democracy & Technology. 2007. Search privacy practices: A work in progress. http://www.cdt.org/privacy/20070808searchprivacy.pdf.Google Scholar
- Cranor, L. 2007. Making privacy disclosures to consumers more usable. http://www.ftc.gov/bcp/workshops/ehavioral/presentations/6lcranor.pdf.Google Scholar
- Cucerzan, S. and White, R. 2007. Query suggestion based on user landing pages. In Proceedings of the 30th Annual International ACM SIGIR Conference, Amsterdam, Netherlands. Google ScholarDigital Library
- Cui, H., Wen, J., Nie, J., and Ma, W. 2002. Probabilistic query expansion using query logs. In Proceedings of the Eleventh International World Wide Web Conference, Honolulu, HII. Google ScholarDigital Library
- European Parliament. 1995. Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:NOT.Google Scholar
- European Parliament and the Council of the European Union. 2006. Directive on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC. http://www.ispai.ie/DR%20as%20published%20OJ%2013-04-06.pdf.Google Scholar
- Fleischer, P. 2007. Google response to Data Protection Working Party. http://64.233.179.110/blog_resources/Google_response_Working_Party_06_2007.pdf.Google Scholar
- Foley, J. 2007. Are Google searches private? An originalist interpretation of the fourth amendment in online communication cases. Berkeley Techn. Law J. 22, 1, 447--472.Google Scholar
- Google. 2007. Google search privacy: Personalized search. http://youtube.com/watch?v=UsUBnPRtTbI.Google Scholar
- Government Accountability Office. 2007. B-308603, presidential signing statements accompanying the fiscal year 2006 appropriations acts. http://www.gao.gov/decisions/appro/308603.htm.Google Scholar
- Howe, D. and Nissenbaum, H. 2008. TrackMeNot: Resisting surveillance in web search. In On the Identity Trail: Privacy, Anonymity and Identity in a Networked Society. Oxford University Press, Oxford, UK, To appear.Google Scholar
- Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada. Google ScholarDigital Library
- Jones, R., Kumar, R., Pang, B., and Tomkins, A. 2007. “I know what you did last summer”—query logs and user privacy. In Proceedings of the ACM 16th Conference on Information and Knowledge Management (CIKM), Lisbon, Portugal. Google ScholarDigital Library
- Jones, R., Rey, B., Madani, O., and Greiner, W. 2006. Generating query substitutions. In Proceedings of the 15th International World Wide Web Conference, Edinburgh, Scotland. Google ScholarDigital Library
- Kumar, R., Novak, J., Pang, B., and Tomkins, A. 2007. On anonymizing query logs via token-based hashing. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada. Google ScholarDigital Library
- Microsoft. 2007. Microsoft privacy principles for live search and online ad targeting. http://download.microsoft.com/download/3/7/f/37f14671-ddee-499b-a794-077b3673f186/Microsoft%E2%80%99s%20Privacy%20Principles%20for%20Live%20Search%20and%20Online%20Ad%20Targeting.pdf.Google Scholar
- Microsoft. 2006. Microsoft live labs: Accelerating search in academic research. http://research.microsoft.com/ur/us/fundingopps/RFPs/Search_2006_RFP.aspx.Google Scholar
- Nakashima, E. 2006. AOL takes down site with users' search data. The Washington Post. http://www.washingtonpost.com/wp-dyn/content/article/2006/08/07/AR2006080701150.html.Google Scholar
- Radlinski, F. and Joachims, T. 2005. Query chains: Learning to rank from implicit feedback. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, IL. Google ScholarDigital Library
- Rasch, M. 2006. Google's data minefield. In The Register. http://www.theregister.co.uk/2006/01/31/google_subpoena_us_government/.Google Scholar
- Reimer, J. 2007. Your ISP may be selling your web clicks. Ars Technica. http://arstechnica.com/news.ars/post/20070315-your-isp-may-be-selling-your-web-clicks.html.Google Scholar
- Roberts, C. 2007. Transcript: Debate on the foreign intelligence surveillance act. In El Paso Times. http://www.elpasotimes.com/news/ci_6685679.Google Scholar
- Spink, A., Jansen, B., Wolfram, D., and Saracevic, T. 2002. From e-sex to e-commerce: Web search changes. Computer 35, 3, 107--109. Google ScholarDigital Library
- Spink, A. and Ozmutlu, H. C. 2002. Characteristics of question format web queries: An exploratory study. Inform. Proc. Manag. v38, n4, 453--471. Google ScholarDigital Library
- Spink, A., Wolfram, D., Jansen, B., and Saracevic, T. 2001. Searching the web: The public and their queries. J. Amer. Soc. Inform. Sci. Techn. 52, 3, 226--234. Google ScholarDigital Library
- Sweeney, L. 2000. Uniqueness of simple demographics in the U.S. population, LIDAP-WP4. Laboratory for International Data Privacy, Carnegie Mellon University.Google Scholar
- Sweeney, L. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncertainty, Fuzziness Knowl.-based Syst. 10, 5, 557--570. Google ScholarDigital Library
- United States Department of Health and Human Services. 2005. Protection of human subjects. http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.htm.Google Scholar
- Xiong, L. and Agichtein, E. 2007. Towards privacy-preserving query log publishing. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google Scholar
Index Terms
- A survey of query log privacy-enhancing techniques from a policy perspective
Recommendations
Achieving Privacy in a Federated Identity Management System
Financial Cryptography and Data SecurityFederated identity management allows a user to efficiently authenticate and use identity information from data distributed across multiple domains. The sharing of data across domains blurs security boundaries and potentially creates privacy risks. We ...
Privacy-enhancing technologies: approaches and development
In this paper, we discuss privacy threats on the Internet and possible solutions to this problem. Examples of privacy threats in the communication networks are identity disclosure, linking data traffic with identity, location disclosure in connection ...
Applying differential privacy to search queries in a policy based interactive framework
PAVLAD '09: Proceedings of the ACM first international workshop on Privacy and anonymity for very large databasesWeb search logs are of growing importance to researchers as they help understanding search behavior and search engine performance. However, search logs typically contain sensitive information about users and therefore considerable caution must be ...
Comments