skip to main content
research-article

A survey of query log privacy-enhancing techniques from a policy perspective

Published:27 October 2008Publication History
Skip Abstract Section

Abstract

As popular search engines face the sometimes conflicting interests of protecting privacy while retaining query logs for a variety of uses, numerous technical measures have been suggested to both enhance privacy and preserve at least a portion of the utility of query logs. This article seeks to assess seven of these techniques against three sets of criteria: (1) how well the technique protects privacy, (2) how well the technique preserves the utility of the query logs, and (3) how well the technique might be implemented as a user control. A user control is defined as a mechanism that allows individual Internet users to choose to have the technique applied to their own query logs.

References

  1. Adar, E. 2007. User 4XXXXX9: anonymizing query logs. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google ScholarGoogle Scholar
  2. Agichtein, E., Brill, E., and Dumais, S. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Article 29 Data Protection Working Party. 2008. Opinion on data protection issues related to search engines. http://www.cbpweb.nl/downloads_int/Opinie%20WP29%20zoekmachines.pdf.Google ScholarGoogle Scholar
  4. Ask.com. 2007. Ask.com puts you in control of your search privacy with the launch of ‘AskEraser’. http://www.irconnect.com/ask/pages/news_releases.html?d=132847.Google ScholarGoogle Scholar
  5. Barbaro, M. and Zeller, T. 2006. A face is exposed for AOL searcher no. 4417749. In The New York Times. http://www.nytimes.com/2006/08/09/technology/09aol.html?ex=1312776000.Google ScholarGoogle Scholar
  6. Bar-Ilan, J. 2007. Position paper: Access to query logs—an academic researcher's point of view. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google ScholarGoogle Scholar
  7. Beitzel S., Jensen, E., Chowdhury, A., Grossman D., and Frieder, O. 2004. Hourly analysis of a very large topically categorized web query log. In Proceedings of the 27th Annual International ACM SIGIR Conference, Sheffield, South Yorkshire, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Center for Democracy & Technology. 2006. Digital search & seizure: Updating privacy protections to keep pace with technology. http://www.cdt.org/publications/digital-search-and-seizure.pdf.Google ScholarGoogle Scholar
  9. Center for Democracy & Technology. 2007. Search privacy practices: A work in progress. http://www.cdt.org/privacy/20070808searchprivacy.pdf.Google ScholarGoogle Scholar
  10. Cranor, L. 2007. Making privacy disclosures to consumers more usable. http://www.ftc.gov/bcp/workshops/ehavioral/presentations/6lcranor.pdf.Google ScholarGoogle Scholar
  11. Cucerzan, S. and White, R. 2007. Query suggestion based on user landing pages. In Proceedings of the 30th Annual International ACM SIGIR Conference, Amsterdam, Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cui, H., Wen, J., Nie, J., and Ma, W. 2002. Probabilistic query expansion using query logs. In Proceedings of the Eleventh International World Wide Web Conference, Honolulu, HII. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. European Parliament. 1995. Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:NOT.Google ScholarGoogle Scholar
  14. European Parliament and the Council of the European Union. 2006. Directive on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC. http://www.ispai.ie/DR%20as%20published%20OJ%2013-04-06.pdf.Google ScholarGoogle Scholar
  15. Fleischer, P. 2007. Google response to Data Protection Working Party. http://64.233.179.110/blog_resources/Google_response_Working_Party_06_2007.pdf.Google ScholarGoogle Scholar
  16. Foley, J. 2007. Are Google searches private? An originalist interpretation of the fourth amendment in online communication cases. Berkeley Techn. Law J. 22, 1, 447--472.Google ScholarGoogle Scholar
  17. Google. 2007. Google search privacy: Personalized search. http://youtube.com/watch?v=UsUBnPRtTbI.Google ScholarGoogle Scholar
  18. Government Accountability Office. 2007. B-308603, presidential signing statements accompanying the fiscal year 2006 appropriations acts. http://www.gao.gov/decisions/appro/308603.htm.Google ScholarGoogle Scholar
  19. Howe, D. and Nissenbaum, H. 2008. TrackMeNot: Resisting surveillance in web search. In On the Identity Trail: Privacy, Anonymity and Identity in a Networked Society. Oxford University Press, Oxford, UK, To appear.Google ScholarGoogle Scholar
  20. Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jones, R., Kumar, R., Pang, B., and Tomkins, A. 2007. “I know what you did last summer”—query logs and user privacy. In Proceedings of the ACM 16th Conference on Information and Knowledge Management (CIKM), Lisbon, Portugal. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jones, R., Rey, B., Madani, O., and Greiner, W. 2006. Generating query substitutions. In Proceedings of the 15th International World Wide Web Conference, Edinburgh, Scotland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kumar, R., Novak, J., Pang, B., and Tomkins, A. 2007. On anonymizing query logs via token-based hashing. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Microsoft. 2007. Microsoft privacy principles for live search and online ad targeting. http://download.microsoft.com/download/3/7/f/37f14671-ddee-499b-a794-077b3673f186/Microsoft%E2%80%99s%20Privacy%20Principles%20for%20Live%20Search%20and%20Online%20Ad%20Targeting.pdf.Google ScholarGoogle Scholar
  25. Microsoft. 2006. Microsoft live labs: Accelerating search in academic research. http://research.microsoft.com/ur/us/fundingopps/RFPs/Search_2006_RFP.aspx.Google ScholarGoogle Scholar
  26. Nakashima, E. 2006. AOL takes down site with users' search data. The Washington Post. http://www.washingtonpost.com/wp-dyn/content/article/2006/08/07/AR2006080701150.html.Google ScholarGoogle Scholar
  27. Radlinski, F. and Joachims, T. 2005. Query chains: Learning to rank from implicit feedback. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, IL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rasch, M. 2006. Google's data minefield. In The Register. http://www.theregister.co.uk/2006/01/31/google_subpoena_us_government/.Google ScholarGoogle Scholar
  29. Reimer, J. 2007. Your ISP may be selling your web clicks. Ars Technica. http://arstechnica.com/news.ars/post/20070315-your-isp-may-be-selling-your-web-clicks.html.Google ScholarGoogle Scholar
  30. Roberts, C. 2007. Transcript: Debate on the foreign intelligence surveillance act. In El Paso Times. http://www.elpasotimes.com/news/ci_6685679.Google ScholarGoogle Scholar
  31. Spink, A., Jansen, B., Wolfram, D., and Saracevic, T. 2002. From e-sex to e-commerce: Web search changes. Computer 35, 3, 107--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Spink, A. and Ozmutlu, H. C. 2002. Characteristics of question format web queries: An exploratory study. Inform. Proc. Manag. v38, n4, 453--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Spink, A., Wolfram, D., Jansen, B., and Saracevic, T. 2001. Searching the web: The public and their queries. J. Amer. Soc. Inform. Sci. Techn. 52, 3, 226--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sweeney, L. 2000. Uniqueness of simple demographics in the U.S. population, LIDAP-WP4. Laboratory for International Data Privacy, Carnegie Mellon University.Google ScholarGoogle Scholar
  35. Sweeney, L. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncertainty, Fuzziness Knowl.-based Syst. 10, 5, 557--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. United States Department of Health and Human Services. 2005. Protection of human subjects. http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.htm.Google ScholarGoogle Scholar
  37. Xiong, L. and Agichtein, E. 2007. Towards privacy-preserving query log publishing. In Proceedings of the 16th International World Wide Web Conference (WWW'07). IW3C2, Banff, Alberta, Canada.Google ScholarGoogle Scholar

Index Terms

  1. A survey of query log privacy-enhancing techniques from a policy perspective

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on the Web
              ACM Transactions on the Web  Volume 2, Issue 4
              October 2008
              118 pages
              ISSN:1559-1131
              EISSN:1559-114X
              DOI:10.1145/1409220
              Issue’s Table of Contents

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 27 October 2008
              • Accepted: 1 August 2008
              • Revised: 1 June 2008
              • Received: 1 December 2007
              Published in tweb Volume 2, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader