Abstract
We describe an approach for extracting semantics for tags, unstructured text-labels assigned to resources on the Web, based on each tag's usage patterns. In particular, we focus on the problem of extracting place semantics for tags that are assigned to photos on Flickr, a popular-photo sharing Web site that supports location (latitude/longitude) metadata for photos. We propose the adaptation of two baseline methods, inspired by well-known burst-analysis techniques, for the task; we also describe two novel methods, TagMaps and scale-structure identification. We evaluate the methods on a subset of Flickr data. We show that our scale-structure identification method outperforms existing techniques and that a hybrid approach generates further improvements (achieving 85% precision at 81% recall). The approach and methods described in this work can be used in other domains such as geo-annotated Web pages, where text terms can be extracted and associated with usage patterns.
- Ahern, S., Naaman, M., Nair, R., and Yang, J. H.-I. 2007. World Explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In Proceedings of the Conference on Digital Libraries (JCDL). ACM, New York, 1--10. Google ScholarDigital Library
- Aipperspach, R., Rattenbury, T., Woodruff, A., and Canny, J. 2006. A quantitative method for revealing and comparing places in the home. In Proceedings of the International Conference on Ubiquitous Computing (Ubicomp). Springer. Google ScholarDigital Library
- Ames, M. and Naaman, M. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM. Google ScholarDigital Library
- Amitay, E., Har'El, N., Sivan, R., and Soffer, A. 2004. Web-a-Where: Geotagging Web content. In Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (SIGIR). ACM Press, 273--280. Google ScholarDigital Library
- Arampatzis, A., van Kreveld, M., Reinbacher, I., Clough, P., Joho, H., Sanderson, M., Jones, C. B., Vaid, S., Benkert, M., and Wolff, A. 2004. Web-Based delineation of imprecise regions. In Proceedings of the Workshop on Geographic Information Retrieval.Google Scholar
- Box, G. and Jenkins, G. 1976. Time Series Analysis: Forecasting and Control. Cambridge University Press. Google ScholarDigital Library
- Brunsdon, C. 1995. Estimating probability surfaces for geographical point data: An adaptive kernel algorithm. In Comput. Geosci. 21, 7, 877--894. Google ScholarDigital Library
- Brunsdon, C., Fotheringham, A., and Charlton, M. 2002. Geographically weighted summary statistics: A framework for localized exploratory data analysis. In Comput. Environm. Urban Syst. 26, 501--524.Google ScholarCross Ref
- Bulterman, D. C. 2004. Is it time for a moratorium on metadata? IEEE MultiMedia 11, 4 (Oct.), 10--17. Google ScholarDigital Library
- Buyukokkten, O., Cho, J., Garcia-Molina, H., Gravano, L., and Shivakumar, N. 1999. Exploiting geographical location information of Web pages. In Proceedings of the Workshop on Web Databases (WebDB). Held in conjunction with ACM SIGMOD'99. http://dbpubs.stanford.edu/pub/1999-4.Google Scholar
- Cai, L. and Hofmann, T. 2003. Text categorization by boosting automatically extracted concepts. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 182--189. Google ScholarDigital Library
- Davis, M., King, S., Good, N., and Sarvas, R. 2004. From context to content: Leveraging context to infer media metadata. In Proceedings of the ACM International Conference on Multimedia. ACM, 188--195. Google ScholarDigital Library
- Ding, J., Gravano, L., and Shivakumar, N. 2000. Computing geographical scopes of Web resources. In Proceedings of the 26th International Conference on Very Large Databases. Morgan Kaufmann, 545--556. Google ScholarDigital Library
- Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., and Tomkins, A. 2006. Visualizing tags over time. In Proceedings of the 15th International Conference on World Wide Web (WWW). ACM Press, New York, 193--202. Google ScholarDigital Library
- Epshtein, B., Ofek, E., Wexler, Y., and Zhang, P. 2007. Hierarchical photo organization using geo-relevance. In Proceedings of the ACM International Symposium on Advances in Geographic Information Systems. ACM, 1--7. Google ScholarDigital Library
- Golder, S. A. and Huberman, B. A. 2006. Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 2, 198--208. Google ScholarDigital Library
- Guralnik, V. and Srivastava, J. 1999. Event detection from time series data. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM Press, New York, 33--42. Google ScholarDigital Library
- Jaffe, A., Naaman, M., Tassa, T., and Davis, M. 2006. Generating summaries and visualization for large collections of geo-referenced photographs. In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (MIR). ACM Press, New York, 89--98. Google ScholarDigital Library
- Jones, C., Alani, H., and Tudhope, D. 2001. Geographical information retrieval with ontologies of place. In Proceedings of the Conference on Spatial Information Theory. Vol. 2205. Springer, 322--335. Google ScholarDigital Library
- Kleinberg, J. 2003. Bursty and hierarchical structure in streams. Data Mining Knowl. Discov. 7, 4, 373--397. Google ScholarDigital Library
- Kruskal, J. B. 1956. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proc. Amer. Math. Soc. 7, 1, 48--50.Google ScholarCross Ref
- Kulldorff, M. 1999. Spatial scan statistics: Models, calculations, and applications. In Scan Statistics and Applications, Glaz and Balakrishnan, eds., Springer, Boston, Birkhauser, 303--322.Google Scholar
- Marlow, C., Naaman, M., Boyd, D., and Davis, M. 2006. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In Proceedings of the 7th Conference on Hypertext and Hypermedia. ACM, 31--40. Google ScholarDigital Library
- McDowall, D., McCleary, R., Meidinger, E. E., and Jr., R. A. H. 1980. Interrupted Time Series Analysis. Sage University PaperSeries on Quantitative Applications in the Social Sciences.Google Scholar
- Naaman, M., Paepcke, A., and Garcia-Molina, H. 2003. From where to what: Metadata sharing for digital photographs with geographic coordinates. In Proceedings of the 10th International Conference on Cooperative Information Systems (CoopIS). Springer, Berlin, 196--217.Google Scholar
- Ng, A., Jordan, M., and Weiss, Y. 2001. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems. Vol. 14.Google Scholar
- Openshaw, S. 1984. The Modifiable Areal Unit Problem: Concepts and Techniques in Modern Geography. Geo Books, Norwich.Google Scholar
- Openshaw, S., Charlton, M., Wymer, C., and Craft, A. 1987. A mark 1 geographical analysis machine for the automated analysis of point data sets. Int. J. Geograph. Inf. Syst. 1, 4, 335--358.Google ScholarCross Ref
- Purves, R., Clough, P., and Joho, H. 2005. Identifying imprecise regions for geographic information retrieval using the web. In Proceedings of the Conference GISRUK.Google Scholar
- Rattenbury, T., Good, N., and Naaman, M. 2007. Towards automatic extraction of event and place semantics from Flickr tags. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 103--110. Google ScholarDigital Library
- Sarin, S., Nagahashi, T., Miyosawa, T., and Kameyama, W. 2007. Exploiting users' personal and public information for personal photo annotation. In Proceedings of the IEEE International Conference on Multimedia. IEEE, 564--567.Google Scholar
- Schmitz, P. 2006. Inducing ontology from Flickr tags. In Proceedings of the Workshop on Collaborative Web Tagging at WWW2006.Google Scholar
- Vlachos, M., Meek, C., Vagena, Z., and Gunopulos, D. 2004. Identifying similarities, periodicities and bursts for online search queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, New York, 131--142. Google ScholarDigital Library
- Wang, C., Wang, J., Xie, X., and Ma, W.-Y. 2007. Mining geographic knowledge using location aware topic model. In Proceedings of the ACM Workshop on Geographical Information Retrieval. ACM, 65--70. Google ScholarDigital Library
- Wang, L., Wang, C., Xie, X., Forman, J., Lu, Y., Ma, W.-Y., and Li, Y. 2005. Detecting dominant locations from search queries. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 424--431. Google ScholarDigital Library
- Witkin, A. 1983. Scale space filtering. In Proceedings of the International Joint Conference on Artificial Intelligence.Google Scholar
- Zaragoza, H., Rode, H., Mika, P., Atserias, J., Ciaramita, M., and Attardi, G. 2007. Ranking very many typed entities on Wikipedia. In Proceedings of the ACM Conference on Information and Knowledge Management. ACM, 1015--1018. Google ScholarDigital Library
- Zhou, C., Frankowski, D., Ludford, P., Shekhar, S., and Terveen, L. 2007. Discovering personally meaningful places: An interactive clustering approach. ACM Trans. Inf. Syst. 25, 3, 1--31. Google ScholarDigital Library
Index Terms
- Methods for extracting place semantics from Flickr tags
Recommendations
Towards automatic extraction of event and place semantics from flickr tags
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalWe describe an approach for extracting semantics of tags, unstructured text-labels assigned to resources on the Web, based on each tag's usage patterns. In particular, we focus on the problem of extracting place and event semantics for tags that are ...
Extracting Representative Tags for Flickr Users
ICDMW '10: Proceedings of the 2010 IEEE International Conference on Data Mining WorkshopsTags are very popular in online social communities (like You tube, Flickr) and provide valuable and crucial information for these communities. But at the same time, there exist a lot of noisy tags, which leads many researches to tag suggestion, tag ...
Towards extracting flickr tag semantics
WWW '07: Proceedings of the 16th international conference on World Wide WebWe address the problem of extracting semantics of tags -- short, unstructured text-labels assigned to resources on the Web -- based on each tag's metadata patterns. In particular, we describe an approach for extracting place and event semantics for tags ...
Comments