skip to main content
10.1145/1367497.1367629acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Substructure similarity measurement in chinese recipes

Authors Info & Claims
Published:21 April 2008Publication History

ABSTRACT

Improving the precision of information retrieval has been a challenging issue on Chinese Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to use keywords (e.g. recipe names) to search recipes, since the names can be literally so abstract that they do not bear much, if any, information on the underlying ingredients or cooking methods. In this paper, we investigate the underlying features of Chinese recipes, and based on workflow-like cooking procedures, we model recipes as graphs. We further propose a novel similarity measurement based on the frequent patterns, and devise an effective filtering algorithm to prune unrelated data so as to support efficient on-line searching. Benefiting from the characteristics of graphs, frequent common patterns can be mined from a cooking graph database. So in our prototype system called RecipeView, we extend the subgraph mining algorithm FSG to cooking graphs and combine it with our proposed similarity measurement, resulting in an approach that well caters for specific users' needs. Our initial experimental studies show that the filtering algorithm can efficiently prune unrelated cooking graphs without affecting the retrieval performance and the similarity measurement gets a relatively higher precision/recall against its counterparts

References

  1. Bunke, H., and Shearer, K. A graph distance metric based on the maximal common subgraph. Pattern Recogn. Lett. 19, 3-4 (1998), 255--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cook, D. J., and Holder, L. B. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research 1 (1994), 231--255.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Conte, D., Guidobaldi, C., and Sansone, C. A comparison of three maximum common subgraph algorithms on a large database of labeled graphs. In Proc. of the 4th IAPR International Workshop on Graph Based Representations in Pattern Recognition (GbRPR), York, UK, 2003, pp. 589--607. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Djoko, S., Cook, D. J., and Holder, L. B. An empirical study of domain knowledge and its benefits to substructure discovery. IEEE Transactions on Knowledge and Data Engineering 9, 4 (1997), 575--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Government News. http://www.cq.xinhua.org/food/200801/15/content_12221389.htmGoogle ScholarGoogle Scholar
  6. Homepage ChemIDPlus. http://chem.sis.nlm.nih.gov/chemidplus/.Google ScholarGoogle Scholar
  7. Homepage Simpack. http://www.ifi.unizh.ch/ddis/simpack.html.Google ScholarGoogle Scholar
  8. Inokuchi, A., Washio, T., and Motoda, H. An apriori-based algorithm for mining frequent substructures from graph data. In Proc. of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), London, UK, 2000, pp. 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Karakoc, E., Cherkasov, A., and Sahinalp, S. C. Novel approaches for small biomolecule classification and structural similarity search. SIGKDD Explor. Newsl. 9, 1 (2007), 14--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kuramochi, M., and Karypis, G. Frequent subgraph discovery. In Proc. of the IEEE International Conference on Data Mining (ICDM), San Jose, USA, 2001, pp. 313--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Li, Y., Meng, X., Wang, L., and Li, Q. RecipeCrawler: collecting recipe data from www incrementally. In Proc. of the 7th International Conference on Web-Age Information Management (WAIM), Hong Kong, China, 2006, pp. 263--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Wang, L., Li, Q. A personalized recipe database system with user-centered adaptation and tutoring support. In ACM SIGMOD Ph.D. workshop on Innovative database research (IDAR), 2007.Google ScholarGoogle Scholar
  13. Messmer, B. T., and Bunke, H. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. Pattern Anal. Mach. Intell. 20, 5 (1998), 493--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Salton, G., Wong, A., and Yang, C. S. A vector space model for automatic indexing. Commun. ACM 18, 11 (1975), 613--620. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sanfeliu, A., and Fu, K. S. A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man and Cybernetics 13, 5 (1983), 353--362.Google ScholarGoogle Scholar
  16. Shasha, D., Wang, J. T. L., and Giugno, R. Algorithmics and applications of tree and graph searching. In Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), New York, USA, 2002, pp. 39--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ukkonen, E. Approximate string-matching with q-grams and maximal matches. Theor. Comput. Sci. 92, 1 (1992), 191--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ullmann, J. R. An algorithm for subgraph isomorphism. J. ACM 23, 1 (1976), 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wang, L. CookRecipe - towards a versatile and fully-fledged recipe analysis and learning system. Ph.D. thesis, Department of Computer Science, City University of Hong Kong, Hong Kong (Jan. 2008).Google ScholarGoogle Scholar
  20. Yan, X., and Han, J. gSpan: Graph-based substructure pattern mining. In Proc. of the IEEE International Conference on Data Mining (ICDM), Washington DC, USA, 2002, p. 721. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yan, X., Yu, P. S., and Han, J. Graph indexing based on discriminative frequent structure analysis. ACM Trans. Database Syst. 30, 4 (2005), 960--993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yan, X., Yu, P. S., and Han, J. Substructure similarity search in graph databases. In Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD), New York, USA, 2005, pp. 766--777. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yang, R., Kalnis, P., and Tung, A. K. H. Similarity evaluation on tree-structured data. In Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD), New York, USA, 2005, pp. 754--765. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Substructure similarity measurement in chinese recipes

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WWW '08: Proceedings of the 17th international conference on World Wide Web
            April 2008
            1326 pages
            ISBN:9781605580852
            DOI:10.1145/1367497

            Copyright © 2008 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 21 April 2008

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,899of8,196submissions,23%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader