article

Free Access

Experiences with selecting search engines using metasearch

Authors:
Daniel Dreilinger

Massachusetts Institute of Technology, Cambridge

Massachusetts Institute of Technology, Cambridge
View Profile

,
Adele E. Howe

Colorado State Univ., Fort Collins

Colorado State Univ., Fort Collins
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 15 Issue 3pp 195–222https://doi.org/10.1145/256163.256164

Published:01 July 1997Publication History

ACM Transactions on Information Systems

Abstract

Search engines are among the most useful and high-profile resources on the Internet. The problem of finding information on the Internet has been replaced with the problem of knowing where search engines are, what they are designed to retrieve, and how to use them. This article describes and evaluates SavvySearch, a metasearch engine designed to intelligently select and interface with multiple remote search engines. The primary metasearch issue examined is the importance of carefully selecting and ranking remote search engines for user queries. We studied the efficacy of SavvySearch's incrementally acquired metaindex approach to selecting search engines by analyzing the effect of time and experience on performance. We also compared the metaindex approach to the simpler categorical approach and showed how much experience is required to surpass the simple scheme.

References

BOWMAN, C. M., DANZIG, P. B., MANBER, U., AND SCHWARTZ, M.F. 1994. Scalable internet resource discovery: Research problems and approaches. Commun. ACM 37, 8 (Aug.). Google Scholar
BOWMAN, C. M., DANZIG, P. B., MANBER, U., SCHWARTZ, M. F., HARDY, D. R., AND WESSELS, D. P. 1995. Harvest: A scalable, customizable discovery and access system. Tech. Rep., Univ. of Colorado, Boulder, Colo.Google Scholar
DREILINGER, D. 1996. Description and evaluation of a meta-search agent. Master's thesis, Computer Science Dept., Colorado State Univ., Fort Collins, Colo.Google Scholar
EICHMANN, D. 1994. Ethical web agents. In Electronic Proceedings of the 2nd World Wide Web Conference '94: Mosaic and the Web. Elsevier, London. Available as http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Agents/eichmann.ethical/ethics.html.Google Scholar
GAUCH, S., WANG, G., AND GOMEZ, M. 1996. Profusion: Intelligent fusion from multiple, different search engines. J. Univ. Comput. Sci. 2, 9 (Sept.).Google Scholar
GRAVANO, L., GARC#A-MOLINA, H., AND TOMASIC, A. 1994. Precision and recall of GLOSS estimators for database discovery. In Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems (PDIS'94). IEEE Computer Society, Washington, D.C. Google Scholar
SALTON, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass. Google Scholar
SELBERG, E. AND ETZIONI, O. 1995. Multi-service search and comparison using the MetaCrawler. In Proceedings of the 4th International World Wide Web Conference.Google Scholar
SHELDON, M. A., DUDA, A., WEISS, R., AND GIFFORD, D.K. 1995. Discover: A resource discovery system based on content routing. In Proceedings of the 3rd International World Wide Web Conference. Elsevier, North Holland, Amsterdam. Google Scholar
WITTEN, I. H., MOFFAT, A., AND BELL, T.C. 1994. Managing Gigabytes: Compressing and Indexing Documents and Images. Von Nostrand Reinhold, New York. Google Scholar
YAN, T. W. AND GARCIA-MOLINA, H. 1995. SIFT--A tool for wide-area information dissemination. In Proceedings of the 1995 USENIX Technical Conference. USENIX Assoc., Berkeley, Calif., 177-186. Google Scholar
ZILBERSTEIN, S. 1995. An anytime computation approach to information gathering. In Working Notes of the AAAI Spring Symposium Series on Information Gathering from Distributed, Heterogeneous Environments. AAAI, Menlo Park, Calif.Google Scholar

Index Terms

Experiences with selecting search engines using metasearch
1. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

Building efficient and effective metasearch engines

Frequently a user's information needs are stored in the databases of multiple search engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search engines and identify useful documents from the returned results. To support ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More
Search Engine Optimization by Re-Ranking the Product Search Result Based on User Click Data
AISS '21: Proceedings of the 3rd International Conference on Advanced Information Science and System

Blibli.com provides a search engine for its customers. It used Solr search engine with only plain BM25 similarity function which is based on probability. In order to improve search engine performance, this research tried to implement an algorithm that ...
Read More

Reviews

Reviewer: Donald Harris Kraft

A solid background in the concept of Web search engines and metasearch engines, and some experiments on SavvySearch, a metasearch engine designed by the authors, are provided in this paper. It includes easy-to-follow definitions of concepts necessary to an understanding of information retrieval, Web search engines, and metasearch engines. It is nice to see that concepts such as search engines (designed to aid in finding Web sites, given the exponentially growing number of sites) and metasearch engines (designed to aid in deciding which search engines to use, given the rapid growth in search engines), which have been known for years by library and information scientists, are being rediscovered by computer scientists. The authors note that a metasearch engine must have a dispatch mechanism to determine which search engines to employ, an interface agent to adapt a user query into a query suitable for each search engine employed, and a display mechanism by which to return the search results to the user. The paper provides a good literature search of available metasearch engines along with their Web site URLs. The authors explain how SavvySearch uses the keywords in the user's query to rank potential search engines that will eventually rank Web sites deemed relevant to the query. They note that the top search engines can be made to search in parallel. In order to rank search engines, they keep track of term frequencies at the sites searched by each search engine, and they keep track of the frequencies of success and failure of each search engine in terms of finding relevant sites for specific terms. The ranking of the search engines is accomplished by a complex formula based on concepts analogous to ranking via term weights in standard document retrieval. The ranking includes considerations of concurrency, expected network load, and local CPU load. One nice feature of the search engine ranking mechanism is the inclusion of thresholds for response times, leading to penalties for slow searches. The paper provides the results of a series of experiments with SavvySearch. A pilot study looked at how well search engines were being selected. The authors used a large set of queries (at least 2500). They varied the ordering of the search engines and the selection of the first group of search engines to be employed. Results indicate that their approach is viable, that users like the basic approach, that users follow more links found at the beginning of a search, and that past query success can be used to improve future searches. Further experiments looked at SavvySearch enhancements, such as penalties for lack of results and frequent updating of the meta-index, which is the data structure for information about search engine successes and failures and for term frequencies. Results were mixed, but, in general, SavvySearch's approach is a good one. The bottom line is that SavvySearch has garnered increased interest and use. It takes some experience for the system to learn enough about what is out there to improve on categorical searches done by other means. The approach is especially effective at figuring out where not to search. The authors continue to search for more efficient ways to use the Web to find relevant information.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 15, Issue 3
July 1997
126 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/256163
Issue’s Table of Contents

Copyright © 1997 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 1997
Published in tois Volume 15, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
WWW
information retrieval
machine learning
search engine
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 121
  Total Citations
  View Citations
- 2,874
  Total Downloads
- Downloads (Last 12 months)68
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Experiences with selecting search engines using metasearch

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Building efficient and effective metasearch engines

Re-ranking search results using query logs

Search Engine Optimization by Re-Ranking the Product Search Result Based on User Click Data

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Experiences with selecting search engines using metasearch

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Building efficient and effective metasearch engines

Re-ranking search results using query logs

Search Engine Optimization by Re-Ranking the Product Search Result Based on User Click Data

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media