skip to main content
10.5555/963600.963704acmotherconferencesArticle/Chapter ViewAbstractPublication PagesisictConference Proceedingsconference-collections
Article

A multithreaded Java framework for information extraction in the context of enterprise application integration

Published:24 September 2003Publication History

ABSTRACT

In this paper, we present a new multithreaded framework for information extraction with Java in heterogeneous enterprise application environments, which frees the developer from having to deal with the error-prone task of low-level thread programming. The power of this framework is demonstrated by an example of extracting product prices from web sites, but the framework is useful for numerous other purposes, too. Strong points of the framework are its performance, continuous feedback, and adherence to maximum response times. The description of the framework uses UML modeling techniques for visualizing multithreading. Moreover, we tackle Java problems of stopping running threads.

References

  1. Chan, P. (2002): The Java Developers Almanac 1.4, Volume 1: Examples and Quick Reference, e93. Stopping a Thread, http://javaalmanac.com/egs/java.lang/StopThread.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Doorenbos, R. B., Etzioni, O., and Weld, D. S. (1997): A Scalable Comparison-Shopping Agent for the World-Wide Web, in: Proc. ACM Conf. Autonomous Agents, ftp://ftp.cs.washington.edu/pub/etzioni/softbots/agents97.ps Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eikvil, L. (1999): Information Extraction from World Wide Web - A Survey. Norwegian Computing Center, P. B. 114 Blindern, N-0314 Oslo, Norwegen, Rapport Nr. 945Google ScholarGoogle Scholar
  4. Friedl, J. E. F. (2002): Mastering Regular Expressions, 2nd edition, O'Reilly & Associates Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hull, R. (1997): Managing Semantic Heterogeneity in Databases - A Theoretical Perspective. Tutorial. Bell Laboratories. Lucent Technologies. http://www.db-research.bell-labs.com/user/hull/pods97-tutorial.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Krulwich, B. T. (1996): The BargainFinder Agent - Comparison Price Shopping on the Internet, in: Williams, Joseph (ed.): Bots and other Internet Beasties, Sams. net Publishing (Macmillan), pp. 257--263Google ScholarGoogle Scholar
  7. Kuhlins, S. and Tredwell, R. (2003): Toolkits for Generating Wrappers - A Survey of Software Toolkits for Automated Data Extraction from Websites, in: Aksit, M., Mezini, M., and Unland, R. (eds.): Objects, Components, Architectures, Services, and Applications for a Networked World, International Conference NetObjectDays (NODe 2002), Oct. 7--10, 2002, Erfurt, Germany, Lecture Notes in Computer Science (LNCS 2591), Springer, pp. 184--198, http://www.wifo.uni-mannheim.de/~kuhlins/paper/wrapper.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kushmerick, N. (1998): (Toward) an Extensible Wrapper Repository Standard, in: Proc. Workshop on AI & Information Integration, AAAI-98 (Madison), http://www.cs.ucd.ie/staff/nick/home/research/download/kushmerick-aaai98-aiii-panel.ps.gzGoogle ScholarGoogle Scholar
  9. Kushmerick, N. (2002): Gleaning Answers from the Web. Position paper, AAAI 2002 Spring Symposium on Mining Answers from Texts and Knowledge Bases.Google ScholarGoogle Scholar
  10. Lea, D. (1999): Concurrent Programming in Java - Design Principles and Patterns, Second edition, Addition-Wesley: "Multiphase cancellation": http://gee.cs.oswego.edu/dl/cpj/cancel.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Roth, M. T. and Schwarz, P. (1997): A Wrapper Architecture for Legacy Data Sources. IBM Almaden Research Center. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Schader, M., and Korthaus, A. (1998): Modeling Java Threads in UML. In: Schader, M., and Korthaus, A. (eds.): The Unified Modeling Language - Technical Aspects and Applications. Physica, Heidelberg, New York, pp. 122--143Google ScholarGoogle ScholarCross RefCross Ref
  13. Sun Microsystems (2003): Java 2 Platform, Standard Edition, v 1.4.2, API Specification, Class Thread, http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Thread.html#stop()Google ScholarGoogle Scholar
  14. Wiederhold, G. (1992): Mediators in the Architecture of Future Information Systems, in: IEEE Computer, 25(3), pp. 38--49, http://www-db.stanford.edu/pub/gio/1991/afis.ps Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A multithreaded Java framework for information extraction in the context of enterprise application integration
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ISICT '03: Proceedings of the 1st international symposium on Information and communication technologies
              September 2003
              614 pages

              Publisher

              Trinity College Dublin

              Publication History

              • Published: 24 September 2003

              Check for updates

              Qualifiers

              • Article
            • Article Metrics

              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader