skip to main content
article
Free Access

A comparative analysis of methodologies for database schema integration

Published:11 December 1986Publication History
Skip Abstract Section

Abstract

One of the fundamental principles of the database approach is that a database allows a nonredundant, unified representation of all data managed in an organization. This is achieved only when methodologies are available to support integration across organizational and application boundaries.

Methodologies for database design usually perform the design activity by separately producing several schemas, representing parts of the application, which are subsequently merged. Database schema integration is the activity of integrating the schemas of existing or proposed databases into a global, unified schema.

The aim of the paper is to provide first a unifying framework for the problem of schema integration, then a comparative review of the work done thus far in this area. Such a framework, with the associated analysis of the existing approaches, provides a basis for identifying strengths and weaknesses of individual methodologies, as well as general guidelines for future improvements and extensions.

References

  1. AL-FEDAGHi, S., AND SCHEUERMANN, P. 1981. Mapping considerations in the design of schemas for the relational model. IEEE Trans. So{tw. Eng. SE-7, I (Jan.).Google ScholarGoogle Scholar
  2. BATINI, C., AND LENZERINI, M. 1984. A methodology for data schema integration in the entity relationship model. IEEE Trans. Softw. Eng. SE~IO, 6 (Nov.), 650-663.Google ScholarGoogle Scholar
  3. CASANOVA, M., AND VIDAL, M. 1983. Towards a sound view integration methodology. In Proceedings of the 2nd ACM SIGACT/SIGMOD Conference on Principles of Database Systems (Atlanta, Ga., Mar. 21-23). ACM, New York, pp. 36-47. Google ScholarGoogle Scholar
  4. DAYAL, U., AND HWANO, H. 1984. View definition and generalization for database integration in multibase: A system for heterogeneous distributed databases. IEEE Trans. Softw. Eng. SE-I 0, 6 (Nov.), 628-644.Google ScholarGoogle Scholar
  5. ELMASRI, R., LARSON, J., AND NAVATHE, S. B. 1987. Integration algorithms for federated databases and logical database design. Tech. Rep., Honeywell Corporate Research Center (submitted for publication).Google ScholarGoogle Scholar
  6. KAHN, B. 1979. A structured logical data base design methodology. Ph.D. dissertation, Computer Science Dept., Univ. of Michigan, Ann Arbor, Mich. Google ScholarGoogle Scholar
  7. MANNINO, M. V., AND EFFELSBERG, W. 1984a. A methodology for global schema design, Computer and Information Sciences Dept., Univ. of Florida, Tech. Rep. No. TR-84-1, Sept.Google ScholarGoogle Scholar
  8. MOTRO, A., AND BUNEMAN, P. 1981. Constructing superviews. In Proceedings of the international Conference on Management of Data (Ann Arbor, Mich., Apr. 29-May 1). ACM, New York. Google ScholarGoogle Scholar
  9. NAVATHE, S. B., AND GADGIL, S. G. 1982. A methodology for view integration in logical data base design. In Proceedings of the 8th International Conference on Very Large Data Bases (Mexico City). VLDB Endowment, Saratoga, Calif. Google ScholarGoogle Scholar
  10. TEORE~, T., AND FRY, J. 1982. Design of Database Structures. Prentice-Hall, Englewood Cliffs, N.J. Google ScholarGoogle Scholar
  11. WIEDERHOLD, G., AND ELMASm, R. 1979. A structural model for database systems. Rep. STAN- CS-79-722, Computer Science Dept., Stanford Univ., Stanford, Calif. Google ScholarGoogle Scholar
  12. YAO, S. B., WADDLE, V., AND HOUSEL, B. 1982. View modeling and integration using the functional data model. IEEE Trans. Softw. Eng. SE- 8, 6, 544-553.Google ScholarGoogle Scholar
  13. ALBANO, A., CARDELLI, L., AND ORSINI, R. 1985. Galileo: A strongly typed, interactive conceptual language. A CM Trans. Database Syst. 10, 2 (June), 230-260. Google ScholarGoogle Scholar
  14. ATZENI, P., AUSIELLO, G., BATINI, C., AND MOSCAR- INI, M. 1982. Inclusion and equivalence between relational database schemata. Theor. Comput. Sci. 19, 267-285.Google ScholarGoogle Scholar
  15. BATINI, C., AND LENZERINI, M. 1983. A conceptual foundation to view integration. In Proceedings of the IFIP TC.2 Working Conference on System Description Methodologies (Kecskmet, Hungary). Elsevier, Amsterdam, pp. 109-139.Google ScholarGoogle Scholar
  16. BATINI, C., LENZERINI, M., AND MOSCARINI, M. 1983. Views integration. In Methodology and Tools for Data Base Design, S. Ceri, Ed. North- Holland, Amsterdam.Google ScholarGoogle Scholar
  17. BATINI, C., DEMO, B., AND DI LEVA, A. 1984. A methodology for conceptual design of office data bases. Inf. Syst. 9, 3, 251-263. Google ScholarGoogle Scholar
  18. BATINI, C., NARDELLI, E., AND TAMASSIA, R. 1986. A layout algorithm for data flow diagrams. IEEE Trans. Softw. Eng. SE-12, 4 (Apr.), 538-546. Google ScholarGoogle Scholar
  19. BEERI, C., BERNSTEIN, P., AND GOODMAN, N. 1978. A sophisticate's introduction to database normalization theory. In Proceedings of the 4th International Conference on Very Large Data Bases (West Berlin, Sept. 13-15). IEEE, New York.Google ScholarGoogle Scholar
  20. BERNSTEIN, P. A. 1976. Synthesizing third normal form relations from functional dependencies. ACM Trans. Database Syst. 1, 4 (Dec.), 277-298. Google ScholarGoogle Scholar
  21. BILLER, H. 1979. On the equivalence of data base schemas: A semantic approach to data translation. Inf. Syst. 4, 1, 35-47.Google ScholarGoogle Scholar
  22. BILLER, H., AND NEUHOLD, E. J. 1982. Concepts for the conceptual schema. In Architecture and Models in Data Base Management Systems, G. M. Nijssen, Ed. North Holland, Amsterdam, pp. 1-30.Google ScholarGoogle Scholar
  23. BISKUP, J., AND CONVENT, B. 1986. A formal view integration method. In Proceedings of the International Conference on the Management of Data (Washington, D.C., May 28-30). ACM, New York. Google ScholarGoogle Scholar
  24. BISKUP, J., DAYAL, U., AND BERNSTEIN, P. A. 1979. Independent database schemas. In Proceedings of the International Conference on the Management of Data (Boston, Mass., May 30- June 1). ACM, New York. Google ScholarGoogle Scholar
  25. BOUZEGHOUB, M., GARDARIN, G., AND METAIS, E. 1986. Database design tools: An expert systems approach. In Proceedings of 11th International Conference of Very Large Databases (Stockholm, Sweden). Morgan Kaufmann, Los Altos, Calif.Google ScholarGoogle Scholar
  26. BRODIE, M. L. 1981. On modelling behavioural semantics of data. In Proceedings of the 7th International Conference on Very Large Data Bases (Cannes, France, Sept. 9-11). IEEE, New York, pp. 32-41.Google ScholarGoogle Scholar
  27. BRODIE, M. L., AND ZILLES, S. N., Eos. 1981. In Proceedings of the Workshop on Data Abstraction, Databases, and Conceptual Modelling. SIGPLAN Not. 16, 1 (Jan.). Google ScholarGoogle Scholar
  28. CARSWELL, J. L., AND NAVATHE, S. B. 1986. SA-ER: A methodology that links structured analysis and entity relationship modeling for database design. In Proceedings of the 5th International Conference on the Entity Relationship Approach, S. Spaccapietra, Ed. (Dijon, France, Nov.), pp. 19-36.Google ScholarGoogle Scholar
  29. CERI, S., ED. 1983. Methodology and Tools for Database Design. North-Holland, Amsterdam. Google ScholarGoogle Scholar
  30. CERI, S., AND PELA(~ATrI, G. 1984. Distributed Databases: Principles and Systems. McGraw-Hill, New York. Google ScholarGoogle Scholar
  31. CERI, S., PELAGATTI, G., AND BRACCHI, G. 1981. A structured methodology for designing static and dynamic aspects of data base applications. Inf. Syst. 6, 1, 31-45.Google ScholarGoogle Scholar
  32. CHEN, P. P. 1976. The entity-relationship model-- Toward a unified view of data. A CM Trans. Database Syst. 1, 1 (Mar.), 9-36. Google ScholarGoogle Scholar
  33. CHEN, P. P. 1983. English sentence structure and entity-relationship diagrams. J. Inf. Sci. 29, 127-150.Google ScholarGoogle Scholar
  34. CHIANG, W., BASAR, E., LIEN, C., AND TEiCHROEW, D. 1983. Data modeling with PSL/PSA: The view integration system (VIS). ISDOS Rep. No. M0549-0, Ann Arbor, Mich.Google ScholarGoogle Scholar
  35. CHILSON, D., AND KUDLAC, C. 1983. Database design: A survey of logical and physical design techniques. Database 15, I (Fall). Google ScholarGoogle Scholar
  36. DATA DESIGNER 1981. Data designer product description. Database Design Inc., Ann Arbor, Mich.Google ScholarGoogle Scholar
  37. DEMO, B. 1983. Program analysis for conversion from a navigation to a specification database interface. In Proceedings of the 9th International Conference on Very Large Data Bases (Florence, Italy). VLDB Endowment, Saratoga, Calif. Google ScholarGoogle Scholar
  38. DEMO, B., AND KUNOU, S. 1985. Modeling the CO- DASYL DML execution context dependency for application program conversion. In Proceedings of the International Conference on Management of Data (Austin, Tx., May 28-30). ACM, New York, pp. 354-363. Google ScholarGoogle Scholar
  39. DOS SANTOS, C. S., NEUHOLD, E. J., AND FURTADO, A. L. 1980. A data type approach to the entity relationship model. In Proceedings of the International Conference on the Entity Relationship Approach to System Analysis and Design, P. Chen, Ed. (Los Angeles, 1979). North-Holland, Amsterdam, pp. 103-120. Google ScholarGoogle Scholar
  40. EICK, C. F., AND LOCKEMANN, P. C. 1985. Acquisition of terminological knowledge using database design techniques. In Proceedings of the International Conference on Management of Data (Austin, Tx., May 28--30). ACM, New York, pp. 84-94. Google ScholarGoogle Scholar
  41. ELMASRI, R. 1980. On the design, use and integration of data models. Ph.D. dissertation, Pep. No. STAN-CS-80-801, Dept. of Computer Science, Stanford Univ., Stanford, Calif. Google ScholarGoogle Scholar
  42. ELMASRI, R., AND NAVATHE, $. B. 1984. Object integration in database design. In Proceedings of the IEEE COMPDEC Conference (Anaheim, Calif., Apr.). IEEE, New York, pp. 426-433. Google ScholarGoogle Scholar
  43. ELMASRI, R., AND WIEDERHOLD, G. 1979. Data model integration using the structural model. In Proceedings of the International Con{erence on Management o{ Data (Boston, Mass., May 30- June 1). ACM, New York. Google ScholarGoogle Scholar
  44. ELMASRI, R., WEELDRYER, J., AND H~.VNER, A. 1985. The category concept: An extension to the entity-relationship model. Data Knowl. Eng. 1, 1 (June). Google ScholarGoogle Scholar
  45. FERRARA, F. M. 1985. EASY-ER: An integrated system for the design and documentation of data base applications. In Proceedings of the 4th International Conference on the Entity Relationship Approach (Chicago, Ill.). IEEE Computer Society, Silver Spring, Md., pp. 104-113. Google ScholarGoogle Scholar
  46. HAMMER, M., AND McLEOD, D. 1981. Database description with SDM: A semantic database model. ACM Trans. Database Syst. 6, 3 (Sept.), 351-386. Google ScholarGoogle Scholar
  47. HUBBARD, G. 1980. Computer Assisted Data Base Design. Van Nostrand-Reinhold, New York. Google ScholarGoogle Scholar
  48. HWANO, H. Y. 1982. Database integration and optimization in multidatabase systems. Ph.D. dissertation, Dept. of Computer Science, Univ. of Texas, Austin, Oct. Google ScholarGoogle Scholar
  49. KLUG, A., AND TSICHRITZIS, D., Eds. 1977. The ANSI/X3/SPARC Report of the Study Group on Data Base Management Systems. AFIPS Press, Reston, Va.Google ScholarGoogle Scholar
  50. LANDERS, T. A., *NO ROSENnER(;, R. L. 1982. An overview of Multibase. In Distributed Databases, H. J. Schneider, Ed. North-Holland, Amsterdam.Google ScholarGoogle Scholar
  51. LARSON, J., NAVATHE, S. B., AND ELMASRI, R. 1986. Attribute equivalence and its role in schema integration. Tech. Rep., Honeywell Computer Sciences Center, Golden Valley, Minn.Google ScholarGoogle Scholar
  52. LUM, V., GHOSH, S., SCHKOLNiCK, M., jEFFERSON, D., Su, S., FRY, J., ,NO YAO, B. 1979. 1978 New Orleans data base design workshop. In Proceedings of the 5th International Conference on Very Large Data Bases (Rio de Janeiro, Oct. 3-5). IEEE, New York, pp. 328-339.Google ScholarGoogle Scholar
  53. MAIER, D. 1983. The Theory of Relational Databases. Computer Science Press, Potomac, Md. Google ScholarGoogle Scholar
  54. MANNINO, M. V., AND EFFELSBERG, W. 1984b. Matching techniques in global schema design. In Proceedings of the IEEE COMPDEC Conference (Los Angeles, Calif.). IEEE, New York, pp. 418-425. Google ScholarGoogle Scholar
  55. MANNINO, M. V., AND KARLE, C. 1986. An extension of the general entity manipulator language for global view definition. Data Knowl. Eng. 2, 1.Google ScholarGoogle Scholar
  56. MANNINO, M. V., NAVATHE, S. B., AND EFFELSBERG, W. 1986. Operators and rules for merging generalization hierarchies. Working Paper, Graduate School of Business, Univ. of Texas, Austin, April 1986.Google ScholarGoogle Scholar
  57. MCLEOD, D., AND HEIMBIGNER, D. 1980. A federated architecture for data base systems. In Proceedings of the AFIPS National Computer Con{erence, vol. 39. AFIPS Press, Arlington, Va.Google ScholarGoogle Scholar
  58. MOTRO, A. 1981. Virtual merging of databases. Ph.D. dissertation, Tech. Rep. #MS-CIS-80-39, Computer Science Dept., Univ. of Pennsylvania, Philadelphia, Pa. 1981. Google ScholarGoogle Scholar
  59. MYLOPOULOS, J., BERNSTEIN, P. A., AND WONG, H. K.T. 1980. A language facility for designing database-intensive applications. ACM Trans. Database Syst. 5, 2 (June) 185-207. Google ScholarGoogle Scholar
  60. NATIONAL BUREAU OF STANDARDS 1982. Data base directions: Information resource managementstrategies and tools. Special Publ. 500-92, A. Goldfine, Ed. U.S. Dept. of Commerce, Washington, D.C., Sept. 1982.Google ScholarGoogle Scholar
  61. NAVATHE, S.B., AND SCHKOLNICK, M. 1978. View representation in logical database design. In Proceedings of the International Conference on Management of Data (Austin, Tex.). ACM, New York, pp. 144-156. Google ScholarGoogle Scholar
  62. NAVATHE, S. B., AND KERSCHnERC, L. 1986. Role of data dictionaries in information resource management. Inf. Manage. 10, 1. Google ScholarGoogle Scholar
  63. NAVATHE, S. B., SASHIDHAR, T., AND ELMASRI, R. 1984. Relationship matching in schema integration. In Proceedings of the l Oth International Conference on Very Large Data Bases (Singapore). Morgan Kaufmann, Los Altos, Calif. Google ScholarGoogle Scholar
  64. NAVATHE, S. B., ELMASRI, R., AND LARSON, J. 1986. Integrating user views in database design. IEEE Computer 19, 1 (Jan.), 50-62.Google ScholarGoogle Scholar
  65. NG, P., JAJODIA, S., AND SPRINGSTEEL, F. 1983. The problem of equivalence of entity relationship diagrams. IEEE Trans. So{tw. Eng. SE-9, 5, 617-630.Google ScholarGoogle Scholar
  66. OLLE, T. W., SOL, H. G., AND VERRIJN-STUART, A. A., Eds. 1982. Information systems design methodologies: A comparative review. In Proceedings o{ the IFIP WG 8.1 Working Conference on Comparative Review of Information Systems Design Methodologies (Noordwijkerhout, The Netherlands). North-Holland, Amsterdam. Google ScholarGoogle Scholar
  67. RISSANEN, J. 1977. Independent components of relations. ACM Trans. Database Syst. 2, 4 (Dec.), 317-325. Google ScholarGoogle Scholar
  68. ROLLAND, C., AND RICHARDS, C. 1982. Transaction modeling. In Proceedings of the International Conference on Management of Data (Orlando, Fla., June 2-4). ACM, New York, pp. 265-275. Google ScholarGoogle Scholar
  69. SAKAI, H. 1981. A method for defining information structures and transactions in conceptual schema design. In Proceedings of the 7th International Conference on Very Large Data Bases (Cannes, France, Sept. 9-11). IEEE, New York, pp. 225-234.Google ScholarGoogle Scholar
  70. SCHEUERMANN, P., SCHIFFNER, G., AND WEBER, H. 1980. Abstraction capabilities and invariant properties modeling within the entity relationship approach. In Proceedings of the International Conference on Entity Relationship Approach to System Analysis and Design, P. Chen, Ed. (Los Angeles, 1979). North-Holland, Amsterdam. Google ScholarGoogle Scholar
  71. SHIN, D. G., AND IRANI, K. B. 1985. Knowledgebased distributed database system design. In Proceedings of the International Conference on Management of Data (Austin, Tex., May 28-30). ACM, New York, pp. 95-105. Google ScholarGoogle Scholar
  72. SHIPMAN, D. W. 1980. The functional data model and data language DAPLEX. ACM Trans. Database Syst. 6, i (Mar.), 140-173. Google ScholarGoogle Scholar
  73. SMITH, J. M., AND SMITH, D. C. 1977. Database abstraction: Aggregation and generalization. ACM Trans. Database Syst. 2, 2 (June), 105-133. Google ScholarGoogle Scholar
  74. TUCHERMAN, L., FURTADO, A. L., ANO CASANOVA, M. A. 1985. A tool for modular database design. In Proceedings of the 11th International Con{erence on Very Large Data Bases (Stockholm, Sweden). Morgan Kaufmann, Los Altos, Calif.Google ScholarGoogle Scholar
  75. ULLMAN, J. D. 1982. Principles of Database Systems, 2nd ed. Computer Science Press, Potomac, Md. Google ScholarGoogle Scholar
  76. WEELDREYER, J. A. 1986. Structural aspects of the entity-category-relationship model of data, Tech. Rep. HR-80-251, Honeywell Computer Sciences Center, Golden Valley, Minn.Google ScholarGoogle Scholar

Recommendations

Reviews

Csaba Joseph Egyhazy

Schema integration, as defined by the authors, occurs in two contexts: (1) view integration (in database design), which produces a global conceptual description of a proposed database; and (2) database integration (in distributed database management), which produces the global schema of a collection of databases. To understand the complexity of schema integration, the authors begin by identifying some of the causes for schema diversity. This is followed by a number of comparisons of 12 established schema integration methodologies. The issue of the diversity of data models is resolved by adopting a uniform treatment of concepts, based primarily on the entity-relationship model. The conforming of schemas problem then amounts to resolving type, dependency, key, and behavioral conflicts. The resolution of the conflicts leads to schema transformations. This is an activity usually undertaken by designers, in close collaboration with users. However, as noted by the authors, most of the schema transformations are geared for a removal of redundancy, as opposed to simplification or logical optimization. One of the most disturbing revelations of the paper, which I found particularly noteworthy, was the absence of existing specialized languages or data structures for automating at least some of the four major schema integration activities common to all 12 methodologies identified in the paper. Additionally, only a few of these methodologies provide explicit tools or procedures to carry out the process of resolution beyond renaming, redundancy elimination, and generalization. The more difficult ones, such as integrity constraints and language and data structure incompatibilities, remain, for the most part, unresolved. And finally, as noted by the authors: “none [of these methodologies] provide an analysis or proof of the completeness of the schema transformation operations from the standpoint of being able to resolve any type of conflict that can arise.” This leads to the conclusion that none of the methodologies are based on any established mathematical theory and are merely engaged in defining a consensus schema by possibly changing some user views. This approach, first suggested over ten years ago, should be challenged by the slow, incoming influence of applied database logic among schema integration researchers.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 18, Issue 4
    Dec. 1986
    74 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/27633
    Issue’s Table of Contents

    Copyright © 1986 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 December 1986
    Published in csur Volume 18, Issue 4

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader