ABSTRACT
Office applications such as OpenOffice and Microsoft Office are widely used to edit the majority of today's business documents: office documents. Usually, version control systems consider office documents as binary objects, thus severely hindering collaborative work. Since XML has become a de-facto standard for office applications, we focus on versioning office documents by structured XML version control approaches. This enables state-of-the-art version control for office documents.A basic prerequisite to XML version control is a diff algorithm, which detects structural changes between XML documents. In this paper, we evaluate state-of-the-art XML diff algorithms w.r.t. their suitability to OpenOffice XML documents and the future OASIS office document standard. It turns out that, due to the specific XML office format, a careful examination of the diff algorithm characteristics is necessary. Therefore, we identify important features for XML diff approaches to handle office documents. We have implemented a first OpenOffice versioning API that can be used in version control systems as a replacement for line-based or binary diffs, which are currently used.
- D. T. Barnard, G. Clarke, and N. Duncan. Tree-to-tree correction for document trees. Technical report, Queen's University Kingston, Ontario, Canada, January 1995.Google Scholar
- Better SCM initiative. better-scm.berlios.de.Google Scholar
- Bitkeeper: version control system. www.bitkeeper.com.Google Scholar
- U. M. Borghoff and J. H. Schlichter. Computer-Supported Cooperative Work: Introduction to Distributed Applications. Springer-Verlag, 2000. Google ScholarDigital Library
- P. Cederqvist et al. Version Management with CVS, 2002. www.cvshome.org/docs/manual/.Google Scholar
- S. S. Chawathe, A. Rajaraman, H. Garcia-Molina, and J. Widom. Change detection in hierarchically structured information. In Proc. of the 1996 ACM SIGMOD Int. Conf. on Management of Data, pages 493--504, Montreal, Canada, 1996. ACM Press. Google ScholarDigital Library
- G. Cobena, S. Abiteboul, and A. Marian. Detecting changes in XML documents. In Proc. of the 18th Int. Conf. on Data Engineering, pages 41--52, San Jose, CA, 2002. IEEE CS Press. Google ScholarDigital Library
- B. Collins-Sussman, B. W. Fitzpatrick, and C. M. Pilato. Version Control with Subversion. O'Reilly and Associates, 2004. subversion.tigris.org/. Google ScholarDigital Library
- S. Dekeyser and J. Hidders. Conflict scheduling of transactions on XML documents. In Proc. of the 15th Conf. on Australasian database, pages 93--101, Darlinghurst, Australia, Australia, 2004. Australian Computer Society, Inc. Google ScholarDigital Library
- J. D. Eisenberg. OpenOffice.org XML Essentials - Using OpenOffice.org's XML Data Format. O'Reilly & Associates, to appear 2005.Google Scholar
- S. C. Gupta, T. N. Nguyen, and E. V. Munson. The software concordance: Using a uniform document model to integrate program analysis and hypermedia. In Proc. of 10th Asia-Pacific Software Engineering Conf., pages 164 -- 173, Chiang Mai, Thailand, 2003. IEEE CS Press. Google ScholarDigital Library
- B. Krieg-Brückner et al. Multimedia instruction in safe and secure systems. In Recent Trends in Algebraic Development Techniques, volume 2755 of LNCS, pages 82--117. Springer-Verlag, 2003.Google ScholarCross Ref
- F. Lam, N. Lam, and R. Wong. Efficient synchronization for mobile XML data. In Proc. of the 11th Int. Conf. on Information and Knowledge Management, pages 153--160, New York, NY, 2002. ACM Press. Google ScholarDigital Library
- T. Lindholm. A three-way merge for XML documents. In Vion-Dury citeproceedingsDocEng04, pages 1--10. Google ScholarDigital Library
- J. I. Maletic, E. V. Munson, A. Marcus, and T. N. Nguyen. Using a hypertext model for traceability link conformance analysis. In Proc. of the 2nd Int. Wkshp. on Traceability in Emerging Forms of Software Engineering, Montreal, Canada, 2003. IEEE CS Press.Google Scholar
- A. Marian, S. Abiteboul, G. Cobena, and L. Mignet. Change-centric management of versions in an XML warehouse. In Proc. of the 27th Int. Conf. on Very Large Data Bases, pages 581--590, Roma, Italy, 2001. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- A. Mouat. XML diff and patch utilities. Master's thesis, Heriot-Watt University, Edinburgh, Scotland, 2002.Google Scholar
- E. W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1(2):251--266, 1986.Google ScholarCross Ref
- C. Nentwich, W. Emmerich, A. Finkelstein, and E. Ellmer. Flexible consistency checking. ACM Trans. Softw. Eng. Methodol., 12(1):28--63, 2003. Google ScholarDigital Library
- Sun follows EC recommendation, proposes OpenOffice as ISO standard format. europa.eu.int/idabc/en/document/3308.Google Scholar
- V. Quint and I. Vatton. Techniques for authoring complex XML documents. In Vion-Dury citeproceedingsDocEng04, pages 115--123. Google ScholarDigital Library
- D. Roundy. Darcs: David's advanced revision control system, 2005. www.darcs.net.Google Scholar
- J. Scheffczyk, U. M. Borghoff, P. Rödig, and L. Schmitz. Managing inconsistent repositories via prioritized repairs. In Vion-Dury citeproceedingsDocEng04, pages 137--146. Google ScholarDigital Library
- J. Scheffczyk, U. M. Borghoff, P. Rödig, and L. Schmitz. Towards efficient consistency management for informal applications. Int. Journal of Computer & Information Science, 5(2):109--121, 2004.Google Scholar
- C. Stutz, J. Siedersleben, D. Kretschmer, and W. Krug. Analysis beyond UML. In 10th Anniversary IEEE Joint Int. Conf. on Requirements Engineering, pages 215--218, Essen, Germany, 2002. IEEE CS Press. Google ScholarDigital Library
- J.-Y. Vion-Dury, editor. Proc. of the 2004 ACM Symp. on Document Engineering, Milwaukee, WI, 2004. ACM Press.Google Scholar
- Y. Wang, D. J. DeWitt, and J. Cai. X-Diff: An effective change detection algorithm for XML-documents. In 19th Int. Conf. on Data Engineering, pages 519--530, Bangalore, India, 2003. IEEE CS Press.Google ScholarCross Ref
- R. K. Wong and N. Lam. Managing and querying multi-version XML data with update logging. In Proc. of the 2002 ACM Symp. on Document Engineering, pages 74--81. ACM Press, 2002. Google ScholarDigital Library
- R. K. Wong and N. Lam. Efficient re-construction of document versions based on adaptive forward and backward change deltas. In Proc. of 14th Int. Conf. of Database and Expert Systems Applications, volume 2736 of LNCS, pages 266--275, Prague, Czech Republic, 2003. Springer-Verlag.Google ScholarCross Ref
- XUpdate - XML update language. xmldb-org.sourceforge.net/xupdate/.Google Scholar
- H. Zhang and F. W. Tompa. Querying XML documents by dynamic shredding. In Vion-Dury citeproceedingsDocEng04, pages 21--30. Google ScholarDigital Library
Index Terms
- Towards XML version control of office documents
Recommendations
Version-aware XML documents
DocEng '11: Proceedings of the 11th ACM symposium on Document engineeringA document often goes through many revisions before it is finalized. In the normal document creation process, newer revisions overwrite older ones and only the final revision is kept. At any stage of document creation, it might be desirable to see how ...
Versioning XML-based office documents
The ability to reliably merge independent updates of a document is a crucial prerequisite to efficient collaboration in office work. However, merge support for common office document standards like OpenDocument or OfficeOpenXML is still in its infancy. ...
Relevancy based access control of versioned XML documents
SACMAT '05: Proceedings of the tenth ACM symposium on Access control models and technologiesIntegration of version and access control of XML documents has the benefit of regulating access to rapidly growing archives of XML documents. Versioned XML documents provide us with valuable informations on dependencies between document nodes, but at ...
Comments