ABSTRACT
Portable Document Format (PDF) is a page-oriented, graphically rich format based on PostScript semantics and it is also the format interpreted by the Adobe Acrobat viewers. Although each of the pages in a PDF document is an independent graphic object this property does not necessarily extend to the components (headings, diagrams, paragraphs etc.) within a page. This, in turn, makes the manipulation and extraction of graphic objects on a PDF page into a very difficult and uncertain process.The work described here investigates the advantages of a model wherein PDF pages are created from assemblies of COGs (Component Object Graphics) each with a clearly defined graphic state. The relative positioning of COGs on a PDF page is determined by appropriate 'spacer' objects and a traversal of the tree of COGs and spacers determines the rendering order. The enhanced revisability of PDF documents within the COG model is discussed, together with the application of the model in those contexts which require easy revisability coupled with the ability to maintain and amend PDF document structure.
- Adobe Systems Incorporated, PDF Reference (Third Edition) version 1.4, ISBN 0-201-75839-3, Addison-Wesley, December 2001.Google Scholar
- Adobe Systems Incorporated, PostScript Language Reference Manual (Third Edition), ISBN 0-201-37922-9, Addison-Wesley, February 1999. Google ScholarDigital Library
- Philip N. Smith and David F. Brailsford, "Towards Structured Block-based PDF," Electronic Publishing-Origination, Dissemination and Design, vol. 8, nos. 2 and 3, pp. 153-165, June/September 1995. Available on-line at http://cajun.cs.nott.ac.uk/compsci/epo/papers/epoddtoc.htmlGoogle Scholar
- B. W. Kernighan, "A Typesetter Independent TROFF," Computing Science Technical Report No. 97, Bell Laboratories, Murray Hill, New Jersey 07974, March 1982.Google Scholar
- Universally Unique Identifiers (UUID). http://www.globecom.net/ietf/draft/ draft-leach-uuids-guids-01.htmlGoogle Scholar
- Brian W. Kernighan and Christopher J. Van Wyk, "Page Makeup by Postprocessing Text Formatter Output," Computing Systems, vol. 2, no. 1, pp. 103--131, 1989.Google Scholar
- The COG-PDF home page: http://www.eprg.org/cogsGoogle Scholar
Index Terms
- Creating reusable well-structured PDF as a sequence of component object graphic (COG) elements
Recommendations
Creating structured PDF files using XML templates
DocEng '04: Proceedings of the 2004 ACM symposium on Document engineeringThis paper describes a tool for recombining the logical structure from an XML document with the typeset appearance of the corresponding PDF document. The tool uses the XML representation as a template for the insertion of the logical structure into the ...
COG Extractor
DocEng '06: Proceedings of the 2006 ACM symposium on Document engineeringThe Component Object Graphic (COG) model describes documents as a series of distinct, encapsulated graphical blocks (termed COGs) that are positioned on the page rather than the traditional approach (taken by formats such as PostScript, PDF and SVG) of ...
Extracting reusable document components for variable data printing
DocEng '07: Proceedings of the 2007 ACM symposium on Document engineeringVariable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Every printed instance of a specific class of document can now have different degrees of customized content within the document template.
This flexibility comes ...
Comments