BASKET
Search

PDF/A Support in the PDFlib Product Family

PDFlib GmbH introduced PDF/A functionality in its products in 2006. PDFlib products were the first with support for XMP extension schemas. All products in the latest version 9 of the PDFlib product family support all flavors of PDF/A-1, PDF/A-2 and PDF/A-3. It provides application developers with a toolkit which allows the following PDF/A-related operations:

create PDF/A from scratch, e.g. based on text from a database

convert raster images (e.g. scans) to PDF/A

process existing PDF/A documents, e.g. merge or split

work with ICC profiles and device-independent color to deal with all color management issues

create PDF/A level A with structure information (Tagged PDF), also in combination with PDF/UA

attach XMP metadata to the generated documents, including XMP extension schemas

determine the glyph names in fonts to determine suitable ActualText for symbolic fonts

attach PDF/A documents to PDF/A-2 or arbitrary file types to PDF/A-3

All of these operations can be implemented with simple PDFlib function calls. Sample code for a variety of programming languages and development environments is provided with the PDFlib distribution. Additional programming techniques for PDF/A are available in the PDFlib Cookbook. To facilitate font embedding as required by PDF/A, the Japanese Resource Kit for PDFlib includes common Japanese fonts. These fonts come with an embeddable license which is included in the software license.

Creating PDF/A with PDFlib

Creating PDF/A-conforming output with PDFlib is achieved by the following means:

PDFlib automatically takes care of several formal settings for PDF/A, such as PDF version number and required XMP identification entries.

The PDFlib client program must explicitly use certain function calls and options (e.g. for font embedding).

The PDFlib client program must refrain from using certain other function calls and option settings (e.g. encryption).

If the PDFlib client program obeys to these rules, valid PDF/A output is guaranteed. If PDFlib detects a violation of the PDF/A creation rules it throws an exception which must be handled by the application. No PDF output is created in case of an error; there is no risk of creating non-conforming output if an error occurs. Details of required and prohibited operations are discussed in the PDFlib documentation.

Processing PDF/A with PDFlib

Additional rules apply when importing pages from existing PDF/A-conforming documents. When dealing with existing PDF/A documents, PDFlib+PDI carefully examines the PDF/A properties of all input and output documents to make sure that the output still conforms to PDF/A. For additional control the output intent of an imported document can be copied to the output PDF, effectively cloning the PDF/A color properties of an existing document. Similarly, XMP metadata from imported documents can be cloned or merged.

Creating PDF/A level A with PDFlib

PDF/A conformance level A can be regarded as level B plus Tagged PDF. PDFlib’s support for PDF/A level A is based on the features for producing Tagged PDF: each content item can be placed at a particular location in the document’s structure tree; content items which are not relevant for the document structure (e.g. headers and footers, pagination) can be tagged as artifacts which means that they will be ignored when the document is read aloud by software or converted to some other format. Alternative text can be attached to images and vector graphics. PDFlib automatically tags tables and artifacts which is a big time-saver for the developer. PDFlib checks the supplied tags to make sure that the structure element nesting and attributes conform to ISO 32000-1. For example, heading or list tags must be properly nested.

Integrated support for PDF/UA makes it easy to create PDF output which is both accessible and archivable. Note that you need detailed knowledge about the document’s logical structure in order to create Tagged PDF. PDFlib takes care of the PDF-related details, but it cannot infer the document structure from its contents.