PDF/A Support in the PDFlib Product Family
PDFlib GmbH introduced PDF/A functionality in its products in 2006. PDFlib products were the first with support for XMP extension schemas. All products in the PDFlib product family support all flavors of PDF/A-1, PDF/A-2 and PDF/A-3 (PDF/A-4 support in development). It provides application developers with a toolkit which allows the following PDF/A-related operations:
- create PDF/A from scratch, e.g. based on text from a database
- convert raster images (e.g. scans) to PDF/A
- process existing PDF/A documents, e.g. merge or split
- work with ICC profiles and device-independent color to deal with all color management issues
- create PDF/A level A with structure information (Tagged PDF), also in combination with PDF/UA
- assemble Tagged PDF/A from existing tagged pages
- attach XMP metadata to the generated documents, including XMP extension schemas
- attach PDF/A documents to PDF/A-2 or arbitrary file types to PDF/A-3
All of these operations can be implemented with simple PDFlib calls. Sample code for a variety of programming languages and development environments is provided with the PDFlib distribution. Additional programming techniques for PDF/A are available in the PDFlib Cookbook.
Creating PDF/A with PDFlib
Creating PDF/A-conforming output with PDFlib is achieved by the following means:
- PDFlib automatically takes care of several formal settings for PDF/A, such as PDF version number and required XMP identification entries.
- The PDFlib application program must explicitly use certain function calls and options (e.g. for font embedding).
- The PDFlib application program must refrain from using certain other function calls and option settings (e.g. encryption).
If the PDFlib application program obeys to these rules valid PDF/A output is guaranteed. If PDFlib detects a violation of the PDF/A creation rules it throws an exception which must be handled by the application. No PDF output is created in case of an exception; there is no risk of creating non-conforming output. Details of required and prohibited operations are discussed in the PDFlib documentation.
Processing PDF/A with PDFlib+PDI
Additional rules apply when importing pages from existing PDF/A-conforming documents. When dealing with existing PDF/A documents, PDFlib+PDI carefully examines the PDF/A properties of all input and output documents to make sure that the output still conforms to PDF/A. For additional control the output intent of an imported document can be copied to the output PDF, effectively cloning the PDF/A color properties of an existing document. Similarly, XMP metadata from imported documents can be cloned or merged.
Creating PDF/A level A with PDFlib
PDF/A conformance level A can be regarded as level B plus Tagged PDF. PDFlib’s support for PDF/A level A is based on the features for producing Tagged PDF: each content item can be placed at a particular location in the document’s structure tree; content items which are not relevant for the document structure (e.g. headers and footers, pagination) can be tagged as Artifacts which means that they will be ignored when the document is read aloud by software or converted to some other format. Alternative text can be attached to images and vector graphics. PDFlib automatically tags tables and Artifacts which is a big time-saver for the developer. PDFlib checks the supplied tags to make sure that the structure element nesting and attributes conform to ISO 32000. For example, heading or list tags must be properly nested.
Integrated support for PDF/UA makes it easy to create PDF output which is both accessible and archivable. Note that you need detailed knowledge about the document’s logical structure in order to create Tagged PDF. PDFlib takes care of the PDF-related details, but it cannot infer the document structure from its contents.
PDF/A-conforming signatures with PLOP DS
PDFlib PLOP DS is a toolkit for applying digital signatures to PDF documents according to the PAdES signature standards required for signatures according to European eIDAS regulations. PLOP DS applies signatures to PDF/A documents such that the signed output also conforms to PDF/A.