PDF 2.0 (ISO 32000-2): New Capabilities

PDF 2.0 brings several new features and enhancements, notably for Tagged PDF and print production.

Tagged PDF

Several changes and extensions for Tagged PDF have been introduced in PDF 2.0, mainly triggered by the extensive discussions around the PDF/UA standard:

The relationship and nesting rules of standard structure elements are now specified explicitly in tabular format. In ISO 32000-1 (PDF 1.7) the tag nesting rules were incomplete (PDF/UA fills some of these gaps) and spread over a large amount of text. In the process of formulating the new tag nesting rules some redundancy within the existing tag set has been eliminated. For example, the Art, Index, Quote, BibEntry and other elements are no longer encouraged in PDF 2.0.

Tag namespaces akin to XML namespaces can be used to group custom tag names and map them to standard structure element types. The namespace for MathML is already predefined in PDF 2.0.

The GoTo and GoToR actions have been extended to allow links to a specific structure element instead of to a fixed location on a page. If the target structure element is located in an external document, it is identified via its ID attribute (similar to named destinations).

New structure element types DocumentFragment for parts of a document, Aside for sidebars and similar content, and Ref for references.

New attributes for the type of List elements and the short name of table header cells.

New structure element Artifact allows irrelevant content to be included in the structure tree although it is ignored. This is useful for maintaining correct ordering of Artifacts and real content, e.g. for line numbers. Tagging with the structure element Artifact with inclusion in the structure tree is more powerful than merely marking content as Artifact, and should not be confused with this older method.

New artifact subtypes for page numbers and Bates numbering.

Structure elements can carry pronunciation hints to facilitate automatic pronunciation of uncommon terms. This feature is based on an XML-based W3C format called Pronunciation Lexicon Specification.

Print Production

The following PDF 2.0 features are aimed at print production. They have been included in PDF 2.0 at the request of stakeholders in the graphic arts industry:

As a generalization of the output intent ICC profiles introduced with PDF/X and PDF/A which apply to the whole document, page-level output intents describe the target device on which a particular page will be rendered. This feature is useful for documents where individual pages are intended to be printed on different output devices, e.g. a collection of mailings where the cover page is printed in black and white only, while the inner pages are printed in color. Page-level output intents came up during discussions around PDF/VT, but are not yet part of that standard. Another use case is merging of documents with different output intents where now the appropriate output intent can be maintained for each page.

Spot colors can be described accurately by attaching spectral data to a document’s output intent dictionary, using the new entry SpectralData. Spot color characterization data must be supplied in the XML-based format CxF/X-4 according to ISO 17972-4:2015 (freely available information about CxF is provided by the inventor Xrite). The CxF format characterizes a spot color by listing its spectral characteristics, i.e. spectral reflectance values of a printed characterization chart must be measured, as well as ink opacity. Opacity properties of a spot color ink are important for proofing spot colors printed on top of other page content. The International Color Consortium provides a good description and a PDF sample (still based on PDF 1.7).
Spectral data for spot colors is already supported by a few third-party vendors in the graphic arts industry, but currently not in any Adobe applications.

The MixingHints entry in the output intent dictionary describes the order in which inks are applied during the printing process. This so-called printing order or ink laydown order and the optional solidities values are important for creating accurate proofs if the spot colors are unavailable.

Black Point Compensation for CIE-based color conversion according to ISO 18619:2015 can be requested by the new graphics state parameter UseBlackPtComp. ICC profiles specify only methods for dealing with the white point, but not the black point. Black Point Compensation is important for maintaining details in the dark areas of an image, also called the shadows. Black Point Compensation has been available as an optional feature in Adobe Photoshop and other applications for quite a while. The new PDF 2.0 feature offers a similar choice also for PDF documents.
ISO 18619 traces back to the BPC algorithm published by Adobe 2006 in the paper Adobe Systems’ Implementation of Black Point Compensation. Adobe’s algorithm applies only to the relative colorimetric rendering intent. The algorithm has subsequently been generalized to the perceptual and saturation rendering intents based on suggestions in the ICC White Paper 40 Black-point compensation: theory and application. The latter document also provides a good explanation of BPC including the nice metaphor of adding/removing fog to an image. By definition, Black Point Compensation does not apply to the absolute colorimetric rendering intent which aims at exact reproduction of all colors within the device’s gamut.

Other Features

The new concept of an unencrypted wrapper document is targeted at situations where a document uses encryption technology outside of ISO 32000-2 (a custom security handler), which means that a viewer may not be able to successfully decrypt the contents. In this situation an unencrypted wrapper document can be provided which describes the cryptographic requirements of the encrypted payload. Using the information in the unencrypted wrapper the PDF viewer can decide whether it will be possible to decrypt the payload. If the required security handler is not available, the contents of the unencrypted wrapper is presented to the user to explain the situation.

The L (length) key for inline images solves a long-standing parsing problem related to the rarely used concept of inline image data.

Signature seed values which apply constraints when the document is digitally signed. Some of the new values support creation of explicit policy signatures according to PAdES-EPES (ETS TS 102 778 part 3).

New values in the document requirements dictionary which stipulates the minimum required functionality which must be provided by the PDF viewer.

Several extensions in PDF 2.0 have been defined for interactive features (in addition to the geospatial and 3D features introduced with Acrobat 9 as discussed here):

Transparency and blend mode can explicitly be specified for annotations.

3D views may contain 3D measurements.

3D cross sections may display parts of 3D artwork with transparent render mode or as if cutting through a solid object.

A new value for tab order allows to step through the annotations on a page, based on the order of annotations or widgets.

Several algorithm descriptions have been improved in PDF 2.0. This was required to avoid ambiguities where different implementations could render PDF documents differently as far as certain constructs were concerned:

rendering details, e.g. the representation of a path which consists only of a single point;

special cases involving the calculations for transparent objects;

separation and overprint simulation, also called output preview;

details of the encryption schemes are more explicitly described;

the convoluted description of Tagged PDF has been completely rewritten and presented in a more readably way.