XMP in PDF/A

XMP in PDF/A

PDF/A requires the use of XMP for identifying PDF documents as conforming to the PDF/A standard. XMP support in PDF/A-1 is based on the XMP 2004 specification: all properties from predefined schemas in XMP 2004 can directly be used in conforming documents. XMP support in PDF/A-2 and PDF/A-3 is based on the XMP 2005 specification.

The table below summarizes the names, namespace URIs and preferred namespace prefixes for the predefined XMP schemas as defined in XMP 2004. The names and descriptions of all properties in these predefined schemas can be found in the XMP 2004 specification (unfortunately, the official version of this document is no longer available online).

schema name and descriptionnamespace URIpreferred namespace prefix
Adobe PDF schema"http://ns.adobe.com/pdf/1.3/"pdf
Dublin Core schema"http://purl.org/dc/elements/1.1/"dc
EXIF schema for EXIF-specific properties"http://ns.adobe.com/exif/1.0/"exif
EXIF schema for TIFF properties"http://ns.adobe.com/tiff/1.0/"tiff
Photoshop schema"http://ns.adobe.com/photoshop/1.0/"photoshop
XMP Basic Job Ticket schema"http://ns.adobe.com/xap/1.0/bj"xmpBJ
XMP Basic schema"http://ns.adobe.com/xap/1.0/"xmp
XMP Media Management schema"http://ns.adobe.com/xap/1.0/mm/"xmpMM
XMP Paged-Text schema"http://ns.adobe.com/xap/1.0/t/pg/"xmpTPg
XMP Rights Management schema"http://ns.adobe.com/xap/1.0/rights/"xmpRights

XMP extension schemas for PDF/A

While the predefined XMP schemas above cover many general metadata requirements, company- or industry-specific metadata requirements can only be met with custom XMP schemas. For this purpose PDF/A supports the use of so-called extension schemas. An extension schema is a collection of metadata properties for a specific application scenario. In order to make sure that extension schemas can correctly be interpreted in the future, PDF/A requires that a description of all used extension schemas is embedded in the XMP. The so-called extension schema container schema contains the name and description of all properties as well as their XMP data type. This description must be provided in a formal way using the XMP schemas and properties detailed in PDF/A. The namespaces for extension schema descriptions are summarized in the table below.

schema name and descriptionnamespace URIrequired namespace prefix
PDF/A extension schema container schema"http://www.aiim.org/pdfa/ns/extension/"pdfaExtension
PDF/A field type schema"http://www.aiim.org/pdfa/ns/field#"pdfaField
PDF/A property value type"http://www.aiim.org/pdfa/ns/property#"pdfaProperty
PDF/A schema value type"http://www.aiim.org/pdfa/ns/schema#"pdfaSchema
PDF/A ValueType value type"http://www.aiim.org/pdfa/ns/type#"pdfaType

Sample extension schemas for PDF/A

We provide several sample extension schemas which may serve as a starting point for creating your own PDF/A-conforming extension schemas. They include a human-readable description of the custom XMP schemas and properties, as well as the corresponding machine-readable description according to the rules set forth by PDF/A. The XMP files include the schema description required for PDF/A as well as a sample dataset, i.e. the actual metadata which makes use of the extension schema:

  • Machine extension schema 1: a simple schema with a few properties describing hypothetical machinery.
  • Machine extension schema 2: similar to the above, but as an additional feature this schema contains a custom XMP property value type »ArticleNumber«. The fields comprising this structured type must also be included in the schema description for PDF/A.
  • Engineering archive: this schema describes engineering documents which have been archived as PDF/A based on scanned paper documents. The metadata contains details describing the documents (language and reference number) and the scanning process (scan date and name of operator).

Some of these XMP extension schemas for PDF/A are also included in the <link pdflib-cookbook pdfa pdfa-extension-schema>PDFlib Cookbook, as well as PDFlib code to create PDF/A conforming output which includes the XMP extension schema.

More detailed technical information

The following TechNotes published by the PDF/A Competence Center of the PDF Association deal with XMP metadata in PDF/A-1:

  • Technical Note TN0008: Predefined XMP Properties in PDF/A-1
    This TechNote discusses the XMP properties which can be used in PDF/A-1 without using an extension schema.
  • Technical Note TN0009: XMP Extension Schemas in PDF/A-1
    This TechNote explains  the construction of XMP extension schema and includes details about XMP syntax requirements. It also contains a full example of an XMP extension schema.

These TechNotes are highly recommended reading for designers and implementors of PDF/A solutions. PDF/A support in PDFlib GmbH products is implemented according to the ISO 19005 standard as well as the recommendations in TechNotes 0008 and 0009.

Validating XMP extension schemas for PDF/A

We recommend the RDF validator on the W3C Web site. Since XMP is a subset of RDF it must conform to the RDF syntax rules which are checked by the W3C validator.

PDFlib was the first product worldwide with support for XMP extension schemas for PDF/A. PDFlib GmbH offers a <link knowledge-base xmp-metadata free-xmp-validator validation>free validation service which checks XMP metadata for compliance with the PDF/A-1/2/3 standards. The XMP can be supplied as simple text, or embedded in a PDF or PDF/A document (regardless of the conformance status of the document).