XMP Overview

XMP Metadata

The term metadata literally means »data about data«. Metadata has been described as the business card of a particular digital doc­ument. Metadata often comprises a set of properties, where each property has spe­cific meaning in the Extensible Metadata Platform (XMP).

Extensible Metadata Platform (XMP)

Extensible Metadata Platform (XMP) is an XML-based for­mat modeled after W3C’s RDF (Resource Description Framework) which forms the foundation of the semantic Web initiative. In 2012 XMP has been standardized as ISO 16684-1:2012 (revised in 2019).

XMP metadata travels with the file, and can be embedded in many common file formats including PDF, TIFF, and JPEG. Metadata properties are grouped in sche­mas. Each schema is identified by a unique namespace URI and holds an arbitrary number of properties. While namespace URIs look very similar to the familiar Web addresses (actually, they often look the same), it's important to note that they do not identify a particular Web page. In fact, namespace URIs are not required to point to any resource - they are simply unique identifiers for some entity used in XMP.

The XMP specification includes more than a dozen predefined schemas with hundreds of properties for common document and image characteristics. The most widely used predefined XMP schema is called the Dublin Core, or dc. It in­cludes general properties such as Title, Creator, Subject, and Description. In addition to predefined schemas, custom schemas can be defined to cover company- or in­dustry-specific metadata requirements.

The Dublin Core has been standardized as ISO 15836 (published in 2003, revised in 2009): »Information and documen­tation - The Dublin Core metadata element set«.

XMP is implemented in all Adobe publishing products and supported by dozens of independent software vendors and user groups. Adobe Bridge, part of the Cre­ative Suite, deals with XMP metadata in various file formats.

PDF and XMP

XMP for PDF documents has been introduced with Acrobat 5 and PDF 1.4 in 2001. The predecessor of XMP in PDF was formed by simple key/value pairs, so-called document info entries, which served as the sole carrier of metadata prior to the in­troduction of XMP. While document info entries are still supported in Acrobat and PDF, XMP metadata is a much more powerful concept and allows metadata to sur­vive format conversions, e.g. from scanned TIFF to PDF.

XMP mandated by ISO standards for PDF

There are various ISO standards which specify PDF subsets for certain application domains, such as the graphic arts industry, archiving, or engineering. Except for the prepress standards PDF/X-1 and PDF/X-3 which have been introduced as early as 2001 and 2002, all ISO standards for PDF include the use of XMP metadata (even mandatory in most cases except ISO 32000). This includes PDF/A, PDF/UA, PDF/E, PDF/X-4/5, and ISO 32000-1 (PDF 1.7). ISO 32000-2:2017 »Document management - Portable Document Format - PDF 2.0« deprecates old-style document info entries (with the exception of the CreationDate and ModDate entries) in favor of XMP metadata.

Summary of XMP support in PDFlib products

PDFlib products include the following kinds of support for XMP in PDF (for details please refer to the product-specific pages):

XMP Resources

Adobe's main XMP page:
http://www.adobe.com/products/xmp.html

XMP 2012 specification and Adobe’s XMP developer's page with XMP Toolkit:
http://www.adobe.com/devnet/xmp.html

XMP standard ISO 16684-1 (first published in 2012, revised in 2019): »Extensible metadata platform (XMP) specification - Part 1: Data model, serialization and core properties«