Bavaria Report*

The Bavaria report* provides a comparative review of PDF/A validators. It assesses the output of PDF/A validation tools for conforming as well as for non-conforming documents. Our goal is to provide guidelines for users and to encourage vendors of PDF/A validation tools to optimize their products. We hope that increased accuracy of PDF/A validation will give users more precise means for checking their documents.

Test suite

number of documents

total size

total number of pages

total number of messages produced by 7 validators

Isartor (included in Bavaria)

204

8 MB

 10203 (all Isartor test files contain a single page, except for one test with 10.000 pages)

2759

Bavaria (in addition to Isartor)

85

50 MB

1830

22816

*   In case you wonder about the name of this report: PDFlib GmbH’s offices are located in Munich. From our offices you can see the Bavaria statue which is close to the area of the famous Oktoberfest.


Evolution of PDF/A validation

The ISO 19005-1 standard for PDF/A has been published in 2005. Early PDF/A adopters had a hard time regarding validation. Acrobat 7.0 introduced PDF/A validation in its Preflight plugin, but it predated the final standard and implemented only an earlier draft version which differed in important details from the final standard. Acrobat 7.07 incorporates modifications which reflect the final standard. As a consequence, documents created earlier were no longer consider valid. Acrobat 8 increased the breadth and depth of PDF/A validation checks and uncovered a lot of standard violations which haven’t been noticed earlier. However, it did not thoroughly check XMP metadata and predated the TechNotes published by the PDF/A Competence Center. These TechNotes brought much needed clarifications in a lot of areas. The TechNotes and the Isartor test suite (which has been published in summer 2008) contributed significantly to a consolidation process regarding PDF/A validation. Acrobat 9 implements the majority of those clarifications, with a few important corrections and additions in Acrobat 9.1.

While we used Acrobat to outline the history of PDF/A validation, several third-party validators have had a similar history of increasingly stricter checks. This emphasis on providing more and stricter test criteria made users wonder whether PDF/A validation will ever stabilize or whether they’ll be forced to constantly adjust their documents to the latest validation technology.

After conducting the Bavaria tests we can certainly say that the status of PDF/A validation is considerably better than in the early days, and the Isartor test suite and PDF/A TechNotes have significantly influenced the quality of validation. While the Bavaria report reveals shortcomings in the current product generation, we hope to contribute to further improvements and to help vendors of validation tools to enhance the accuracy of their products. This will ultimately increase the acceptance of PDF/A as a reliable standard for long-term document preservation.


Tested Validators

Vendor

Product

Tested Version

Platform

Adobe

Acrobat

9.0

Windows

Adobe

Acrobat

9.1.0

Windows

Adobe

LiveCycle PDF Generator

8.2.1.1

Windows Server

Apago

PDF Appraiser

2.0 alpha

Windows

Callas Software GmbH

pdfaPilot Server

1.2.077

Windows

intarsys consulting GmbH

PDF/A Live!

4.0.7

Windows

PDF Tools AG

3Heights PDF Validator Shell

1.8.32.1

Windows

Seal Systems AG 

PDF Longlife Suite/ PDF Checker

2.1.1.2 Beta

Windows

Solid Documents, LLC

Solid Framework

5.1.168

Windows

We would like to thank all involved software vendors for their kind cooperation in conducting this test.


The Bavaria Report

The »Bavaria Report on PDF/A Validation Accuracy« is provided as a PDF document which contains an introduction and a summary plus assessment of the test results for each tested validator. Attached to the PDF document are (for each product) an HTML report with the validation results for all test documents plus a detailed XML document with all messages produced by the validator. The messages in the XML are classified according to whether or not we believe they are appropriate.

2009-05-04: Added PDF/A validation results for Adobe LiveCycle PDF Generator. We gratefully acknowledge the assistance of FileAffairs GmbH who ran the actual tests and provided the XML files containing the PDF/A validation output produced by LiveCycle.

Bavaria Report on PDF/A Validation Accuracy


Bavaria test suite for PDF/A-1

For conducting the validation test we used the documents contained in the Isartor test suite which contains more than 200 non-conforming test documents. In addition, the Bavaria test suite contains both conforming and non-conforming PDF/A documents from a variety of sources, created with a variety of PDF generation products. The Bavaria package contains a descriptive file called bavaria.xml which contains comments regarding interesting validation aspects of the test documents.

 (zip archive)

The Bavaria test suite for PDF/A-1 (50MB)


PDFlib’s motivation

PDFlib GmbH is interested in PDF/A validation tools for several reasons:

As a member of the PDF/A Competence Center we are involved in a lot of technical and non-technical discussions regarding PDF/A conformance. The availability of reliable PDF/A validation tools is a crucial factor for the success of PDF/A. As co-authors of the Isartor test suite for PDF/A-1 we want to assess the effect of this test suite on validation tools.

We use PDF/A validators to cross-check our own PDF/A output. As part of our internal QA process we regularly run PDF/A documents created with our own software through PDF/A validators to make sure that we don’t miss any aspect of the standard. This process requires reliable validation tools, so we had to analyze a lot of documents and messages in order to make sure that our PDF output fully conforms to the standard.

We often receive customer inquiries regarding PDF/A validation tools. Although we do not provide direct recommendations, we want to offer a solid basis for selecting a suitable tool.

Some of our products (e.g. PDFlib+PDI 7, PLOP 4, PLOP DS 4) offer PDF/A-aware processing. This means that a processing step maintains PDF/A conformance: if the input conforms to PDF/A it is guaranteed that the output also conforms to PDF/A. While our products offer PDF/A-compliant features such as XMP metadata injection, digital signature, or page recombination they do not include full-blown PDF/A validation. Instead, they assume conforming input. Our customers need reliable PDF/A validation tools in order to check if this is true.

Therefore it is in the best interest of PDFlib GmbH and its customers to provide a comprehensive and balanced review of PDF/A validation tools.


Disclosure

PDFlib GmbH is one of the founding members of the PDF/A Competence Center and was actively involved in the design and implementation of the Isartor test suite for PDF/A-1. PDFlib GmbH offers products for generating and processing PDF/A documents, but we do not offer any tools for validating PDF/A or converting PDF documents to PDF/A.