Using output encoding "UTF-8"
For the best experience, open this PDF portfolio in
Acrobat 9 or Adobe Reader 9, or later.
Get Adobe Reader Now!
----- File attachment 'PDFlib-use-cases-E.pdf':
PDFlib Use Cases
Here we have collected a number of use cases which can be implemented with recent
products of PDFlib GmbH. Depending on the purpose of the installation a standalone
implementation or an integration into a web server environment can be considered.
Personalization of online sold PDF documents. More and more commercially useful
information (such as market analysis or standards documents) is sold as PDF document.
To protect the sold documents from being circulated illegally they can be enhanced
with additional security:
> A personal password for each customer may protect the documents; printouts or
modifications of the document can be restricted according to requirements. For
those measures the use of PLOP is sufficient.
> In case further protection is demanded the document may be personalized for each
customer. The document can be imported with PDFlib+PDI, and for example be watermarked
with the name of the customer. The text might be visible or hidden, and
helps to identify the customer if required.
Personalized trade show guide. On the web site of a trade show a visitor can choose
which product groups at the show are of relevance for him. From this his personalized
trade show guide is generated, which contains the relevant information like booth
number, contact data and similar information from the exhibitor database.
For implementing this project PDFlib is sufficient. If existing PDF pages, such as
product catalogs or exhibitor information should be integrated, PDFlib+PDI has to be
involved.
Extensive online documentation. A leading electronics manufacturer stores the wiring
scheme of radios as PDF documents. A mouse click on one of the components shows
the data sheet of the respective component. PDFlib+PDI produces this wiring diagram
which consists of an extensively linked PDF document, and the pages are dynamically
combined with PDFlib+PDI according to the required components.
Evaluation of used cars. The customer feeds the details about his second hand car
(like model, year of manufacture, mileage and so on) into a web front end. Based upon
an appraisal database the value of the car is individually calculated. The result is combined
by PDFlib in a report document, which may contain pictures additional to text
based information. This service is subject to charge and therefore the invoice is generated
as PDF document too and delivered combined with the report via Internet.
Construction financing. Customer specific data are collected by the sales representative
of a financial service provider and sent via Internet to the company server. Based
on the customer details and recent product offers the company server generates an individual
offer for the customer. PDFlib formats the data and produces a PDF document
which can be forwarded to the sales representative or the customer.
Personalized application form for insurance and financial sector. The customer sends
his details for an insurance or bank account application via fax or letter to the bank.
There a clerk feeds those information into a database. A PDF template with blocks al-
PDFlib GmbH www.pdflib.com 12/2006
PDFlib Use Cases 1
lows to issue the completed form. PDFlib Personalization Server (PPS) fills the blocks
into the PDF template with the information from the database, and produces a completed
document.
Data sheets for technical products. A magnet manufacturer produces his magnet
data sheets dynamically as PDF, in which tables and diagrams are embedded from existing
graphics files. Even though implementing this workflow requires development effort
the system is still more efficient than producing the data sheets manually. Because
of the large amount of products and short turnaround cycles of the data sheets the integration
effort has amortized quickly.
Enhancing documents for online publications. The Unicode consortium prepares extensive
tables to document the Unicode standard. The artwork for printing the Unicode
book is delivered as PDF, but those documents are not suitable for Web publishing. Because
of this the Unicode consortium employs the products PDFlib+PDI and TET to produce
enhanced PDF documents.
> Automatically embed web links: First TET searches the documents for URLs. Based on
the found text strings and coordinates PDFlib embeds links into original documents
imported the by PDFlib+PDI.
> TET generates bookmarks by searching the headlines.
> The resulting PDF documents are protected by encryption and password.
For print publishing the PDF files are prepared by PDFlib+PDI:
> Right and left pages are moved to adjust the borders.
> Headers are added for the book.
The resulting PDF documents where used to print the Unicode book (ISBN 0321480910)
and to generate the online documentation for the Unicode standard.
Watermarks created with PDFlib+PDI protect confidential material. Mayor David Miller
of the City of Toronto, Canada's largest City and 5th largest in North America, was
concerned about media leaks of confidential information. Councillors would be given
special printed material at in camera sessions of the City government. Unfortunately,
the press would often know about these documents before the councillors.
Something needed to be done to make paper documents more secure. The City Clerk
Uli Watkiss turned to the IT Division for help. They had tightened up the physical process
of distributing the material, but could they do more?
The IT division came up with the idea of virtual watermarks. This would be a solution
that puts the city workmark and logo, as well as the councillor's name in grey scale
into the background of the paper document. Each confidential document would now be
owned by a particular person. Perhaps this could not stop leaks, but the psychological
impact would be immense.
»We needed an industrial strength solution,«tells Michael Sutton, Senior Technical
Support Specialist. They had 65 councillors and staff, that would need confidential documents
up to 200 pages each. This would be a lot of variable output for the city. The city
was using IBM's InfoPrint Manager for AIX for print management, and Heidelberg Digimasters
for print output. What they didn't have was a templating/watermarking technology
in between.
»That naturally lead us to PDFlib,« states Michael Sutton. Using PDFlib, they were
very quickly able to assemble and layer confidential scanned documents, with Adobe
PDFlib GmbH www.pdflib.com 12/2006
PDFlib Use Cases 2
Form Builder backgrounds, adding the greyscale names of the Councillors using. PDFlib
easily handled transparency with multiple-layer PDFs.
In case the pages of the document are created with PDFlib the watermark can be integrated
easily into the process. With existing documents PDFlib+PDI can be applied.
PDFlib+PDI imports the existing PDFs and places the one by one into the new document.
During this a watermark is pasted over the original. The watermark can consist of
text, pictures or even another PDF. To preserve readability even with big watermarks it
can be turned transparent. In case the recipients have to be kept from changing the watermark,
the document can be protected by encryption.
Later, the IT department used the PDF Virtual Filesystem to add outstanding performance.
Most of the compositional work is done in memory. The resulting PDF is downsampled
to PostScript Level 2, which is easily handled by the Heidelbergs. »The result is
outstanding, based on the quality and fidelity of the PDF created by PDFlib, and its features.«
judges Michael Sutton.
By making PDFlib a templating print driver under the IPM technology they even surprised
IBM. With Windows Script Host and JavaScript they make this an easy to use
drag-and-drop document option on Windows workstations.
Every developer can operate PDFlib in the programming language which fits best regarding
his experience and the demands of his project. In case the developer chooses a
language like Java which is available on several platforms, the application may also run
under a different operating system.
»Our future plans are to keep refining the PDFlib process,« tells Michael Sutton »and
eventually phase out the use of offset colour letterhead and documents at the City. We
expect PDFlib will save us millions of dollars as a result. This is a great integration package
for enterprise printing. If you are a large shop, looking for the same ease of assembly
for PDF elements, that we see in the traditional AFP or big print iron world, this is
the package. It allows you to use PDF, with all the richness of features, such as fonts,
graphics and color support.«
PDFlib GmbH www.pdflib.com 12/2006
PDFlib Use Cases 3
PDFlib GmbH www.pdflib.com 12/2006
PDFlib Use Cases 4
----- End file attachment 'PDFlib-use-cases-E.pdf'
----- File attachment 'Use-cases-PDFA-with-PDFlib-products.pdf':
Use Cases:
PDF/A with PDFlib Products
This document discusses various scenarios where PDF/A application problems can be
solved with PDFlib GmbH products. The required minimum product versions are mentioned
for each use case. The PDFlib Cookbook at www.pdflib.com/pdflib-cookbook/ contains
working code samples for the PDF/A use cases discussed in this document. For more
details on PDF/A support in our products please refer to the whitepaper »Creating PDF/A
with PDFlib Products«. The PDF/A implementation in PDFlib products conforms to the
TechNotes published by the PDF/A Competence Center (see www.pdfa.org).
PDFlib 7: Dynamically generate PDF/A. All scenarios where products from the
PDFlib family are used to dynamically create PDF documents can easily be adjusted
to create PDF/A output. As an example, creating digital invoices as PDF/A documents
with PDFlib is a snap. You don’t have to worry about color space issues, XMP
handling, and other subtleties related to PDF/A requirements. Simply create PDF/A
documents according to the instructions in the PDFlib documentation, and PDFlib
will take care of all other PDF/A-related aspects. If you fail to comply with one of
the rules, PDFlib will issue an error instead of creating an output document. This
guarantees that all generated output documents conform to the PDF/A standard.
PDFlib 7: Convert images to PDF/A and add custom XMP metadata. PDFlib can be
used to convert TIFF, JPEG, and other image formats to PDF/A documents. PDFlib
provides a simple conversion method which can be customized for local requirements.
For example, PDFlib’s support for XMP extension schemas allows you to
add custom metadata to the generated documents in a way which conforms to the
PDF/A standard. For more details on XMP please refer to our whitepaper »XMP
Metadata Support in PDFlib Products«.
PDFlib 7: Add transparent stamps to PDF/A. Many projects require the use of
transparent stamps which are printed across the actual page contents in order to
visualize some status information related to the document contents, such as »Confidential«,
»Prerelease«, etc. However, the PDF/A standard does not allow the use of
PDF’s transparency feature. Using opaque (non-transparent) text for stamps
means that the page contents will be obscured by the stamp text (or vice versa).
As a practical solution for this problem PDFlib supports the creation of transparent
stamps in a way which conforms to the PDF/A standard. Similar to the wellknown
printing technique of raster cells, the transparency effect is achieved by
spreading tiny little dots across the stamp text (technically, a PDF pattern colorspace
is created which is based on a bitmap mask, where the bitmap contains a raster
cell pattern for the desired tint percentage). This way the stamp text appears
transparent and does not obscure the underlying page contents, while the document
still complies with PDF/A requirements.
Implementing this technique with PDFlib can be used to convert e.g. scanned
pages to PDF/A while applying a transparent stamp as part of the TIFF-to-PDF/A
conversion process. Implementing it with PDFlib+PDI provides a simple way of
adding transparent stamps to existing PDF/A documents (note that this process
will not keep interactive elements such as bookmarks or form fields, though).
PDFlib PLOP 3.1: Insert custom XMP metadata in PDF/A documents. In this scenario
the user already has PDF/A documents, but wants to add custom XMP metadata
to a number of existing documents. The source or creation process does not really
matter; the PDF/A documents could e.g. be created from scanned paper input.
PDFlib GmbH
2008 - 07
www.pdflib.com
1
For custom XMP metadata to be used with PDF/A, the standard requires embedding
of a machine-readable description of the extension schema. This must be
done according to the rules set forth by the standard, and is implemented in
PDFlib PLOP 3.1. PLOP can therefore be used as a post-processor for inserting predefined
or custom XMP metadata properties in existing PDF/A documents. Since
PLOP is both XMP- and PDF/A-aware the operation will maintain the standard conformance
status of the input document. In order to guarantee standard conformance
PLOP validates all user-supplied XMP metadata according to PDF/A rules.
PDFlib PLOP DS 3: Digitally sign PDF/A documents. PDF/A documents may include
digital signatures. However, PDF/A establishes several rules which affect signatures
in the documents. Signatures must be applied carefully in order to maintain
the PDF/A status of the signed document. PLOP DS 3 implements a PDF/Aconforming
signature process and can therefore be used to apply signatures to existing
PDF/A-1a or PDF/A-1b documents without affecting the PDF/A conformance
status.
PDFlib 7: Create PDF/A-1a with Tagged PDF. PDF/A-1a is superior to PDF/A-1b since
it ensures reliable text retrieval as well as content repurposing and accessibility by
means of structure information (Tagged PDF). PDFlib supports both of these
aspects and can be used to create PDF/A-1a output to meet strict accessibility requirements
for the archived documents. Note that structure information must be
available for creating Tagged PDF. Since PDFlib cannot create Tagged PDF from unstructured
input the application must supply suitable structure information when
creating PDF/A documents.
pCOS 2: Read PDF/A status information. Acrobat 8 does not provide any simple
representation of the PDF/A conformance level of a document. The status will be
displayed when launching the Preflight plugin or digging through the advanced
XMP metadata panel. PDFlib GmbH provides a custom XMP panel which will display
PDF standards conformance information in an Acrobat panel. This panel provides
a simple representation of the conformance status for the PDF/A-1, PDF/X-4,
PDF/X-5, and PDF/E-1 standards. The required panel description can be downloaded
from www.pdflib.com/developer/xmp-metadata/.
While the panel solves the identification problem for interactive use, the product
PDFlib pCOS can be used to programmatically determine the PDF/A (or PDF/X)
conformance status of a document (without any validation). The pCOS API can be
included in custom applications, and the pCOS command-line tool can be integrated
in various workflow scenarios. Since the pCOS programming interface is integrated
in our other products it can be used from PDFlib, TET, and PLOP in a similar
way.
PDFlib GmbH
Franziska-Bilek-Weg 9
80339 München, Germany
phone +49 • 89 • 452 33 84-0
info@pdflib.com
www.pdflib.com
2 www.pdflib.com 2008 - 07 PDFlib GmbH
----- End file attachment 'Use-cases-PDFA-with-PDFlib-products.pdf'
----- File attachment 'Whitepaper-XMP-metadata-in-PDFlib-products.pdf':
Whitepaper:�
XMP Metadata Support in
PDFlib Products
The importance of metadata. The term metadata literally means »data about
data«. Metadata has been described as the business card of a particular digital document.
Metadata often comprises a set of properties, where each property has specific
meaning in the context of the document. Some examples for common metadata
properties:
> The author of a PDF document.
> The date a PDF document was created or a JPEG image was taken with a camera.
> The name of the photographer who took an image.
> The serial number of a personalized document.
> The stockkeeping unit (SKU) of the item described in a document.
> The year of manufacture of the engineering product described in a document.
> The reference number of a document in a legal case.
As an increasing number of publishing, documentation, translation, and other
workflows are implemented in a completely digital manner, metadata plays a crucial
role for handling digital documents during their lifetime.
Adobe’s Extensible Metadata Platform (XMP). As Adobe recognized the need for a
common metadata format which can be used across applications and file formats,
they designed the Extensible Metadata Platform (XMP). This is an XML-based format
modelled after W3C’s RDF (Resource Description Framework) which forms the
foundation of the semantic Web initiative. Adobe makes the XMP specification
freely available, and offers an open-source XMP toolkit for software developers.
XMP metadata travels with the file, and can be embedded in many common file
formats including PDF, TIFF, and JPEG. Metadata properties are grouped in schemas.
Each schema is identified by a unique namespace URI and holds an arbitrary
number of properties.
The XMP specification includes more than a dozen predefined schemas with
hundreds of properties for common document and image characteristics. The
most widely used predefined XMP schema is called the Dublin Core, or dc. It includes
general properties such as Title, Creator, Subject, and Description. In addition
to predefined schemas custom schemas can be defined to cover company- or industry-specific
metadata requirements.
XMP for PDF documents has been introduced with Acrobat 5 and PDF 1.4 in 2001.
The predecessor of XMP in PDF was formed by simple key/value pairs, so-called
document info entries, which served as the sole carrier of metadata prior to the introduction
of XMP. While document info entries are still supported in Acrobat and
PDF, XMP metadata is a much more powerful concept and allows metadata to survive
format conversions, e.g. from scanned TIFF to PDF.
XMP is implemented in all Adobe publishing products and supported by dozens
of independent software vendors and user groups. Adobe Bridge, part of the Creative
Suite, deals with XMP metadata in various file formats. XMP metadata can be
displayed and edited in the File Info/Document Properties panel in Acrobat (File,
Properties..., Additional metadata...), Photoshop, InDesign, and other Adobe applications.
While the File Info panel groups metadata properties according to the predefined
XMP schemas, custom panels can be defined to tailor metadata display
and editable fields according to the requirements of various application domains.
PDFlib GmbH
2008 - 08
www.pdflib.com
1
XMP for verticals. XMP is increasingly used by industry groups to cover their
metadata requirements. Some examples:
> The AdsML consortium creates specifications and processes for the exchange of
advertising information and content.
> The International Press Telecommunications Council (IPTC) is an industry
group established by news organizations. It develops industry standards for the
interchange of news data. It published the »IPTC Core« schema for XMP which is
widely used for transferring metadata for images and other news items.
> The DICOM standard for exchanging medical images supports the use of PDF
and specifies a custom XMP schema for storing patient data, study description,
equipment details, and other metadata.
> The Publishing Requirements for Industry Standard Metadata (PRISM) defines a
metadata vocabulary for processing magazine, news, catalog, book, and journal
content.
XMP mandated by ISO standards. There are several existing and planned ISO
standards which specify PDF subsets for certain application domains, such as the
graphic arts industry, archiving, or engineering. Except for the prepress standards
PDF/X-1 and X-3 which have been introduced in 2001 and 2002, all ISO standards
for PDF include the use of XMP metadata (even mandatory in most cases except
ISO 32000):
> PDF/A-1 in ISO 19005-1 (published in 2005): »Electronic document file format for
long-term preservation – Use of PDF 1.4«. PDF/A-1 requires XMP for identifying
conforming files and supports custom metadata through XMP extension schemas.
A formal description of all extension schemas must be embedded in PDF/A
to maximize the future use of custom metadata. PDF/A-1 allows the use of document
info entries, but requires synchronization between common PDF document
info entries and certain predefined XMP properties to allow pure XMPbased
workflows. The standard defines this »crosswalk« between document
info entries and XMP properties. XMP support in PDF/A-1 is based on the XMP
2004 specification.
> PDF/E in ISO 24517-1 (published in 2008): »Engineering document format using
PDF – Use of PDF 1.6«. XMP support in PDF/E is almost identical to PDF/A-1, except
that it is based on the newer XMP 2005 specification.
> PDF/X-4 in ISO 15930-7 (published in 2008): »Complete exchange of printing
data (PDF/X-4) and partial exchange of printing data with external profile reference
(PDF/X-4p) using PDF 1.6«. Similar to PDF/A-1, XMP is required to express
standards conformance in PDF/X-4. Document info entries may be used in
PDF/X-4, but must be synchronized with corresponding XMP entries. XMP extension
schemas for custom metadata are allowed. However, unlike PDF/A-1
these can be used without embedding a formal description. XMP support in
PDF/X-4 is based on the XMP 2005 specification.
> PDF/X-2 in ISO 15930-5 (published in 2003) and PDF/X-5 in ISO 15930-8 (published
in 2008): »Partial exchange of printing data using PDF 1.6 (PDF/X-5)«.
PDF/X-2 and X-5 documents reference other PDF/X documents, where the target
of such a reference is identified by using various XMP entries. This makes XMP a
crucial component of PDF/X-2 and X-5.
> ISO 32000 (published in 2008): »Document management – Portable document
format – PDF 1.7«. ISO 32000 is the standardized version of PDF 1.7. The technical
content is identical to PDF 1.7 (the file format of Acrobat 8) which fully supports
XMP metadata.
The Dublin Core, one of the most common predefined XMP metadata schemas has
been standardized as ISO 15836 (published in 2003): »Information and documentation
— The Dublin Core metadata element set«.
2 www.pdflib.com 2008 - 08 PDFlib GmbH
XMP support in the PDFlib product suite. Simple XMP support has been introduced
in the PDFlib product family in 2004. With PDF/A-1 support in PDFlib 7 (released
in 2006) the XMP features were expanded to match the requirements of
PDF/A-1. In particular, automatic synchronization of document info entries to XMP
properties (as specified in the PDF/A-1 crosswalk) was implemented, as well as automatic
creation of several internal XMP properties required for PDF/A-1. As a result,
PDFlib users can generate XMP for PDF/A-1 without having to struggle with
the internals of the XMP format. Advanced users can directly feed all of the predefined
XMP metadata schemas to PDFlib for inclusion in the generated PDF documents.
Since PDFlib is available on all relevant operating systems and does not require
any third-party products, it brings XMP support to all platforms.
On top of this, PDFlib 7.0.3 adds support for XMP extension schemas according
to PDF/A-1. Users can embed extension schema descriptions for custom metadata
as required by PDF/A-1. Since PDFlib fully validates user-supplied XMP extension
schemas for internal consistency and standards conformance, the output is guaranteed
to conform to the PDF/A-1 standard.
This feature makes PDFlib 7.0.3 the first product worldwide to support XMP extension
schemas for PDF/A-1. As a result of PDFlib GmbH’s participation in the
PDF/A Competence Center, all PDF/A activities are closely coordinated with other
vendors of PDF/A software to ensure the highest possible degree of standards conformance
and adherence to industry practises.
Since XMP validation is active even when no PDF/A output is created, all XMP
users benefit from the improved XMP support in PDFlib 7.0.3.
More details on XMP in PDF/A, plus an online validator for XMP extension schemas
can be found on www.pdflib.com.
Injecting XMP in PDF with PDFlib PLOP 3.1. In addition to various other features
including encryption, decryption, optimization, and digital signature, PDFlib PLOP
can insert XMP metadata in existing PDF documents. This function comes handy
in situations where existing PDF documents do not contain all required metadata
properties. It is especially useful in PDF/A workflows since XMP support in PLOP is
PDF/A-aware. For example, custom XMP with extension schemas can be injected in
PDF/A documents from workflows which do not support extension schemas.
Extracting XMP from PDF with PDFlib pCOS. The pCOS interface is PDFlib GmbH’s
method for retrieving any kind of information from PDF documents. It is available
as a stand-alone product, and also integrated in all other products. pCOS offers a
simple programming method for extracting XMP metadata from PDF documents.
XMP metadata is normalized to Unicode so that users don’t have to worry about
encoding issues.
XMP retrieval works regardless of compression, encryption, and PDF object
structure. While the XMP package mechanism defined by Adobe allows easy inclusion
and retrieval of XMP data packages in various file formats, the PDF format exhibits
several subtleties which complicate the issue. For example, PDF documents
may contain several update sections which cause multiple instances of an XMP
stream to be present in the file, where only one of these instances is relevant. A
simple text search for the XMP block will likely retrieve the wrong instance; only
software which carefully follows the PDF object structure will retrieve the correct
XMP metadata block in all cases. This is the reason why Adobe’s free XMP Toolkit
does not fully support XMP retrieval from PDF, while it does support XMP in other
file formats such as TIFF and JPEG.
Searching for XMP metadata with PDFlib TET PDF IFilter. TET PDF IFilter is the latest
product released by PDFlib GmbH. It implements Microsoft’s IFilter interface
and can be used with various Microsoft and third-party desktop and enterprise
PDFlib GmbH
2008 - 08
www.pdflib.com
3
search products, such as Windows Desktop Search (WDS), Office SharePoint Server
(MOSS), Indexing Server, or SQL Server. XMP support in TET PDF IFilter makes it
very easy to leverage XMP metadata in environments where Microsoft search solutions
are deployed.
The advanced metadata implementation in TET PDF IFilter supports the Windows
property system for metadata. In addition to page contents it indexes XMP
metadata as well as standard or custom document info entries. Metadata indexing
can be configured on several levels:
> Document info entries and common XMP properties are mapped to standard
Windows properties, e.g. Title, Subject, Author.
> TET PDF IFilter adds useful PDF-specific pseudo-properties, e.g. page size, PDF/A
conformance level, font lists.
> All relevant predefined XMP properties can be searched, e.g. dc:rights,
xmpRights:UsageTerms, xmp:CreatorTool.
> Custom (user-defined) XMP properties can be searched, e.g. company-specific
classification items.
> In addition to document metadata, XMP attached to images can also be indexed,
e.g. the name of the photographer of an image or copyright information.
TET PDF IFilter optionally integrates metadata in the indexed raw text. As a result,
even full-text search engines without metadata support (e.g. SQL Server) can
search for metadata.
Workflow scenarios which benefit from XMP-based document search. XMP metadata
handling can be integrated in diverse scenarios which require searching digital
documents. Two typical examples are described below.
Publishing: creative professionals use Adobe and other publishing software to
create documents and metadata interactively. They assign keywords, author name,
copyright information and other common XMP properties to documents. They can
use Adobe Bridge to search or group documents according to the assigned metadata
properties, and are focused on common XMP schemas such as Dublin Core
and IPTC.
Technical documentation: a large number of documents is created manually or
automatically, and collected in departmental or company-wide collections. These
document collections are accessed via common Windows retrieval tools, such as
Microsoft Office SharePoint Server (MOSS) on server systems, Windows Desktop
Search (WDS) on workstations, or other retrieval products. After attaching TET PDF
IFilter to these products users can search for documents based on XMP metadata
properties, the actual page contents, or even image properties. While predefined
XMP schemas cover the basic requirements, customized XMP schemas can be used
in the queries to cover company-specific requirements.
PDFlib GmbH
Franziska-Bilek-Weg 9
80339 München, Germany
phone +49 • 89 • 452 33 84-0
info@pdflib.com
www.pdflib.com
4 www.pdflib.com 2008 - 08 PDFlib GmbH
----- End file attachment 'Whitepaper-XMP-metadata-in-PDFlib-products.pdf'