PDFlib TET – Text Extraction Toolkit


Many Ways to use TET

TET is available as a programming library (component) for various development environments, and as a command-line tool for batch operations. Both offer similar features, but are suitable for different deployment tasks. Both the TET library and command-line tool can create TETML, TET’s XML-based output format.

The TET Programming Library is used...

...for integration into desktop or server applications. Examples for using the library with all supported language bindings are included in the TET package.

The TET Command-line Tool is suited...

...for batch processing PDF documents. It doesn’t require any programming, but offers command-line options which can be used to integrate it into complex workflows.

TETML Output is suited...

...for XML-based workflows and developers who are familiar with the wide range of XML processing tools and languages, e.g. XSLT.

TET Connectors are suited...

...for integrating TET in various common software packages, e.g. databases and search engines.


Supported Development Environments

PDFlib TET is everywhere – it runs on practically all computing platforms. We offer 32-bit and 64-bit packages for all common flavors of Windows, Mac OS, Linux and Unix, as well as for IBM eServer iSeries and zSeries systems.

The TET core is written in highly optimized C code for maximum performance and small overhead. Via a simple API (Application Programming Interface) the TET functionality is accessible from a variety of development environments:

COM for use with VB, ASP, Borland Delphi, etc.

C and C++

Java, including servlets and Java Application Server

.NET for use with C#, VB.NET, ASP.NET, etc.

Perl

PHP

Python

RPG (IBM i5/iSeries)


Fully functional evaluation versions including documentation and samples are available here.