PDFlib TET - Text Extraction Toolkit

Many Ways to use TET

TET is available as a programming library for various development environments, and as a command-line tool for batch operations. Both offer similar features, but are suitable for different deployment scenarios. Both the TET library and command-line tool can create TETML, TET’s XML-based output format.

The TET Programming Library is used...

...for integration into desktop or server applications. Examples for using the library with all supported language bindings are included in the TET package.

The TET Command-line Tool is suited...

...for batch processing PDF documents. It doesn’t require any programming, but offers command-line options which can be used to integrate it into complex workflows.

TETML Output is suited...

...for XML-based workflows and developers who are familiar with the wide range of XML processing tools and languages, e.g. XSLT.

TET Connectors are suited...

...for integrating TET in various common software packages, e.g. databases and search engines.

Supported Development Environments

PDFlib TET is everywhere - it runs on practically all computing platforms. We offer 32-bit and 64-bit packages for all common flavors of Windows, OS X/macOS, Linux and Unix, as well as for IBM i5/iSeries and zSeries systems. TET is also available for mobile systems including iOS and Android.

The TET core is written in highly optimized C and C++ code for maximum performance and small overhead. Via a simple API (Application Programming Interface) the TET functionality is accessible from a variety of development environments:

COM for use with VB, ASP, etc.

C and C++

Java, including servlets and Java Application Server

.NET for use with C#, VB.NET, ASP.NET, etc.

Objective-C (OS X and iOS)

Perl

PHP

Python

REALbasic/Xojo

RPG (IBM i5/iSeries)

Ruby, including Ruby on Rails

Fully functional evaluation versions including documentation and samples are available here.