
TET PDF IFilter extracts text and metadata from PDF documents and makes it available to search and retrieval software on Windows. This allows PDF documents to be searched on the local desktop, a corporate server, or the Web. TET PDF IFilter is based on the patented PDFlib Text Extraction Toolkit (TET), which is a developer product for reliably extracting text from PDF documents.
TET PDF IFilter is a robust implementation of Microsoft’s IFilter indexing interface. It works with all search and retrieval products which support the IFilter interface, e.g. SharePoint and SQL Server. Such products use format-specific filter programs – called IFilters – for particular file formats, e.g. HTML. TET PDF IFilter is such a program, aimed at PDF documents. The user interface for searching the documents may be the Windows Explorer, a Web or database frontend, a query script, or a custom application. As an alternative to interactive searches, queries can also be submitted programmatically without any user interface.
PDFlib TET, the basis of TET PDF IFilter, was first released in 2002, and has been used by customers worldwide in server and desktop environments. As an alternative to extracting PDF page contents and metadata as raw text, TET can supply the document contents in XML format. TET is also available as a free plugin for Adobe Acrobat; this plugin allows interactive test and evaluation of TET’s superior text extraction.
TET PDF IFilter offers the following advantages:
Indexes not only page content, but also custom metadata, bookmarks, and PDF attachments
Extracts text even from PDFs where Acrobat fails
Indexes XMP image metadata
Performance: thread-safe, fast and robust, 32- and 64-bit
Lean stand-alone product
Automatic language/script detection
Actively supported by a dedicated team
TET PDF IFilter is available in fully thread-safe native 32- and 64-bit versions. You can implement enterprise PDF search solutions with TET PDF IFilter and the following products:
Microsoft Office SharePoint Server (MOSS)
Microsoft Search Server 2008 and the free Search Server 2008 Express
Microsoft SQL Server
Microsoft Exchange Server
TET PDF IFilter can be used with all other Microsoft and third-party products which support the IFilter interface.
TET PDF IFilter can also be used to implement desktop PDF search, e.g. with the following products:
Windows Desktop Search (WDS): integrated in Windows
Vista; also available as free add-on for Windows XP
Windows Indexing Service
TET PDF IFilter is freely available for non-commercial desktop use, which provides a convenient basis for test and evaluation.
PDFlib TET PDF IFilter is available as a beta version for download here.