TET Cookbook


Text Extraction

Process the text contents of PDF documents
text_extractorSimple text extractor
concordanceCreate a list of all unique words in the document.
back_of_the_book_indexCreate a sorted list of all words in the document along with the page numbers where the words occur.
glyphinfoPrint text plus coordinates, fontname, fontsize and more.