List of words in the document along with the number of occurrences: the 138 and 100 TET 87 of 61 to 57 PDF 54 text 51 in 50 for 50 a 46 are 39 as 36 is 34 PDFlib 31 or 31 be 30 with 26 can 23 image 22 on 21 page 20 Unicode 19 color 17 e 17 document 17 The 17 images 16 which 16 g 16 characters 16 products 16 documents 15 other 15 all 15 information 13 TETML 13 Text 12 such 12 used 12 Glyph 12 extracts 11 metadata 11 from 11 available 11 contents 11 extracted 11 output 11 character 11 font 10 word 10 may 10 common 9 not 9 GmbH 9 size 9 form 9 many 8 glyph 7 formats 7 contains 7 processing 7 various 7 parts 7 pdflib 7 com 7 TIFF 7 XMP 7 more 7 software 7 correctly 7 Extraction 6 well 6 into 6 PDFs 6 forms 6 www 6 details 6 separate 6 applications 6 It 6 F1 6 width 6 it 6 often 6 by 6 small 6 s 6 single 6 but 6 datasheet 5 Image 5 optionally 5 pCOS 5 interface 5 search 5 including 5 scripts 5 Arabic 5 languages 5 proper 5 This 5 extraction 5 In 5 library 5 only 5 connector 5 must 5 two 5 ligatures 5 Other 5 extract 5 support 5 process 5 use 5 spot 5 fragmented 5 programming 5 position 4 an 4 content 4 shadow 4 All 4 Acrobat 4 required 4 corresponding 4 includes 4 words 4 instances 4 order 4 example 4 spaces 4 DeviceN 4 pages 4 JPEG 4 combined 4 since 4 XML 4 detects 4 removes 4 that 4 bold 4 result 4 create 4 For 4 top 4 this 4 initial 4 colors 4 channels 4 command-line 4 offer 4 deployment 4 server 4 C 4 XML-based 3 called 3 columns 3 redundant 3 elements 3 etc 3 engine 3 based 3 supports 3 versions 3 up 3 systems 3 world 3 implements 3 Hebrew 3 logical 3 encoding 3 Ligatures 3 appropriate 3 mapping 3 identified 3 problems 3 packages 3 Word 3 Detection 3 several 3 patented 3 hyphenated 3 over 3 multiple 3 each 3 lists 3 included 3 Separation 3 Images 3 no 3 about 3 see 3 replace 3 one 3 emit 3 requirements 3 contain 3 While 3 bookmarks 3 file 3 processed 3 annotations 3 variety 3 also 3 even 3 XSLT 3 following 3 i 3 paragraph 3 these 3 environments 3 Search 3 Microsoft 3 product 3 Windows 3 Challenges 3 different 3 shadowed 3 where 3 using 3 algorithm 3 if 3 placed 3 delivers 3 Drop 3 If 3 pieces 3 damaged 3 CMYK 3 any 3 fragments 3 thousands 3 development 3 tool 3 batch 3 suitable 3 suited 3 Cookbook 3 Products 3 We 3 IBM 3 licenses 3 Toolkit 2 What 2 makes 2 plus 2 Raster 2 converts 2 format 2 resource 2 advanced 2 analysis 2 algorithms 2 boundaries 2 grouping 2 table 2 Using 2 integrated 2 retrieve 2 interactive 2 their 2 headings 2 PDI 2 addition 2 placing 2 flavors 2 ISO 2 require 2 password 2 Damaged 2 World 2 processes 2 some 2 Latin 2 reordering 2 right-to-left 2 bidirectional 2 normalization 2 Japanese 2 regardless 2 vertical 2 without 2 Since 2 encoded 2 glyphs 2 sequence 2 avoid 2 InDesign 2 TeX 2 Analysis 2 Table 2 analyzed 2 determine 2 span 2 rows 2 precise 2 headers 2 footers 2 Color 2 alternate 2 space 2 geometric 2 larger 2 facilitate 2 fidelity 2 guaranteed 2 conversion 2 quality 2 Tagged 2 irrelevant 2 Artifact 2 Interface 2 info 2 preserve 2 remove 2 wide 2 variants 2 standard 2 meet 2 Web 2 than 2 most 2 domains 2 custom 2 fields 2 standards 2 individual 2 represents 2 tools 2 stylesheets 2 filters 2 Box 2 connectors 2 They 2 Server 2 IFilter 2 lines 2 hyphen 2 combines 2 treated 2 they 2 same 2 multiply 2 still 2 programs 2 first 2 both 2 combine 2 combinations 2 extracting 2 photographs 2 caps 2 remainder 2 tasks 2 provide 2 values 2 case 2 does 2 usable 2 unusable 2 garbage 2 runs 2 right 2 left 2 left-to-right 2 additional 2 so 2 heavily 2 cannot 2 displayed 2 subset 2 pixel 2 channel 2 plain 2 consist 2 Adobe 2 Both 2 offers 2 options 2 integration 2 workflows 2 samples 2 core 2 free 2 macOS 2 System 2 iOS 2 code 2 performance 2 NET 2 worldwide 2 our 2 Support 2 will 2 have 2 response 2 times 2 site 2 reliably 1 strings 1 detailed 1 determining 1 identifying 1 structures 1 removing 1 items 1 arbitrary 1 objects 1 With 1 Implement 1 indexer 1 Repurpose 1 Convert 1 Process 1 splitting 1 requires 1 Check 1 whether 1 particular 1 location 1 empty 1 barcode 1 stamp 1 Features 1 Accepted 1 Input 1 input 1 DC 1 Protected 1 do 1 opening 1 repaired 1 Writing 1 Systems 1 writing 1 special 1 Greek 1 Cyrillic 1 presentation 1 Simplified 1 Traditional 1 Chinese 1 Korean 1 horizontal 1 Indic 1 supported 1 usually 1 normalizes 1 method 1 multi-character 1 decomposed 1 Glyphs 1 mapped 1 configurable 1 replacement 1 misinterpretation 1 workarounds 1 specific 1 creation 1 generated 1 mainframe 1 Content 1 Determine 1 Combine 1 dehyphenation 1 Remove 1 duplicate 1 artificially 1 bolded 1 Recombine 1 paragraphs 1 reading 1 Correctly 1 scattered 1 Page 1 Layout 1 List 1 Tables 1 detected 1 cells 1 improves 1 ordering 1 cell 1 Bulleted 1 numbered 1 Geometry 1 provides 1 metrics 1 widths 1 direction 1 Specific 1 areas 1 excluded 1 ignore 1 margins 1 analyzes 1 description 1 returns 1 identify 1 highlighted 1 Optionally 1 simpler 1 JBIG2 1 files 1 Precise 1 angles 1 reported 1 Fragmented 1 repurposing 1 downsampling 1 occurs 1 ensures 1 highest 1 possible 1 Ignore 1 Artifacts 1 especially 1 UA 1 tagged 1 ignores 1 querying 1 Postprocessing 1 postprocessing 1 steps 1 improve 1 Foldings 1 punctuation 1 Decompositions 1 equivalent 1 narrow 1 superscript 1 respective 1 counterparts 1 converted 1 NFC 1 database 1 Document 1 Domains 1 places 1 deal 1 situations 1 relevant 1 predefined 1 entries 1 level 1 attachments 1 portfolios 1 recursively 1 comments 1 general 1 properties 1 queried 1 count 1 conformance 1 like 1 A 1 X 1 Metadata 1 ways 1 programmatically 1 Contents 1 flavor 1 fonts 1 colorspaces 1 analyze 1 JavaScript 1 ICC 1 profiles 1 intents 1 apply 1 convert 1 Sample 1 distribution 1 fragment 1 shows 1 llx 1 lly 1 urx 1 ury 1 P 1 D 1 F 1 l 1 b 1 include 1 tables 1 placement 1 along 1 Connectors 1 make 1 Lucene 1 Engine 1 Solr 1 Apache 1 TIKA 1 toolkit 1 Oracle 1 MediaWiki 1 retrieval 1 Dehyphenation 1 complete 1 important 1 ensure 1 searches 1 full 1 successful 1 although 1 present 1 Dashes 1 hyphens 1 separately 1 removed 1 Shadow 1 artifical 1 Digital 1 effect 1 achieved 1 offset 1 between 1 Similarly 1 simulated 1 overprinting 1 As 1 once 1 detection 1 identifies 1 excess 1 copies 1 extra 1 hit 1 hits 1 would 1 found 1 duplicated 1 Accented 1 Characters 1 accents 1 diacritical 1 marks 1 close 1 Some 1 typesetting 1 base 1 accent 1 letter 1 then 1 dieresis 1 situation 1 composite 1 fi 1 fl 1 ffi 1 less 1 Th 1 sp 1 ct 1 st 1 others 1 When 1 digital 1 separated 1 constituent 1 allow 1 keeps 1 dash 1 Inttrroduccttiion 1 Introduction 1 Midi-Pyr 1 en 1 ees 1 Midi-Pyrénées 1 rst 1 Caps 1 large 1 at 1 beginning 1 aligns 1 line 1 drops 1 down 1 emphasize 1 start 1 properly 1 drop 1 cap 1 S 1 tellen 1 Stellen 1 Mapping 1 foundation 1 every 1 assigned 1 value 1 complicates 1 supporting 1 assign 1 worst 1 enough 1 cascaded 1 takes 1 problematic 1 deliver 1 while 1 Bidirectional 1 encode 1 simply 1 container 1 script 1 inserts 1 numbers 1 names 1 Western 1 interpreted 1 directions 1 hence 1 term 1 poses 1 challenges 1 four 1 contextual 1 These 1 shaped 1 normalized 1 isolated 1 reorders 1 visual 1 mixture 1 Documents 1 get 1 because 1 transmission 1 errors 1 repair 1 mode 1 recovers 1 kinds 1 Sometimes 1 Even 1 extreme 1 cases 1 Spaces 1 Compression 1 data 1 combination 1 eleven 1 nine 1 compression 1 balances 1 characteristics 1 capabilities 1 Regardless 1 internal 1 structure 1 Spot 1 Colors 1 Technically 1 known 1 Device-dependent 1 CIE-based 1 Special 1 DeviceGray 1 CalGray 1 Indexed 1 DeviceRGB 1 CalRGB 1 Pattern 1 DeviceCMYK 1 Lab 1 ICCBased 1 creates 1 intended 1 need 1 superior 1 accept 1 Cyan 1 Magenta 1 missing 1 added 1 created 1 However 1 able 1 handle 1 restricted 1 instructed 1 grayscale 1 Merging 1 broken 1 producing 1 appears 1 actually 1 Office 1 produce 1 hundreds 1 segments 1 varying 1 transparency 1 flattening 1 merges 1 rectangular 1 grid 1 Only 1 merging 1 reasonably 1 repurposed 1 Photoshop 1 displays 1 Channels 1 window 1 Double-clicking 1 icons 1 reveals 1 Although 1 reusable 1 bottom 1 Many 1 Ways 1 operations 1 similar 1 features 1 scenarios 1 component 1 desktop 1 Examples 1 package 1 doesn 1 t 1 integrate 1 complex 1 developers 1 who 1 familiar 1 range 1 integrating 1 databases 1 engines 1 collection 1 examples 1 demonstrate 1 Several 1 show 1 how 1 enhance 1 add 1 links 1 Family 1 family 1 comprises 1 described 1 Share- 1 Point 1 SQL 1 Plugin 1 utility 1 evaluate 1 interactively 1 Supported 1 Development 1 Environments 1 everywhere 1 practically 1 computing 1 platforms 1 Linux 1 Unix 1 Z 1 mobile 1 Android 1 written 1 highly 1 optimized 1 maximum 1 overhead 1 Via 1 simple 1 API 1 Application 1 Programming 1 functionality 1 accessible 1 Java 1 Core 1 Objective-C 1 Perl 1 PHP 1 Python 1 RPG 1 Ruby 1 Benefits 1 Software 1 Rock-solid 1 Tens 1 programmers 1 working 1 meets 1 robust 1 unattended 1 Speed 1 Simplicity 1 incredibly 1 fast 1 per 1 second 1 straightforward 1 easy 1 learn 1 Our 1 international 1 customers 1 Professional 1 there 1 problem 1 we 1 try 1 help 1 commercial 1 business-critical 1 By 1 adding 1 access 1 latest 1 should 1 arise 1 Licensing 1 licensing 1 source 1 contracts 1 extended 1 technical 1 short 1 updates 1 About 1 completely 1 focused 1 technology 1 Customers 1 company 1 closely 1 follows 1 market 1 trends 1 distributed 1 major 1 markets 1 North 1 America 1 Europe 1 Japan 1 Contact 1 Fully 1 functional 1 evaluation 1 documentation 1 please 1 contact 1 Franziska-Bilek-Weg 1 München 1 Germany 1 phone 1 sales 1 Total unique words: 959