jPDFText Source Code Samples

Following are a number of Java samples that use jPDFText to extract text content from PDF documents. More samples are available in jPDFText knowledge base.

ExtractAllText.java – Simple program to extract the entire text in a document as a single String, and then saving this to a file.

ExtractTextByPage.java – Program that extracts the text for each page in a document and writes it to a file.

GetWordList.java – Program that gets all the words from a PDF document and echoes them to the console.

GetLinesAndPositions.java – Program that gets all the lines from a PDF document and echoes them to the console. This program can be used to extract data from structured reports such as invoices, statements, etc..

GetWordsAndPositions.java – Extracts all the words in the document with their position informaiton and echoes this to the console.