com.qoppa.ocr
Class TessJNI
java.lang.Object
com.qoppa.ocr.TessJNI
public class TessJNI
- extends Object
This class provides a native interface to the Tesseract OCR engine.
- Author:
- Qoppa Software
TessJNI
public TessJNI()
performOCR
public String performOCR(String language,
BufferedImage image)
throws OCRException
- Performs OCR on an image and returns an hOCR result string. This method makes a call to the Tesseract OCR
engine to perform character recognition on the image. The results are in hOCR format, a standard format
for OCR results that includes recognized text as well as location and size information.
- Parameters:
language - The language to use in performing the OCR.image - The image to process
- Returns:
- The OCR results, in hOCR format.
- Throws:
OCRException
performOCR
public String performOCR(String language,
PDFPage pdfPage,
int dpi)
throws PDFException,
OCRException
- Performs OCR on a PDF page and returns an hOCR result string. This method converts the PDF page to an image and then
makes a call to the Tesseract OCR engine to perform character recognition on the image. The results are in hOCR format,
a standard format for OCR results that includes recognized text as well as location and size information.
- Parameters:
language - The language to use in performing the OCR.pdfPage - The PDF page to process
- Returns:
- The OCR results, in hOCR format.
- Throws:
OCRException
PDFException