OCR PDF

PDF Studio is capable of OCRing documents using any of the available OCR languages to add text to documents. OCR allows you to add text to scanned documents or images so that the document can be searched or marked up as you would any other text document. PDF Studio 11 also introduces the ability to run OCR with two languages at once. For more information on OCRing with two languages see OCR Preferences.

What is OCR?

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed or printed text into machine-encoded searchable text data.

From Existing Document

Text can be added to an existing document using OCR

  1. Launch PDF Studio and open the PDF document that you wish to add searchable text to
  2. Go to Document > OCR – Create Searchable PDF from the menu
  3. From the Language drop down select the language you wish to use 
  4. Select the Page Range and Resolution that you wish to use
  5. Click on “OK” to begin the OCR process
  6. You will see a progress dialog showing you the current page being processed. Once complete click on “OK” to close the dialog 
  7. Your document is now ready to be searched, edited, or marked up with highlights, underlined, crossed-out or used with caret annotations.

When Scanning a Document

OCR can add text to a document at the same time it is being scanned with PDF Studio

  1. Launch PDF Studio and start the scanning tool by either clicking on the Scanner button on the toolbar or going to File > Create PDF > From Scanner
  2. In the scanning dialog you will see an option to OCR the document after scanning 
  3. From the Language drop down select the language you wish to use 
  4. After setting all of your scanning and OCR settings click on “Scan” to begin scanning the document
  5. Once the scanning completes the OCR process will begin and you will see a progress dialog showing you the current page being processed. Once complete click on “OK” to close the dialog 
  6. Your document is now ready to be searched, edited, or marked up with highlights, underlined, crossed-out or used with caret annotations.

Available OCR Languages

The following language dictionary files are available for download directly from within PDF Studio OCR functions. Using the appropriate language file will improve the accuracy of OCR results. See Tips on Improving OCR Results for additional information

  • Afrikaans
  • Albanian – shqip
  • Arabic – العربية
  • Azerbaijani – azərbaycan
  • Basque – euskara
  • Belarusian – беларуская
  • Bengali – বাংলা
  • Bulgarian – български
  • Catalan – català
  • Cherokee
  • Chinese (Simplified) – 中文(体中文)
  • Chinese (Traditional) – 中文(繁體)
  • Croatian – hrvatski
  • Czech – čeština “da”>Danish – dansk
  • Danish – Dansk
  • Danish (Fraktur) – Dansk (Fraktur)
  • Dutch - Netherlandish
  • English
  • Estonian – eesti
  • Finnish - Suomalainen
  • French - Français
  • Galician – galego
  • German - Deutsche
  • Greek – Ελληνικά
  • Hebrew – עברית
  • Hindi – हिन्दी
  • Hungarian – magyar
  • Icelandic – íslenska
  • Indonesian – Bahasa Indonesia
  • Italian - Italiano
  • Italian (old) – italino vecchio
  • Japanese – 日本語
  • Kannada – ಕನ್ನಡ
  • Korean – 한국어
  • Latvian – latviešu
  • Lithuanian – lietuvių
  • Macedonian – македонски
  • Malay – Bahasa Melayu
  • Malayalam – മലയാളം
  • Maltese – Malti
  • Math / Equations
  • Norwegian - Norsk
  • Polish - Polskie
  • Portuguese - Português
  • Romanian – română
  • Russian – русский
  • Serbian – српски
  • Slovakian – slovenčina
  • Slovakian (Fraktur) – slovenčina (Fraktur)
  • Slovenian – slovenščina
  • Spanish - Español
  • Spanish (Old) – español (Antiguo)
  • Swahili – Kiswahili
  • Swedish - Svensk
  • Tagalog
  • Tamil – தமிழ்
  • Telugu – తెలుగు
  • Thai – ไทย
  • Turkish – Türkçe
  • Ukrainian – українська
  • Vietnamese – Tiếng Việt