OCR PDF
Add a searchable text layer to scanned or image-based PDF documents using Optical Character Recognition.
Overview
The OCR PDF tool applies Optical Character Recognition to scanned or image-based PDF pages, creating an invisible text layer behind the page images. This makes the document fully searchable, selectable, and accessible without altering its visual appearance. OCR is essential for digitising paper documents and meeting accessibility requirements.
How to Use
- Navigate to OCR PDF from the Security & Optimization menu.
- Upload your file using drag-and-drop, Browse Files, or cloud storage (Dropbox / Google Drive).
- Select the document language for optimal recognition accuracy.
- Click OCR to process.
- Download the result when processing completes.
Options
| Option | Description |
|---|---|
| Language(s) | Select one or more languages that appear in the document. Hold Ctrl (or Cmd on Mac) to select multiple. Choosing the correct language improves recognition accuracy for language-specific characters and dictionaries. Defaults to English. |
Supported Languages
OCR supports the following 60+ languages (those with bundled recognition data). For multilingual documents, select all relevant languages from the dropdown.
Common
- English
- French
- German
- Spanish
- Italian
- Portuguese
- Dutch
- Chinese (Simplified & Traditional)
- Japanese
- Korean
- Arabic
- Hebrew
- Turkish
- Polish
European
- Albanian, Basque, Bosnian, Breton, Catalan, Corsican, Croatian, Czech, Danish, Estonian, Faroese, Finnish, Frisian, Galician, Hungarian, Icelandic, Irish, Latvian, Lithuanian, Luxembourgish, Maltese, Norwegian, Occitan, Romanian, Scottish Gaelic, Serbian (Latin), Slovak, Slovenian, Swedish, Welsh
Asian
- Cebuano, Filipino, Indonesian, Javanese, Malay, Sundanese, Urdu
Middle Eastern & African
- Afrikaans, Azerbaijani, Pashto, Persian, Sindhi, Swahili, Uyghur, Uzbek, Yoruba
Other
- Esperanto, Haitian Creole, Latin, Maori, Quechua, Tatar, Tongan, Yiddish
Languages without bundled recognition data (for example Russian, Greek, Thai, Vietnamese, Ukrainian, Bulgarian, and most Indic languages) are not offered. Contact support if you need one of these.
Tips & Notes
Higher-quality scans produce better OCR results. Aim for at least 300 DPI when scanning documents. Ensure the pages are straight and the text is not skewed for best accuracy.
For multilingual documents, select all languages present in the document so the OCR engine can recognise each of them. Selecting only the languages that actually appear gives the best accuracy.
Pages that already contain a text layer are skipped to avoid duplicating text. Only image-based pages are processed.
Related tools: Extract Text · PDF to Word