OCR
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
OCR as a task
Software
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
OCRopus
- document OCR (used in Google Books, Internet Archive)
- multifont, multilanguage
- https://en.wikipedia.org/wiki/OCRopus
Tesseract
- document OCR
- https://opensource.google.com/projects/tesseract
- https://en.wikipedia.org/wiki/Tesseract_(software)
CuneiForm
keras-ocr
EasyOCR
ABBYY (FineReader)
Google Docs OCR
- online-only
Rossum
- paid, online-only?
- https://rossum.ai/lp/ocr-software/
Amazon Rekognition
- more for scene text?(verify)
- paid, online-only
Amazon Textract
- more for documents?(verify)
- paid, online-only
Transym
- more for documents?(verify)
- paid, online-only
- https://transym.com/
Integrated features / online APIs (i.e. not easy to automate)
- Acrobat,
- Google Keep,
- Google Drive ('open with' converts),
- OneNote,
- IBM datacap[1],
Convenience tools / wrappers
- from screen capture. More of a convenience tool
- for text that comes from fonts this can work quite well, and fairly quickly, even in photographic context, though degrades quickly on more creative text
Document managers
Apache Tika
- geared at content analysis and indexing (also metadata/document structure parser)
- uses tesseract for OCR
- https://tika.apache.org/
Aleph
-->
Output formats
hOCR
A (HTML-based) format to store detected words/fragments of text's position, and optionally detected style, layout, and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML.
https://en.wikipedia.org/wiki/HOCR
https://pypi.org/project/hocr-spec/
ALTO
https://en.wikipedia.org/wiki/ALTO_(XML)
PAGE XML
https://en.wikipedia.org/wiki/PAGE_(XML)
abbyyXML
https://support.abbyy.com/hc/en-us/articles/360017336699-ABBYY-FineReader-Engine-XML-Export