The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
The physical and human spects dealing with audio, video, and images
Vision and color perception: objectively describing color · the eyes and the brain · physics, numbers, and (non)linearity · color spaces · references, links, and unsorted stuff
Image: file formats
· noise reduction
· halftoning, dithering
· illuminant correction
· Image descriptors
· Reverse image search
· image feature and contour detection
· OCR
· Image - unsorted
Video: format notes · encoding notes · On display speed · Screen tearing and vsync
Audio physics and physiology: Sound physics and some human psychoacoustics · Descriptions used for sound and music
Noise stuff: Stray signals and noise · sound-related noise names · electronic non-coupled noise names · electronic coupled noise · ground loop · strategies to avoid coupled noise · Sampling, reproduction, and transmission distortions · (tape) noise reduction
Digital sound and processing:
capture, storage, reproduction · on APIs (and latency) · programming and codecs · some glossary · Audio and signal processing - unsorted stuff
Music electronics:
device voltage and impedance, audio and otherwise ·
amps and speakers ·
basic audio hacks ·
Simple ADCs and DACs ·
digital audio ·
multichannel and surround
On the stage side: microphones · studio and stage notes ·
Effects ·
sync
Electronic music:
- Electronic music - musical terms
- MIDI · Some history, ways of making noises · Gaming synth · microcontroller synth
- Modular synth (eurorack, mostly):
- sync · power supply · formats (physical, interconnects)
- DAW: Ableton notes · MuLab notes · Mainstage notes
Unsorted: Visuals DIY · Signal analysis, modeling, processing (some audio, some more generic) · Music fingerprinting and identification
For more, see Category:Audio, video, images
|
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
OCR as a task
Software
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
OCRopus
- document OCR (used in Google Books, Internet Archive)
- multifont, multilanguage
- https://en.wikipedia.org/wiki/OCRopus
Tesseract
- document OCR
- https://opensource.google.com/projects/tesseract
- https://en.wikipedia.org/wiki/Tesseract_(software)
CuneiForm
- https://en.wikipedia.org/wiki/CuneiForm_(software)
keras-ocr
- https://keras-ocr.readthedocs.io/en/latest/
EasyOCR
- https://github.com/JaidedAI/EasyOCR
ABBYY (FineReader)
- paid
- https://pdf.abbyy.com/
Google Docs OCR
- online-only
Rossum
- paid, online-only?
- https://rossum.ai/lp/ocr-software/
Amazon Rekognition
- more for scene text?(verify)
- paid, online-only
Amazon Textract
- more for documents?(verify)
- paid, online-only
Transym
- more for documents?(verify)
- paid, online-only
- https://transym.com/
Integrated features / online APIs (i.e. not easy to automate)
- Acrobat,
- Google Keep,
- Google Drive ('open with' converts),
- OneNote,
- IBM datacap[1],
Convenience tools / wrappers
Powertoys's Text Extractor
- from screen capture. More of a convenience tool
- for text that comes from fonts this can work quite well, and fairly quickly, even in photographic context, though degrades quickly on more creative text
Lios
Document managers with OCR
Output formats
hOCR
A (HTML-based) format to store detected words/fragments of text's position,
and optionally detected style, layout, and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML.
https://en.wikipedia.org/wiki/HOCR
https://pypi.org/project/hocr-spec/
ALTO
https://en.wikipedia.org/wiki/ALTO_(XML)
PAGE XML
https://en.wikipedia.org/wiki/PAGE_(XML)
abbyyXML
https://support.abbyy.com/hc/en-us/articles/360017336699-ABBYY-FineReader-Engine-XML-Export