Recognizers¶
Text recognition backends for extracting text from detected blocks.
Supported Recognizers¶
Cloud VLM APIs¶
- OpenAI: GPT-4 Vision, GPT-4o
- Gemini: Gemini 2.5 Flash, Gemini 2.0 Flash
- OpenRouter: Access to multiple VLMs
Local Models¶
- PaddleOCR-VL: PaddleOCR-VL-0.9B (NaViT + ERNIE-4.5-0.3B)
- 0.9B parameters
- 109 languages support
- No API costs
- Requires GPU (recommended)
Note
Detailed recognizer API documentation coming soon. See Basic Usage for usage examples.