Htrflow
HTRflow¶
HTRflow is an open source tool for HTR and OCR developed by the AI lab at the National Archives of Sweden (Riksarkivet).
Key features¶
- Flexibility: Customize the HTR/OCR process for different kinds of materials.
- Compatibility: HTRflow supports all models trained by the AI lab - and more!
- YAML pipelines: HTRflow YAML pipelines are easy to create, modify and share.
- Export: Export results as Alto XML, Page XML, plain text or JSON.
- Evaluation: Compare results from different pipelines with ground truth.
Installation¶
Install HTRflow with pip:
For more details, see the Installation guide.Getting Started¶
Ready to build your own pipeline for your documents? Head over to the Quickstart guide to get started with HTRflow.
The guide will walk you through setting up your first pipeline, utilizing pre-trained models, and seamlessly running HTR/OCR tasks. With the HTRflow CLI, you can quickly set up pipelines using pipeline.yaml
files as your "blueprints".