Quickstart¶
Data¶
Load dataset from huggingface
from datasets import load_dataset
dataset = load_dataset("Riksarkivet/Trolldomkomission")["train"]
images = dataset["image"]
Volume¶
Segment Images¶
from htrflow_core.models.ultralytics.yolo import YOLO
seg_model = YOLO('ultralyticsplus/yolov8s')
res = seg_model(vol.images()) # vol.segments() is also possible since it points to the images
Update Volume¶
HTR¶
from htrflow_core.models.huggingface.trocr import TrOCR
rec_model = TrOCR()
res = rec_model(vol.segments())
vol.update(res)
Serialize¶
Saves at outputs/.xml, since the two demo images are called the same, we get only one output file