Lightweight notebook and scripts for detecting and grouping characters (digits/letters) on Scania images using YOLOv8.
hotspot_detection_scania.ipynb— main notebook: data conversion, training, prediction, grouping, visualization, and XML generation.new_dataset/— sample dataset structure (images + labels +data.yaml).
- Python 3.8+
- macOS (tested) or Linux
Run:
pip install -r requirements.txt
# or
pip install ultralytics opencv-python matplotlib tqdm pyyaml(If you prefer to use the notebook cell, it already contains !pip install ultralytics opencv-python matplotlib.)
- Open
hotspot_detection_scania.ipynbin Jupyter/VS Code. - Edit configuration placeholders in the notebook cells (paths such as image folder, xml folder, model paths, dataset folder).
- Run the cells in order:
- Install dependencies
- Convert Pascal VOC XMLs to YOLO labels (if needed)
- Organize dataset into
train/valand createdata.yaml - Train YOLOv8
- Visualize training metrics
- Run predictions and group characters
- Generate XML annotations from predictions
new_dataset/
- images/
- train/
- val/
- labels/
- train/
- val/
- data.yaml
YOLO label files are .txt with class and normalized bbox values.
The notebook uses Ultralitycs YOLO API (YOLOv8). Example training cell:
from ultralytics import YOLO
model = YOLO('yolov8n.pt')
model.train(data='new_dataset/data.yaml', epochs=100, imgsz=640, batch=20, device='mps')Adjust device, epochs, batch, and imgsz as required.
Prediction cells show how detected character boxes are collected, sorted, grouped (left-to-right/top-to-bottom) and merged into multi-character boxes. There are two grouping variants in the notebook: a simple y/x sort method and an improved xyxy gap-based method.
Notebook includes functions to convert grouped detections into Pascal VOC XML files for downstream use.
- Replace placeholder paths in the notebook before running.
- Ensure model paths point to existing weights (
best.pt). - The grouping thresholds may need tuning per image set.