Some checks failed
api-ci-deploy / test-build-deploy (push) Has been cancelled
Eval harness for AQA A-level + GCSE-science image-only papers: finalize.py --b1-only, RapidOCR runner (rapid_pass.py via dsync), GT fixtures (make_b1_gt.py + b1_gt_labels.json), and fetch_b1_corpus.py to pull the eval corpus from .94 cc.examboards at runtime. Salvaged from t_15be12ed (which timed out on iteration budget re-running OCR): exam PDFs and generated OCR caches/reports are NOT committed (third-party copyright + reproducible) — gitignored and fetched/generated at runtime. Baseline coverage recorded in the task evidence file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
API Docling first-pass auto-map package
This package is the in-API home for the S5 exam-template/first-pass/v1 extraction pipeline copied from /home/kcar/dev/docling-exam-spike.
auto_map(pdf_bytes) returns the editable first-pass template.json shape consumed by downstream exam-marker mapping. The pipeline keeps margins as constraining inputs: document left/right and per-page top/bottom margins are derived before template assembly, then part/question bands and furniture/figure boxes are constrained through those margins.
dsync Redis env wiring
The OCR path uses dsync.py for docling-serve GPU locking, page cache, and retry. Configure with env-var names only:
DOCLING_SERVEDOCLING_REDIS_URLDOCLING_REDIS_HOSTDOCLING_REDIS_PORTDOCLING_REDIS_PASSWORDDOCLING_REDIS_DB
If Redis is unavailable, dsync falls back to no cache/lock and logs that state. Do not put secret values in this file.