Implements the seed_exam_corpus.py skeleton TODOs against the real APIs and fills the public exam corpus from official board sources. Loader (run/initialization/seed_exam_corpus.py): - _resolve_source_bytes: local path | url: fetch with on-disk cache + PDF validation - upload_file: real StorageAdmin.upload_file, skip-if-exists+sha256 unless --force - upsert_specification/upsert_paper: real upserts on spec_code/exam_code. Fix: QP/MS/INSERT/ER role -> eb_exams.type_code; doc_type set to 'pdf' (doc_type is CHECK-constrained to file formats; the skeleton wrote the role there). - copy_user_test_subset: copy a QP subset into a test user's cc.users exam space + files rows - first_sweep: auto_map + the /auto-map row mapper over seeded QPs -> system-owned exam_templates + questions/response_areas/boundaries/layout (idempotent) - identity discovery via institute_memberships.profile_id Manifest (run/initialization/manifests/): - exam-corpus.yaml: 505 papers / 18 specs / AQA+Edexcel+OCR, every source URL HEAD-verified. AQA sciences GCSE 8461/8462/8463/8464 + AS/A-level 7401-7408, sessions JUN18-JUN24, QP+MS+ER, F+H. - generate_corpus_manifest.py: regenerates + re-verifies all URLs from official hosts. seed_curriculum.py: deprecation banner -> superseded by seed_exam_corpus.py; storage_loc standardised on cc.examboards. Verified on dev .94: full 505-paper seed (eb_specifications=18, eb_exams=505, QP=211), idempotent re-runs, first-sweep + user-subset, 6/6 buckets provisioned. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Description
FastAPI + Python 3.12 backend for Classroom Copilot — auth, document processing, transcription sessions, LLM integration, Supabase-backed
Languages
Python
98.9%
Shell
0.8%
Jupyter Notebook
0.3%