Commit Graph

1 Commits

Author SHA1 Message Date
Levi Neuwirth eb7fef30df Pin Hugging Face model revisions for downloader and embed pipeline
- Add tools/model-checksums.sha256 with sha256 hashes for the five
  Xenova/all-MiniLM-L6-v2 files served from static/models/.
  download-model.sh was already plumbed to verify against this file
  when present; the file itself was missing, so downloads were
  unverified. Now every fetch checks against committed hashes and
  fails closed on mismatch.
- Pin embed.py's SentenceTransformer load to a specific HF commit
  (c9745ed1d9f207416be6d2e6f8de32d1f16199bf of
  sentence-transformers/all-MiniLM-L6-v2). A future model bump can no
  longer silently change embedding semantics across builds. Bump
  deliberately when validating; re-run a full embed pass to refresh
  the semantic + similar-links data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:08:14 -04:00