Commit Graph

18 Commits

Author SHA1 Message Date
Levi Neuwirth 77e31efdae Add link archive system: snapshots, backlinks, link-rot
Preserve external works the site cites against link rot, host them at
permanent /archive/<slug>/ URLs in site chrome, and treat them as
first-class citizens of the backlinks and similar-pages indexes.
Curated, not crawled: the author adds one line to archive/manifest.yaml
and the build fetches, hashes, snapshots, and indexes the work.

* archive/manifest.yaml + tools/archive.py (fetch / refresh / wayback /
  check / gc) — PDFs downloaded directly, HTML pages snapshotted with a
  vendored monolith (tools/bin/monolith @ 2.10.1) into a single
  self-contained file with the archive CSP and a noarchive robots meta
  injected. Per-entry PROVENANCE.json committed; gitignored .txt
  sidecars regenerated from the artifact's SHA-256.
* build/Archive.hs + build/ArchiveIndex.hs + build/Filters/Archive.hs
  — Hakyll rules for /archive/ and /archive/<slug>/, a body Pandoc
  filter that appends an archive affordance to live citations and
  flips dead ones to the local copy on archive.py check's asymmetric
  hysteresis (rotted needs 3 fails over >= 14 days; one ok recovers).
* build/Backlinks.hs — keeps archived external URLs through pass 1 and
  canonicalises them to /archive/<slug>/ in pass 2, producing a
  "Referenced by" section grouped by the fragment each citation
  targets. build/Stats.hs gains a "Link archive" telemetry block on
  /build/ (count, total size, median age, by-status / by-quality /
  by-visibility, orphans).
* Integrity: archive.py fetch and build/Archive.hs (via sha256sum)
  both re-hash every committed artifact, so a tampered file halts the
  build even with cabal invoked directly or no .venv present. refresh
  refuses to replace an uncommitted prior snapshot and rolls back
  atomically on any exit path. removed.yaml is honoured by fetch,
  wayback, and check using canonical-form (tracking-stripped,
  arXiv-canonicalised) comparison.
* visibility: private keeps an entry in-repo but undeployed.
  nginx/archive.conf emits X-Robots-Tag: noindex, noarchive for raw
  artifacts that cannot carry meta directives.

The full design, phase plan (1-5), and three refinement passes live
in ARCHIVE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 10:06:33 -04:00
Levi Neuwirth 339433db20 Quote rsync target variables in Makefile deploy
A VPS_PATH containing whitespace or shell metacharacters would split
on the unquoted expansion and hand rsync extra arguments. The
existing VPS_PATH guard rejects obviously dangerous parents (/srv,
/var, etc.) but does not catch this. Quoting fails closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:08:23 -04:00
Levi Neuwirth cd94227acb Spec dilemma 2026-05-01 21:22:01 -04:00
Levi Neuwirth 42ba2bf972 Current rework 2026-04-26 19:42:47 -04:00
Levi Neuwirth 6585573dae States/Context/Embeddings fixes 2026-04-26 11:22:57 -04:00
Levi Neuwirth 6d2f9d12ae PDF compression 2026-04-22 12:40:22 -04:00
Levi Neuwirth 3a95a05284 Fix broken PDF hyperlinks 2026-04-22 12:10:31 -04:00
Levi Neuwirth 913a374fb2 Professional content refactor 2026-04-22 11:46:57 -04:00
Levi Neuwirth acb3ae7066 visual enhancements 2026-04-15 22:25:38 -04:00
Levi Neuwirth b02e1e868d audit: tooling, deploy ordering, README, repo hygiene 2026-04-10 17:41:33 -04:00
Levi Neuwirth c864e2f9cc makefile corrections + esoteric math rendering 2026-04-05 12:00:07 -04:00
Levi Neuwirth 1be6292757 ToC fix 2026-03-27 16:19:52 -04:00
Levi Neuwirth a5495035be epistemic v2 2026-03-26 09:10:35 -04:00
Levi Neuwirth 728afd4c68 affiliation, cabal helper script 2026-03-26 08:14:50 -04:00
Levi Neuwirth 5cfbfbc0ef GPG signing, embedding pipeline, visualization filter, search timing, sig popups
- GPG page signing: dedicated signing subkey in ~/.gnupg-signing, sign-site.sh
  walks _site/**/*.html producing .sig files, preset-signing-passphrase.sh caches
  passphrase via gpg-preset-passphrase; make sign target; make deploy chains it
- Footer sig link: $url$.sig with hover popup showing ASCII armor (popups.js
  sigContent provider; .footer-sig-link bound explicitly to bypass footer exclusion)
- Public key at static/gpg/pubkey.asc
- Embedding pipeline: tools/embed.py encodes _site pages with nomic-embed-text-v1.5
  + FAISS IndexFlatIP, writes data/similar-links.json; staleness check skips when
  JSON is newer than all HTML; make build invokes via uv, skips gracefully if .venv absent
- SimilarLinks.hs: similarLinksField loads similar-links.json with Hakyll dependency
  tracking; renders Related section in page-footer.html
- uv environment: pyproject.toml + uv.lock (CPU-only torch via pytorch-cpu index)
- Visualization filter: Filters/Viz.hs runs Python scripts for .figure (SVG) and
  .visualization (Vega-Lite JSON) fenced divs; viz.js renders with monochrome config
  and MutationObserver dark-mode re-render; viz.css layout
- Search timing: #search-timing element shows elapsed ms via MutationObserver
- Build telemetry timestamps removed from git tracking (now in .gitignore)
- spec.md updated to v9; WRITING.md updated with viz, related, signing, build docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 20:14:49 -04:00
Levi Neuwirth 26c067147a Build telemetry 2026-03-19 15:27:12 -04:00
Levi Neuwirth 9c47811372 Administrativa. 2026-03-17 22:31:24 -04:00
Levi Neuwirth 714824a0b5 initial deploy! whoop 2026-03-17 21:56:14 -04:00