Preserve external works the site cites against link rot, host them at
permanent /archive/<slug>/ URLs in site chrome, and treat them as
first-class citizens of the backlinks and similar-pages indexes.
Curated, not crawled: the author adds one line to archive/manifest.yaml
and the build fetches, hashes, snapshots, and indexes the work.
* archive/manifest.yaml + tools/archive.py (fetch / refresh / wayback /
check / gc) — PDFs downloaded directly, HTML pages snapshotted with a
vendored monolith (tools/bin/monolith @ 2.10.1) into a single
self-contained file with the archive CSP and a noarchive robots meta
injected. Per-entry PROVENANCE.json committed; gitignored .txt
sidecars regenerated from the artifact's SHA-256.
* build/Archive.hs + build/ArchiveIndex.hs + build/Filters/Archive.hs
— Hakyll rules for /archive/ and /archive/<slug>/, a body Pandoc
filter that appends an archive affordance to live citations and
flips dead ones to the local copy on archive.py check's asymmetric
hysteresis (rotted needs 3 fails over >= 14 days; one ok recovers).
* build/Backlinks.hs — keeps archived external URLs through pass 1 and
canonicalises them to /archive/<slug>/ in pass 2, producing a
"Referenced by" section grouped by the fragment each citation
targets. build/Stats.hs gains a "Link archive" telemetry block on
/build/ (count, total size, median age, by-status / by-quality /
by-visibility, orphans).
* Integrity: archive.py fetch and build/Archive.hs (via sha256sum)
both re-hash every committed artifact, so a tampered file halts the
build even with cabal invoked directly or no .venv present. refresh
refuses to replace an uncommitted prior snapshot and rolls back
atomically on any exit path. removed.yaml is honoured by fetch,
wayback, and check using canonical-form (tracking-stripped,
arXiv-canonicalised) comparison.
* visibility: private keeps an entry in-repo but undeployed.
nginx/archive.conf emits X-Robots-Tag: noindex, noarchive for raw
artifacts that cannot carry meta directives.
The full design, phase plan (1-5), and three refinement passes live
in ARCHIVE.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Emit a minimal robots.txt that points at the sitemap.
- Emit sitemap.xml covering every dated content page (essays, blog,
fiction, poetry, music) with absolute <loc> and frontmatter-derived
<lastmod>. Standalone pages (about, colophon, etc.) are
intentionally omitted: they're reachable via the main nav, lack
date: frontmatter, and would force a fallback lastmod that
misrepresents staleness.
- Replace the magic 'drop 8' offset in essay routing with
stripPrefix "content/". Same behavior, but reads structurally and
fails closed if the prefix ever changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each shelf gets a dingbat keyed by portal slug: laurel (research),
quill (nonfiction), open book (fiction), lyre (poetry), plus the
existing clef / ai / tech / trefoil glyphs for the remaining four.
Rendered via mask-image with currentColor so a single SVG per
portal inherits whatever color its heading carries. Between rendered
shelves, a centered fleuron flanked by thin rules (library-divider.svg)
sits via CSS adjacent-sibling so hidden sections leave no orphan
dividers. The template swaps its Unicode placeholder for a
data-ornament span, wires a '\$library-intro\$' slot above the shelves,
and renders a "More on this shelf →" link when has-more gates fire.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>