Commit Graph

3 Commits

Author SHA1 Message Date
Levi Neuwirth 56afdb867a Feature modules: URL normalization, Maybe-trust, proper medians
- Empty/all-comments manifest.yaml is the empty archive, not a fatal
  parse error (AUDIT §3.11)
- Backlinks normaliseUrl strips index.html like SimilarLinks, so links
  to canonical directory URLs invert again; Stats normUrl updated in
  lockstep (§3.12)
- PDF viewer file= query value percent-encoded (hand-rolled RFC 3986
  encoder; network-uri is not a dependency) (§3.13)
- Photography feed thumbnails embed for flat singles and series
  children, not just directory entries (§3.14)
- Marks trust is Maybe Int: missing confidence/evidence collapses the
  figure to the bare frame as documented, instead of a literal
  "0 TRUST"; result-shape glyph centers when no score (§3.15)
- Unknown catalog categories fold into one Other bucket; medians take
  the mean of middle elements; protocol-relative URLs excluded from
  backlinks; @string/@comment/@preamble skipped in BibTeX parsing;
  watch-staleness of the once-per-process archive reads documented;
  stale comments fixed (§3.16, §3.9)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:13:34 -04:00
Levi Neuwirth 7ca937d98c Fix audit HIGHs/MEDs in build code
- ArchiveIndex: guard rawIndex/rawState with doesFileExist so a fresh
  clone (gitignored data/ JSONs absent) degrades to empty instead of
  crashing — the behavior the module doc already promised (AUDIT §1.2)
- Commonplace: decode YAML via encodeUtf8, not Char8.pack, which
  truncates codepoints above 0x7F (AUDIT §3.2)
- Stats: DayOfWeek is ISO-numbered (Mon=1..Sun=7); dowOf and weekStart
  assumed Mon=0..Sun=6, clipping every Sunday cell outside the heatmap
  viewBox and starting weeks on Sunday (AUDIT §3.1)
- Site: epistemicEntry now honors the proved/proven confidence sentinel
  like Contexts.overallScoreField (AUDIT §2.6)
- Contexts: affiliationField returns noResult instead of an empty list,
  so essays without affiliation no longer render an empty meta row
  (AUDIT §2.7)

Verified: full site build passes; proved page gets score=100 in
epistemic-meta.json; empty .meta-affiliation gone; heatmap rows
y=22..94 all inside the 104-high viewBox.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 09:21:30 -04:00
Levi Neuwirth 77e31efdae Add link archive system: snapshots, backlinks, link-rot
Preserve external works the site cites against link rot, host them at
permanent /archive/<slug>/ URLs in site chrome, and treat them as
first-class citizens of the backlinks and similar-pages indexes.
Curated, not crawled: the author adds one line to archive/manifest.yaml
and the build fetches, hashes, snapshots, and indexes the work.

* archive/manifest.yaml + tools/archive.py (fetch / refresh / wayback /
  check / gc) — PDFs downloaded directly, HTML pages snapshotted with a
  vendored monolith (tools/bin/monolith @ 2.10.1) into a single
  self-contained file with the archive CSP and a noarchive robots meta
  injected. Per-entry PROVENANCE.json committed; gitignored .txt
  sidecars regenerated from the artifact's SHA-256.
* build/Archive.hs + build/ArchiveIndex.hs + build/Filters/Archive.hs
  — Hakyll rules for /archive/ and /archive/<slug>/, a body Pandoc
  filter that appends an archive affordance to live citations and
  flips dead ones to the local copy on archive.py check's asymmetric
  hysteresis (rotted needs 3 fails over >= 14 days; one ok recovers).
* build/Backlinks.hs — keeps archived external URLs through pass 1 and
  canonicalises them to /archive/<slug>/ in pass 2, producing a
  "Referenced by" section grouped by the fragment each citation
  targets. build/Stats.hs gains a "Link archive" telemetry block on
  /build/ (count, total size, median age, by-status / by-quality /
  by-visibility, orphans).
* Integrity: archive.py fetch and build/Archive.hs (via sha256sum)
  both re-hash every committed artifact, so a tampered file halts the
  build even with cabal invoked directly or no .venv present. refresh
  refuses to replace an uncommitted prior snapshot and rolls back
  atomically on any exit path. removed.yaml is honoured by fetch,
  wayback, and check using canonical-form (tracking-stripped,
  arXiv-canonicalised) comparison.
* visibility: private keeps an entry in-repo but undeployed.
  nginx/archive.conf emits X-Robots-Tag: noindex, noarchive for raw
  artifacts that cannot carry meta directives.

The full design, phase plan (1-5), and three refinement passes live
in ARCHIVE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 10:06:33 -04:00