--- title: Repository audit date: 2026-06-09 --- # Repository audit — levineuwirth.org (2026-06-09) Comprehensive audit of the repo on `main` at commit `620b974` (working tree modified: branding refresh across `static/` + `templates/partials/`, plus `tools/embed.py` rework; untracked `static/og-image.png`, `templates/partials/logo-mark.svg`, `data/embed-cache-pages.npz.tmp.npz`). Severity legend: **HIGH** (likely to break a build, cause data loss, or expose a security weakness) — **MED** (latent bug, brittleness, or documentation drift) — **LOW** (minor robustness gap or fragile assumption) — **NIT** (style, polish, or paranoia). Numbers are file:line against the working tree at audit time. Findings marked "verified" were reproduced empirically (solver runs, built `_site/` output inspection, live HTTP checks, binary parsing); the rest were confirmed by reading the code. Prior audit: `AUDIT.md` (2026-05-07). Follow-up status in §10. --- ## 1. Build & dependency chain ### 1.1 `cabal.project.freeze` is unsolvable again — next clean build fails — **HIGH** `cabal build --dry-run` fails today (verified): the freeze pins `distributive ==0.6.2.1`, but the system (pacman) GHC package db has `comonad-5.0.10` built against `distributive-0.6.3`: ``` rejecting: distributive-0.6.3/installed... (constraint from cabal.project.freeze requires ==0.6.2.1) After searching the rest of the dependency tree exhaustively... ``` The conflict set also names aeson, warp, hakyll, http2, semigroupoids. This is the same failure mode as prior-audit §1.1 — that audit's specific aeson pin was fixed (now 2.2.2.0/hashable 1.4.7.0), but a different package broke the same way after a system update. Recent builds succeed only off the cached `dist-newstyle/cache/plan.json`; the freeze file has since changed, so the next cabal invocation re-solves and fails. Because `make deploy` starts with `make clean`, the next deploy hits this. `levineuwirth.cabal`'s own bounds are compatible with the freeze — the conflict is freeze-vs-installed-db, not freeze-vs-cabal-file. Fix: `tools/refreeze.sh` (written for exactly this post-`pacman -Syu` situation). The underlying fragility — freezing against a mutable system package db — remains; consider documenting the refreeze step as part of any system-upgrade ritual. *(In progress at time of writing.)* ### 1.2 Missing `data/archive-index.json` / `archive-state.json` crashes the build — **HIGH** `build/ArchiveIndex.hs:134-146`. The module doc (lines 18-22) promises "An absent or malformed file degrades safely: an empty index makes the link consumers no-op; an absent state file makes every entry @Live@." But `rawIndex = unsafePerformIO $ do decoded <- A.eitherDecodeFileStrict' indexPath` (and identically `rawState`) never checks `doesFileExist`, and aeson's `eitherDecodeFileStrict'` throws an uncaught `IOException` on a missing file (verified: `withBinaryFile: does not exist`). Both files are gitignored (`.gitignore:84-85`), so a fresh clone or a no-`.venv` build — the exact path `build/Archive.hs:20-24` promises to support — throws when the CAF is first forced. Contrast `readUrlSet` (line 109) in the same file, which guards correctly. Currently latent on this machine only because both generated files happen to exist. ### 1.3 `embed.py` `trust_remote_code=True` executes unpinned third-party code — **HIGH** `tools/embed.py:329` (line ~341 in the uncommitted version). The new page-model load is `SentenceTransformer(PAGE_MODEL_NAME, revision=PAGE_MODEL_REVISION, trust_remote_code=True)`. The `revision` arg pins only the `nomic-ai/nomic-embed-text-v1.5` repo; the actual modeling code is pulled via `auto_map` from a *different* repo — verified in the local HF cache: the executed code lives under `transformers_modules/nomic_hyphen_ai/nomic_hyphen_bert_hyphen_2048/...`, i.e. `nomic-ai/nomic-bert-2048` at its current head, which nothing pins. A compromise of that second repo runs arbitrary Python at build time, in a repo whose every other download path (download-model.sh, pdfjs, leaflet) is sha256-pinned. The comment "Both pins are deliberate" is therefore misleading. Fix: pin via `code_revision`, or run with `HF_HUB_OFFLINE=1` after first fetch, or document the accepted risk. ### 1.4 Working-tree commit hazard: tracked templates reference untracked files — **HIGH (process)** `templates/partials/nav.html:5` (tracked, modified) adds `$partial("templates/partials/logo-mark.svg")$` and `templates/partials/head.html` references `/og-image.png` — both target files are **untracked** (no git history). Committing the template diff without `git add`-ing both breaks every page's Hakyll build on a fresh clone (`$partial$` aborts compilation) and 404s the og:image. They must land in the same commit. Conversely, `data/embed-cache-pages.npz.tmp.npz` must **not** be committed (see §4.1). The partial itself is safe as a Hakyll template (verified: zero `$` characters; `match "templates/**"` compiles it). ### 1.5 `einops` dependency: undocumented, unbounded, imported nowhere — **LOW** `pyproject.toml:27` adds `einops>=0.8.2`. No import anywhere in `tools/`/`build/`/`static/js/`; its only consumer is nomic's `trust_remote_code` module (§1.3). Every sibling dependency has an explanatory comment and an upper bound per the file's own stated policy ("Upper bounds are intentionally generous (next major) but always present"); einops has neither. `uv lock --check` passes (0.8.2 pinned). --- ## 2. Haskell build code — core ### 2.1 Nav, home grid, and library link `/fiction/` and `/poetry/` — confirmed 404s — **MED** `build/Site.hs:50-60` (`homePortals` contains `("Fiction","fiction")`, `("Poetry","poetry")`), `templates/partials/nav.html:56,61`, `templates/library.html:44,58`. No rule generates either index: fiction and poetry are not in `tagIndexable` (`build/Patterns.hs:148-151` = essays + blog + photos) and Site.hs has no landing rule. Verified: `_site/fiction` does not exist; `_site/poetry/` has no `index.html`. nginx has no redirects. Both links 404 in production today. ### 2.2 Tag/route collisions guarded for `photography` only — **MED** `build/Tags.hs:98-99`. `tagIdentifier` maps tag `t` → `t ++ "/index.html"`; `sectionOwnedTopLevelTags = ["photography"]` is the only guard. A tagIndexable item tagged `music` (or `music/x`, which expands to `music`) emits `music/index.html`, already owned by the music index route (`build/Site.hs:486-487`); similarly `essays`, `blog`, `cv`, `archive`, `authors`, `bibliography`. Hakyll does not error on duplicate routes — one silently overwrites the other. ### 2.3 Sidenotes filter destroys the documented no-JS fallback — **MED** `build/Filters/Sidenotes.hs:30-36` vs `static/css/sidenotes.css:125-135`. The module doc claims the Pandoc `
` "serves as fallback," but `apply` replaces every `Note`, so the writer never emits the section. CSS depends on it below 1500px. Verified in output: `_site/essays/scaling_outage.html` has 3 `class="sidenote"` and zero `footnotes` occurrences. With JS disabled, footnote content is invisible on narrow viewports. The comment, the CSS, and ozymandias.md's own prose all contradict actual behavior. ### 2.4 Sidenote bodies rendered without the KaTeX writer — **MED** `build/Filters/Sidenotes.hs:103-115`. `inlinesToHtml`/`blocksToHtml` use `writeHtml5String (def :: WriterOptions)` (PlainMath), while the main pipeline uses `KaTeX ""` (`build/Compilers.hs:47`). Math inside a footnote never gets `\(...\)`, so KaTeX never renders it — degrades to plain italics, silently inconsistent with body math. ### 2.5 SourceRefs whitelist vs `/source/` serving whitelist have drifted — **MED** `build/Filters/SourceRefs.hs:114-141` vs `build/Site.hs:217-240`. Site.hs:209 says "must stay aligned with 'isSourcePath'". Mismatches: SourceRefs wraps `content/` and `yaml-source/` (no Site counterpart); `static/` + any known ext vs Site's `static/js/**`/`static/css/**` only; `tools/` + any ext vs Site's `tools/**.sh`/`tools/**.py`; `data/` at any depth vs Site's top-level `data/*.{json,yaml,md,bib}`. Each mismatch yields a wrapped source-ref whose popup fetch 404s (Forgejo href fallback still works). Inverse: Site serves `data/*.bib` but `.bib` is missing from `hasKnownExt` — dead whitelist entry. ### 2.6 `epistemicEntry` ignores `confidence: proved` — **MED** `build/Site.hs:1014-1024`. Comment: "Compute overall-score the same way Contexts.overallScoreField does," but it uses `readMaybe =<< lookupString "confidence" meta`, which is `Nothing` for `"proved"`/`"proven"`, whereas `Contexts.overallScoreField` (`build/Contexts.hs:574-576`) substitutes 100 via `isProvedConfidence`. Proved pages get no `score` in `data/epistemic-meta.json` and export the raw string under `confidence`, so client-side filtering silently misses them. ### 2.7 Empty affiliation `
` ships on every essay without `affiliation:` — **MED** `build/Contexts.hs:84-89` + `templates/partials/metadata-tail.html:12`. `affiliationField` returns an empty list instead of `noResult`; Hakyll's `$if$` is truthy for empty list fields (the codebase knows this — `tagLinksFieldExcludingScope` uses `noResult` for exactly this reason). Verified in output: `_site/essays/asymmetric-forgetting.html` contains `
` with whitespace-only content. ### 2.8 Library page hard-depends on `content/library.md` — **LOW** `build/Site.hs:675`. `_ <- loadSnapshot libraryIntroId "body"` is a top-level compiler statement (not inside a `field`), so it's a hard failure. The block is documented as "optional prose block"; deleting `content/library.md` breaks the whole `library.html` compile. Contrast the existence-guarded sidecars at `build/Tags.hs:277-283` and `build/Site.hs:843-850`. ### 2.9 Library `primaryPortalOf` reads only list-form `tags:` — **LOW** `build/Site.hs:632-638`. `lookupStringList "tags"` returns `Nothing` for scalar comma form (`tags: research, ai`), which Hakyll's `getTags` accepts. Such an item appears on tag pages but is silently dropped from the library. All current content uses list form — latent. ### 2.10 `allContent` omits me/, memento-mori/, photography from the link graph — **LOW** `build/Patterns.hs:124-133`, used by `build/Backlinks.hs:334,345`. Despite "Every content file the backlinks pass should index," `content/me/index.md` and `content/memento-mori/index.md` (full essays, rendered with `backlinksField`) never have their outgoing links extracted; photography likewise. Either deliberate-but-undocumented or the exact silent omission the module header says it exists to prevent. ### 2.11 Paginated tag pages: split by creation date, sorted by display date — **LOW** `build/Tags.hs:371-377`. `buildPaginateWith (sortAndGroupAt tagPageSize)` partitions via `sortRecentFirst` (creation date), then each page re-sorts with `recentFirstByDisplay` (revision-aware). A recently revised old item stays on a late page but jumps to its top — cross-page ordering is not monotone. Only fires above the 150-item threshold. ### 2.12 `fill:#000` replacement corrupts longer hex colors — **LOW** `build/Filters/Score.hs:118-133` (and `Filters/Viz.hs` `processColors`). The 6-digit pass protects only `#000000`; for `fill:#000080` the 3-digit pass produces `fill:currentColor80` — invalid CSS, silently mangled SVG. Quoted attribute forms are safe; only unquoted style-property forms are exposed. ### 2.13 Source-level preprocessors rewrite inside fenced code blocks — **LOW** `build/Filters/Wikilinks.hs:24-31`, `Filters/Transclusion.hs:18-20`, `Filters/EmbedPdf.hs`. All run on the raw source before Pandoc parses fences: `[[anything]]` in a code block becomes a link; a code-block line that is exactly `{{slug}}` or `{{pdf:...}}` becomes raw HTML. Transclusion's comment ("prevents accidental substitution inside prose or code") is false for full-line directives in code blocks. A live foot-gun for a site that documents its own syntax (ozymandias.md does exactly this). ### 2.14 `domainIcon` matches substrings of the whole URL, not the host — **LOW** `build/Filters/Links.hs:120-153`. `"x.com" `T.isInfixOf` url` etc. — `https://example.org/why-x.com-failed` gets the Twitter icon. Contradicts the strict-hostname discipline `isExternal` documents at lines 95-101 of the same file. Cosmetic (icon only). ### 2.15 `gsubRoute "content/"` strips every occurrence, not just the prefix — **LOW** `build/Site.hs:171,357,417` etc. Hakyll's `gsubRoute` is replace-all; a co-located directory literally named `content` would be silently mangled (`content/essays/slug/content/data.csv` → `essays/slug/data.csv`). Same for `gsubRoute "static/"`. Improbable but silent. ### 2.16 `existsCached` memoizes non-existence for the process lifetime — **LOW** `build/Filters/SourceRefs.hs:160-166`. Under `make watch`, a source file created after first reference stays cached as absent until restart. ### 2.17 Core NITs - `build/Site.hs:42-44`: comment says "eight portals"; the list has nine. Echoed at Site.hs:606 ("the eight") vs line 657's "nine times". - `build/Site.hs:866-877`: random-pages.json comment says "essays + blog posts only" but the rule loads fiction and flat poetry too; uses flat-only `content/poetry/*.md` while the epistemic rule uses `allPoetry` — collection poems are epistemic-indexed but never randomizable. - `build/Utils.hs:64-73`: `authorSlugify` comment claims runs of spaces collapse; code maps each space (`"A B"` → `"a--b"`). Consistent everywhere, so links work; comment wrong. - `build/Utils.hs:31-32`: `readingTime` truncates (`div 200`) — 399 words reports "1 min"; comment implies ceiling semantics. - `build/Pagination.hs:42` + `build/Site.hs:77-82`: hardcoded pattern literals duplicate `Patterns.hs`, defeating that module's stated purpose (Patterns.hs:6-10). - `build/Contexts.hs:174-180`: plain `tagLinksField` returns an empty list rather than `noResult` — `$if(item-tags)$` is true and templates emit empty tag wrappers (author-index.html, item-card.html). - `build/Tags.hs:296-304`: `tagItemCtx` composes `defaultContext`, not `siteCtx`, so `$if(has-monogram)$` never fires on tag pages — monograms render on new.html/library but silently never on tag indexes. - `build/Contexts.hs:485-492`: `dotsField` comment says "1–5" but accepts 0 (`max 0 (min 5 n)`) — `importance: 0` renders five empty circles. - `build/Contexts.hs:375-381`: `descriptionField` doc says `noResult`; code uses `fail` — behaviorally fine under Hakyll 4.16 `$if$` (verified against Hakyll 4.16.7.1 source) but logs `[ERROR]` debug noise per abstract-less page. Same in `abstractField`, `summaryField`, `bibliographyField`. - `build/Filters/Images.hs:233-234`: `webpSrc` interpolated into `srcset` unescaped while sibling `src` goes through `esc`. - `build/Filters/Links.hs:37-46,63-69`: internal PDF links double-classified (`pdf-link` + `link-internal` chrome) despite the "no overlap" comment. - `build/Filters/Smallcaps.hs:31-34` + `Filters/Archive.hs:42-44`: "headers are skipped" only at top level; a Header nested in a Div/BlockQuote is processed, contradicting the comments. Verified clean: no unguarded `head`/`fromJust`/`read`/`!!` hazards in the core modules; filter composition order matches its documenting comments; Hakyll 4.16.7.1 `$if$` treats both `fail` and `noResult` as false. --- ## 3. Haskell build code — feature modules ### 3.1 Stats heatmap day-of-week off-by-one: Sunday clipped out of the SVG — **MED** `build/Stats.hs:185,300,317`. `dowOf d = fromEnum (dayOfWeek d) -- Mon=0..Sun=6` — but `time-1.12.2` is ISO-numbered (verified: `map fromEnum [Monday..Sunday] == [1..7]`). So Sunday lands at y=106 while `svgH` = 104 — every Sunday cell is clipped out of the viewBox and grid row 0 is permanently blank. Relatedly, `weekStart` returns the previous *Sunday* (and for a Sunday, 7 days back), not the "first Monday on or before" its comment claims; builds run on a Sunday also clip the newest column horizontally. ### 3.2 `Commonplace.hs` uses `Char8.pack` — non-ASCII YAML corruption — **MED** `build/Commonplace.hs:143`. `Y.decodeEither' (BS.pack raw)` with `Data.ByteString.Char8` truncates each `Char` to 8 bits — the exact hazard `build/Now.hs:249-253` documents and fixes with `TE.encodeUtf8`. `data/commonplace.yaml` is currently pure ASCII, so latent — but a commonplace book of quotations is the likeliest file to acquire an em-dash or curly quote, which will then either fail the YAML parse or publish mojibake. ### 3.3 Backlinks: links inside tight lists are invisible — **MED** `build/Backlinks.hs:220-226`. `extractLinksWithContext`'s `go` handles `Para`, `BlockQuote`, `Div`, `BulletList`, `OrderedList`, then `go _ = []`. Tight list items (the default `- item` form) are `Plain` blocks, not `Para`, so recursion into list children yields nothing. Every internal link written in a tight list never produces a backlink. `Header`, `Table`, and `DefinitionList` blocks are likewise skipped. The doc comment implies coverage it doesn't deliver. ### 3.4 Stability "age" is the first→last commit span, not time since first commit — **MED** `build/Stability.hs:89-93,99-112`. Docs say "age in days since first commit," but `classify (length dates) (daySpan (last dates) newest)` computes the span between first and most recent *commit*, with no reference to today. A piece written in a one-week burst years ago reports "volatile" forever; time passing without commits can never increase stability. Either the comment or the metric is wrong. ### 3.5 Frontmatter `history:` assumed newest-first; WRITING.md documents oldest-first — **MED** `build/Stability.hs:204-217,299-336` vs `WRITING.md:105-109`. `loadVersionHistory` keeps authored order and all range fields treat the head as newest (`es@(newest:_) -> let oldest = last es`). Git history is newest-first, but WRITING.md's `history:` example is oldest-first. With the documented ordering, `version-history-range` renders reversed ("14 March 2026 – 1 March 2026"), `range-start` returns the newest date, and `version-history-primary` shows the three *oldest* entries. ### 3.6 Archive manifest→provenance join is exact-string, rest of system is normalized — **MED** `build/Archive.hs:269`. `Map.lookup (meUrl me) provByUrl` joins on the raw URL; everywhere else equivalence is `normalizeUrl` (ArchiveIndex filtering, dup detection, ARCHIVE.md:189-192). Editing a manifest URL to a normalization-equivalent form (`http`→`https`, trailing slash, tracking param) silently unpublishes `/archive//` while ArchiveIndex's normalized filter keeps the slug active — links keep pointing at a 404. ### 3.7 Photography `buildPin` computes wrong slug/thumb/title for flat entries — **MED** `build/Photography.hs:354,362`. `slug = takeFileName (takeDirectory fp)` — for a flat `content/photography/foo.md` this yields `"photography"`, so map.json gets `"slug": "photography"`, the title fallback is wrong, and `thumb = "/photography/photography/

"` 404s (flat-single assets route to `/photography/`). PHOTOGRAPHY.md:214 explicitly supports flat singles. Latent — `content/photography/` currently has only `index.md` — but breaks the first geo-tagged flat single. ### 3.8 `geo-precision` fails open: a typo'd "hidden" publishes coordinates — **MED** `build/Photography.hs:347-349,312-320`. Only the exact string matches (`(_, Just "hidden", _) -> return Nothing`); any other value (e.g. `Hidden`, `hiddn`) falls into `roundCoord`, whose catch-all treats unknown values as `city` (~10 km rounding) — publishing coordinates the author meant to suppress. Contradicts the file's own privacy comment (lines 287-289) and the fail-closed precedent for `visibility:` in `build/Archive.hs:77-83`. ### 3.9 Archive state is process-lifetime cached — `watch` goes stale — **LOW** `build/ArchiveIndex.hs:123-146` + `build/Archive.hs:304`. `activeUrls`/`rawIndex`/`rawState` are NOINLINE `unsafePerformIO` CAFs read once per process, and `archiveRules` reads the manifest in `preprocess`. Under `site watch`, edits to `manifest.yaml`, `removed.yaml`, or the regenerated state JSONs are never re-read until restart. One-shot builds unaffected. ### 3.10 Pinned pages render raw ISO in `$last-reviewed$` — **LOW** `build/Stability.hs:166-170`. The git branch formats via `fmtIso` ("1 May 2026"); the IGNORE.txt-pinned branch returns the frontmatter value verbatim ("2026-05-01") — inconsistent display formatting. ### 3.11 Empty/all-comments `manifest.yaml` halts the build — **LOW** `build/Archive.hs:158-170`. An empty YAML stream decodes as `Null`, which fails to parse as `[ManifestEntry]` and takes the `exitFailure` branch — draining the manifest to zero entries is fatal rather than the empty archive the absent-file branch supports. ### 3.12 Backlinks `normaliseUrl` misses directory-form canonical URLs — **LOW** `build/Backlinks.hs:275-281`. Strips `.html` but not `index.html`/trailing slash: a page routed `essays/foo/index.html` keys as `/essays/foo/index`, but a body link authored `/essays/foo/` doesn't match — backlink silently dropped. `build/SimilarLinks.hs:97-99` handles exactly this case and its comment flags the divergence. ### 3.13 SimilarLinks PDF viewer URL not percent-encoded — **LOW** `build/SimilarLinks.hs:155-164`. `viewerUrl = "/pdfjs/web/viewer.html?file=" ++ escapeHtml raw` — `escapeHtml` handles HTML metachars only; a path containing `&`, `?`, `#`, or spaces breaks the `file=` query value. ### 3.14 Photography feed thumbnails only for directory-form entries — **LOW** `build/Photography.hs:449-453`. `imgTag` requires `isDir`; flat singles and series children (`/.md`) get text-only feed entries, against PHOTOGRAPHY.md's "thumbnails embedded inline" (lines 36, 445) and the feed's deliberate inclusion of series children. ### 3.15 Marks: missing confidence/evidence renders a literal "0 TRUST" — **LOW** `build/Marks.hs:272-278,565`. `computeTrust _ _ = 0` with a comment claiming the figure "collapses to the bare frame," but `renderEpistemicFigure` unconditionally calls `renderTrustLabel`, so a piece with `status:` but no `confidence`/`evidence` (a case MARKS.md:696 says should render) displays a prominent center "0" — indistinguishable from an authored zero-trust score. ### 3.16 Feature-module NITs - `build/Catalog.hs:228-235`: two distinct unknown categories render as adjacent duplicate "Other" sections (equal rank, `groupBy` on raw string). - `build/Stats.hs:754-777`: `pageTOC` comment says "nine h2 sections"; lists eleven (matching the eleven rendered). - `build/SimilarLinks.hs:51-54`: comment says "the template caps the display"; the code caps it (`take maxSimilar` at line 80). - `build/Stats.hs:169-171`, `build/Archive.hs:564-569`: "median" is the upper-median for even-length lists. - `build/Backlinks.hs:133-153`: protocol-relative `//host/path` URLs pass `isPageLink` and pollute backlinks.json. - `build/BibExtras.hs:75-98`: `@string`/`@comment`/`@preamble` blocks parsed as citekey entries — only consequential on a citekey/macro-name collision. Verified clean: Marks tick positions/axis order/radii match MARKS.md §3; proved-confidence trust substitution matches §4.3; Archive's fail-closed `visibility` validation, removed.yaml conflict rejection, and double-sided SHA-256 verification all match ARCHIVE.md. --- ## 4. Python & shell tooling ### 4.1 `data/embed-cache-pages.npz.tmp.npz` orphan: explained; cleanup + ignore gaps — **MED** The orphan (mtime May 26) is the fossil of a fixed bug: an earlier embed.py passed a bare path to `np.savez_compressed`, numpy appended `.npz` (verified in numpy's `_savez` source), and the subsequent `os.replace` raised FileNotFoundError, stranding the file. The current file-handle code (`tools/embed.py:173-183`) is correct, but: (a) nothing deletes the stale orphan — **delete it, don't commit it**; (b) the tmp write has no try/finally, so any mid-write exception strands `embed-cache-pages.npz.tmp`; (c) the new `.gitignore` entry is exact-path (`data/embed-cache-pages.npz`) and covers neither `.tmp` nor `.tmp.npz` variants — widen to `data/embed-cache-pages.npz*`; (d) the fixed tmp name means two concurrent runs interleave writes. ### 4.2 Corrupt embed cache crashes instead of being discarded — **MED** `tools/embed.py:154`. The discard path catches `(OSError, KeyError, ValueError)`, but `np.load` on a truncated `.npz` raises `zipfile.BadZipFile` (verified MRO: `BadZipFile → Exception`), and `EOFError` is also uncaught. A half-written cache (exactly what §4.1(b) can produce) makes every subsequent build print "Warning: embedding failed" and leaves similar-links/semantic index stale until the file is manually deleted — the opposite of the docstring's "unreadable → discarding" contract. ### 4.3 embed.py staleness check structurally defeated by stamp-build-time — **MED** `tools/embed.py:195-200` + `Makefile:68`. `needs_update()` compares `_site/**/*.html` mtimes against embed's outputs — but the build order is `embed.py` → `stamp-build-time.py _site`, and the stamper rewrites the footer timestamp in essentially every HTML file each build. So every page is always newer than embed's outputs and the "skip if fresh" fast path never fires: the full paragraph-embedding pass (and model load) runs on every build. The new page cache papers over half the cost; the paragraph pass pays full price every time. Related (`tools/embed.py:297-299`): model/config changes never invalidate outputs — currently masked by this bug; fixing one exposes the other. ### 4.4 archive.py writes provenance/index/state non-atomically — **MED** `tools/archive.py:718-721,734-737,953-957,1077-1080`. All plain `write_text()`. An interrupt mid-write truncates `PROVENANCE.json`; the next build's `json.loads` (line 642) raises an unhandled `JSONDecodeError` — and a truncated provenance is indistinguishable from corruption in a tool whose whole contract is integrity checking. embed.py got atomic-write helpers; archive.py did not. ### 4.5 download-leaflet.sh: checksum verification bypassable — **MED** `tools/download-leaflet.sh:43-47,90`. The early-exit skip checks file existence only (download-model.sh re-verifies on its skip path), and `curl -o "$target"` writes directly to the final path: a download that *fails* `verify_or_warn` aborts via `set -e` *after* the bad file is in place, and the next run's existence check accepts it permanently. A MITM'd unpkg.com download survives one failed run and is silently vendored on the next. ### 4.6 Other download/convert scripts leave partial files in final paths — **LOW** `tools/download-model.sh:84`: interrupted curl leaves a partial `model_quantized.onnx`; caught today only because model-checksums.sha256 pins all five files — any unpinned file would persist forever. Use `-o "$dst.part" && mv`. `tools/convert-images.sh:33`: interrupted cwebp leaves a partial `.webp` that the `-nt` staleness gate then skips forever — a truncated WebP ships until manually deleted. ### 4.7 archive.py robustness gaps — **LOW** - `tools/archive.py:788,795-799`: provenance missing the `artifact` key makes `prev_artifact == slug_dir`, then `sha256_of` raises an uncaught `IsADirectoryError` instead of the structured "prior snapshot incomplete" error. - `tools/archive.py:614-617,938-940,1066-1068`: non-dict manifest entries (`- https://example.com` instead of `- url: ...`) crash with `AttributeError: 'str' object has no attribute 'get'`. - `tools/archive.py:896`: `wayback_save` concatenates the raw URL (contrast `wayback_lookup` at 909, which uses `quote(url, safe="")`). ### 4.8 add-popup-source.sh: dead CSP reminder + unvalidated nginx interpolation — **LOW** `tools/add-popup-source.sh:214`: the connect-src reminder gates on `[[ "$NEEDS_PROXY" -eq 0 && -n "$UPSTREAM_HOST" ]]`, but `UPSTREAM_HOST` is only set in the `NEEDS_PROXY -eq 1` branch (lines 124-131) — the reminder can never print, and the no-proxy case is exactly when it's needed (the provider will be CSP-blocked with no hint). Line 71: `NAME` from a free-text prompt is interpolated into `location /proxy/$NAME/`/`set $upstream_$NAME` with no `^[a-z0-9-]+$` validation (import-photo.sh validates; this doesn't). ### 4.9 refreeze.sh deletes the freeze before the replacement succeeds — **LOW** `tools/refreeze.sh:13-16`. `rm -f "$FREEZE"` then `cabal freeze`; a failed resolve leaves no freeze file (recoverable via git, but write-temp-then-move is safer). ### 4.10 embed.py / atomic-write NITs — **LOW/NIT** `tools/embed.py:109-115`: `atomic_write_bytes` uses a fixed `.tmp` name (concurrent-run collision) and no `fsync` before `os.replace` (power loss can leave an empty target). Same pattern in `_atomic_write_yaml` of extract-exif.py:377, extract-palette.py:65, extract-dimensions.py:65. `tools/embed.py:144`: NpzFile never closed — use `with np.load(...) as npz:`. ### 4.11 Tooling NITs - `tools/import-photo.sh:147-155`: on `mogrify -strip` failure the EXIF-laden JPEG (GPS, serials) remains under `content/`, where `make build`'s `git add content/` could auto-commit it. Delete `$TARGET` on that failure path. - `tools/hooks/pre-commit-marks.sh:28-31`: `awk '{ print $2 }'` truncates paths with spaces; the `status:` probe reads the working tree, not the staged blob. Advisory-only hook. - `tools/preset-signing-passphrase.sh:30`: `echo -n "$PASSPHRASE"` eats a passphrase starting with `-e`/`-n`/`-E`; use `printf '%s'`. - `tools/stamp-build-time.py:52-54`: in-place non-atomic rewrite of `_site/` HTML. - `tools/archive.py:244`: `pdftotext` without `--`; a slug starting with `-` parses as an option. Same in extract-exif.py:159. - `tools/monolith-version.txt` records a sha256 (matches the binary today, verified) but `find_monolith()` never checks it. Verified clean: sign-site.sh (atomic sig writes, post-pass manifest verification); compress-assets.sh and download-pdfjs.sh (mktemp + EXIT trap, hash verified before extraction); audit-marks.py, viz_theme.py, extract-dimensions.py, extract-palette.py; embed.py's faiss `-1` padding is safely filtered; `uv lock --check` passes; model-checksums.sha256 pins all five model files. --- ## 5. Frontend JavaScript ### 5.1 Score-reader pages never restore theme/settings — **MED** `templates/score-reader-default.html:10` + `static/js/theme.js:12-13`. The template loads `theme.js` without `utils.js` (unlike head.html:66-67), so `window.lnUtils.safeStorage` is undefined and theme/text-size/focus-mode/ reduce-motion all silently fail to restore — a dark-theme user gets a light flash-and-stay on every score page. Compounding: settings.js (line 15; the template does render the settings toggle) falls back to its no-op store, so theme picks made on score pages never persist either. ### 5.2 search-filters.js: epistemic filters silently bypass clean-URL pages — **MED** `static/js/search-filters.js:117-125`. `normUrl()` returns `u.pathname` verbatim and looks it up in `epistemicMeta[url]`. Verified: `_site/data/epistemic-meta.json` keys include `/essays/beyond-comorbidity-indices/index.html` while rendered result links use `/essays/beyond-comorbidity-indices/`. The lookup misses, `passes(null)` returns true ("no metadata = don't filter"), so every directory-style page bypasses all active epistemic filters. Flat `.html` pages match fine, which hides the bug. ### 5.3 viz.js ignores the cappuccino theme — **MED** `static/js/viz.js:94-99`. `isDark()` knows only `'dark'`/`'light'`/OS-preference, but theme.js/settings.js support `'cappuccino'` — a dark-brown theme (`--bg: #553a28`, base.css:203). With OS-light + cappuccino, charts render the LIGHT config (near-black marks and axis labels) on a dark background. ### 5.4 collapse.js localStorage keys collide across pages — **MED** `static/js/collapse.js:44,83`. Key is `'section-collapsed:' + heading.id` with no pathname namespace (contrast annotations.js). Pandoc auto-slugs (`#introduction`, `#background`) recur across essays, so collapsing "Introduction" on one essay collapses it everywhere. Also uses raw `localStorage` rather than `lnUtils.safeStorage`. ### 5.5 semantic-search.js: stale-response race + duplicate index fetch — **MED** `static/js/semantic-search.js:117-144`. `runSearch` has no generation token; overlapping queries render in promise-resolution order, so an older query's hits can replace a newer one's (with `setStatus('')` masking it). `loadIndex()` (42-59) has no in-flight-promise dedup (unlike `loadModel`'s `loadModelPromise`), so concurrent first searches fetch `semantic-index.bin` + `semantic-meta.json` twice. ### 5.6 lightbox.js: aria-modal with no focus trap, no keyboard activation — **MED** `static/js/lightbox.js`. Overlay sets `role="dialog"` + `aria-modal="true"` but has no Tab handling (gallery.js's `trapTab` at 235-257 shows the in-repo pattern) — focus walks into the obscured page. Trigger images get only a `click` listener and no `tabindex`/keydown, so keyboard users can't open it; `close()` focuses a non-focusable ``, which no-ops. ### 5.7 Frontend LOWs - `static/js/gallery.js:122-125,270-275`: math/score overlay is click-only (no role/tabindex/keydown); `closeOverlay()` focus-returns to a non-focusable div — focus drops to ``. - `static/js/popups.js:478,515`: the Wikipedia provider's `decodeURIComponent` runs synchronously before the `.catch` attaches — a malformed percent sequence in a link path throws an uncaught `URIError` per hover. - `static/js/popups.js:359,390`: fetched monogram SVG injected via `innerHTML` unescaped — the single unsanitized path in an otherwise fully escaped pipeline. Build-authored content, so not exploitable today; the comment acknowledges the trust assumption. - `static/js/citations.js`: dead file — no template loads it; popups.js supersedes it. If ever re-added it would double-bind and inject bibliography innerHTML without popups.js's cloned-node hardening. Delete. - `static/js/nav.js:26,30-31`: raw `localStorage` unguarded; if storage access throws, the throw lands before `toggle.addEventListener`, leaving the Portals toggle completely dead (utils.js exists precisely for this). - `static/js/annotations.js:209-215`: marks are mouse-only; the tooltip's Delete button is unreachable by keyboard (only recourse is the all-or-nothing "Clear Annotations"). - `static/js/search.js:10`: unguarded `new PagefindUI(...)` — if the pagefind bundle 404s, the ReferenceError aborts the whole handler including the `?q=` pre-fill that the selection-popup "Here" flow depends on. - `static/js/semantic-search.js:55-56,96-107`: no `vectors.length === meta.length * DIM` consistency check — a stale CDN-cached mismatch yields NaN scores and silently garbage ranking. (Current files verified consistent: 1,256,448 bytes = 818 × 384 × 4.) - `static/js/transclude.js:149-151` + `collapse.js:111-114`: nested transcludes render a bare placeholder (no rescan of injected content); `reinitCollapse` is not idempotent (would stack toggle buttons if ever called twice on the same container). - `static/js/popups.js:985-988,1009-1014`: `daysBetween` uses `Math.abs`, so future dates render "N days ago" (now.js:17 handles this correctly). ### 5.8 Frontend NITs - `static/js/copy.js:20-22,39`: code-less `

` fallback copies the
  "copy" button label along with content.
- `static/js/score-reader.js:50`: URL rewritten to `?p=1` on every load
  even without a `?p=` param.
- `static/js/search-filters.js:271`: `parseInt(v,10) || 0` turns junk
  threshold input into an active ≥0 filter that matches everything.
- `static/js/selection-popup.js:90-95`: shift-keyup while typing capitals
  in the annotation picker re-summons the selection toolbar over it.

Verified clean: the semantic-search ↔ embed.py contract post-model-split
(DIM 384, 818-entry meta, no prefix for MiniLM — the nomic
`search_document:` prefix is confined to the build-only page path); XSS
escaping across semantic-search, popups providers, map tooltips,
annotations (sole exception §5.7 monogram); theme.js ↔ settings.js
storage schema identical; all JS selector contracts against templates
(including the uncommitted head/nav edits); popups/sidenotes
double-init guards; settings.js and gallery.js focus traps.

---

## 6. Templates & content

### 6.1 Draft in undocumented location is never built — **MED**

`content/drafts/inclusionist-manifesto.md`. WRITING.md:34 says drafts go
under `content/drafts/essays/`; `draftEssayPattern`
(`build/Patterns.hs:46-49`) matches only that, so this file is invisible
even to `make watch`/`make dev` — silently orphaned.

### 6.2 SIMD/PQC essay `repository:` URL 404s — **MED**

`content/essays/where-does-simd-help-post-quantum-cryptography/index.md:24`.
`https://git.levineuwirth.org/where-simd-helps` is missing the owner
segment — verified HTTP 404, while the sibling essay's
`.../neuwirth/beyond_comorbidity_indices` returns 200.

### 6.3 Tracked drafts contradict the gitignore policy — **MED**

`.gitignore:88` ignores `content/drafts/` as local-only "working notes,"
but `git ls-files -i -c` shows four tracked drafts
(`digital_progeny.md`, `modern_idolatry.md`, `test-essay.md`,
`university_care.md`) — ignore rules don't untrack, so edits are
auto-staged by `make build` and pushed publicly by deploy. The over-broad
`**/.env.*` pattern also matches the tracked `.env.example`.

### 6.4 Template/content LOWs and NITs

- `content/colophon.md:5`: `modified:` is dead frontmatter — nothing
  reads it; `$date-modified$` (page-footer.html:108) is Hakyll's
  `dateField` over the `date` key.
- Seven files end frontmatter with a valueless `confidence-history:`
  (YAML null; WRITING.md:97 documents a list of ints) — harmless, but
  `content/essays/scaling_outage.md` also retains the full WRITING.md
  scaffold comments in a published essay.
- `static/images/canto31.jpg`: still 4.0 MB (prior-audit §6.1 unfixed).
- `templates/blog-post.html:25,34`: `id="similar-links"` appears twice in
  mutually exclusive `$if$` branches — safe, fragile under edit.
- `content/drafts/essays/digital_progeny.md`: title duplicates the
  published "The Specification Dilemma" — stale draft.
- Frontmatter flags `home:`/`library:`/`links:`/`search:`/`portal:` are
  consumed (head.html CSS gates, default.html:6 `data-portal`) but
  undocumented in WRITING.md.

Verified clean: all `$partial(...)$` includes resolve; all ~140 distinct
template variables have context providers; no missing `alt` attributes,
tag-balance failures, or within-page duplicate IDs in composed pages; all
26 CSS files referenced by head.html exist; sampled enum values across
all sections are legal per WRITING.md and Contexts.hs validation lists.

---

## 7. Documentation / spec drift (WRITING.md, README.md)

### 7.1 `js:` page-script paths documented as content-relative; emitted root-relative — **MED**

`WRITING.md:773-775` vs `templates/default.html:37`
(`