Add 2026-06-09 repository audit findings
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
7c5354efa7
commit
70ad44e9f4
|
|
@ -0,0 +1,931 @@
|
||||||
|
---
|
||||||
|
title: Repository audit
|
||||||
|
date: 2026-06-09
|
||||||
|
---
|
||||||
|
|
||||||
|
# Repository audit — levineuwirth.org (2026-06-09)
|
||||||
|
|
||||||
|
Comprehensive audit of the repo on `main` at commit `620b974` (working tree
|
||||||
|
modified: branding refresh across `static/` + `templates/partials/`, plus
|
||||||
|
`tools/embed.py` rework; untracked `static/og-image.png`,
|
||||||
|
`templates/partials/logo-mark.svg`, `data/embed-cache-pages.npz.tmp.npz`).
|
||||||
|
|
||||||
|
Severity legend: **HIGH** (likely to break a build, cause data loss, or
|
||||||
|
expose a security weakness) — **MED** (latent bug, brittleness, or
|
||||||
|
documentation drift) — **LOW** (minor robustness gap or fragile assumption) —
|
||||||
|
**NIT** (style, polish, or paranoia).
|
||||||
|
|
||||||
|
Numbers are file:line against the working tree at audit time. Findings
|
||||||
|
marked "verified" were reproduced empirically (solver runs, built `_site/`
|
||||||
|
output inspection, live HTTP checks, binary parsing); the rest were
|
||||||
|
confirmed by reading the code.
|
||||||
|
|
||||||
|
Prior audit: `AUDIT.md` (2026-05-07). Follow-up status in §10.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Build & dependency chain
|
||||||
|
|
||||||
|
### 1.1 `cabal.project.freeze` is unsolvable again — next clean build fails — **HIGH**
|
||||||
|
|
||||||
|
`cabal build --dry-run` fails today (verified): the freeze pins
|
||||||
|
`distributive ==0.6.2.1`, but the system (pacman) GHC package db has
|
||||||
|
`comonad-5.0.10` built against `distributive-0.6.3`:
|
||||||
|
|
||||||
|
```
|
||||||
|
rejecting: distributive-0.6.3/installed... (constraint from
|
||||||
|
cabal.project.freeze requires ==0.6.2.1)
|
||||||
|
After searching the rest of the dependency tree exhaustively...
|
||||||
|
```
|
||||||
|
|
||||||
|
The conflict set also names aeson, warp, hakyll, http2, semigroupoids. This
|
||||||
|
is the same failure mode as prior-audit §1.1 — that audit's specific aeson
|
||||||
|
pin was fixed (now 2.2.2.0/hashable 1.4.7.0), but a different package broke
|
||||||
|
the same way after a system update. Recent builds succeed only off the
|
||||||
|
cached `dist-newstyle/cache/plan.json`; the freeze file has since changed,
|
||||||
|
so the next cabal invocation re-solves and fails. Because `make deploy`
|
||||||
|
starts with `make clean`, the next deploy hits this. `levineuwirth.cabal`'s
|
||||||
|
own bounds are compatible with the freeze — the conflict is
|
||||||
|
freeze-vs-installed-db, not freeze-vs-cabal-file.
|
||||||
|
|
||||||
|
Fix: `tools/refreeze.sh` (written for exactly this post-`pacman -Syu`
|
||||||
|
situation). The underlying fragility — freezing against a mutable system
|
||||||
|
package db — remains; consider documenting the refreeze step as part of any
|
||||||
|
system-upgrade ritual. *(In progress at time of writing.)*
|
||||||
|
|
||||||
|
### 1.2 Missing `data/archive-index.json` / `archive-state.json` crashes the build — **HIGH**
|
||||||
|
|
||||||
|
`build/ArchiveIndex.hs:134-146`. The module doc (lines 18-22) promises "An
|
||||||
|
absent or malformed file degrades safely: an empty index makes the link
|
||||||
|
consumers no-op; an absent state file makes every entry @Live@." But
|
||||||
|
`rawIndex = unsafePerformIO $ do decoded <- A.eitherDecodeFileStrict' indexPath`
|
||||||
|
(and identically `rawState`) never checks `doesFileExist`, and aeson's
|
||||||
|
`eitherDecodeFileStrict'` throws an uncaught `IOException` on a missing
|
||||||
|
file (verified: `withBinaryFile: does not exist`). Both files are
|
||||||
|
gitignored (`.gitignore:84-85`), so a fresh clone or a no-`.venv` build —
|
||||||
|
the exact path `build/Archive.hs:20-24` promises to support — throws when
|
||||||
|
the CAF is first forced. Contrast `readUrlSet` (line 109) in the same file,
|
||||||
|
which guards correctly. Currently latent on this machine only because both
|
||||||
|
generated files happen to exist.
|
||||||
|
|
||||||
|
### 1.3 `embed.py` `trust_remote_code=True` executes unpinned third-party code — **HIGH**
|
||||||
|
|
||||||
|
`tools/embed.py:329` (line ~341 in the uncommitted version). The new
|
||||||
|
page-model load is
|
||||||
|
`SentenceTransformer(PAGE_MODEL_NAME, revision=PAGE_MODEL_REVISION, trust_remote_code=True)`.
|
||||||
|
The `revision` arg pins only the `nomic-ai/nomic-embed-text-v1.5` repo; the
|
||||||
|
actual modeling code is pulled via `auto_map` from a *different* repo —
|
||||||
|
verified in the local HF cache: the executed code lives under
|
||||||
|
`transformers_modules/nomic_hyphen_ai/nomic_hyphen_bert_hyphen_2048/...`,
|
||||||
|
i.e. `nomic-ai/nomic-bert-2048` at its current head, which nothing pins. A
|
||||||
|
compromise of that second repo runs arbitrary Python at build time, in a
|
||||||
|
repo whose every other download path (download-model.sh, pdfjs, leaflet) is
|
||||||
|
sha256-pinned. The comment "Both pins are deliberate" is therefore
|
||||||
|
misleading. Fix: pin via `code_revision`, or run with `HF_HUB_OFFLINE=1`
|
||||||
|
after first fetch, or document the accepted risk.
|
||||||
|
|
||||||
|
### 1.4 Working-tree commit hazard: tracked templates reference untracked files — **HIGH (process)**
|
||||||
|
|
||||||
|
`templates/partials/nav.html:5` (tracked, modified) adds
|
||||||
|
`$partial("templates/partials/logo-mark.svg")$` and
|
||||||
|
`templates/partials/head.html` references `/og-image.png` — both target
|
||||||
|
files are **untracked** (no git history). Committing the template diff
|
||||||
|
without `git add`-ing both breaks every page's Hakyll build on a fresh
|
||||||
|
clone (`$partial$` aborts compilation) and 404s the og:image. They must
|
||||||
|
land in the same commit. Conversely, `data/embed-cache-pages.npz.tmp.npz`
|
||||||
|
must **not** be committed (see §4.1). The partial itself is safe as a
|
||||||
|
Hakyll template (verified: zero `$` characters; `match "templates/**"`
|
||||||
|
compiles it).
|
||||||
|
|
||||||
|
### 1.5 `einops` dependency: undocumented, unbounded, imported nowhere — **LOW**
|
||||||
|
|
||||||
|
`pyproject.toml:27` adds `einops>=0.8.2`. No import anywhere in
|
||||||
|
`tools/`/`build/`/`static/js/`; its only consumer is nomic's
|
||||||
|
`trust_remote_code` module (§1.3). Every sibling dependency has an
|
||||||
|
explanatory comment and an upper bound per the file's own stated policy
|
||||||
|
("Upper bounds are intentionally generous (next major) but always
|
||||||
|
present"); einops has neither. `uv lock --check` passes (0.8.2 pinned).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Haskell build code — core
|
||||||
|
|
||||||
|
### 2.1 Nav, home grid, and library link `/fiction/` and `/poetry/` — confirmed 404s — **MED**
|
||||||
|
|
||||||
|
`build/Site.hs:50-60` (`homePortals` contains `("Fiction","fiction")`,
|
||||||
|
`("Poetry","poetry")`), `templates/partials/nav.html:56,61`,
|
||||||
|
`templates/library.html:44,58`. No rule generates either index: fiction and
|
||||||
|
poetry are not in `tagIndexable` (`build/Patterns.hs:148-151` = essays +
|
||||||
|
blog + photos) and Site.hs has no landing rule. Verified: `_site/fiction`
|
||||||
|
does not exist; `_site/poetry/` has no `index.html`. nginx has no
|
||||||
|
redirects. Both links 404 in production today.
|
||||||
|
|
||||||
|
### 2.2 Tag/route collisions guarded for `photography` only — **MED**
|
||||||
|
|
||||||
|
`build/Tags.hs:98-99`. `tagIdentifier` maps tag `t` → `t ++ "/index.html"`;
|
||||||
|
`sectionOwnedTopLevelTags = ["photography"]` is the only guard. A
|
||||||
|
tagIndexable item tagged `music` (or `music/x`, which expands to `music`)
|
||||||
|
emits `music/index.html`, already owned by the music index route
|
||||||
|
(`build/Site.hs:486-487`); similarly `essays`, `blog`, `cv`, `archive`,
|
||||||
|
`authors`, `bibliography`. Hakyll does not error on duplicate routes — one
|
||||||
|
silently overwrites the other.
|
||||||
|
|
||||||
|
### 2.3 Sidenotes filter destroys the documented no-JS fallback — **MED**
|
||||||
|
|
||||||
|
`build/Filters/Sidenotes.hs:30-36` vs `static/css/sidenotes.css:125-135`.
|
||||||
|
The module doc claims the Pandoc `<section class="footnotes">` "serves as
|
||||||
|
fallback," but `apply` replaces every `Note`, so the writer never emits the
|
||||||
|
section. CSS depends on it below 1500px. Verified in output:
|
||||||
|
`_site/essays/scaling_outage.html` has 3 `class="sidenote"` and zero
|
||||||
|
`footnotes` occurrences. With JS disabled, footnote content is invisible on
|
||||||
|
narrow viewports. The comment, the CSS, and ozymandias.md's own prose all
|
||||||
|
contradict actual behavior.
|
||||||
|
|
||||||
|
### 2.4 Sidenote bodies rendered without the KaTeX writer — **MED**
|
||||||
|
|
||||||
|
`build/Filters/Sidenotes.hs:103-115`. `inlinesToHtml`/`blocksToHtml` use
|
||||||
|
`writeHtml5String (def :: WriterOptions)` (PlainMath), while the main
|
||||||
|
pipeline uses `KaTeX ""` (`build/Compilers.hs:47`). Math inside a footnote
|
||||||
|
never gets `<span class="math inline">\(...\)</span>`, so KaTeX never
|
||||||
|
renders it — degrades to plain italics, silently inconsistent with body
|
||||||
|
math.
|
||||||
|
|
||||||
|
### 2.5 SourceRefs whitelist vs `/source/` serving whitelist have drifted — **MED**
|
||||||
|
|
||||||
|
`build/Filters/SourceRefs.hs:114-141` vs `build/Site.hs:217-240`. Site.hs:209
|
||||||
|
says "must stay aligned with 'isSourcePath'". Mismatches: SourceRefs wraps
|
||||||
|
`content/` and `yaml-source/` (no Site counterpart); `static/` + any known
|
||||||
|
ext vs Site's `static/js/**`/`static/css/**` only; `tools/` + any ext vs
|
||||||
|
Site's `tools/**.sh`/`tools/**.py`; `data/` at any depth vs Site's
|
||||||
|
top-level `data/*.{json,yaml,md,bib}`. Each mismatch yields a wrapped
|
||||||
|
source-ref whose popup fetch 404s (Forgejo href fallback still works).
|
||||||
|
Inverse: Site serves `data/*.bib` but `.bib` is missing from
|
||||||
|
`hasKnownExt` — dead whitelist entry.
|
||||||
|
|
||||||
|
### 2.6 `epistemicEntry` ignores `confidence: proved` — **MED**
|
||||||
|
|
||||||
|
`build/Site.hs:1014-1024`. Comment: "Compute overall-score the same way
|
||||||
|
Contexts.overallScoreField does," but it uses
|
||||||
|
`readMaybe =<< lookupString "confidence" meta`, which is `Nothing` for
|
||||||
|
`"proved"`/`"proven"`, whereas `Contexts.overallScoreField`
|
||||||
|
(`build/Contexts.hs:574-576`) substitutes 100 via `isProvedConfidence`.
|
||||||
|
Proved pages get no `score` in `data/epistemic-meta.json` and export the
|
||||||
|
raw string under `confidence`, so client-side filtering silently misses
|
||||||
|
them.
|
||||||
|
|
||||||
|
### 2.7 Empty affiliation `<div>` ships on every essay without `affiliation:` — **MED**
|
||||||
|
|
||||||
|
`build/Contexts.hs:84-89` + `templates/partials/metadata-tail.html:12`.
|
||||||
|
`affiliationField` returns an empty list instead of `noResult`; Hakyll's
|
||||||
|
`$if$` is truthy for empty list fields (the codebase knows this —
|
||||||
|
`tagLinksFieldExcludingScope` uses `noResult` for exactly this reason).
|
||||||
|
Verified in output: `_site/essays/asymmetric-forgetting.html` contains
|
||||||
|
`<div class="meta-row meta-affiliation">` with whitespace-only content.
|
||||||
|
|
||||||
|
### 2.8 Library page hard-depends on `content/library.md` — **LOW**
|
||||||
|
|
||||||
|
`build/Site.hs:675`. `_ <- loadSnapshot libraryIntroId "body"` is a
|
||||||
|
top-level compiler statement (not inside a `field`), so it's a hard
|
||||||
|
failure. The block is documented as "optional prose block"; deleting
|
||||||
|
`content/library.md` breaks the whole `library.html` compile. Contrast the
|
||||||
|
existence-guarded sidecars at `build/Tags.hs:277-283` and
|
||||||
|
`build/Site.hs:843-850`.
|
||||||
|
|
||||||
|
### 2.9 Library `primaryPortalOf` reads only list-form `tags:` — **LOW**
|
||||||
|
|
||||||
|
`build/Site.hs:632-638`. `lookupStringList "tags"` returns `Nothing` for
|
||||||
|
scalar comma form (`tags: research, ai`), which Hakyll's `getTags`
|
||||||
|
accepts. Such an item appears on tag pages but is silently dropped from
|
||||||
|
the library. All current content uses list form — latent.
|
||||||
|
|
||||||
|
### 2.10 `allContent` omits me/, memento-mori/, photography from the link graph — **LOW**
|
||||||
|
|
||||||
|
`build/Patterns.hs:124-133`, used by `build/Backlinks.hs:334,345`. Despite
|
||||||
|
"Every content file the backlinks pass should index," `content/me/index.md`
|
||||||
|
and `content/memento-mori/index.md` (full essays, rendered with
|
||||||
|
`backlinksField`) never have their outgoing links extracted; photography
|
||||||
|
likewise. Either deliberate-but-undocumented or the exact silent omission
|
||||||
|
the module header says it exists to prevent.
|
||||||
|
|
||||||
|
### 2.11 Paginated tag pages: split by creation date, sorted by display date — **LOW**
|
||||||
|
|
||||||
|
`build/Tags.hs:371-377`. `buildPaginateWith (sortAndGroupAt tagPageSize)`
|
||||||
|
partitions via `sortRecentFirst` (creation date), then each page re-sorts
|
||||||
|
with `recentFirstByDisplay` (revision-aware). A recently revised old item
|
||||||
|
stays on a late page but jumps to its top — cross-page ordering is not
|
||||||
|
monotone. Only fires above the 150-item threshold.
|
||||||
|
|
||||||
|
### 2.12 `fill:#000` replacement corrupts longer hex colors — **LOW**
|
||||||
|
|
||||||
|
`build/Filters/Score.hs:118-133` (and `Filters/Viz.hs` `processColors`).
|
||||||
|
The 6-digit pass protects only `#000000`; for `fill:#000080` the 3-digit
|
||||||
|
pass produces `fill:currentColor80` — invalid CSS, silently mangled SVG.
|
||||||
|
Quoted attribute forms are safe; only unquoted style-property forms are
|
||||||
|
exposed.
|
||||||
|
|
||||||
|
### 2.13 Source-level preprocessors rewrite inside fenced code blocks — **LOW**
|
||||||
|
|
||||||
|
`build/Filters/Wikilinks.hs:24-31`, `Filters/Transclusion.hs:18-20`,
|
||||||
|
`Filters/EmbedPdf.hs`. All run on the raw source before Pandoc parses
|
||||||
|
fences: `[[anything]]` in a code block becomes a link; a code-block line
|
||||||
|
that is exactly `{{slug}}` or `{{pdf:...}}` becomes raw HTML.
|
||||||
|
Transclusion's comment ("prevents accidental substitution inside prose or
|
||||||
|
code") is false for full-line directives in code blocks. A live foot-gun
|
||||||
|
for a site that documents its own syntax (ozymandias.md does exactly
|
||||||
|
this).
|
||||||
|
|
||||||
|
### 2.14 `domainIcon` matches substrings of the whole URL, not the host — **LOW**
|
||||||
|
|
||||||
|
`build/Filters/Links.hs:120-153`. `"x.com" `T.isInfixOf` url` etc. —
|
||||||
|
`https://example.org/why-x.com-failed` gets the Twitter icon. Contradicts
|
||||||
|
the strict-hostname discipline `isExternal` documents at lines 95-101 of
|
||||||
|
the same file. Cosmetic (icon only).
|
||||||
|
|
||||||
|
### 2.15 `gsubRoute "content/"` strips every occurrence, not just the prefix — **LOW**
|
||||||
|
|
||||||
|
`build/Site.hs:171,357,417` etc. Hakyll's `gsubRoute` is replace-all; a
|
||||||
|
co-located directory literally named `content` would be silently mangled
|
||||||
|
(`content/essays/slug/content/data.csv` → `essays/slug/data.csv`). Same
|
||||||
|
for `gsubRoute "static/"`. Improbable but silent.
|
||||||
|
|
||||||
|
### 2.16 `existsCached` memoizes non-existence for the process lifetime — **LOW**
|
||||||
|
|
||||||
|
`build/Filters/SourceRefs.hs:160-166`. Under `make watch`, a source file
|
||||||
|
created after first reference stays cached as absent until restart.
|
||||||
|
|
||||||
|
### 2.17 Core NITs
|
||||||
|
|
||||||
|
- `build/Site.hs:42-44`: comment says "eight portals"; the list has nine.
|
||||||
|
Echoed at Site.hs:606 ("the eight") vs line 657's "nine times".
|
||||||
|
- `build/Site.hs:866-877`: random-pages.json comment says "essays + blog
|
||||||
|
posts only" but the rule loads fiction and flat poetry too; uses
|
||||||
|
flat-only `content/poetry/*.md` while the epistemic rule uses
|
||||||
|
`allPoetry` — collection poems are epistemic-indexed but never
|
||||||
|
randomizable.
|
||||||
|
- `build/Utils.hs:64-73`: `authorSlugify` comment claims runs of spaces
|
||||||
|
collapse; code maps each space (`"A B"` → `"a--b"`). Consistent
|
||||||
|
everywhere, so links work; comment wrong.
|
||||||
|
- `build/Utils.hs:31-32`: `readingTime` truncates (`div 200`) — 399 words
|
||||||
|
reports "1 min"; comment implies ceiling semantics.
|
||||||
|
- `build/Pagination.hs:42` + `build/Site.hs:77-82`: hardcoded pattern
|
||||||
|
literals duplicate `Patterns.hs`, defeating that module's stated purpose
|
||||||
|
(Patterns.hs:6-10).
|
||||||
|
- `build/Contexts.hs:174-180`: plain `tagLinksField` returns an empty list
|
||||||
|
rather than `noResult` — `$if(item-tags)$` is true and templates emit
|
||||||
|
empty tag wrappers (author-index.html, item-card.html).
|
||||||
|
- `build/Tags.hs:296-304`: `tagItemCtx` composes `defaultContext`, not
|
||||||
|
`siteCtx`, so `$if(has-monogram)$` never fires on tag pages — monograms
|
||||||
|
render on new.html/library but silently never on tag indexes.
|
||||||
|
- `build/Contexts.hs:485-492`: `dotsField` comment says "1–5" but accepts
|
||||||
|
0 (`max 0 (min 5 n)`) — `importance: 0` renders five empty circles.
|
||||||
|
- `build/Contexts.hs:375-381`: `descriptionField` doc says `noResult`;
|
||||||
|
code uses `fail` — behaviorally fine under Hakyll 4.16 `$if$` (verified
|
||||||
|
against Hakyll 4.16.7.1 source) but logs `[ERROR]` debug noise per
|
||||||
|
abstract-less page. Same in `abstractField`, `summaryField`,
|
||||||
|
`bibliographyField`.
|
||||||
|
- `build/Filters/Images.hs:233-234`: `webpSrc` interpolated into `srcset`
|
||||||
|
unescaped while sibling `src` goes through `esc`.
|
||||||
|
- `build/Filters/Links.hs:37-46,63-69`: internal PDF links double-classified
|
||||||
|
(`pdf-link` + `link-internal` chrome) despite the "no overlap" comment.
|
||||||
|
- `build/Filters/Smallcaps.hs:31-34` + `Filters/Archive.hs:42-44`:
|
||||||
|
"headers are skipped" only at top level; a Header nested in a
|
||||||
|
Div/BlockQuote is processed, contradicting the comments.
|
||||||
|
|
||||||
|
Verified clean: no unguarded `head`/`fromJust`/`read`/`!!` hazards in the
|
||||||
|
core modules; filter composition order matches its documenting comments;
|
||||||
|
Hakyll 4.16.7.1 `$if$` treats both `fail` and `noResult` as false.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Haskell build code — feature modules
|
||||||
|
|
||||||
|
### 3.1 Stats heatmap day-of-week off-by-one: Sunday clipped out of the SVG — **MED**
|
||||||
|
|
||||||
|
`build/Stats.hs:185,300,317`. `dowOf d = fromEnum (dayOfWeek d) -- Mon=0..Sun=6`
|
||||||
|
— but `time-1.12.2` is ISO-numbered (verified:
|
||||||
|
`map fromEnum [Monday..Sunday] == [1..7]`). So Sunday lands at y=106 while
|
||||||
|
`svgH` = 104 — every Sunday cell is clipped out of the viewBox and grid
|
||||||
|
row 0 is permanently blank. Relatedly, `weekStart` returns the previous
|
||||||
|
*Sunday* (and for a Sunday, 7 days back), not the "first Monday on or
|
||||||
|
before" its comment claims; builds run on a Sunday also clip the newest
|
||||||
|
column horizontally.
|
||||||
|
|
||||||
|
### 3.2 `Commonplace.hs` uses `Char8.pack` — non-ASCII YAML corruption — **MED**
|
||||||
|
|
||||||
|
`build/Commonplace.hs:143`. `Y.decodeEither' (BS.pack raw)` with
|
||||||
|
`Data.ByteString.Char8` truncates each `Char` to 8 bits — the exact hazard
|
||||||
|
`build/Now.hs:249-253` documents and fixes with `TE.encodeUtf8`.
|
||||||
|
`data/commonplace.yaml` is currently pure ASCII, so latent — but a
|
||||||
|
commonplace book of quotations is the likeliest file to acquire an em-dash
|
||||||
|
or curly quote, which will then either fail the YAML parse or publish
|
||||||
|
mojibake.
|
||||||
|
|
||||||
|
### 3.3 Backlinks: links inside tight lists are invisible — **MED**
|
||||||
|
|
||||||
|
`build/Backlinks.hs:220-226`. `extractLinksWithContext`'s `go` handles
|
||||||
|
`Para`, `BlockQuote`, `Div`, `BulletList`, `OrderedList`, then `go _ = []`.
|
||||||
|
Tight list items (the default `- item` form) are `Plain` blocks, not
|
||||||
|
`Para`, so recursion into list children yields nothing. Every internal
|
||||||
|
link written in a tight list never produces a backlink. `Header`, `Table`,
|
||||||
|
and `DefinitionList` blocks are likewise skipped. The doc comment implies
|
||||||
|
coverage it doesn't deliver.
|
||||||
|
|
||||||
|
### 3.4 Stability "age" is the first→last commit span, not time since first commit — **MED**
|
||||||
|
|
||||||
|
`build/Stability.hs:89-93,99-112`. Docs say "age in days since first
|
||||||
|
commit," but `classify (length dates) (daySpan (last dates) newest)`
|
||||||
|
computes the span between first and most recent *commit*, with no
|
||||||
|
reference to today. A piece written in a one-week burst years ago reports
|
||||||
|
"volatile" forever; time passing without commits can never increase
|
||||||
|
stability. Either the comment or the metric is wrong.
|
||||||
|
|
||||||
|
### 3.5 Frontmatter `history:` assumed newest-first; WRITING.md documents oldest-first — **MED**
|
||||||
|
|
||||||
|
`build/Stability.hs:204-217,299-336` vs `WRITING.md:105-109`.
|
||||||
|
`loadVersionHistory` keeps authored order and all range fields treat the
|
||||||
|
head as newest (`es@(newest:_) -> let oldest = last es`). Git history is
|
||||||
|
newest-first, but WRITING.md's `history:` example is oldest-first. With
|
||||||
|
the documented ordering, `version-history-range` renders reversed
|
||||||
|
("14 March 2026 – 1 March 2026"), `range-start` returns the newest date,
|
||||||
|
and `version-history-primary` shows the three *oldest* entries.
|
||||||
|
|
||||||
|
### 3.6 Archive manifest→provenance join is exact-string, rest of system is normalized — **MED**
|
||||||
|
|
||||||
|
`build/Archive.hs:269`. `Map.lookup (meUrl me) provByUrl` joins on the raw
|
||||||
|
URL; everywhere else equivalence is `normalizeUrl` (ArchiveIndex
|
||||||
|
filtering, dup detection, ARCHIVE.md:189-192). Editing a manifest URL to a
|
||||||
|
normalization-equivalent form (`http`→`https`, trailing slash, tracking
|
||||||
|
param) silently unpublishes `/archive/<slug>/` while ArchiveIndex's
|
||||||
|
normalized filter keeps the slug active — links keep pointing at a 404.
|
||||||
|
|
||||||
|
### 3.7 Photography `buildPin` computes wrong slug/thumb/title for flat entries — **MED**
|
||||||
|
|
||||||
|
`build/Photography.hs:354,362`. `slug = takeFileName (takeDirectory fp)` —
|
||||||
|
for a flat `content/photography/foo.md` this yields `"photography"`, so
|
||||||
|
map.json gets `"slug": "photography"`, the title fallback is wrong, and
|
||||||
|
`thumb = "/photography/photography/<p>"` 404s (flat-single assets route to
|
||||||
|
`/photography/<asset>`). PHOTOGRAPHY.md:214 explicitly supports flat
|
||||||
|
singles. Latent — `content/photography/` currently has only `index.md` —
|
||||||
|
but breaks the first geo-tagged flat single.
|
||||||
|
|
||||||
|
### 3.8 `geo-precision` fails open: a typo'd "hidden" publishes coordinates — **MED**
|
||||||
|
|
||||||
|
`build/Photography.hs:347-349,312-320`. Only the exact string matches
|
||||||
|
(`(_, Just "hidden", _) -> return Nothing`); any other value (e.g.
|
||||||
|
`Hidden`, `hiddn`) falls into `roundCoord`, whose catch-all treats unknown
|
||||||
|
values as `city` (~10 km rounding) — publishing coordinates the author
|
||||||
|
meant to suppress. Contradicts the file's own privacy comment (lines
|
||||||
|
287-289) and the fail-closed precedent for `visibility:` in
|
||||||
|
`build/Archive.hs:77-83`.
|
||||||
|
|
||||||
|
### 3.9 Archive state is process-lifetime cached — `watch` goes stale — **LOW**
|
||||||
|
|
||||||
|
`build/ArchiveIndex.hs:123-146` + `build/Archive.hs:304`.
|
||||||
|
`activeUrls`/`rawIndex`/`rawState` are NOINLINE `unsafePerformIO` CAFs read
|
||||||
|
once per process, and `archiveRules` reads the manifest in `preprocess`.
|
||||||
|
Under `site watch`, edits to `manifest.yaml`, `removed.yaml`, or the
|
||||||
|
regenerated state JSONs are never re-read until restart. One-shot builds
|
||||||
|
unaffected.
|
||||||
|
|
||||||
|
### 3.10 Pinned pages render raw ISO in `$last-reviewed$` — **LOW**
|
||||||
|
|
||||||
|
`build/Stability.hs:166-170`. The git branch formats via `fmtIso`
|
||||||
|
("1 May 2026"); the IGNORE.txt-pinned branch returns the frontmatter value
|
||||||
|
verbatim ("2026-05-01") — inconsistent display formatting.
|
||||||
|
|
||||||
|
### 3.11 Empty/all-comments `manifest.yaml` halts the build — **LOW**
|
||||||
|
|
||||||
|
`build/Archive.hs:158-170`. An empty YAML stream decodes as `Null`, which
|
||||||
|
fails to parse as `[ManifestEntry]` and takes the `exitFailure` branch —
|
||||||
|
draining the manifest to zero entries is fatal rather than the empty
|
||||||
|
archive the absent-file branch supports.
|
||||||
|
|
||||||
|
### 3.12 Backlinks `normaliseUrl` misses directory-form canonical URLs — **LOW**
|
||||||
|
|
||||||
|
`build/Backlinks.hs:275-281`. Strips `.html` but not
|
||||||
|
`index.html`/trailing slash: a page routed `essays/foo/index.html` keys as
|
||||||
|
`/essays/foo/index`, but a body link authored `/essays/foo/` doesn't
|
||||||
|
match — backlink silently dropped. `build/SimilarLinks.hs:97-99` handles
|
||||||
|
exactly this case and its comment flags the divergence.
|
||||||
|
|
||||||
|
### 3.13 SimilarLinks PDF viewer URL not percent-encoded — **LOW**
|
||||||
|
|
||||||
|
`build/SimilarLinks.hs:155-164`.
|
||||||
|
`viewerUrl = "/pdfjs/web/viewer.html?file=" ++ escapeHtml raw` —
|
||||||
|
`escapeHtml` handles HTML metachars only; a path containing `&`, `?`, `#`,
|
||||||
|
or spaces breaks the `file=` query value.
|
||||||
|
|
||||||
|
### 3.14 Photography feed thumbnails only for directory-form entries — **LOW**
|
||||||
|
|
||||||
|
`build/Photography.hs:449-453`. `imgTag` requires `isDir`; flat singles
|
||||||
|
and series children (`<series>/<photo>.md`) get text-only feed entries,
|
||||||
|
against PHOTOGRAPHY.md's "thumbnails embedded inline" (lines 36, 445) and
|
||||||
|
the feed's deliberate inclusion of series children.
|
||||||
|
|
||||||
|
### 3.15 Marks: missing confidence/evidence renders a literal "0 TRUST" — **LOW**
|
||||||
|
|
||||||
|
`build/Marks.hs:272-278,565`. `computeTrust _ _ = 0` with a comment
|
||||||
|
claiming the figure "collapses to the bare frame," but
|
||||||
|
`renderEpistemicFigure` unconditionally calls `renderTrustLabel`, so a
|
||||||
|
piece with `status:` but no `confidence`/`evidence` (a case MARKS.md:696
|
||||||
|
says should render) displays a prominent center "0" — indistinguishable
|
||||||
|
from an authored zero-trust score.
|
||||||
|
|
||||||
|
### 3.16 Feature-module NITs
|
||||||
|
|
||||||
|
- `build/Catalog.hs:228-235`: two distinct unknown categories render as
|
||||||
|
adjacent duplicate "Other" sections (equal rank, `groupBy` on raw
|
||||||
|
string).
|
||||||
|
- `build/Stats.hs:754-777`: `pageTOC` comment says "nine h2 sections";
|
||||||
|
lists eleven (matching the eleven rendered).
|
||||||
|
- `build/SimilarLinks.hs:51-54`: comment says "the template caps the
|
||||||
|
display"; the code caps it (`take maxSimilar` at line 80).
|
||||||
|
- `build/Stats.hs:169-171`, `build/Archive.hs:564-569`: "median" is the
|
||||||
|
upper-median for even-length lists.
|
||||||
|
- `build/Backlinks.hs:133-153`: protocol-relative `//host/path` URLs pass
|
||||||
|
`isPageLink` and pollute backlinks.json.
|
||||||
|
- `build/BibExtras.hs:75-98`: `@string`/`@comment`/`@preamble` blocks
|
||||||
|
parsed as citekey entries — only consequential on a citekey/macro-name
|
||||||
|
collision.
|
||||||
|
|
||||||
|
Verified clean: Marks tick positions/axis order/radii match MARKS.md §3;
|
||||||
|
proved-confidence trust substitution matches §4.3; Archive's fail-closed
|
||||||
|
`visibility` validation, removed.yaml conflict rejection, and double-sided
|
||||||
|
SHA-256 verification all match ARCHIVE.md.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Python & shell tooling
|
||||||
|
|
||||||
|
### 4.1 `data/embed-cache-pages.npz.tmp.npz` orphan: explained; cleanup + ignore gaps — **MED**
|
||||||
|
|
||||||
|
The orphan (mtime May 26) is the fossil of a fixed bug: an earlier
|
||||||
|
embed.py passed a bare path to `np.savez_compressed`, numpy appended
|
||||||
|
`.npz` (verified in numpy's `_savez` source), and the subsequent
|
||||||
|
`os.replace` raised FileNotFoundError, stranding the file. The current
|
||||||
|
file-handle code (`tools/embed.py:173-183`) is correct, but: (a) nothing
|
||||||
|
deletes the stale orphan — **delete it, don't commit it**; (b) the tmp
|
||||||
|
write has no try/finally, so any mid-write exception strands
|
||||||
|
`embed-cache-pages.npz.tmp`; (c) the new `.gitignore` entry is exact-path
|
||||||
|
(`data/embed-cache-pages.npz`) and covers neither `.tmp` nor `.tmp.npz`
|
||||||
|
variants — widen to `data/embed-cache-pages.npz*`; (d) the fixed tmp name
|
||||||
|
means two concurrent runs interleave writes.
|
||||||
|
|
||||||
|
### 4.2 Corrupt embed cache crashes instead of being discarded — **MED**
|
||||||
|
|
||||||
|
`tools/embed.py:154`. The discard path catches
|
||||||
|
`(OSError, KeyError, ValueError)`, but `np.load` on a truncated `.npz`
|
||||||
|
raises `zipfile.BadZipFile` (verified MRO: `BadZipFile → Exception`), and
|
||||||
|
`EOFError` is also uncaught. A half-written cache (exactly what §4.1(b)
|
||||||
|
can produce) makes every subsequent build print "Warning: embedding
|
||||||
|
failed" and leaves similar-links/semantic index stale until the file is
|
||||||
|
manually deleted — the opposite of the docstring's "unreadable →
|
||||||
|
discarding" contract.
|
||||||
|
|
||||||
|
### 4.3 embed.py staleness check structurally defeated by stamp-build-time — **MED**
|
||||||
|
|
||||||
|
`tools/embed.py:195-200` + `Makefile:68`. `needs_update()` compares
|
||||||
|
`_site/**/*.html` mtimes against embed's outputs — but the build order is
|
||||||
|
`embed.py` → `stamp-build-time.py _site`, and the stamper rewrites the
|
||||||
|
footer timestamp in essentially every HTML file each build. So every page
|
||||||
|
is always newer than embed's outputs and the "skip if fresh" fast path
|
||||||
|
never fires: the full paragraph-embedding pass (and model load) runs on
|
||||||
|
every build. The new page cache papers over half the cost; the paragraph
|
||||||
|
pass pays full price every time. Related (`tools/embed.py:297-299`):
|
||||||
|
model/config changes never invalidate outputs — currently masked by this
|
||||||
|
bug; fixing one exposes the other.
|
||||||
|
|
||||||
|
### 4.4 archive.py writes provenance/index/state non-atomically — **MED**
|
||||||
|
|
||||||
|
`tools/archive.py:718-721,734-737,953-957,1077-1080`. All plain
|
||||||
|
`write_text()`. An interrupt mid-write truncates `PROVENANCE.json`; the
|
||||||
|
next build's `json.loads` (line 642) raises an unhandled
|
||||||
|
`JSONDecodeError` — and a truncated provenance is indistinguishable from
|
||||||
|
corruption in a tool whose whole contract is integrity checking. embed.py
|
||||||
|
got atomic-write helpers; archive.py did not.
|
||||||
|
|
||||||
|
### 4.5 download-leaflet.sh: checksum verification bypassable — **MED**
|
||||||
|
|
||||||
|
`tools/download-leaflet.sh:43-47,90`. The early-exit skip checks file
|
||||||
|
existence only (download-model.sh re-verifies on its skip path), and
|
||||||
|
`curl -o "$target"` writes directly to the final path: a download that
|
||||||
|
*fails* `verify_or_warn` aborts via `set -e` *after* the bad file is in
|
||||||
|
place, and the next run's existence check accepts it permanently. A
|
||||||
|
MITM'd unpkg.com download survives one failed run and is silently
|
||||||
|
vendored on the next.
|
||||||
|
|
||||||
|
### 4.6 Other download/convert scripts leave partial files in final paths — **LOW**
|
||||||
|
|
||||||
|
`tools/download-model.sh:84`: interrupted curl leaves a partial
|
||||||
|
`model_quantized.onnx`; caught today only because model-checksums.sha256
|
||||||
|
pins all five files — any unpinned file would persist forever. Use
|
||||||
|
`-o "$dst.part" && mv`. `tools/convert-images.sh:33`: interrupted cwebp
|
||||||
|
leaves a partial `.webp` that the `-nt` staleness gate then skips forever
|
||||||
|
— a truncated WebP ships until manually deleted.
|
||||||
|
|
||||||
|
### 4.7 archive.py robustness gaps — **LOW**
|
||||||
|
|
||||||
|
- `tools/archive.py:788,795-799`: provenance missing the `artifact` key
|
||||||
|
makes `prev_artifact == slug_dir`, then `sha256_of` raises an uncaught
|
||||||
|
`IsADirectoryError` instead of the structured "prior snapshot
|
||||||
|
incomplete" error.
|
||||||
|
- `tools/archive.py:614-617,938-940,1066-1068`: non-dict manifest entries
|
||||||
|
(`- https://example.com` instead of `- url: ...`) crash with
|
||||||
|
`AttributeError: 'str' object has no attribute 'get'`.
|
||||||
|
- `tools/archive.py:896`: `wayback_save` concatenates the raw URL
|
||||||
|
(contrast `wayback_lookup` at 909, which uses `quote(url, safe="")`).
|
||||||
|
|
||||||
|
### 4.8 add-popup-source.sh: dead CSP reminder + unvalidated nginx interpolation — **LOW**
|
||||||
|
|
||||||
|
`tools/add-popup-source.sh:214`: the connect-src reminder gates on
|
||||||
|
`[[ "$NEEDS_PROXY" -eq 0 && -n "$UPSTREAM_HOST" ]]`, but `UPSTREAM_HOST`
|
||||||
|
is only set in the `NEEDS_PROXY -eq 1` branch (lines 124-131) — the
|
||||||
|
reminder can never print, and the no-proxy case is exactly when it's
|
||||||
|
needed (the provider will be CSP-blocked with no hint). Line 71: `NAME`
|
||||||
|
from a free-text prompt is interpolated into
|
||||||
|
`location /proxy/$NAME/`/`set $upstream_$NAME` with no
|
||||||
|
`^[a-z0-9-]+$` validation (import-photo.sh validates; this doesn't).
|
||||||
|
|
||||||
|
### 4.9 refreeze.sh deletes the freeze before the replacement succeeds — **LOW**
|
||||||
|
|
||||||
|
`tools/refreeze.sh:13-16`. `rm -f "$FREEZE"` then `cabal freeze`; a failed
|
||||||
|
resolve leaves no freeze file (recoverable via git, but write-temp-then-move
|
||||||
|
is safer).
|
||||||
|
|
||||||
|
### 4.10 embed.py / atomic-write NITs — **LOW/NIT**
|
||||||
|
|
||||||
|
`tools/embed.py:109-115`: `atomic_write_bytes` uses a fixed `.tmp` name
|
||||||
|
(concurrent-run collision) and no `fsync` before `os.replace` (power loss
|
||||||
|
can leave an empty target). Same pattern in `_atomic_write_yaml` of
|
||||||
|
extract-exif.py:377, extract-palette.py:65, extract-dimensions.py:65.
|
||||||
|
`tools/embed.py:144`: NpzFile never closed — use
|
||||||
|
`with np.load(...) as npz:`.
|
||||||
|
|
||||||
|
### 4.11 Tooling NITs
|
||||||
|
|
||||||
|
- `tools/import-photo.sh:147-155`: on `mogrify -strip` failure the
|
||||||
|
EXIF-laden JPEG (GPS, serials) remains under `content/`, where
|
||||||
|
`make build`'s `git add content/` could auto-commit it. Delete `$TARGET`
|
||||||
|
on that failure path.
|
||||||
|
- `tools/hooks/pre-commit-marks.sh:28-31`: `awk '{ print $2 }'` truncates
|
||||||
|
paths with spaces; the `status:` probe reads the working tree, not the
|
||||||
|
staged blob. Advisory-only hook.
|
||||||
|
- `tools/preset-signing-passphrase.sh:30`: `echo -n "$PASSPHRASE"` eats a
|
||||||
|
passphrase starting with `-e`/`-n`/`-E`; use `printf '%s'`.
|
||||||
|
- `tools/stamp-build-time.py:52-54`: in-place non-atomic rewrite of
|
||||||
|
`_site/` HTML.
|
||||||
|
- `tools/archive.py:244`: `pdftotext` without `--`; a slug starting with
|
||||||
|
`-` parses as an option. Same in extract-exif.py:159.
|
||||||
|
- `tools/monolith-version.txt` records a sha256 (matches the binary
|
||||||
|
today, verified) but `find_monolith()` never checks it.
|
||||||
|
|
||||||
|
Verified clean: sign-site.sh (atomic sig writes, post-pass manifest
|
||||||
|
verification); compress-assets.sh and download-pdfjs.sh (mktemp + EXIT
|
||||||
|
trap, hash verified before extraction); audit-marks.py, viz_theme.py,
|
||||||
|
extract-dimensions.py, extract-palette.py; embed.py's faiss `-1` padding
|
||||||
|
is safely filtered; `uv lock --check` passes; model-checksums.sha256 pins
|
||||||
|
all five model files.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Frontend JavaScript
|
||||||
|
|
||||||
|
### 5.1 Score-reader pages never restore theme/settings — **MED**
|
||||||
|
|
||||||
|
`templates/score-reader-default.html:10` + `static/js/theme.js:12-13`. The
|
||||||
|
template loads `theme.js` without `utils.js` (unlike head.html:66-67), so
|
||||||
|
`window.lnUtils.safeStorage` is undefined and theme/text-size/focus-mode/
|
||||||
|
reduce-motion all silently fail to restore — a dark-theme user gets a
|
||||||
|
light flash-and-stay on every score page. Compounding: settings.js (line
|
||||||
|
15; the template does render the settings toggle) falls back to its no-op
|
||||||
|
store, so theme picks made on score pages never persist either.
|
||||||
|
|
||||||
|
### 5.2 search-filters.js: epistemic filters silently bypass clean-URL pages — **MED**
|
||||||
|
|
||||||
|
`static/js/search-filters.js:117-125`. `normUrl()` returns `u.pathname`
|
||||||
|
verbatim and looks it up in `epistemicMeta[url]`. Verified:
|
||||||
|
`_site/data/epistemic-meta.json` keys include
|
||||||
|
`/essays/beyond-comorbidity-indices/index.html` while rendered result
|
||||||
|
links use `/essays/beyond-comorbidity-indices/`. The lookup misses,
|
||||||
|
`passes(null)` returns true ("no metadata = don't filter"), so every
|
||||||
|
directory-style page bypasses all active epistemic filters. Flat `.html`
|
||||||
|
pages match fine, which hides the bug.
|
||||||
|
|
||||||
|
### 5.3 viz.js ignores the cappuccino theme — **MED**
|
||||||
|
|
||||||
|
`static/js/viz.js:94-99`. `isDark()` knows only
|
||||||
|
`'dark'`/`'light'`/OS-preference, but theme.js/settings.js support
|
||||||
|
`'cappuccino'` — a dark-brown theme (`--bg: #553a28`, base.css:203). With
|
||||||
|
OS-light + cappuccino, charts render the LIGHT config (near-black marks
|
||||||
|
and axis labels) on a dark background.
|
||||||
|
|
||||||
|
### 5.4 collapse.js localStorage keys collide across pages — **MED**
|
||||||
|
|
||||||
|
`static/js/collapse.js:44,83`. Key is
|
||||||
|
`'section-collapsed:' + heading.id` with no pathname namespace (contrast
|
||||||
|
annotations.js). Pandoc auto-slugs (`#introduction`, `#background`) recur
|
||||||
|
across essays, so collapsing "Introduction" on one essay collapses it
|
||||||
|
everywhere. Also uses raw `localStorage` rather than
|
||||||
|
`lnUtils.safeStorage`.
|
||||||
|
|
||||||
|
### 5.5 semantic-search.js: stale-response race + duplicate index fetch — **MED**
|
||||||
|
|
||||||
|
`static/js/semantic-search.js:117-144`. `runSearch` has no generation
|
||||||
|
token; overlapping queries render in promise-resolution order, so an
|
||||||
|
older query's hits can replace a newer one's (with `setStatus('')`
|
||||||
|
masking it). `loadIndex()` (42-59) has no in-flight-promise dedup (unlike
|
||||||
|
`loadModel`'s `loadModelPromise`), so concurrent first searches fetch
|
||||||
|
`semantic-index.bin` + `semantic-meta.json` twice.
|
||||||
|
|
||||||
|
### 5.6 lightbox.js: aria-modal with no focus trap, no keyboard activation — **MED**
|
||||||
|
|
||||||
|
`static/js/lightbox.js`. Overlay sets `role="dialog"` +
|
||||||
|
`aria-modal="true"` but has no Tab handling (gallery.js's `trapTab` at
|
||||||
|
235-257 shows the in-repo pattern) — focus walks into the obscured page.
|
||||||
|
Trigger images get only a `click` listener and no `tabindex`/keydown, so
|
||||||
|
keyboard users can't open it; `close()` focuses a non-focusable `<img>`,
|
||||||
|
which no-ops.
|
||||||
|
|
||||||
|
### 5.7 Frontend LOWs
|
||||||
|
|
||||||
|
- `static/js/gallery.js:122-125,270-275`: math/score overlay is
|
||||||
|
click-only (no role/tabindex/keydown); `closeOverlay()` focus-returns
|
||||||
|
to a non-focusable div — focus drops to `<body>`.
|
||||||
|
- `static/js/popups.js:478,515`: the Wikipedia provider's
|
||||||
|
`decodeURIComponent` runs synchronously before the `.catch` attaches —
|
||||||
|
a malformed percent sequence in a link path throws an uncaught
|
||||||
|
`URIError` per hover.
|
||||||
|
- `static/js/popups.js:359,390`: fetched monogram SVG injected via
|
||||||
|
`innerHTML` unescaped — the single unsanitized path in an otherwise
|
||||||
|
fully escaped pipeline. Build-authored content, so not exploitable
|
||||||
|
today; the comment acknowledges the trust assumption.
|
||||||
|
- `static/js/citations.js`: dead file — no template loads it; popups.js
|
||||||
|
supersedes it. If ever re-added it would double-bind and inject
|
||||||
|
bibliography innerHTML without popups.js's cloned-node hardening.
|
||||||
|
Delete.
|
||||||
|
- `static/js/nav.js:26,30-31`: raw `localStorage` unguarded; if storage
|
||||||
|
access throws, the throw lands before `toggle.addEventListener`,
|
||||||
|
leaving the Portals toggle completely dead (utils.js exists precisely
|
||||||
|
for this).
|
||||||
|
- `static/js/annotations.js:209-215`: marks are mouse-only; the tooltip's
|
||||||
|
Delete button is unreachable by keyboard (only recourse is the
|
||||||
|
all-or-nothing "Clear Annotations").
|
||||||
|
- `static/js/search.js:10`: unguarded `new PagefindUI(...)` — if the
|
||||||
|
pagefind bundle 404s, the ReferenceError aborts the whole handler
|
||||||
|
including the `?q=` pre-fill that the selection-popup "Here" flow
|
||||||
|
depends on.
|
||||||
|
- `static/js/semantic-search.js:55-56,96-107`: no
|
||||||
|
`vectors.length === meta.length * DIM` consistency check — a stale
|
||||||
|
CDN-cached mismatch yields NaN scores and silently garbage ranking.
|
||||||
|
(Current files verified consistent: 1,256,448 bytes = 818 × 384 × 4.)
|
||||||
|
- `static/js/transclude.js:149-151` + `collapse.js:111-114`: nested
|
||||||
|
transcludes render a bare placeholder (no rescan of injected content);
|
||||||
|
`reinitCollapse` is not idempotent (would stack toggle buttons if ever
|
||||||
|
called twice on the same container).
|
||||||
|
- `static/js/popups.js:985-988,1009-1014`: `daysBetween` uses `Math.abs`,
|
||||||
|
so future dates render "N days ago" (now.js:17 handles this correctly).
|
||||||
|
|
||||||
|
### 5.8 Frontend NITs
|
||||||
|
|
||||||
|
- `static/js/copy.js:20-22,39`: code-less `<pre>` fallback copies the
|
||||||
|
"copy" button label along with content.
|
||||||
|
- `static/js/score-reader.js:50`: URL rewritten to `?p=1` on every load
|
||||||
|
even without a `?p=` param.
|
||||||
|
- `static/js/search-filters.js:271`: `parseInt(v,10) || 0` turns junk
|
||||||
|
threshold input into an active ≥0 filter that matches everything.
|
||||||
|
- `static/js/selection-popup.js:90-95`: shift-keyup while typing capitals
|
||||||
|
in the annotation picker re-summons the selection toolbar over it.
|
||||||
|
|
||||||
|
Verified clean: the semantic-search ↔ embed.py contract post-model-split
|
||||||
|
(DIM 384, 818-entry meta, no prefix for MiniLM — the nomic
|
||||||
|
`search_document:` prefix is confined to the build-only page path); XSS
|
||||||
|
escaping across semantic-search, popups providers, map tooltips,
|
||||||
|
annotations (sole exception §5.7 monogram); theme.js ↔ settings.js
|
||||||
|
storage schema identical; all JS selector contracts against templates
|
||||||
|
(including the uncommitted head/nav edits); popups/sidenotes
|
||||||
|
double-init guards; settings.js and gallery.js focus traps.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Templates & content
|
||||||
|
|
||||||
|
### 6.1 Draft in undocumented location is never built — **MED**
|
||||||
|
|
||||||
|
`content/drafts/inclusionist-manifesto.md`. WRITING.md:34 says drafts go
|
||||||
|
under `content/drafts/essays/`; `draftEssayPattern`
|
||||||
|
(`build/Patterns.hs:46-49`) matches only that, so this file is invisible
|
||||||
|
even to `make watch`/`make dev` — silently orphaned.
|
||||||
|
|
||||||
|
### 6.2 SIMD/PQC essay `repository:` URL 404s — **MED**
|
||||||
|
|
||||||
|
`content/essays/where-does-simd-help-post-quantum-cryptography/index.md:24`.
|
||||||
|
`https://git.levineuwirth.org/where-simd-helps` is missing the owner
|
||||||
|
segment — verified HTTP 404, while the sibling essay's
|
||||||
|
`.../neuwirth/beyond_comorbidity_indices` returns 200.
|
||||||
|
|
||||||
|
### 6.3 Tracked drafts contradict the gitignore policy — **MED**
|
||||||
|
|
||||||
|
`.gitignore:88` ignores `content/drafts/` as local-only "working notes,"
|
||||||
|
but `git ls-files -i -c` shows four tracked drafts
|
||||||
|
(`digital_progeny.md`, `modern_idolatry.md`, `test-essay.md`,
|
||||||
|
`university_care.md`) — ignore rules don't untrack, so edits are
|
||||||
|
auto-staged by `make build` and pushed publicly by deploy. The over-broad
|
||||||
|
`**/.env.*` pattern also matches the tracked `.env.example`.
|
||||||
|
|
||||||
|
### 6.4 Template/content LOWs and NITs
|
||||||
|
|
||||||
|
- `content/colophon.md:5`: `modified:` is dead frontmatter — nothing
|
||||||
|
reads it; `$date-modified$` (page-footer.html:108) is Hakyll's
|
||||||
|
`dateField` over the `date` key.
|
||||||
|
- Seven files end frontmatter with a valueless `confidence-history:`
|
||||||
|
(YAML null; WRITING.md:97 documents a list of ints) — harmless, but
|
||||||
|
`content/essays/scaling_outage.md` also retains the full WRITING.md
|
||||||
|
scaffold comments in a published essay.
|
||||||
|
- `static/images/canto31.jpg`: still 4.0 MB (prior-audit §6.1 unfixed).
|
||||||
|
- `templates/blog-post.html:25,34`: `id="similar-links"` appears twice in
|
||||||
|
mutually exclusive `$if$` branches — safe, fragile under edit.
|
||||||
|
- `content/drafts/essays/digital_progeny.md`: title duplicates the
|
||||||
|
published "The Specification Dilemma" — stale draft.
|
||||||
|
- Frontmatter flags `home:`/`library:`/`links:`/`search:`/`portal:` are
|
||||||
|
consumed (head.html CSS gates, default.html:6 `data-portal`) but
|
||||||
|
undocumented in WRITING.md.
|
||||||
|
|
||||||
|
Verified clean: all `$partial(...)$` includes resolve; all ~140 distinct
|
||||||
|
template variables have context providers; no missing `alt` attributes,
|
||||||
|
tag-balance failures, or within-page duplicate IDs in composed pages; all
|
||||||
|
26 CSS files referenced by head.html exist; sampled enum values across
|
||||||
|
all sections are legal per WRITING.md and Contexts.hs validation lists.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Documentation / spec drift (WRITING.md, README.md)
|
||||||
|
|
||||||
|
### 7.1 `js:` page-script paths documented as content-relative; emitted root-relative — **MED**
|
||||||
|
|
||||||
|
`WRITING.md:773-775` vs `templates/default.html:37`
|
||||||
|
(`<script src="/$script-src$" defer>`). The doc claims a composition's
|
||||||
|
`js: scripts/widget.js` serves at `/music/symphony/scripts/widget.js`; the
|
||||||
|
template emits raw root-relative frontmatter. The only current user
|
||||||
|
(memento-mori) works by coincidence of its root-level route. A
|
||||||
|
composition following the doc would 404.
|
||||||
|
|
||||||
|
### 7.2 "Standalone page `content/my-page/index.md`" has no generic rule — **MED**
|
||||||
|
|
||||||
|
`WRITING.md:20` presents directory-form standalone pages as a general
|
||||||
|
capability; `build/Site.hs` hardcodes only `content/me/index.md` (293) and
|
||||||
|
`content/memento-mori/index.md` (307); the generic rule (351) matches flat
|
||||||
|
`content/*.md` only. A new `content/my-page/index.md` silently doesn't
|
||||||
|
build.
|
||||||
|
|
||||||
|
### 7.3 Portal table lists 8 portals; the build has 9 — **MED**
|
||||||
|
|
||||||
|
`WRITING.md:221-231` omits Photography, which is in `homePortals`
|
||||||
|
(`build/Site.hs:50-60`), the nav, and `content/tag-meta/photography.md`.
|
||||||
|
|
||||||
|
### 7.4 Three implemented frontmatter fields undocumented — **MED**
|
||||||
|
|
||||||
|
WRITING.md:3 claims to cover "all frontmatter fields"; zero hits for:
|
||||||
|
`summary:` (`build/Contexts.hs:415-427`, rendered by essay.html:16 and
|
||||||
|
reading.html:12, in live use), `revised:` (`build/Contexts.hs:815`
|
||||||
|
`getRevisions` — drives `$date-display$`/`$date-original$`/
|
||||||
|
`$revision-note$` and list sort order), `keywords:`
|
||||||
|
(`build/Contexts.hs:283` → `/bibliography/<kw>/` links).
|
||||||
|
|
||||||
|
### 7.5 Documentation LOWs
|
||||||
|
|
||||||
|
- `WRITING.md:268-269,82`: default citation style called "Chicago
|
||||||
|
Author-Date"; the injected CSL (`build/Citations.hs:114,167-168`) is
|
||||||
|
`data/chicago-notes.csl`, titled "Chicago Notes Bibliography".
|
||||||
|
- `README.md:12,19`: `make watch` described as "rebuilds on save without
|
||||||
|
a server"; it runs Hakyll's preview server (WRITING.md:1139 has it
|
||||||
|
right).
|
||||||
|
- `WRITING.md:105-109`: `history:` example ordering contradicts the code
|
||||||
|
(see §3.5).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. nginx, Makefile & deployment
|
||||||
|
|
||||||
|
### 8.1 Multi-line CSP value embeds literal `\` + LF bytes — **MED**
|
||||||
|
|
||||||
|
`nginx/security-headers.conf:60-71`. The
|
||||||
|
`Content-Security-Policy-Report-Only` value is a single quoted string
|
||||||
|
spanning 12 lines with trailing `\` characters — nginx has no
|
||||||
|
line-continuation inside quoted strings, so the emitted header contains
|
||||||
|
raw backslash, LF, and leading-space bytes between directives. Raw LF in
|
||||||
|
a header value is illegal in HTTP/2 (vhost example enables `http2 on`);
|
||||||
|
strict clients reject the whole response. Sent on every response even as
|
||||||
|
Report-Only. Must be collapsed to one line.
|
||||||
|
|
||||||
|
### 8.2 CSP gaps that will fire under enforcement — **MED**
|
||||||
|
|
||||||
|
`nginx/security-headers.conf:66-67`. (a) `font-src 'self' data:` blocks
|
||||||
|
KaTeX webfonts: head.html:61 loads `katex.min.css` from cdn.jsdelivr.net,
|
||||||
|
whose relative font URLs resolve to the CDN. (b) `connect-src 'self'`
|
||||||
|
blocks the onnxruntime `.wasm` that transformers.js v2 (dynamically
|
||||||
|
imported in `static/js/semantic-search.js:25`) fetches from jsdelivr —
|
||||||
|
the config comment covers the same-origin model files but not the
|
||||||
|
runtime. Both latent while Report-Only.
|
||||||
|
|
||||||
|
### 8.3 Makefile auto-commit sweeps any pre-staged changes — **MED**
|
||||||
|
|
||||||
|
`Makefile:28-29`. `git add content/` followed by
|
||||||
|
`git diff --cached --quiet || git commit -m "auto: ..."` commits the
|
||||||
|
*entire index* — anything previously staged gets folded into an
|
||||||
|
`auto: <timestamp> [skip ci]` commit and pushed publicly on deploy. Use
|
||||||
|
`git commit -- content/` or verify no foreign paths are staged.
|
||||||
|
|
||||||
|
### 8.4 Makefile LOWs
|
||||||
|
|
||||||
|
- pdf-thumbs: the `find | while read` pipeline swallows `pdftoppm`
|
||||||
|
failures (loop exit status is the last iteration's) — a corrupt PDF
|
||||||
|
silently ships without a thumbnail.
|
||||||
|
- deploy: prerequisite order `clean build sign` is guaranteed only under
|
||||||
|
serial make; no `.NOTPARALLEL:` guard for `-j` invocations. (Confirmed:
|
||||||
|
deploy does run `clean` first; `.PHONY` is complete; `.env` export
|
||||||
|
allowlist is sound.)
|
||||||
|
- `tools/hooks/pre-commit-marks.sh` is documented (Makefile:175 comment)
|
||||||
|
but not installed — `.git/hooks/` has only samples and `core.hooksPath`
|
||||||
|
is unset.
|
||||||
|
|
||||||
|
Verified clean: all seven `data/` JSON/YAML files parse;
|
||||||
|
`data/embed-cache-pages.npz` is untracked, so the new gitignore entry is
|
||||||
|
fully effective; nginx archive.conf's add_header-inheritance re-include is
|
||||||
|
correct; no redirect loops; popup-proxy rate-limit/cache zones correctly
|
||||||
|
documented for http{} scope.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Working-tree diff review (branding refresh + embed split)
|
||||||
|
|
||||||
|
The model contract is **intact** — the diff splits one MiniLM pipeline
|
||||||
|
into two: pages now use nomic-embed-text-v1.5 (768d, build-only, for
|
||||||
|
similar-links.json); paragraphs stay on all-MiniLM-L6-v2@c9745ed (384d,
|
||||||
|
the browser contract). download-model.sh, model-checksums.sha256,
|
||||||
|
semantic-search.js (`DIM = 384`), and both WRITING.md lines (1108 nomic
|
||||||
|
for Related-pages, 1128 MiniLM for client search) are all consistent.
|
||||||
|
Icon declarations all match real files (verified with `file`: apple-touch
|
||||||
|
180×180, favicon-96 96×96, manifest PNGs 192/512, og-image 1200×630
|
||||||
|
matching declared og:image dimensions; the webp sidecar was regenerated).
|
||||||
|
|
||||||
|
Open items beyond §1.3/§1.4/§4.1:
|
||||||
|
|
||||||
|
### 9.1 32.8 KB traced SVG inlined into every page — **MED**
|
||||||
|
|
||||||
|
`templates/partials/logo-mark.svg` (32,818 bytes, potrace-style single
|
||||||
|
giant `<path>`) is inlined via the nav partial into every HTML page —
|
||||||
|
a ~33 KB per-page weight regression (pre-compression). The two-tone
|
||||||
|
`--logo-ink`/`--logo-bg` cutout (components.css:72-98) genuinely needs
|
||||||
|
inline SVG or `<use>`; an external sprite + `<use href>` restores
|
||||||
|
cacheability. Better still: a hand-drawn or simplified path — a traced
|
||||||
|
bitmap at nav size carries detail that can never resolve.
|
||||||
|
|
||||||
|
### 9.2 Icon asset bloat — **LOW**
|
||||||
|
|
||||||
|
`static/favicon.ico` is now 71,766 bytes; parsed directory shows
|
||||||
|
16/32/48/64/128/256 px entries, the 128+256 pair alone 55.8 KB. The .ico
|
||||||
|
is only the legacy fallback (modern browsers take the SVG); 16+32+48
|
||||||
|
(~8 KB) is conventional. `static/favicon.svg` is a 32,844-byte traced
|
||||||
|
path. `static/images/link-icons/internal.svg` went ~2 KB → 32,818 bytes
|
||||||
|
yet renders at 0.7–1.6 rem via CSS mask in three stylesheets
|
||||||
|
(components.css:853, typography.css:833, popups.css:161).
|
||||||
|
|
||||||
|
### 9.3 Webmanifest regressions — **NIT**
|
||||||
|
|
||||||
|
`static/site.webmanifest`: `purpose` changed maskable→`any` for both
|
||||||
|
icons (Android adaptive launchers will letterbox; convention is separate
|
||||||
|
`any` + `maskable` entries); still no `start_url`/`scope`/`description`
|
||||||
|
(Lighthouse installability warnings). JSON valid; icons verified.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Prior audit (AUDIT.md 2026-05-07) follow-up
|
||||||
|
|
||||||
|
| Finding | Status |
|
||||||
|
|---|---|
|
||||||
|
| §1.1 freeze unsolvable | **Effectively still open** — aeson pin fixed, but the freeze broke again via `distributive` after a system update (§1.1 above); the underlying freeze-vs-system-db fragility is unaddressed |
|
||||||
|
| §1.3 Python version mismatch | Fixed (`requires-python = ">=3.14"` matches `.python-version`) |
|
||||||
|
| §1.4 model checksums | Fixed (`tools/model-checksums.sha256`, 5 entries) |
|
||||||
|
| §9.1 nginx headers | Fixed (`nginx/security-headers.conf` + vhost example, README'd) — but see §8.1/§8.2 for new issues in that file |
|
||||||
|
| §6.1 `canto31.jpg` 4 MB | **Unfixed** |
|
||||||
|
| robots.txt / sitemap | Fixed (Site.hs:941/963, present in `_site/`) |
|
||||||
|
| README `paper/`/`spec.md` ghosts | Fixed |
|
||||||
|
| rsync target quoting | Fixed |
|
||||||
|
| date-quoting doc | Fixed (WRITING.md:106) |
|
||||||
|
| tag-meta no-title exception | Fixed (WRITING.md:238-251) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Suggested triage order
|
||||||
|
|
||||||
|
1. ~~`tools/refreeze.sh`~~ (§1.1 — in progress)
|
||||||
|
2. Delete `data/embed-cache-pages.npz.tmp.npz`; widen the gitignore
|
||||||
|
pattern; `git add` `logo-mark.svg` + `og-image.png` before committing
|
||||||
|
the branding diff (§1.4, §4.1)
|
||||||
|
3. Guard `ArchiveIndex.hs` file reads with `doesFileExist` (§1.2)
|
||||||
|
4. Pin or sandbox the nomic remote code (§1.3)
|
||||||
|
5. Fix the `/fiction/`–`/poetry/` 404s (§2.1) and the production-visible
|
||||||
|
frontend MEDs (§5.1, §5.2)
|
||||||
|
6. Collapse the nginx CSP to one line before ever flipping it to
|
||||||
|
enforcing (§8.1, §8.2)
|
||||||
|
7. The rest by severity as time allows
|
||||||
Loading…
Reference in New Issue