**Status:** LIVING DOCUMENT — Updated as implementation progresses.
---
## I. Vision & Philosophy
This website is an **intellectual home** — the permanent residence of a mind that moves freely between computer science, music composition, poetry, fiction, and whatever else catches fire.
### Commitments
1. **Long content over disposable content.** Essays are living documents.
2. **Semantic zoom.** Title → abstract → headers → body → sidenotes → citations → sources.
3. **Earned ornament.** Every decorative element serves a purpose.
4. **The site is the proof.** Entirely FOSS. No tracking. No analytics. No fingerprinting.
5. **Reader > Author.**
6. **Configuration is code.** The build system is a Haskell program.
7. **No homepage epigraph.**
8. **Extensible metadata.** Future-proofed for semantic embeddings via external JSON injection.
---
## II. All Resolved Decisions
### Typography
| Role | Font | License | Notes |
|------|------|---------|-------|
| **Body** | **Spectral** | SIL OFL | Screen-first serif. True smallcaps (`smcp`), four figure styles, ligatures, seven weights + italics. Self-hosted from source — Google Fonts strips OT features. |
- **Center column:** Body text in Spectral. 650–700px max-width.
- **Right margin:** Sidenotes only (right column).
### Color
Pure monochrome. No accent color. Light mode default (`#faf8f4` background, `#1a1a1a` text). Dark mode via `[data-theme="dark"]` + `prefers-color-scheme`.
### Content Systems
- **Tag system:** Hierarchical, slash-separated (`research/mathematics`). Hakyll `buildTags` + custom hierarchy. Tag pages at `/<tag>/` with no `/tags/` namespace prefix.
- **Pagination:** Blog index 20/page, tag pages 20/page. Essay index all on one page.
- **RSS:** Atom feed at `/feed.xml` (all content types, sorted by `date`) and `/music/feed.xml` (compositions only).
- **Citations:** Numbered superscript markers `[1]` linked to a bibliography section. Hover preview via `citations.js`. Further Reading section separate from cited works. `data/bibliography.bib` + Chicago Author-Date CSL.
- **Collapsible sections:** h2/h3 headings toggle their content via `collapse.js`. Smooth `max-height` transition. State persisted in `localStorage`.
### Gwern Codebase: Selective Adoption
| Component | Action | Actual outcome |
|-----------|--------|----------------|
| `sidenotes.js` | Adopt directly (Said Achmiz, MIT) | **Written from scratch** — purpose-built for our HTML structure |
| `popups.js` | Fork and simplify (Said Achmiz, MIT) | Exists in `static/js/popups.js`; Phase 3 |
Extensible YAML frontmatter. Hakyll strips frontmatter before passing to Pandoc, so all frontmatter access goes through Hakyll's metadata API (`lookupStringList`, `getMetadataField`, etc.), not through Pandoc `Meta`.
**Frontmatter keys in use:**
```yaml
title: # page title
date: # ISO date (YYYY-MM-DD) — used for sorting, feed, reading-time
abstract: # short description (1–3 sentences). Rendered via Pandoc to support LaTeX math and Markdown.
tags: # hierarchical tag list
authors: # list of author names (defaults to Levi Neuwirth)
further-reading: # list of BibTeX keys for the Further Reading section
bibliography: # path to .bib file (optional; defaults to data/bibliography.bib)
csl: # path to .csl file (optional; defaults to data/chicago-notes.csl)
repository: # external URL pointing to the content's source code or data repository
# Epistemic profile (all optional; section shown only if `status` is present)
status: # Draft | Working model | Durable | Refined | Superseded | Deprecated
confidence: # 0–100 integer (%)
importance: # 1–5 integer (rendered as filled/empty dots)
evidence: # 1–5 integer (rendered as filled/empty dots)
scope: # personal | local | average | broad | civilizational
**`IGNORE.txt`:** A file in the project root listing content paths (one per line) whose `stability` and `last-reviewed` should not be recomputed. Cleared automatically after every `make build`. Useful for pinning manually-set stability labels on pages whose git history is misleading.
**Top metadata block:**
1. **Tags** — hierarchical tag list with links to tag index pages
2. **Description** — the `abstract` field, rendered via Pandoc (supporting LaTeX math and Markdown formatting), typically in italic
- [x] Pagination (blog index and tag pages, 20/page)
- [x] Metadata: YAML frontmatter + auto-computed word count / reading time
- [x] Single Atom feed (`/feed.xml`, all content, sorted by date)
- [x] External link icons (SVG mask-image, domain-classified via `Filters.Links`)
- [x] Gallery / Exhibit system (`gallery.js`, `gallery.css`) — added (not in original spec)
### Phase 3: Rich Interactions
- [x] Link preview popups (`popups.js`) — internal page previews (title, abstract, authors, tags, reading time), Wikipedia excerpts, citation previews; relative-URL fix for index pages
- [x] Pagefind search (`/search.html`) — `search.js` pre-fills from `?q=` param; `#search-timing` shows elapsed ms (mono, faint) via `MutationObserver` on search results subtree
- [x] Author system — authors treated as tags; `build/Authors.hs`; author pages at `/authors/{slug}/`; `authorLinksField` in all contexts; defaults to Levi Neuwirth
- [x] Settings panel — `settings.js` + `settings.css` section in `components.css`; theme, text size (3 steps), focus mode, reduce motion, print; all state in `localStorage`; `theme.js` restores all settings before first paint
- [x] Selection popup — `selection-popup.js` / `selection-popup.css`; context-aware toolbar appears 450 ms after text selection; see Implementation Notes
- [x] Fiction reading mode — same codex layout; `fictionCompiler`; chapter drop caps + smallcaps lead-in via `h2 + p::first-letter`; reading mode infrastructure shared with poetry
- [x] Music section — score fragment system (A): inline SVG excerpts (motifs, passages) integrated into the gallery/exhibit system; named, TOC-listed, focusable in the shared overlay alongside equations; authored via `{.score-fragment score-name="..." score-caption="..."}` fenced-div; SVG inlined at build time by `Filters.Score`; black fills/strokes replaced with `currentColor` for dark-mode; see Implementation Notes
- [x] Music section — composition landing pages + full score reader (C): two-URL architecture per composition; `/music/{slug}/` (rich prose landing page with movement list, audio players, inline score fragments) and `/music/{slug}/score/` (minimal dedicated reader); Hakyll `version "score-reader"` mechanism; `compositionCtx` with `slug`, `score-url`, `has-score`, `score-page-count`, `score-pages` list, `has-movements`, `movements` list (Aeson-parsed nested YAML); `score-reader-default.html` minimal shell; `score-reader.js` (page navigation, movement jumps, `?p=` deep linking, preloading, keyboard); `score-reader.css`; dark mode via `filter: invert(1)`; see Implementation Notes
- [x] Accessibility audit — skip link, TOC collapsed-link tabbing (`visibility: hidden`), section-toggle focus visibility, lightbox/gallery/settings focus restoration, popup `aria-hidden`, metadata nav wrapping, footer `onclick` removal; settings panel focus-steal bug fixed (focus only returns to toggle when it was inside the panel, preventing interference with text-selection popup)
- [~] Visualization pipeline — Pandoc filter approach (`Filters.Viz`): `.figure` fenced divs run `python3 <script>`, capture SVG stdout, inline with `currentColor` replacement; `.visualization` fenced divs embed Vega-Lite JSON in a `<script type="application/json" class="vega-spec">` tag rendered by `viz.js`; `viz: true` frontmatter gates CDN Vega/Vega-Lite/Vega-Embed + `viz.js`; dark mode re-renders via `MutationObserver`; `tools/viz_theme.py` provides matplotlib monochrome helpers. Infrastructure complete; not yet used in production content.
- [ ] Content migration — migrate existing essays, poems, fiction, and music landing pages from prior formats into `content/`
### Phase 5: Infrastructure & Advanced
- [x] **Arch Linux VPS + nginx + certbot + DNS migration** — Hetzner VPS provisioned, Arch Linux installed, nginx configured (config in §III), TLS cert via certbot, DNS migrated from DreamHost. `make deploy` pushes to GitHub and rsyncs to VPS.
- [x] **Semantic embedding pipeline** — Implemented. See Phase 6 "Embedding-powered similar links" and "Full-text semantic search".
- [x] **Backlinks with context** — Two-pass build-time system (`build/Backlinks.hs`). Pass 1: `version "links"` compiles each page lightly (wikilinks preprocessed, links + context extracted, serialised as JSON). Pass 2: `create ["data/backlinks.json"]` inverts the map. `backlinksField` in `essayCtx` / `postCtx` loads the JSON and renders `<details>`-collapsible per-entry lists. `popups.js` excludes `.backlink-source` links from the preview popup. Context paragraph uses `runPure . writeHtml5String` on the surrounding `Para` block. See Implementation Notes.
- [ ] **Link archiving** — For all external links in `data/bibliography.bib` and in page bodies, check availability and save snapshots (Wayback Machine `save` API or local archivebox instance). Store archive URLs in `data/link-archive.json`; `Filters.Links` injects `data-archive-url` attributes; `popups.js` falls back to the archive if the live URL returns 404.
- [ ] **Self-hosted git (Forgejo)** — Run Forgejo on the VPS. Mirror the build repo. Link from the colophon. Not essential; can remain on GitHub indefinitely.
- [ ] **Reader mode** — Distraction-free reading overlay: hides nav, TOC, sidenotes; widens the body column to ~70ch; activated via a keyboard shortcut or settings panel toggle. Distinct from focus mode (which affects the nav) — reader mode affects the content layout.
- [ ] **HTTP/3 + QUIC** — nginx 1.25+ supports HTTP/3 via `listen 443 quic reuseport` + `http3 on` + `Alt-Svc` header. Requires UDP 443 open in Hetzner's Cloud Firewall. Deferred: latency is currently geographic RTT, not server processing; gains would be modest for a static site from a single DC. CDN alternatives (Bunny.net, multi-region Hetzner with GeoDNS) would address the root cause but raise ethical or operational complexity concerns. Revisit if latency becomes a real user complaint. Server-side improvements (brotli pre-compression, `open_file_cache`) are a lower-cost step first.
### Phase 6: Deferred Features
- [x] **Annotation UI** — `annotations.js` / `annotations.css`: localStorage storage, text-stream re-anchoring, four highlight colors (amber/sage/steel/rose), hover tooltip with delete. Selection popup "Annotate" button triggers a color-swatch + optional note picker; Enter or "Highlight" button commits; Escape cancels. Picker positioned above the selection, same inverted style as the tooltip. Settings panel includes a "Clear Annotations" button (with confirmation) that wipes all annotations site-wide via `Annotations.clearAll()`.
- [~] **Visualization pipeline** — Implemented as a Pandoc IO filter (`Filters.Viz`), not a per-slug Hakyll rule. See Phase 4 entry and Implementation Notes. Infrastructure complete; production content pending.
- [x] **Music catalog page** — `/music/` index listing all compositions grouped by instrumentation category (orchestral → chamber → solo → vocal → choral → electronic → other), with an optional Featured section. Auto-generated from composition frontmatter by `build/Catalog.hs`; renders HTML in Haskell (same pattern as backlinks). Category, year, duration, instrumentation, and ◼/♫ indicators for score/recording availability. `content/music/index.md` provides prose intro + abstract. Template: `templates/music-catalog.html`. CSS: `static/css/catalog.css`. Context: `musicCatalogCtx` (provides `catalog: true` flag, `featured-works`, `has-featured`, `catalog-by-category`).
- [x] **Score reader swipe gestures** — `touchstart`/`touchend` listeners on `#score-reader-stage` with passive: true. Threshold: ≥ 50 px horizontal, <30pxverticaldrift.Leftswipe→nextpage;rightswipe→previouspage.
- [x] **Full-piece audio on composition pages** — `recording` frontmatter key (path relative to the composition directory). Rendered as a full-width `<audio>` player in `composition.html`, above the per-movement list. Styled via `.comp-recording` / `.comp-recording-audio` in `components.css`. Per-movement `<audio>` players and `.comp-btn` / `.comp-movement-*` styles also added in the same pass.
- [x] **RSS/feed improvements** — `/feed.xml` now includes compositions (`content/music/*/index.md`) alongside essays, posts, fiction, poetry. New `/music/feed.xml` (compositions only, `musicFeedConfig`). Compositions already had `"content"` snapshots saved by the landing-page rule; no compiler changes needed.
- [ ] **Pagefind improvements** — Currently a basic full-text search. Consider: sub-result excerpts, portal-scoped search filters, weighting by `importance` frontmatter field.
- [ ] **Audio essays / podcast feed** — Record readings of select essays. Embed a native `<audio>` player at the top of the essay page, activated by an `audio` frontmatter key (path to MP3, relative to the content dir). Generate a separate `/podcast.xml` Atom feed with `<enclosure>` elements pointing to the MP3s so readers can subscribe in any podcast app. Stretch goal: a paragraph-sync mode where the player emits `timeupdate` events that highlight the paragraph currently being read — requires a `data/audio/{slug}-timestamps.json` file mapping paragraph indices to timestamps, authored manually or via forced-alignment tooling (e.g. `whisper` with word timestamps).
- [x] **Build telemetry page** — `/build/` page generated at build time. `build/Stats.hs` loads all content items by type, reads `"word-count"` snapshots, aggregates counts/words/reading-time per type, computes word-length distribution (5 buckets), and reads top-15 tags from the `Tags` object. Makefile writes `date +%s` to `data/build-start.txt` before Hakyll runs; after pagefind, computes elapsed and writes `data/last-build-seconds.txt` (read on next build). CSS in `static/css/build.css` (flex bar chart, tabular-nums table, grid dl); loaded conditionally via `$if(build)$` in head.html.
- [x] **Epistemic profile** — Replaces the old `certainty` / `importance` fields with a richer multi-axis system. **Compact** (always visible in footer): status chip · confidence % · importance dots · evidence dots. **Expanded** (`<details>`): stability (auto) · scope · novelty · practicality · last reviewed · confidence trend. Auto-calculation in `build/Stability.hs` via `git log --follow`; `IGNORE.txt` pins overrides. See Metadata section and Implementation Notes for full schema and vocabulary.
- [ ] **Writing statistics dashboard** — A `/stats` page computed entirely at build time from the corpus. Contents: total word count across all content types, essay/post/poem count, words written per month rendered as a GitHub-style contribution heatmap (SVG generated by Haskell or a Python script), average and median essay length, longest essay, most-cited essay (by backlink count), tag distribution as a treemap, reading-time histogram, site growth over time (cumulative word count by date). All data collected during the Hakyll build from compiled items and their snapshots; serialized to `data/stats.json` and rendered into a dedicated `stats.html` template.
- [x] **Memento mori** — Implemented at `/memento-mori/` as a full standalone page. 90×52 grid of weeks anchored to birthday anniversaries (nested year/week loop via `setFullYear`; week 52 stretched to eve of next birthday to absorb 365th/366th days). Week popup shows dynamic day-count and locale-derived day names. Score fragment (bassoon, `content/memento-mori/scores/bsn.svg`) inlined via `Filters.Score`. Linked from footer (MM).
- [x] **Embedding-powered similar links** — `tools/embed.py` encodes every `#markdownBody` page with `all-MiniLM-L6-v2` (384 dims, unit-normalised), builds a FAISS `IndexFlatIP`, queries top-5 neighbours per page (cosine ≥ 0.30), writes `data/similar-links.json`. `build/SimilarLinks.hs` provides `similarLinksField` in `essayCtx`/`postCtx` with Hakyll dependency tracking; "Related" section rendered in `page-footer.html`. Staleness check skips re-embedding when JSON is newer than all HTML. Called by `make build` via `uv run`; non-fatal if `.venv` absent. See Implementation Notes.
- [x] **Bidirectional backlinks with context** — See Phase 5 above; implemented with full context-paragraph extraction. Merged with the Phase 5 stub.
- [x] **Signed pages / content integrity** — `make sign` (called by `make deploy`) runs `tools/sign-site.sh`: walks `_site/**/*.html`, produces a detached ASCII-armored `.sig` per page. Signing uses a dedicated Ed25519 subkey isolated in `~/.gnupg-signing/` (master `sec#` stub + `ssb` signing subkey). Passphrase cached 24 h in the signing agent via `tools/preset-signing-passphrase.sh` + `gpg-preset-passphrase`; `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase`. Footer "sig" link points to `$url$.sig`; hovering shows the ASCII armor via `popups.js``sigContent` provider. Public key at `static/gpg/pubkey.asc` → served at `/gpg/pubkey.asc`. Fingerprints: master `CD90AE96…B5C9663`; signing subkey `C9A42A6F…2707066` (keygrip `619844703EC398E70B0045D7150F08179CFEEFE3`). See Implementation Notes.
- [x] **Self-hosted semantic search model** — `tools/download-model.sh` fetches the quantized ONNX model (`all-MiniLM-L6-v2`, ~22 MB, 5 files) from HuggingFace into `static/models/all-MiniLM-L6-v2/` (gitignored). `semantic-search.js` sets `env.localModelPath = '/models/'` and `env.allowRemoteModels = false` before calling `pipeline()`, so all model weight fetches are same-origin. The CDN import of transformers.js itself still requires `cdn.jsdelivr.net` in `script-src`; `connect-src` stays `'self'`. `make download-model` is a one-time setup step per machine. See Implementation Notes.
- [x] **Responsive images (WebP)** — `tools/convert-images.sh` walks `static/` and `content/`, calls `cwebp -q 85` to produce `.webp` companions alongside every JPEG/PNG (skips existing; exits gracefully if `cwebp` absent). `make build` runs it before Hakyll so WebP files are present when `static/**` is copied. `build/Filters/Images.hs` detects local raster images and emits `RawInline "html"``<picture>` elements with a `<source srcset="…webp" type="image/webp">` and an `<img>` fallback; SVG, external URLs, and `data:` URIs pass through as plain `<img>`. Generated `.webp` files gitignored via `static/**/*.webp` / `content/**/*.webp`. See Implementation Notes.
- [x] **Full-text semantic search** — `tools/embed.py` also produces paragraph-level embeddings (same `all-MiniLM-L6-v2` model, same pass): walks `<p>/<li>/<blockquote>` in `#markdownBody`, tracks nearest preceding heading for context, writes `data/semantic-index.bin` (raw Float32, N×384) + `data/semantic-meta.json` ([{url, title, heading, excerpt}]). Client: `static/js/semantic-search.js` dynamically imports `@xenova/transformers@2` from CDN, embeds the query, brute-force cosine-ranks all paragraph vectors (fast at <5kparagraphsinJS),renderstop-8resultsastitle+sectionheading+excerpt.Surfacedon`/search.html`asa**Keyword / Semantic**tabstrip;activetabpersistsin`localStorage`(keyworddefaultonfirstvisit).Bothtabsonsamepage;`semantic-search.js`loadedwithinexisting`$if(search)$`block.SeeImplementationNotes.
---
## VI. Implementation Notes
### sidenotes.js — Written from scratch
The spec called for adopting Said Achmiz's `sidenotes.js` directly. Instead a purpose-built version was written for the `<span class="sidenote">` structure produced by `Filters.Sidenotes`. Features: JS collision avoidance (`positionSidenotes`), bidirectional hover highlight, click-to-focus (sticky highlight on wide viewport, anchor scroll fallback on narrow), document-click dismissal. `window.resize` is used as the reposition signal; `collapse.js` dispatches it after each section transition.
### Gallery / Exhibit system — Added (not in original spec)
- **Exhibits** (`.exhibit--equation`, `.exhibit--proof`): always-visible inline blocks with overlay zoom on click.
- **TOC integration**: exhibits are listed under their parent heading.
- Implementation: `gallery.js`, `gallery.css`; Pandoc fenced-div syntax (`:::`) to avoid the 4-space code block trap.
### LaTeX Math — Client-side KaTeX
The spec described pure build-time SSR. In practice: Pandoc outputs `class="math"` spans, KaTeX renders client-side from a deferred script. Fully static (no server per request). Revisit if build-time SSR becomes important.
*Note:* The `abstract` field is parsed natively through Pandoc via a custom `abstractField` compiler in `Contexts.hs` (using KaTeX settings) so that LaTeX math in the frontmatter renders identically to the body text.
### Citation pipeline — key subtleties
1. **`Cite` nodes, not `Span` nodes.** `processCitations` with `class="in-text"` CSL does *not* convert `Cite` nodes to `Span class="citation"` nodes in the Pandoc AST — it only populates their inline content and creates the refs div. The HTML writer wraps them in `<span class="citation">` at write time. Our `Citations.hs` must match `Cite` nodes directly.
2. **Hakyll strips YAML frontmatter.** Hakyll reads frontmatter separately; the body passed to `readPandocWith` has no YAML block, so Pandoc `Meta` is empty. `further-reading` keys and custom `bibliography` paths are read from Hakyll's metadata API (`lookupStringList` / `lookupString`) in `Compilers.hs` and passed explicitly to `Citations.applyCitations` to support custom per-essay `.bib` files (defaulting to `data/bibliography.bib`).
3. **`nocite` format.** Each further-reading key must be a *separate*`Cite` node with `AuthorInText` mode and non-empty fallback content — matching what pandoc produces from `"@key1 @key2"` in YAML. A single `Cite` node with multiple citations is not recognized by citeproc's nocite processing.
4. **`collectCiteOrder` queries blocks only**, not the full `Pandoc` (which includes metadata). Querying metadata would pick up the injected `nocite``Cite` nodes and incorrectly classify further-reading entries as inline citations.
### External link icons
Implemented via `data-link-icon` and `data-link-icon-type="svg"` attributes set by `Filters.Links`. CSS uses `mask-image: url(...)` with `background-color: currentColor` so icons inherit the text color and work in dark mode. Icons in `static/images/link-icons/` as SVG files.
### Tags — Hierarchical, no namespace
Tags are slash-separated (`research/mathematics`). A tag is auto-expanded into all ancestor prefixes so that `/research/` aggregates all `research/*` content. Tag pages live directly at `/<tag>/` with no `/tags/` namespace.
### Collapsible sections
`collapse.js` wraps each h2/h3's following siblings in a `.section-body` div and injects a `.section-toggle` button into the heading. State is persisted per heading in `localStorage` under `section-collapsed:<id>`. After each `transitionend`, dispatches `window.resize` to retrigger sidenote positioning. Headings themselves are never hidden, preserving `IntersectionObserver` targets for `toc.js`.
### Atom feed
`/feed.xml` covers all essays and blog posts (up to 30 most recent). A `"content"` snapshot is saved in `Site.hs`*before* template application, so the feed body is just the compiled article HTML (not the full page with nav/footer). Dates from the `date` frontmatter key, formatted as RFC 3339.
### Author system
Authors are treated as a second tag dimension. `build/Authors.hs` provides `buildAllAuthors` (a `buildTagsWith` call keyed to `authors` frontmatter) and `authorLinksField` (a `listFieldWith` context that defaults to `["Levi Neuwirth"]` when no `authors` key is present, so all unattributed content contributes to his author page). Author pages live at `/authors/{slug}/`. `slugify` lowercases and hyphenates; pipe-separated values (`"Name | role"`) strip the role portion via `nameOf`.
### Settings panel
`settings.js` manages four independent settings, all persisted in `localStorage`:
- **Theme** (`data-theme` on `<html>`): light / dark, with `syncThemeButtons()` toggling `.is-active`.
- **Text size**: three steps `[20, 23, 26]` px (small / default / large), written as `--text-size` CSS custom property on `<html>`. Default index is 1 (23 px).
- **Focus mode** (`data-focus-mode` on `<html>`): hides TOC, fades header to 7% opacity until hover.
- **Reduce motion** (`data-reduce-motion` on `<html>`): collapses all `animation-duration` / `transition-duration` to `0.01ms`.
`theme.js` (sync, not deferred) restores all four attributes from `localStorage` before first paint to avoid flash.
### Selection popup
`selection-popup.js` / `selection-popup.css`. A toolbar appears 450 ms after any text selection of ≥ 2 characters. Context is detected from the DOM ancestors of the selection range:
| **prose** (multi-word) | fallback | BibTeX · Copy · DuckDuckGo · Here · Wikipedia |
| **prose** (single word) | `!/\s/.test(text)` | BibTeX · Copy · Define · DuckDuckGo · Here · Wikipedia |
16 languages are mapped to documentation providers (MDN, Hoogle, docs.python.org, doc.rust-lang.org, etc.) via `DOC_PROVIDERS`. **BibTeX** generates a `@online{...}` BibLaTeX entry (key = `lastname + year + firstWord`; selected text in `note={\enquote{...}}`; year scraped from `#version-history li`). **Here** opens `/search.html?q=` in a new tab. **Define** opens English Wiktionary. Popup positions above the selection, flips below if insufficient space; hides on scroll, outside mousedown, or Escape.
### Reading mode (poetry + fiction)
Shared codex layout for creative content. `templates/reading.html` omits the TOC and emits a `<div id="reading-progress">` progress bar instead. `body.reading-mode` (set via `$if(reading)$` in `default.html`) triggers a slightly warmer background (`#fdf9f1` / `#1c1917`). Poetry pages (`body.reading-mode.poetry`) use a 52ch measure, 1.85 line-height, stanza paragraph spacing, and suppressed dropcap/smallcaps lead-in; `poetryCompiler` enables `Ext_hard_line_breaks` so each source newline becomes `<br>`. Fiction pages (`body.reading-mode.fiction`) use a 62ch measure, centered Fira Sans smallcaps chapter headings, and a dropcap + smallcaps lead-in on each `h2 + p`. Progress bar is driven by `reading.js` (scroll position → `width` on `#reading-progress`). CSS and JS loaded conditionally via `$if(reading)$`. Content goes in `content/poetry/*.md` and `content/fiction/*.md`; tags `poetry` / `fiction` route items to the correct portal and library section.
### Score fragment system (option A)
`Filters/Score.hs` walks the Pandoc AST for `Div` nodes with class `score-fragment`. It reads the referenced SVG from disk (path resolved relative to the source file's directory via `getResourceFilePath` + `takeDirectory`), replaces hardcoded black fill/stroke values with `currentColor` (6-digit before 3-digit to prevent partial matches on `#000` vs `#000000`), and emits a `RawBlock "html"``<figure>` carrying `class="score-fragment exhibit"`, `data-exhibit-name`, and `data-exhibit-type="score"` for gallery.js TOC integration. SVGs are inlined at build time and never served as separate files. `gallery.js` discovers `.score-fragment` elements via `discoverFocusableScores`, adds them to the shared `focusables[]` array with `type: 'score'`, and the overlay's `renderOverlay` branches on type — score path clones the SVG into the overlay body (no font-size loop); math path keeps the KaTeX re-render. Overlay body receives class `is-score` for tighter horizontal padding (`2rem 1.5rem` vs `3.5rem 4.5rem`). CSS: background rect removed via `svg > rect:first-child { fill: none }`, SVG responsive via `width: 100%; height: auto`, dark mode via `color: var(--text)`.
**Authoring syntax:**
```markdown
::: {.score-fragment score-name="Main Theme, mm. 1–8" score-caption="The opening gesture."}

:::
```
### Music — Composition landing pages + full score reader (option C)
**Implemented.** Two URLs per composition from one source directory.
The Hakyll `version "score-reader"` mechanism compiles the same `index.md` twice: once as the landing page (default version) and once as the reader (`customRoute` to `music/{slug}/score/index.html`). Score reader uses `makeItem ""` — the prose body is irrelevant; only frontmatter fields are needed.
#### Source directory layout
```
content/music/symphonic-dances/
├── index.md ← composition frontmatter + program notes prose
├── scores/
│ ├── page-01.svg ← one file per score page (LilyPond SVG output)
│ ├── page-02.svg
│ └── symphonic-dances.pdf
└── audio/
├── movement-1.mp3
└── movement-2.mp3
```
SVG, MP3, and PDF files are copied to `_site/` via `copyFileCompiler`. Score reader SVGs are served as separate `<img>` files — inlining a full orchestral score is impractical.
| `$score-pages$` | list | each item: `$score-page-url$` (absolute URL) |
| `$has-movements$` | boolean | present when `movements` non-empty |
| `$movements$` | list | each item: `$movement-name$`, `$movement-page$`, `$movement-duration$`, `$movement-audio$`, `$has-audio$` |
| `$composition$` | flag | `"true"` — gates `score-reader.css` in `head.html` |
`movements` is parsed from the nested YAML using `Data.Aeson.KeyMap` (Aeson 2.x API). `score-pages` are resolved to absolute URLs (`/music/{slug}/{path}`) inside the context so the `data-pages` attribute in the score reader template needs no further processing.
#### Score reader
The reader template embeds page URLs as a comma-separated `data-pages` attribute on `#score-reader-stage`. `score-reader.js` splits on commas and filters empties.
`score-reader-default.html` loads only: `base.css`, `components.css` (for settings panel styles), `score-reader.css`, `theme.js` (sync, pre-paint), `settings.js` (theme toggle in the top bar), `score-reader.js`. No nav, no TOC, no sidenotes, no popups, no gallery, no lightbox.
`score-reader.js` behaviors:
- `navigate(page)`: swaps `<img src>`, updates counter, toggles prev/next disabled states, updates active movement button (last movement whose start page ≤ current page), calls `history.replaceState` for `?p=` deep linking, preloads ±1 pages.
- Keyboard: `ArrowLeft`/`ArrowRight`/`ArrowUp`/`ArrowDown` for page turns; `Escape` → `history.back()`. Suppressed when settings panel is open.
- Dark mode: `[data-theme="dark"] .score-page { filter: invert(1); }` — clean for pure B&W notation; revisit if LilyPond embeds colored elements.
- Mobile: score scrolls horizontally at ≤ 640px (`min-width: 600px` on `<img>`); arrow buttons hidden; pinch-to-zoom is native.
#### Known limitations / future work
- **Full-piece audio**: a `recording` frontmatter key for a complete performance would add a top-level audio player on the landing page. Not yet implemented.
- **LilyPond margin cropping**: the `viewBox` drives scaling but LilyPond's default page includes margins. May need per-composition `viewBox` overrides or CSS `object-fit` once real scores are tested.
### Backlinks — Two-pass dependency-correct system
`build/Backlinks.hs`. The fundamental challenge: backlinks for page A require knowing what other pages link to A, but those pages haven't been compiled yet when A is compiled. Solved with a two-version architecture:
1. **Pass 1** (`version "links"`): each content file is compiled lightly — wikilinks preprocessed, Markdown parsed, AST walked block-by-block. For every internal link, the URL and the HTML of its surrounding `Para` block are recorded as a `LinkEntry { leUrl, leContext }`. Context rendered via `runPure (writeHtml5String opts (Pandoc nullMeta [Plain inlines]))` with `writerTemplate = Nothing` (fragment only). Result serialised as JSON per page.
2. **Pass 2** (`create ["data/backlinks.json"]`): loads all `version "links"` items, inverts the map (target → [source]), resolves each source's title and abstract from its metadata, emits `data/backlinks.json`.
3. **Context** (`backlinksField`): loads `data/backlinks.json` via `load`, looks up the current page's normalised URL, renders `<ul>` with `<details>`-collapsible context per entry.
**Key implementation details:**
- All `loadAll` / `loadAllSnapshots` / `buildTagsWith` / `buildPaginateWith` calls use `.&&. hasNoVersion` to prevent "links" version items from being picked up alongside default versions.
- `isPageLink` filters out `http://`, `https://`, `#`-anchors, `mailto:`, `tel:`, and static-asset extensions (`.pdf`, `.svg`, `.mp3`, etc.).
- JSON encoding uses `TL.unpack . TLE.decodeUtf8 . Aeson.encode` (not `LBSC.unpack`) to preserve non-ASCII characters in context paragraphs.
- `popups.js` excludes `.backlink-source` links from the internal-preview popup (same exception pattern as `.meta-authors`).
### Epistemic Profile
Implemented across `build/Stability.hs`, `build/Contexts.hs`, `templates/partials/page-footer.html`, `templates/partials/metadata.html`, and `static/css/components.css`.
**Context fields provided by `epistemicCtx`** (included in `essayCtx`):
| Field | Source | Notes |
|-------|--------|-------|
| `$status$` | frontmatter `status` | via `defaultContext` |
| `$confidence$` | frontmatter `confidence` | via `defaultContext` |
- Heuristic: ≤ 1 commits or age <14days→*volatile*;≤5commitsandage<90days→*revising*;≤15commitsorage<365days→*fairly stable*;≤30commitsorage<730days→*stable*;otherwise→*established*.
- `IGNORE.txt`: paths listed here use frontmatter `stability`/`last-reviewed` verbatim. Cleared by `> IGNORE.txt` in the Makefile's `build` target (one-shot pins).
**Critical implementation note:** Fields that use `unsafeCompiler` must return `Maybe` from the IO block and call `fail` in the `Compiler` monad afterward — not inside the `IO` action. Calling `fail` inside `unsafeCompiler`'s IO block throws an `IOError` that Hakyll's `$if()$` template evaluation does not catch as `NoResult`, causing the entire item compilation to error silently.
`build/Filters/Viz.hs` walks the AST for `Div` blocks with class `figure` or `visualization`.
**Static figures (`.figure`):** reads the `script` attribute, runs `python3 <script>` with the source file's directory as cwd, captures stdout as SVG. Replaces hardcoded `#000000`/`black` fills/strokes with `currentColor` (same trick as `Filters.Score`). Wraps in `<figure class="viz-figure">`. Script is expected to import `tools/viz_theme` and call `save_svg()` which writes to stdout.
**Interactive figures (`.visualization`):** runs the script, expects Vega-Lite JSON on stdout. Embeds as `<script type="application/json" class="vega-spec">` inside a `.vega-container` div. `viz.js` finds all `.vega-spec` scripts, stores parsed spec on `container._vegaSpec`, calls `vegaEmbed`. Always applies the site's monochrome Vega config, ignoring the spec's own `config`. MutationObserver on `document.documentElement[data-theme]` triggers `reRenderAll()` on theme change.
**Frontmatter:** `viz: true` gates CDN loading of Vega/Vega-Lite/Vega-Embed and `viz.js` in `head.html` via `$if(viz)$`.
**`tools/viz_theme.py`:** `apply_monochrome()` sets matplotlib rcParams (transparent backgrounds, black lines); `save_svg(fig)` writes SVG to stdout via `io.StringIO`; `LINESTYLE_CYCLE` provides dash-pattern sequences for multi-series charts (no color distinction needed).
**Key architecture:** master certifying key in `~/.gnupg` (passphrase-protected, used for email). Dedicated signing keyring at `~/.gnupg-signing/` holds: `sec#` (master stub, no secret) + `ssb` Ed25519 signing subkey (with secret). Correct isolation: `gpg --export-secret-subkeys "FINGERPRINT!"` exports only the subkey secret.
**Passphrase caching:** GPG 2.4's `passwd` in `--edit-key` requires the master secret to be present — it cannot change a subkey passphrase in a stub+subkey-only keyring. Instead, `gpg-preset-passphrase` (`/usr/lib/gnupg/gpg-preset-passphrase`) caches the passphrase by keygrip directly in the agent. `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase` and `max-cache-ttl 86400`. `tools/preset-signing-passphrase.sh` prompts via the terminal, calls `gpg-preset-passphrase --preset <keygrip>`. Must be run once per boot (or when the 24h cache expires).
**Popup preview:** `popups.js``sigContent` provider fetches the `.sig` URL (same-origin), renders the ASCII armor in a `<pre>` inside a `.popup-sig` div. Bound to `a.footer-sig-link` explicitly in `bindTargets`, bypassing the footer-exclusion guard on internal links. Result cached in the shared `cache` map.
**nginx:** `.sig` files need no special handling — they're served as static files alongside `.html`. The `try_files` directive handles `$uri` directly.
### Embedding pipeline + semantic search
**Model unification:** Both similar-links (page-level) and semantic search (paragraph-level) use `all-MiniLM-L6-v2` (384 dims). This is a deliberate simplification: the same model runs at build time (Python/sentence-transformers) and query time (browser/transformers.js `Xenova/all-MiniLM-L6-v2` quantized), guaranteeing that query vectors and corpus vectors are in the same embedding space.
**Build-time (`tools/embed.py`):** One HTML parse pass per file extracts both the full-page text (for similar-links) and individual paragraphs (for semantic search). Model is loaded once and both encoding jobs run sequentially. Outputs: `data/similar-links.json`, `data/semantic-index.bin` (raw `float32`, shape `[N_paragraphs, 384]`), `data/semantic-meta.json`. All three are gitignored (generated). Staleness check: skips the entire run if all three outputs are newer than all `_site/` HTML.
**Binary index format:** `para_vecs.tobytes()` writes a flat, little-endian `float32` array. In JS: `new Float32Array(arrayBuffer)`. No header, no framing — row `i` starts at byte offset `i × 384 × 4`. This is the simplest possible format and avoids a numpy/npy parser in the browser.
**Client-side search (`semantic-search.js`):** Dynamically imports `@xenova/transformers@2` from jsDelivr CDN on first query (lazy — no load cost on pages that never use semantic search). Fetches binary index + metadata JSON (also lazy, browser-cached). Brute-force dot product over a `Float32Array` in a tight JS loop — fast enough at <5kparagraphs;revisitwithaWASMFAISSbindingifthecorpusgrowsbeyond~20kparagraphs.Vectorsareunit-normalisedatbuildtime,sodotproduct =cosinesimilarity.
**Tab default + localStorage:** Keyword (Pagefind) is the default on first visit — zero cold-start. User's last-used tab is stored under `search-tab` in `localStorage` and restored on load, so returning users who prefer semantic always land there. If `localStorage` is unavailable (private browsing restrictions), falls back silently to keyword.
**Self-hosted model:** `semantic-search.js` sets `mod.env.localModelPath = '/models/'` and `mod.env.allowRemoteModels = false` immediately after the CDN import. transformers.js then resolves model files as `GET /models/all-MiniLM-L6-v2/{file}`, which are same-origin. `make download-model` (= `tools/download-model.sh`) fetches 5 files from HuggingFace: `config.json`, `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`, `onnx/model_quantized.onnx`. The files live in `static/models/` (gitignored) and are copied to `_site/models/` by the existing `static/**` Hakyll rule.
**CSP:** `script-src` requires `https://cdn.jsdelivr.net` for the transformers.js library import. `connect-src` stays `'self'` — all model weight fetches are same-origin after `allowRemoteModels = false`. The binary index and meta JSON are also same-origin.
### Responsive images — WebP `<picture>` wrapping
`build/Filters/Images.hs` inspects each `Image` inline's `src`. If it is a local raster (not starting with `http://`, `https://`, `//`, or `data:`; extension `.jpg`/`.jpeg`/`.png`/`.gif`), it emits `RawInline (Format "html")` containing a `<picture>` element:
The WebP `srcset` is computed at build time by `System.FilePath.replaceExtension`. No IO is needed in the filter — the `<source>` is always emitted; browsers silently ignore it if the file doesn't exist (falling back to `<img>`). SVG, external URLs, and `data:` URIs remain plain `<img>` tags. Images inside `<a>` links get no `data-lightbox` marker (same as before).
`tools/convert-images.sh` performs the actual conversion: `cwebp -q 85` per file. Runs before Hakyll in `make build` and is also available as a standalone `make convert-images` target. Exits 0 with a notice if `cwebp` is not installed, so the build never fails on machines without `libwebp`. Generated `.webp` files are gitignored; `git add -f` to commit an authored WebP.
**Quality note:** `-q 85` is a good default for photographic images. For pixel-art, diagrams, or images that are already highly compressed, `-lossless` or a higher quality setting may be appropriate (edit the script).
### Annotations (infrastructure only)
`annotations.js` stores annotations as JSON in `localStorage` under `site-annotations`, scoped per `location.pathname`. On `DOMContentLoaded`, `applyAll()` re-anchors saved annotations via a `TreeWalker` text-stream search (concatenates all text nodes in `#markdownBody`, finds exact match by index, builds a `Range`, wraps with `<mark>`). Cross-element ranges use `extractContents()` + `insertNode()` fallback. Four highlight colors (amber / sage / steel / rose) defined in `annotations.css` as `rgba` overlays with `box-decoration-break: clone`. Hover tooltip shows note, date, and delete button. Public API: `window.Annotations.add(text, color, note)` / `.remove(id)`. The selection-popup "Annotate" button is currently removed pending a UI revision.
---
## VII. Reference: Inspirations
- **gwern.net** — Primary model (Gwern Branwen + Said Achmiz). Semantic zoom, sidenotes, popups, monochrome, Pandoc+Hakyll.
- **Edward Tufte** — Sidenotes, information design
- **Matthew Butterick's Practical Typography** — Web typography in practice
- **Traditional book design** — The standard to aspire to on screen
---
*This specification is a living document updated as implementation progresses.*