levineuwirth.org/spec.md

740 lines
62 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# levineuwirth.org — Design Specification v11
**Author:** Levi Neuwirth
**Date:** March 2026 (v11: 21 March 2026)
**Status:** LIVING DOCUMENT — Updated as implementation progresses.
---
## I. Vision & Philosophy
This website is an **intellectual home** — the permanent residence of a mind that moves freely between computer science, music composition, poetry, fiction, and whatever else catches fire.
### Commitments
1. **Long content over disposable content.** Essays are living documents.
2. **Semantic zoom.** Title → abstract → headers → body → sidenotes → citations → sources.
3. **Earned ornament.** Every decorative element serves a purpose.
4. **The site is the proof.** Entirely FOSS. No tracking. No analytics. No fingerprinting.
5. **Reader > Author.**
6. **Configuration is code.** The build system is a Haskell program.
7. **No homepage epigraph.**
8. **Extensible metadata.** Future-proofed for semantic embeddings via external JSON injection.
---
## II. All Resolved Decisions
### Typography
| Role | Font | License | Notes |
|------|------|---------|-------|
| **Body** | **Spectral** | SIL OFL | Screen-first serif. True smallcaps (`smcp`), four figure styles, ligatures, seven weights + italics. Self-hosted from source — Google Fonts strips OT features. |
| **UI / Headers** | **Fira Sans** | SIL OFL | Humanist sans-serif. Complements Spectral. |
| **Code** | **JetBrains Mono** | SIL OFL | Ligatures, excellent legibility. |
Font pairing has been tested across screens and confirmed.
**Self-hosting workflow:**
```bash
pyftsubset Spectral-Regular.ttf \
--output-file=spectral-regular.woff2 \
--flavor=woff2 \
--layout-features='liga,dlig,smcp,c2sc,onum,lnum,pnum,tnum,frac,ordn,sups,subs,ss01,ss02,ss03,ss04,ss05,kern' \
--unicodes='U+0000-00FF,U+0131,U+0152-0153,U+02BB-02BC,U+02C6,U+02DA,U+02DC,U+2000-206F,U+2074,U+20AC,U+2122,U+2191,U+2193,U+2212,U+2215,U+FEFF,U+FFFD' \
--no-hinting --desubroutinize
```
### LaTeX Math
Client-side KaTeX (not pure build-time SSR — see Implementation Notes):
- Pandoc outputs math spans with `class="math inline"` / `class="math display"`
- KaTeX renders client-side from a deferred script
- KaTeX CSS/fonts loaded conditionally only on pages with math (`$if(math)$` in head template)
### Navigation
```
Home | Me | Current | New | Links | Search [⚙]
───────────────────────────────────────────────
▼ Portals
AI | Fiction | Miscellany | Music | Nonfiction | Poetry | Research | Tech
```
- **Primary row (always visible):** Home, Me, Current (now-page), New (changelog), Links, Search; settings gear (⚙) on the right
- **Settings panel** (⚙ button): Theme (Light/Dark), Text size (A/A+), Focus Mode, Reduce Motion, Print — managed by `settings.js`; state persisted via `localStorage`
- **Expandable portal row:** AI, Fiction, Miscellany, Music, Nonfiction, Poetry, Research, Tech
- Portal row collapsed by default; expansion state persisted via `localStorage`
- Fira Sans smallcaps for primary row
### Layout
- **Left margin:** Interactive sticky TOC (`IntersectionObserver`). Collapses on narrow screens.
- **Center column:** Body text in Spectral. 650700px max-width.
- **Right margin:** Sidenotes only (right column).
### Color
Pure monochrome. No accent color. Light mode default (`#faf8f4` background, `#1a1a1a` text). Dark mode via `[data-theme="dark"]` + `prefers-color-scheme`.
### Content Systems
- **Tag system:** Hierarchical, slash-separated (`research/mathematics`). Hakyll `buildTags` + custom hierarchy. Tag pages at `/<tag>/` with no `/tags/` namespace prefix.
- **Pagination:** Blog index 20/page, tag pages 20/page. Essay index all on one page.
- **RSS:** Atom feed at `/feed.xml` (all content types, sorted by `date`) and `/music/feed.xml` (compositions only).
- **Citations:** Numbered superscript markers `[1]` linked to a bibliography section. Hover preview via `citations.js`. Further Reading section separate from cited works. `data/bibliography.bib` + Chicago Author-Date CSL.
- **Collapsible sections:** h2/h3 headings toggle their content via `collapse.js`. Smooth `max-height` transition. State persisted in `localStorage`.
### Gwern Codebase: Selective Adoption
| Component | Action | Actual outcome |
|-----------|--------|----------------|
| `sidenotes.js` | Adopt directly (Said Achmiz, MIT) | **Written from scratch** — purpose-built for our HTML structure |
| `popups.js` | Fork and simplify (Said Achmiz, MIT) | Exists in `static/js/popups.js`; Phase 3 |
| CSS typographic foundations | Extract and refactor | Done |
| Pandoc AST filters | Write from scratch | Done |
| Hakyll architecture | Rewrite, informed by gwern | Done |
| Everything else | Ignore | — |
### Metadata
Extensible YAML frontmatter. Hakyll strips frontmatter before passing to Pandoc, so all frontmatter access goes through Hakyll's metadata API (`lookupStringList`, `getMetadataField`, etc.), not through Pandoc `Meta`.
**Frontmatter keys in use:**
```yaml
title: # page title
date: # ISO date (YYYY-MM-DD) — used for sorting, feed, reading-time
abstract: # short description (13 sentences)
tags: # hierarchical tag list
authors: # list of author names (defaults to Levi Neuwirth)
further-reading: # list of BibTeX keys for the Further Reading section
bibliography: # path to .bib file (optional; defaults to data/bibliography.bib)
csl: # path to .csl file (optional; defaults to data/chicago-notes.csl)
# Epistemic profile (all optional; section shown only if `status` is present)
status: # Draft | Working model | Durable | Refined | Superseded | Deprecated
confidence: # 0100 integer (%)
importance: # 15 integer (rendered as filled/empty dots)
evidence: # 15 integer (rendered as filled/empty dots)
scope: # personal | local | average | broad | civilizational
novelty: # conventional | moderate | idiosyncratic | innovative
practicality: # abstract | low | moderate | high | exceptional
stability: # volatile | revising | fairly stable | stable | established
# (auto-computed from git history; use IGNORE.txt to pin)
last-reviewed: # ISO date — overrides git-derived date when in IGNORE.txt
confidence-history: # list of integers — trend derived from last two entries (↑↓→)
# Version history (optional; falls back to git log, then to date-created/date-modified)
history:
- date: "2026-03-01" # ISO date string (quote to prevent YAML date parsing)
note: Initial draft # human-readable annotation
- date: "2026-03-14"
note: Expanded typography section; added citations
```
Auto-computed at build time: `word-count`, `reading-time`.
Auto-derived at build time: `stability` (from `git log --follow`), `last-reviewed` (most recent commit date), `confidence-trend` (from `confidence-history`).
**`IGNORE.txt`:** A file in the project root listing content paths (one per line) whose `stability` and `last-reviewed` should not be recomputed. Cleared automatically after every `make build`. Useful for pinning manually-set stability labels on pages whose git history is misleading.
**Top metadata block:**
1. **Tags** — hierarchical tag list with links to tag index pages
2. **Description** — the `abstract` field, rendered in italic
3. **Authors**`authors` list
4. **Page info** — jump links to bottom metadata sections (Epistemic/Bibliography/Backlinks shown conditionally)
**Bottom metadata footer:**
- **Version history** — three-tier priority: (1) frontmatter `history` list with authored notes → (2) git log dates (date-only) → (3) `date-created` / `date-modified` fallback. `make build` auto-commits `content/` before building, keeping git history current.
- **Epistemic** (if `status` set) — compact: status chip · confidence % · importance dots · evidence dots; expanded `<details>`: stability · scope · novelty · practicality · last reviewed · confidence trend
- **Bibliography** — formatted citations + Further Reading
- **Backlinks** — auto-generated; each entry shows source title (link) + collapsible context paragraph
### Licensing
- **Content:** CC BY-SA-NC 4.0
- **Code:** MIT
---
## III. Deployment & Infrastructure
### Deployment Pipeline
```
[Local machine] [Arch Linux VPS / DreamHost]
content/*.md
cabal run site -- build nginx serving
↓ /var/www/levineuwirth.org/
pagefind --site _site
rsync -avz --delete \
_site/ \
vps:/var/www/levineuwirth.org/ ──→ Live site
```
```makefile
build:
@git add content/
@git diff --cached --quiet || git commit -m "auto: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
@date +%s > data/build-start.txt
@./tools/convert-images.sh # WebP conversion (skipped if cwebp absent)
cabal run site -- build
pagefind --site _site
@if [ -d .venv ]; then \
uv run python tools/embed.py || echo "Warning: embedding failed"; \
fi
> IGNORE.txt # clear stability pins after each build
@BUILD_END=$(date +%s); BUILD_START=$(cat data/build-start.txt); \
echo $((BUILD_END - BUILD_START)) > data/last-build-seconds.txt
sign:
@./tools/sign-site.sh # detach-sign every _site/**/*.html; requires passphrase cached via preset-signing-passphrase.sh
deploy: build sign
rsync -avz --delete _site/ vps:/var/www/levineuwirth.org/
watch:
cabal run site -- watch
clean:
cabal run site -- clean
download-model:
@./tools/download-model.sh # fetch quantized ONNX model to static/models/ (once per machine)
convert-images:
@./tools/convert-images.sh # manual trigger; also runs in build
```
### Hosting Timeline
1. **Immediate:** Deploy to DreamHost (rsync static files)
2. **Phase 5:** Provision Arch VPS (Hetzner), configure nginx + certbot, migrate DNS
### VPS: nginx config (Arch Linux)
```nginx
server {
listen 443 ssl http2;
server_name levineuwirth.org www.levineuwirth.org;
root /var/www/levineuwirth.org;
# TLS (managed by certbot)
ssl_certificate /etc/letsencrypt/live/levineuwirth.org/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/levineuwirth.org/privkey.pem;
# cdn.jsdelivr.net required for transformers.js (semantic search library).
# Model weights served same-origin from /models/ — connect-src stays 'self'.
add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; img-src 'self'; font-src 'self';" always;
gzip on;
gzip_types text/html text/css application/javascript application/json image/svg+xml;
location ~* \.(woff2|css|js|svg|png|jpg|webp)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
location ~* \.html$ {
expires 1h;
add_header Cache-Control "public, must-revalidate";
}
try_files $uri $uri.html $uri/ =404;
error_page 404 /404.html;
}
server {
listen 80;
server_name levineuwirth.org www.levineuwirth.org;
return 301 https://$host$request_uri;
}
```
---
## IV. Repository Structure
```
levineuwirth.org/
├── content/
│ ├── essays/
│ │ └── test-essay.md # Feature test document
│ ├── blog/
│ ├── music/
│ │ └── {slug}/
│ │ ├── index.md # Composition frontmatter + program notes
│ │ ├── scores/ # LilyPond SVG pages + PDF
│ │ └── audio/ # Per-movement MP3s
│ └── *.md # Standalone pages (me, colophon, etc.)
├── static/
│ ├── css/
│ │ ├── base.css # CSS variables, palette, dark mode
│ │ ├── typography.css # Spectral OT features, dropcaps, smallcaps, link icons
│ │ ├── layout.css # 3-column layout, responsive breakpoints
│ │ ├── sidenotes.css # Sidenote positioning
│ │ ├── popups.css # Link preview popup styles
│ │ ├── syntax.css # Monochrome code highlighting (JetBrains Mono)
│ │ ├── components.css # Nav (incl. settings panel), TOC, metadata, citations, collapsibles
│ │ ├── viz.css # Visualization figure layout (.viz-figure, .vega-container, .viz-caption)
│ │ ├── gallery.css # Exhibit system + annotation callouts
│ │ ├── selection-popup.css # Text-selection toolbar
│ │ ├── annotations.css # User highlight marks + annotation tooltip
│ │ ├── images.css # Figure layout, captions, lightbox overlay
│ │ ├── score-reader.css # Full-page score reader layout
│ │ ├── catalog.css # Music catalog page (`/music/`)
│ │ └── print.css # Print stylesheet (media="print")
│ ├── js/
│ │ ├── theme.js # Dark/light toggle (sync, not deferred)
│ │ ├── sidenotes.js # Written from scratch — collision avoidance, hover/focus
│ │ ├── toc.js # Sticky TOC + scroll tracking + animated collapse
│ │ ├── nav.js # Portal row expand/collapse + localStorage
│ │ ├── collapse.js # Section collapsing with localStorage persistence
│ │ ├── citations.js # Citation hover previews
│ │ ├── gallery.js # Exhibit overlay + annotation toggle
│ │ ├── popups.js # Link preview popups (internal, Wikipedia, citations)
│ │ ├── settings.js # Settings panel (theme, text size, focus mode, reduce motion, print)
│ │ ├── selection-popup.js # Context-aware text-selection toolbar
│ │ ├── annotations.js # localStorage highlight/annotation engine (UI deferred)
│ │ ├── score-reader.js # Score reader: page-turn, movement jumps, deep linking
│ │ ├── viz.js # Vega-Lite render + dark mode re-render via MutationObserver
│ │ ├── semantic-search.js # Client-side semantic search: transformers.js + Float32Array cosine ranking
│ │ ├── search.js # Pagefind UI init + ?q= pre-fill + search timing (#search-timing)
│ │ └── prism.min.js # Syntax highlighting
│ ├── fonts/ # Self-hosted WOFF2 (subsetted with OT features)
│ ├── gpg/
│ │ └── pubkey.asc # Ed25519 signing subkey public key (master: CD90AE96…; subkey: C9A42A6F…)
│ ├── models/ # Self-hosted ONNX model (gitignored; run: make download-model)
│ │ └── all-MiniLM-L6-v2/ # ~22 MB quantized — served at /models/ for semantic-search.js
│ └── images/
│ └── link-icons/ # SVG icons for external link classification
│ ├── external.svg
│ ├── wikipedia.svg
│ ├── github.svg
│ ├── arxiv.svg
│ └── doi.svg
├── templates/
│ ├── default.html # Outer shell: nav, head, footer JS
│ ├── essay.html # 3-column layout with TOC
│ ├── composition.html # Music landing page (metadata block, movements, body, recording player)
│ ├── music-catalog.html # Music catalog index (`/music/`)
│ ├── score-reader.html # Minimal score reader body (top bar + SVG stage)
│ ├── score-reader-default.html # Minimal HTML shell for score reader (no nav/footer)
│ ├── blog-post.html
│ ├── page.html # Simple standalone pages
│ ├── essay-index.html
│ ├── blog-index.html
│ ├── tag-index.html
│ └── partials/
│ ├── head.html # CSS, conditional JS (citations, collapse)
│ ├── nav.html # Two-row nav with portals
│ ├── footer.html
│ ├── metadata.html # Essay metadata block (top)
│ └── page-footer.html # Essay footer (bibliography, backlinks)
├── build/
│ ├── Main.hs # Entry point
│ ├── Site.hs # Hakyll rules (all routes + Atom feed)
│ ├── Compilers.hs # Pandoc compiler wrappers
│ ├── Contexts.hs # Template contexts (word-count, reading-time, bibliography)
│ ├── Citations.hs # citeproc pipeline: Cite→superscript + bibliography HTML
│ ├── Filters.hs # Re-exports all filter modules
│ ├── Filters/
│ │ ├── Typography.hs # Smart quotes, dashes
│ │ ├── Sidenotes.hs # Footnote → sidenote conversion
│ │ ├── Dropcaps.hs # Decorative first-letter drop caps
│ │ ├── Smallcaps.hs # Smallcaps via smcp OT feature
│ │ ├── Wikilinks.hs # [[wikilink]] syntax
│ │ ├── Links.hs # External link classification + data-link-icon attributes
│ │ ├── Math.hs # Simple LaTeX → Unicode conversion
│ │ ├── Code.hs # Prepend language- prefix for Prism.js
│ │ ├── Images.hs # Lazy loading, lightbox data-attributes, WebP <picture> wrapper for local raster images
│ │ ├── Score.hs # Score fragment SVG inlining + currentColor replacement
│ │ └── Viz.hs # Visualization IO filter: runs Python scripts, inlines SVG / Vega-Lite JSON
│ ├── Authors.hs # Author-as-tag system (slugify, authorLinksField, author pages)
│ ├── Backlinks.hs # Two-pass build-time backlinks with context paragraph extraction
│ ├── Catalog.hs # Music catalog: featured works + grouped-by-category HTML rendering
│ ├── Stability.hs # Git-based stability auto-calculation + last-reviewed derivation
│ ├── Metadata.hs # Stub (Phase 2+)
│ ├── Tags.hs # Hierarchical tag system
│ ├── Pagination.hs # 20/page for blog + tag indexes
│ └── Utils.hs # Shared helpers (wordCount, readingTime)
├── data/
│ ├── bibliography.bib # BibTeX references
│ ├── chicago-notes.csl # CSL style (in-text, Chicago Author-Date)
│ └── (future: embeddings.json, similar-links.json)
├── tools/
│ ├── subset-fonts.sh
│ ├── viz_theme.py # Matplotlib monochrome helpers (apply_monochrome, save_svg, LINESTYLE_CYCLE)
│ ├── sign-site.sh # Detach-sign every _site/**/*.html → .html.sig (called by `make sign`)
│ ├── preset-signing-passphrase.sh # Cache signing subkey passphrase in gpg-agent (run once per boot)
│ ├── download-model.sh # Fetch quantized ONNX model to static/models/ (run once per machine)
│ ├── convert-images.sh # Convert JPEG/PNG → WebP companions via cwebp (runs automatically in build)
│ └── embed.py # Build-time embedding pipeline: similar-links + semantic search index
├── levineuwirth.cabal
├── cabal.project
├── cabal.project.freeze
├── Makefile
└── CLAUDE.md
```
---
## V. Implementation Phases
### Phase 1: Foundation ✓
- [x] Init Hakyll project, modular Haskell build system
- [x] Font subsetting + self-hosting (Spectral, Fira Sans, JetBrains Mono)
- [x] CSS: base (palette, variables, dark mode), typography (Spectral features), layout (3-column), sidenotes
- [x] `sidenotes.js` — written from scratch (not adopted; see Implementation Notes)
- [x] Two-row navigation with expandable portals
- [x] Templates: default, essay, blog-post, index
- [x] Dark/light toggle with `localStorage` + `prefers-color-scheme`
- [x] Basic Pandoc pipeline (Markdown → HTML, smart typography)
- [x] Deploy to DreamHost via rsync — deployed to Hetzner VPS instead
### Phase 2: Content Features ✓
- [x] Pandoc filters: sidenotes, dropcaps, smallcaps, wikilinks, typography, link classification, code, math
- [x] Interactive sticky TOC — IntersectionObserver, animated expand/collapse, page-title display, auto-collapse on scroll
- [x] Citation system — numbered superscript markers, hover preview, bibliography + Further Reading sections
- [x] Monochrome syntax highlighting (Prism.js + `Filters.Code`)
- [x] Collapsible h2/h3 sections (`collapse.js`) — `max-height` transition, localStorage persistence
- [x] Hierarchical tag system + tag index pages
- [x] Pagination (blog index and tag pages, 20/page)
- [x] Metadata: YAML frontmatter + auto-computed word count / reading time
- [x] Single Atom feed (`/feed.xml`, all content, sorted by date)
- [x] External link icons (SVG mask-image, domain-classified via `Filters.Links`)
- [x] Gallery / Exhibit system (`gallery.js`, `gallery.css`) — added (not in original spec)
### Phase 3: Rich Interactions
- [x] Link preview popups (`popups.js`) — internal page previews (title, abstract, authors, tags, reading time), Wikipedia excerpts, citation previews; relative-URL fix for index pages
- [x] Pagefind search (`/search.html`) — `search.js` pre-fills from `?q=` param; `#search-timing` shows elapsed ms (mono, faint) via `MutationObserver` on search results subtree
- [x] Author system — authors treated as tags; `build/Authors.hs`; author pages at `/authors/{slug}/`; `authorLinksField` in all contexts; defaults to Levi Neuwirth
- [x] Settings panel — `settings.js` + `settings.css` section in `components.css`; theme, text size (3 steps), focus mode, reduce motion, print; all state in `localStorage`; `theme.js` restores all settings before first paint
- [x] Selection popup — `selection-popup.js` / `selection-popup.css`; context-aware toolbar appears 450 ms after text selection; see Implementation Notes
- [x] Print stylesheet — `print.css` (media="print"); single-column, light colors, sidenotes as indented blocks, external URLs shown
- [x] Current page (`/current.html`) — now-page; added to primary nav
- [x] Annotations — `annotations.js` / `annotations.css`; localStorage storage, text re-anchoring, highlight marks, tooltip with delete; color-picker UI in selection popup (four swatches + optional note field)
### Phase 4: Creative Content & Polish
- [x] Image handling (lazy load, lightbox, figures, WebP `<picture>` wrapper for local raster images)
- [x] Homepage (replaces standalone index; gateway + curated recent content)
- [x] Poetry typesetting — codex reading mode (`reading.html`, `reading.css`, `reading.js`); `poetryCompiler` with `Ext_hard_line_breaks`; narrower measure, stanza spacing, drop-cap suppressed
- [x] Fiction reading mode — same codex layout; `fictionCompiler`; chapter drop caps + smallcaps lead-in via `h2 + p::first-letter`; reading mode infrastructure shared with poetry
- [x] Music section — score fragment system (A): inline SVG excerpts (motifs, passages) integrated into the gallery/exhibit system; named, TOC-listed, focusable in the shared overlay alongside equations; authored via `{.score-fragment score-name="..." score-caption="..."}` fenced-div; SVG inlined at build time by `Filters.Score`; black fills/strokes replaced with `currentColor` for dark-mode; see Implementation Notes
- [x] Music section — composition landing pages + full score reader (C): two-URL architecture per composition; `/music/{slug}/` (rich prose landing page with movement list, audio players, inline score fragments) and `/music/{slug}/score/` (minimal dedicated reader); Hakyll `version "score-reader"` mechanism; `compositionCtx` with `slug`, `score-url`, `has-score`, `score-page-count`, `score-pages` list, `has-movements`, `movements` list (Aeson-parsed nested YAML); `score-reader-default.html` minimal shell; `score-reader.js` (page navigation, movement jumps, `?p=` deep linking, preloading, keyboard); `score-reader.css`; dark mode via `filter: invert(1)`; see Implementation Notes
- [x] Accessibility audit — skip link, TOC collapsed-link tabbing (`visibility: hidden`), section-toggle focus visibility, lightbox/gallery/settings focus restoration, popup `aria-hidden`, metadata nav wrapping, footer `onclick` removal; settings panel focus-steal bug fixed (focus only returns to toggle when it was inside the panel, preventing interference with text-selection popup)
- [~] Visualization pipeline — Pandoc filter approach (`Filters.Viz`): `.figure` fenced divs run `python3 <script>`, capture SVG stdout, inline with `currentColor` replacement; `.visualization` fenced divs embed Vega-Lite JSON in a `<script type="application/json" class="vega-spec">` tag rendered by `viz.js`; `viz: true` frontmatter gates CDN Vega/Vega-Lite/Vega-Embed + `viz.js`; dark mode re-renders via `MutationObserver`; `tools/viz_theme.py` provides matplotlib monochrome helpers. Infrastructure complete; not yet used in production content.
- [ ] Content migration — migrate existing essays, poems, fiction, and music landing pages from prior formats into `content/`
### Phase 5: Infrastructure & Advanced
- [x] **Arch Linux VPS + nginx + certbot + DNS migration** — Hetzner VPS provisioned, Arch Linux installed, nginx configured (config in §III), TLS cert via certbot, DNS migrated from DreamHost. `make deploy` pushes to GitHub and rsyncs to VPS.
- [x] **Semantic embedding pipeline** — Implemented. See Phase 6 "Embedding-powered similar links" and "Full-text semantic search".
- [x] **Backlinks with context** — Two-pass build-time system (`build/Backlinks.hs`). Pass 1: `version "links"` compiles each page lightly (wikilinks preprocessed, links + context extracted, serialised as JSON). Pass 2: `create ["data/backlinks.json"]` inverts the map. `backlinksField` in `essayCtx` / `postCtx` loads the JSON and renders `<details>`-collapsible per-entry lists. `popups.js` excludes `.backlink-source` links from the preview popup. Context paragraph uses `runPure . writeHtml5String` on the surrounding `Para` block. See Implementation Notes.
- [ ] **Link archiving** — For all external links in `data/bibliography.bib` and in page bodies, check availability and save snapshots (Wayback Machine `save` API or local archivebox instance). Store archive URLs in `data/link-archive.json`; `Filters.Links` injects `data-archive-url` attributes; `popups.js` falls back to the archive if the live URL returns 404.
- [ ] **Self-hosted git (Forgejo)** — Run Forgejo on the VPS. Mirror the build repo. Link from the colophon. Not essential; can remain on GitHub indefinitely.
- [ ] **Reader mode** — Distraction-free reading overlay: hides nav, TOC, sidenotes; widens the body column to ~70ch; activated via a keyboard shortcut or settings panel toggle. Distinct from focus mode (which affects the nav) — reader mode affects the content layout.
- [ ] **HTTP/3 + QUIC** — nginx 1.25+ supports HTTP/3 via `listen 443 quic reuseport` + `http3 on` + `Alt-Svc` header. Requires UDP 443 open in Hetzner's Cloud Firewall. Deferred: latency is currently geographic RTT, not server processing; gains would be modest for a static site from a single DC. CDN alternatives (Bunny.net, multi-region Hetzner with GeoDNS) would address the root cause but raise ethical or operational complexity concerns. Revisit if latency becomes a real user complaint. Server-side improvements (brotli pre-compression, `open_file_cache`) are a lower-cost step first.
### Phase 6: Deferred Features
- [x] **Annotation UI**`annotations.js` / `annotations.css`: localStorage storage, text-stream re-anchoring, four highlight colors (amber/sage/steel/rose), hover tooltip with delete. Selection popup "Annotate" button triggers a color-swatch + optional note picker; Enter or "Highlight" button commits; Escape cancels. Picker positioned above the selection, same inverted style as the tooltip. Settings panel includes a "Clear Annotations" button (with confirmation) that wipes all annotations site-wide via `Annotations.clearAll()`.
- [~] **Visualization pipeline** — Implemented as a Pandoc IO filter (`Filters.Viz`), not a per-slug Hakyll rule. See Phase 4 entry and Implementation Notes. Infrastructure complete; production content pending.
- [x] **Music catalog page**`/music/` index listing all compositions grouped by instrumentation category (orchestral → chamber → solo → vocal → choral → electronic → other), with an optional Featured section. Auto-generated from composition frontmatter by `build/Catalog.hs`; renders HTML in Haskell (same pattern as backlinks). Category, year, duration, instrumentation, and ◼/♫ indicators for score/recording availability. `content/music/index.md` provides prose intro + abstract. Template: `templates/music-catalog.html`. CSS: `static/css/catalog.css`. Context: `musicCatalogCtx` (provides `catalog: true` flag, `featured-works`, `has-featured`, `catalog-by-category`).
- [x] **Score reader swipe gestures**`touchstart`/`touchend` listeners on `#score-reader-stage` with passive: true. Threshold: ≥ 50 px horizontal, < 30 px vertical drift. Left swipe next page; right swipe previous page.
- [x] **Full-piece audio on composition pages** `recording` frontmatter key (path relative to the composition directory). Rendered as a full-width `<audio>` player in `composition.html`, above the per-movement list. Styled via `.comp-recording` / `.comp-recording-audio` in `components.css`. Per-movement `<audio>` players and `.comp-btn` / `.comp-movement-*` styles also added in the same pass.
- [x] **RSS/feed improvements** `/feed.xml` now includes compositions (`content/music/*/index.md`) alongside essays, posts, fiction, poetry. New `/music/feed.xml` (compositions only, `musicFeedConfig`). Compositions already had `"content"` snapshots saved by the landing-page rule; no compiler changes needed.
- [ ] **Pagefind improvements** Currently a basic full-text search. Consider: sub-result excerpts, portal-scoped search filters, weighting by `importance` frontmatter field.
- [ ] **Audio essays / podcast feed** Record readings of select essays. Embed a native `<audio>` player at the top of the essay page, activated by an `audio` frontmatter key (path to MP3, relative to the content dir). Generate a separate `/podcast.xml` Atom feed with `<enclosure>` elements pointing to the MP3s so readers can subscribe in any podcast app. Stretch goal: a paragraph-sync mode where the player emits `timeupdate` events that highlight the paragraph currently being read requires a `data/audio/{slug}-timestamps.json` file mapping paragraph indices to timestamps, authored manually or via forced-alignment tooling (e.g. `whisper` with word timestamps).
- [x] **Build telemetry page** `/build/` page generated at build time. `build/Stats.hs` loads all content items by type, reads `"word-count"` snapshots, aggregates counts/words/reading-time per type, computes word-length distribution (5 buckets), and reads top-15 tags from the `Tags` object. Makefile writes `date +%s` to `data/build-start.txt` before Hakyll runs; after pagefind, computes elapsed and writes `data/last-build-seconds.txt` (read on next build). CSS in `static/css/build.css` (flex bar chart, tabular-nums table, grid dl); loaded conditionally via `$if(build)$` in head.html.
- [x] **Epistemic profile** Replaces the old `certainty` / `importance` fields with a richer multi-axis system. **Compact** (always visible in footer): status chip · confidence % · importance dots · evidence dots. **Expanded** (`<details>`): stability (auto) · scope · novelty · practicality · last reviewed · confidence trend. Auto-calculation in `build/Stability.hs` via `git log --follow`; `IGNORE.txt` pins overrides. See Metadata section and Implementation Notes for full schema and vocabulary.
- [ ] **Writing statistics dashboard** — A `/stats` page computed entirely at build time from the corpus. Contents: total word count across all content types, essay/post/poem count, words written per month rendered as a GitHub-style contribution heatmap (SVG generated by Haskell or a Python script), average and median essay length, longest essay, most-cited essay (by backlink count), tag distribution as a treemap, reading-time histogram, site growth over time (cumulative word count by date). All data collected during the Hakyll build from compiled items and their snapshots; serialized to `data/stats.json` and rendered into a dedicated `stats.html` template.
- [x] **Memento mori** — Implemented at `/memento-mori/` as a full standalone page. 90×52 grid of weeks anchored to birthday anniversaries (nested year/week loop via `setFullYear`; week 52 stretched to eve of next birthday to absorb 365th/366th days). Week popup shows dynamic day-count and locale-derived day names. Score fragment (bassoon, `content/memento-mori/scores/bsn.svg`) inlined via `Filters.Score`. Linked from footer (MM).
- [x] **Embedding-powered similar links**`tools/embed.py` encodes every `#markdownBody` page with `all-MiniLM-L6-v2` (384 dims, unit-normalised), builds a FAISS `IndexFlatIP`, queries top-5 neighbours per page (cosine ≥ 0.30), writes `data/similar-links.json`. `build/SimilarLinks.hs` provides `similarLinksField` in `essayCtx`/`postCtx` with Hakyll dependency tracking; "Related" section rendered in `page-footer.html`. Staleness check skips re-embedding when JSON is newer than all HTML. Called by `make build` via `uv run`; non-fatal if `.venv` absent. See Implementation Notes.
- [x] **Bidirectional backlinks with context** — See Phase 5 above; implemented with full context-paragraph extraction. Merged with the Phase 5 stub.
- [x] **Signed pages / content integrity**`make sign` (called by `make deploy`) runs `tools/sign-site.sh`: walks `_site/**/*.html`, produces a detached ASCII-armored `.sig` per page. Signing uses a dedicated Ed25519 subkey isolated in `~/.gnupg-signing/` (master `sec#` stub + `ssb` signing subkey). Passphrase cached 24 h in the signing agent via `tools/preset-signing-passphrase.sh` + `gpg-preset-passphrase`; `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase`. Footer "sig" link points to `$url$.sig`; hovering shows the ASCII armor via `popups.js` `sigContent` provider. Public key at `static/gpg/pubkey.asc` → served at `/gpg/pubkey.asc`. Fingerprints: master `CD90AE96…B5C9663`; signing subkey `C9A42A6F…2707066` (keygrip `619844703EC398E70B0045D7150F08179CFEEFE3`). See Implementation Notes.
- [x] **Self-hosted semantic search model**`tools/download-model.sh` fetches the quantized ONNX model (`all-MiniLM-L6-v2`, ~22 MB, 5 files) from HuggingFace into `static/models/all-MiniLM-L6-v2/` (gitignored). `semantic-search.js` sets `env.localModelPath = '/models/'` and `env.allowRemoteModels = false` before calling `pipeline()`, so all model weight fetches are same-origin. The CDN import of transformers.js itself still requires `cdn.jsdelivr.net` in `script-src`; `connect-src` stays `'self'`. `make download-model` is a one-time setup step per machine. See Implementation Notes.
- [x] **Responsive images (WebP)**`tools/convert-images.sh` walks `static/` and `content/`, calls `cwebp -q 85` to produce `.webp` companions alongside every JPEG/PNG (skips existing; exits gracefully if `cwebp` absent). `make build` runs it before Hakyll so WebP files are present when `static/**` is copied. `build/Filters/Images.hs` detects local raster images and emits `RawInline "html"` `<picture>` elements with a `<source srcset="…webp" type="image/webp">` and an `<img>` fallback; SVG, external URLs, and `data:` URIs pass through as plain `<img>`. Generated `.webp` files gitignored via `static/**/*.webp` / `content/**/*.webp`. See Implementation Notes.
- [x] **Full-text semantic search**`tools/embed.py` also produces paragraph-level embeddings (same `all-MiniLM-L6-v2` model, same pass): walks `<p>/<li>/<blockquote>` in `#markdownBody`, tracks nearest preceding heading for context, writes `data/semantic-index.bin` (raw Float32, N×384) + `data/semantic-meta.json` ([{url, title, heading, excerpt}]). Client: `static/js/semantic-search.js` dynamically imports `@xenova/transformers@2` from CDN, embeds the query, brute-force cosine-ranks all paragraph vectors (fast at <5k paragraphs in JS), renders top-8 results as title + section heading + excerpt. Surfaced on `/search.html` as a **Keyword / Semantic** tab strip; active tab persists in `localStorage` (keyword default on first visit). Both tabs on same page; `semantic-search.js` loaded within existing `$if(search)$` block. See Implementation Notes.
---
## VI. Implementation Notes
### sidenotes.js — Written from scratch
The spec called for adopting Said Achmiz's `sidenotes.js` directly. Instead a purpose-built version was written for the `<span class="sidenote">` structure produced by `Filters.Sidenotes`. Features: JS collision avoidance (`positionSidenotes`), bidirectional hover highlight, click-to-focus (sticky highlight on wide viewport, anchor scroll fallback on narrow), document-click dismissal. `window.resize` is used as the reposition signal; `collapse.js` dispatches it after each section transition.
### Gallery / Exhibit system — Added (not in original spec)
- **Exhibits** (`.exhibit--equation`, `.exhibit--proof`): always-visible inline blocks with overlay zoom on click.
- **Annotations** (`.annotation--static`, `.annotation--collapsible`): editorial callout boxes.
- **TOC integration**: exhibits are listed under their parent heading.
- Implementation: `gallery.js`, `gallery.css`; Pandoc fenced-div syntax (`:::`) to avoid the 4-space code block trap.
### LaTeX Math — Client-side KaTeX
The spec described pure build-time SSR. In practice: Pandoc outputs `class="math"` spans, KaTeX renders client-side from a deferred script. Fully static (no server per request). Revisit if build-time SSR becomes important.
### Citation pipeline — key subtleties
1. **`Cite` nodes, not `Span` nodes.** `processCitations` with `class="in-text"` CSL does *not* convert `Cite` nodes to `Span class="citation"` nodes in the Pandoc AST it only populates their inline content and creates the refs div. The HTML writer wraps them in `<span class="citation">` at write time. Our `Citations.hs` must match `Cite` nodes directly.
2. **Hakyll strips YAML frontmatter.** Hakyll reads frontmatter separately; the body passed to `readPandocWith` has no YAML block, so Pandoc `Meta` is empty. `further-reading` keys are read from Hakyll's metadata API (`lookupStringList`) in `Compilers.hs` and passed explicitly to `Citations.applyCitations`.
3. **`nocite` format.** Each further-reading key must be a *separate* `Cite` node with `AuthorInText` mode and non-empty fallback content matching what pandoc produces from `"@key1 @key2"` in YAML. A single `Cite` node with multiple citations is not recognized by citeproc's nocite processing.
4. **`collectCiteOrder` queries blocks only**, not the full `Pandoc` (which includes metadata). Querying metadata would pick up the injected `nocite` `Cite` nodes and incorrectly classify further-reading entries as inline citations.
### External link icons
Implemented via `data-link-icon` and `data-link-icon-type="svg"` attributes set by `Filters.Links`. CSS uses `mask-image: url(...)` with `background-color: currentColor` so icons inherit the text color and work in dark mode. Icons in `static/images/link-icons/` as SVG files.
### Tags — Hierarchical, no namespace
Tags are slash-separated (`research/mathematics`). A tag is auto-expanded into all ancestor prefixes so that `/research/` aggregates all `research/*` content. Tag pages live directly at `/<tag>/` with no `/tags/` namespace.
### Collapsible sections
`collapse.js` wraps each h2/h3's following siblings in a `.section-body` div and injects a `.section-toggle` button into the heading. State is persisted per heading in `localStorage` under `section-collapsed:<id>`. After each `transitionend`, dispatches `window.resize` to retrigger sidenote positioning. Headings themselves are never hidden, preserving `IntersectionObserver` targets for `toc.js`.
### Atom feed
`/feed.xml` covers all essays and blog posts (up to 30 most recent). A `"content"` snapshot is saved in `Site.hs` *before* template application, so the feed body is just the compiled article HTML (not the full page with nav/footer). Dates from the `date` frontmatter key, formatted as RFC 3339.
### Author system
Authors are treated as a second tag dimension. `build/Authors.hs` provides `buildAllAuthors` (a `buildTagsWith` call keyed to `authors` frontmatter) and `authorLinksField` (a `listFieldWith` context that defaults to `["Levi Neuwirth"]` when no `authors` key is present, so all unattributed content contributes to his author page). Author pages live at `/authors/{slug}/`. `slugify` lowercases and hyphenates; pipe-separated values (`"Name | role"`) strip the role portion via `nameOf`.
### Settings panel
`settings.js` manages four independent settings, all persisted in `localStorage`:
- **Theme** (`data-theme` on `<html>`): light / dark, with `syncThemeButtons()` toggling `.is-active`.
- **Text size**: three steps `[20, 23, 26]` px (small / default / large), written as `--text-size` CSS custom property on `<html>`. Default index is 1 (23 px).
- **Focus mode** (`data-focus-mode` on `<html>`): hides TOC, fades header to 7% opacity until hover.
- **Reduce motion** (`data-reduce-motion` on `<html>`): collapses all `animation-duration` / `transition-duration` to `0.01ms`.
`theme.js` (sync, not deferred) restores all four attributes from `localStorage` before first paint to avoid flash.
### Selection popup
`selection-popup.js` / `selection-popup.css`. A toolbar appears 450 ms after any text selection of 2 characters. Context is detected from the DOM ancestors of the selection range:
| Context | Detection | Buttons |
|---------|-----------|---------|
| **code** (known lang) | `closest('pre, code, .sourceCode')` + `language-*` class | Copy · \<MDN / Hoogle / Docs\> |
| **code** (unknown) | same, no `language-*` | Copy |
| **math** | `closest('.math, .katex')` + `Range.intersectsNode` fallback | Copy · nLab · OEIS · Wolfram |
| **prose** (multi-word) | fallback | BibTeX · Copy · DuckDuckGo · Here · Wikipedia |
| **prose** (single word) | `!/\s/.test(text)` | BibTeX · Copy · Define · DuckDuckGo · Here · Wikipedia |
16 languages are mapped to documentation providers (MDN, Hoogle, docs.python.org, doc.rust-lang.org, etc.) via `DOC_PROVIDERS`. **BibTeX** generates a `@online{...}` BibLaTeX entry (key = `lastname + year + firstWord`; selected text in `note={\enquote{...}}`; year scraped from `#version-history li`). **Here** opens `/search.html?q=` in a new tab. **Define** opens English Wiktionary. Popup positions above the selection, flips below if insufficient space; hides on scroll, outside mousedown, or Escape.
### Reading mode (poetry + fiction)
Shared codex layout for creative content. `templates/reading.html` omits the TOC and emits a `<div id="reading-progress">` progress bar instead. `body.reading-mode` (set via `$if(reading)$` in `default.html`) triggers a slightly warmer background (`#fdf9f1` / `#1c1917`). Poetry pages (`body.reading-mode.poetry`) use a 52ch measure, 1.85 line-height, stanza paragraph spacing, and suppressed dropcap/smallcaps lead-in; `poetryCompiler` enables `Ext_hard_line_breaks` so each source newline becomes `<br>`. Fiction pages (`body.reading-mode.fiction`) use a 62ch measure, centered Fira Sans smallcaps chapter headings, and a dropcap + smallcaps lead-in on each `h2 + p`. Progress bar is driven by `reading.js` (scroll position `width` on `#reading-progress`). CSS and JS loaded conditionally via `$if(reading)$`. Content goes in `content/poetry/*.md` and `content/fiction/*.md`; tags `poetry` / `fiction` route items to the correct portal and library section.
### Score fragment system (option A)
`Filters/Score.hs` walks the Pandoc AST for `Div` nodes with class `score-fragment`. It reads the referenced SVG from disk (path resolved relative to the source file's directory via `getResourceFilePath` + `takeDirectory`), replaces hardcoded black fill/stroke values with `currentColor` (6-digit before 3-digit to prevent partial matches on `#000` vs `#000000`), and emits a `RawBlock "html"` `<figure>` carrying `class="score-fragment exhibit"`, `data-exhibit-name`, and `data-exhibit-type="score"` for gallery.js TOC integration. SVGs are inlined at build time and never served as separate files. `gallery.js` discovers `.score-fragment` elements via `discoverFocusableScores`, adds them to the shared `focusables[]` array with `type: 'score'`, and the overlay's `renderOverlay` branches on type score path clones the SVG into the overlay body (no font-size loop); math path keeps the KaTeX re-render. Overlay body receives class `is-score` for tighter horizontal padding (`2rem 1.5rem` vs `3.5rem 4.5rem`). CSS: background rect removed via `svg > rect:first-child { fill: none }`, SVG responsive via `width: 100%; height: auto`, dark mode via `color: var(--text)`.
**Authoring syntax:**
```markdown
::: {.score-fragment score-name="Main Theme, mm. 18" score-caption="The opening gesture."}
![](scores/main-theme.svg)
:::
```
### Music — Composition landing pages + full score reader (option C)
**Implemented.** Two URLs per composition from one source directory.
#### Architecture
| URL | Templates | Purpose |
|-----|-----------|---------|
| `/music/{slug}/` | `composition.html` + `default.html` | Rich prose landing page |
| `/music/{slug}/score/` | `score-reader.html` + `score-reader-default.html` | Minimal page-turn reader |
The Hakyll `version "score-reader"` mechanism compiles the same `index.md` twice: once as the landing page (default version) and once as the reader (`customRoute` to `music/{slug}/score/index.html`). Score reader uses `makeItem ""` the prose body is irrelevant; only frontmatter fields are needed.
#### Source directory layout
```
content/music/symphonic-dances/
├── index.md ← composition frontmatter + program notes prose
├── scores/
│ ├── page-01.svg ← one file per score page (LilyPond SVG output)
│ ├── page-02.svg
│ └── symphonic-dances.pdf
└── audio/
├── movement-1.mp3
└── movement-2.mp3
```
SVG, MP3, and PDF files are copied to `_site/` via `copyFileCompiler`. Score reader SVGs are served as separate `<img>` files inlining a full orchestral score is impractical.
#### Frontmatter schema
```yaml
---
title: "Symphonic Dances with Claude"
date: 2026-03-01
abstract: >
A five-movement work for orchestra.
tags: [music]
instrumentation: "orchestra (2+picc.2+ca.2+bcl.2 — 4.3.3.1 — timp+3perc — hp — str)"
duration: "ca. 24'"
premiere: "2026-05-01"
commissioned-by: "—" # optional
pdf: scores/symphonic-dances.pdf # optional; path relative to composition dir
score-pages: # required for reader; landing page works without it
- scores/page-01.svg
- scores/page-02.svg
movements: # optional; omit entirely if no movement structure
- name: "I. Allegro con brio"
page: 1 # 1-indexed starting page in the reader
duration: "8'"
audio: audio/movement-1.mp3 # optional; omit if no recording
- name: "II. Adagio cantabile"
page: 8
duration: "10'"
---
```
#### `compositionCtx` fields
Extends `essayCtx` (all essay fields available `abstract`, `toc`, `word-count`, etc.). Additional fields:
| Field | Type | Value |
|-------|------|-------|
| `$slug$` | string | `takeFileName . takeDirectory` of source path |
| `$score-url$` | string | `/music/{slug}/score/` |
| `$has-score$` | boolean | present when `score-pages` non-empty |
| `$score-page-count$` | string | `show (length score-pages)` |
| `$score-pages$` | list | each item: `$score-page-url$` (absolute URL) |
| `$has-movements$` | boolean | present when `movements` non-empty |
| `$movements$` | list | each item: `$movement-name$`, `$movement-page$`, `$movement-duration$`, `$movement-audio$`, `$has-audio$` |
| `$composition$` | flag | `"true"` gates `score-reader.css` in `head.html` |
`movements` is parsed from the nested YAML using `Data.Aeson.KeyMap` (Aeson 2.x API). `score-pages` are resolved to absolute URLs (`/music/{slug}/{path}`) inside the context so the `data-pages` attribute in the score reader template needs no further processing.
#### Score reader
The reader template embeds page URLs as a comma-separated `data-pages` attribute on `#score-reader-stage`. `score-reader.js` splits on commas and filters empties.
`score-reader-default.html` loads only: `base.css`, `components.css` (for settings panel styles), `score-reader.css`, `theme.js` (sync, pre-paint), `settings.js` (theme toggle in the top bar), `score-reader.js`. No nav, no TOC, no sidenotes, no popups, no gallery, no lightbox.
`score-reader.js` behaviors:
- `navigate(page)`: swaps `<img src>`, updates counter, toggles prev/next disabled states, updates active movement button (last movement whose start page current page), calls `history.replaceState` for `?p=` deep linking, preloads ±1 pages.
- Keyboard: `ArrowLeft`/`ArrowRight`/`ArrowUp`/`ArrowDown` for page turns; `Escape` `history.back()`. Suppressed when settings panel is open.
- Dark mode: `[data-theme="dark"] .score-page { filter: invert(1); }` clean for pure B&W notation; revisit if LilyPond embeds colored elements.
- Mobile: score scrolls horizontally at 640px (`min-width: 600px` on `<img>`); arrow buttons hidden; pinch-to-zoom is native.
#### Known limitations / future work
- **Full-piece audio**: a `recording` frontmatter key for a complete performance would add a top-level audio player on the landing page. Not yet implemented.
- **LilyPond margin cropping**: the `viewBox` drives scaling but LilyPond's default page includes margins. May need per-composition `viewBox` overrides or CSS `object-fit` once real scores are tested.
### Backlinks — Two-pass dependency-correct system
`build/Backlinks.hs`. The fundamental challenge: backlinks for page A require knowing what other pages link to A, but those pages haven't been compiled yet when A is compiled. Solved with a two-version architecture:
1. **Pass 1** (`version "links"`): each content file is compiled lightly wikilinks preprocessed, Markdown parsed, AST walked block-by-block. For every internal link, the URL and the HTML of its surrounding `Para` block are recorded as a `LinkEntry { leUrl, leContext }`. Context rendered via `runPure (writeHtml5String opts (Pandoc nullMeta [Plain inlines]))` with `writerTemplate = Nothing` (fragment only). Result serialised as JSON per page.
2. **Pass 2** (`create ["data/backlinks.json"]`): loads all `version "links"` items, inverts the map (target [source]), resolves each source's title and abstract from its metadata, emits `data/backlinks.json`.
3. **Context** (`backlinksField`): loads `data/backlinks.json` via `load`, looks up the current page's normalised URL, renders `<ul>` with `<details>`-collapsible context per entry.
**Key implementation details:**
- All `loadAll` / `loadAllSnapshots` / `buildTagsWith` / `buildPaginateWith` calls use `.&&. hasNoVersion` to prevent "links" version items from being picked up alongside default versions.
- `isPageLink` filters out `http://`, `https://`, `#`-anchors, `mailto:`, `tel:`, and static-asset extensions (`.pdf`, `.svg`, `.mp3`, etc.).
- JSON encoding uses `TL.unpack . TLE.decodeUtf8 . Aeson.encode` (not `LBSC.unpack`) to preserve non-ASCII characters in context paragraphs.
- Decoding uses `Aeson.decodeStrict (TE.encodeUtf8 (T.pack s))` symmetrically.
- `popups.js` excludes `.backlink-source` links from the internal-preview popup (same exception pattern as `.meta-authors`).
### Epistemic Profile
Implemented across `build/Stability.hs`, `build/Contexts.hs`, `templates/partials/page-footer.html`, `templates/partials/metadata.html`, and `static/css/components.css`.
**Context fields provided by `epistemicCtx`** (included in `essayCtx`):
| Field | Source | Notes |
|-------|--------|-------|
| `$status$` | frontmatter `status` | via `defaultContext` |
| `$confidence$` | frontmatter `confidence` | via `defaultContext` |
| `$importance-dots$` | frontmatter `importance` (15) | `●●●○○` rendered in Haskell |
| `$evidence-dots$` | frontmatter `evidence` (15) | same |
| `$confidence-trend$` | frontmatter `confidence-history` list | / / from last two entries |
| `$stability$` | auto-computed via `git log --follow` | always resolves; never fails |
| `$last-reviewed$` | most recent commit date | formatted "%-d %B %Y"; `noResult` if no commits |
| `$scope$`, `$novelty$`, `$practicality$` | frontmatter | via `defaultContext` |
**Stability auto-calculation** (`build/Stability.hs`):
- Runs `git log --follow --format=%ad --date=short -- <filepath>` via `readProcessWithExitCode`.
- Heuristic: 1 commits or age < 14 days *volatile*; 5 commits and age < 90 days *revising*; 15 commits or age < 365 days *fairly stable*; 30 commits or age < 730 days *stable*; otherwise *established*.
- `IGNORE.txt`: paths listed here use frontmatter `stability`/`last-reviewed` verbatim. Cleared by `> IGNORE.txt` in the Makefile's `build` target (one-shot pins).
**Critical implementation note:** Fields that use `unsafeCompiler` must return `Maybe` from the IO block and call `fail` in the `Compiler` monad afterward not inside the `IO` action. Calling `fail` inside `unsafeCompiler`'s IO block throws an `IOError` that Hakyll's `$if()$` template evaluation does not catch as `NoResult`, causing the entire item compilation to error silently.
### Visualization pipeline — Pandoc IO filter approach
`build/Filters/Viz.hs` walks the AST for `Div` blocks with class `figure` or `visualization`.
**Static figures (`.figure`):** reads the `script` attribute, runs `python3 <script>` with the source file's directory as cwd, captures stdout as SVG. Replaces hardcoded `#000000`/`black` fills/strokes with `currentColor` (same trick as `Filters.Score`). Wraps in `<figure class="viz-figure">`. Script is expected to import `tools/viz_theme` and call `save_svg()` which writes to stdout.
**Interactive figures (`.visualization`):** runs the script, expects Vega-Lite JSON on stdout. Embeds as `<script type="application/json" class="vega-spec">` inside a `.vega-container` div. `viz.js` finds all `.vega-spec` scripts, stores parsed spec on `container._vegaSpec`, calls `vegaEmbed`. Always applies the site's monochrome Vega config, ignoring the spec's own `config`. MutationObserver on `document.documentElement[data-theme]` triggers `reRenderAll()` on theme change.
**Frontmatter:** `viz: true` gates CDN loading of Vega/Vega-Lite/Vega-Embed and `viz.js` in `head.html` via `$if(viz)$`.
**`tools/viz_theme.py`:** `apply_monochrome()` sets matplotlib rcParams (transparent backgrounds, black lines); `save_svg(fig)` writes SVG to stdout via `io.StringIO`; `LINESTYLE_CYCLE` provides dash-pattern sequences for multi-series charts (no color distinction needed).
**Authoring syntax:**
```markdown
::: {.figure script="figures/plot.py" caption="Caption text"}
:::
::: {.visualization script="figures/chart.py" caption="Caption text"}
:::
```
### GPG signing — dedicated subkey + preset passphrase
**Key architecture:** master certifying key in `~/.gnupg` (passphrase-protected, used for email). Dedicated signing keyring at `~/.gnupg-signing/` holds: `sec#` (master stub, no secret) + `ssb` Ed25519 signing subkey (with secret). Correct isolation: `gpg --export-secret-subkeys "FINGERPRINT!"` exports only the subkey secret.
**Passphrase caching:** GPG 2.4's `passwd` in `--edit-key` requires the master secret to be present it cannot change a subkey passphrase in a stub+subkey-only keyring. Instead, `gpg-preset-passphrase` (`/usr/lib/gnupg/gpg-preset-passphrase`) caches the passphrase by keygrip directly in the agent. `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase` and `max-cache-ttl 86400`. `tools/preset-signing-passphrase.sh` prompts via the terminal, calls `gpg-preset-passphrase --preset <keygrip>`. Must be run once per boot (or when the 24h cache expires).
**Popup preview:** `popups.js` `sigContent` provider fetches the `.sig` URL (same-origin), renders the ASCII armor in a `<pre>` inside a `.popup-sig` div. Bound to `a.footer-sig-link` explicitly in `bindTargets`, bypassing the footer-exclusion guard on internal links. Result cached in the shared `cache` map.
**nginx:** `.sig` files need no special handling they're served as static files alongside `.html`. The `try_files` directive handles `$uri` directly.
### Embedding pipeline + semantic search
**Model unification:** Both similar-links (page-level) and semantic search (paragraph-level) use `all-MiniLM-L6-v2` (384 dims). This is a deliberate simplification: the same model runs at build time (Python/sentence-transformers) and query time (browser/transformers.js `Xenova/all-MiniLM-L6-v2` quantized), guaranteeing that query vectors and corpus vectors are in the same embedding space.
**Build-time (`tools/embed.py`):** One HTML parse pass per file extracts both the full-page text (for similar-links) and individual paragraphs (for semantic search). Model is loaded once and both encoding jobs run sequentially. Outputs: `data/similar-links.json`, `data/semantic-index.bin` (raw `float32`, shape `[N_paragraphs, 384]`), `data/semantic-meta.json`. All three are gitignored (generated). Staleness check: skips the entire run if all three outputs are newer than all `_site/` HTML.
**Binary index format:** `para_vecs.tobytes()` writes a flat, little-endian `float32` array. In JS: `new Float32Array(arrayBuffer)`. No header, no framing row `i` starts at byte offset `i × 384 × 4`. This is the simplest possible format and avoids a numpy/npy parser in the browser.
**Client-side search (`semantic-search.js`):** Dynamically imports `@xenova/transformers@2` from jsDelivr CDN on first query (lazy no load cost on pages that never use semantic search). Fetches binary index + metadata JSON (also lazy, browser-cached). Brute-force dot product over a `Float32Array` in a tight JS loop fast enough at <5k paragraphs; revisit with a WASM FAISS binding if the corpus grows beyond ~20k paragraphs. Vectors are unit-normalised at build time, so dot product = cosine similarity.
**Tab default + localStorage:** Keyword (Pagefind) is the default on first visit zero cold-start. User's last-used tab is stored under `search-tab` in `localStorage` and restored on load, so returning users who prefer semantic always land there. If `localStorage` is unavailable (private browsing restrictions), falls back silently to keyword.
**Self-hosted model:** `semantic-search.js` sets `mod.env.localModelPath = '/models/'` and `mod.env.allowRemoteModels = false` immediately after the CDN import. transformers.js then resolves model files as `GET /models/all-MiniLM-L6-v2/{file}`, which are same-origin. `make download-model` (= `tools/download-model.sh`) fetches 5 files from HuggingFace: `config.json`, `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`, `onnx/model_quantized.onnx`. The files live in `static/models/` (gitignored) and are copied to `_site/models/` by the existing `static/**` Hakyll rule.
**CSP:** `script-src` requires `https://cdn.jsdelivr.net` for the transformers.js library import. `connect-src` stays `'self'` all model weight fetches are same-origin after `allowRemoteModels = false`. The binary index and meta JSON are also same-origin.
### Responsive images — WebP `<picture>` wrapping
`build/Filters/Images.hs` inspects each `Image` inline's `src`. If it is a local raster (not starting with `http://`, `https://`, `//`, or `data:`; extension `.jpg`/`.jpeg`/`.png`/`.gif`), it emits `RawInline (Format "html")` containing a `<picture>` element:
```html
<picture>
<source srcset="/images/foo.webp" type="image/webp">
<img src="/images/foo.jpg" alt="…" loading="lazy" data-lightbox="true">
</picture>
```
The WebP `srcset` is computed at build time by `System.FilePath.replaceExtension`. No IO is needed in the filter the `<source>` is always emitted; browsers silently ignore it if the file doesn't exist (falling back to `<img>`). SVG, external URLs, and `data:` URIs remain plain `<img>` tags. Images inside `<a>` links get no `data-lightbox` marker (same as before).
`tools/convert-images.sh` performs the actual conversion: `cwebp -q 85` per file. Runs before Hakyll in `make build` and is also available as a standalone `make convert-images` target. Exits 0 with a notice if `cwebp` is not installed, so the build never fails on machines without `libwebp`. Generated `.webp` files are gitignored; `git add -f` to commit an authored WebP.
**Quality note:** `-q 85` is a good default for photographic images. For pixel-art, diagrams, or images that are already highly compressed, `-lossless` or a higher quality setting may be appropriate (edit the script).
### Annotations (infrastructure only)
`annotations.js` stores annotations as JSON in `localStorage` under `site-annotations`, scoped per `location.pathname`. On `DOMContentLoaded`, `applyAll()` re-anchors saved annotations via a `TreeWalker` text-stream search (concatenates all text nodes in `#markdownBody`, finds exact match by index, builds a `Range`, wraps with `<mark>`). Cross-element ranges use `extractContents()` + `insertNode()` fallback. Four highlight colors (amber / sage / steel / rose) defined in `annotations.css` as `rgba` overlays with `box-decoration-break: clone`. Hover tooltip shows note, date, and delete button. Public API: `window.Annotations.add(text, color, note)` / `.remove(id)`. The selection-popup "Annotate" button is currently removed pending a UI revision.
---
## VII. Reference: Inspirations
- **gwern.net** Primary model (Gwern Branwen + Said Achmiz). Semantic zoom, sidenotes, popups, monochrome, Pandoc+Hakyll.
- **Edward Tufte** Sidenotes, information design
- **Matthew Butterick's Practical Typography** Web typography in practice
- **Traditional book design** The standard to aspire to on screen
---
*This specification is a living document updated as implementation progresses.*