levineuwirth.org/spec.md

61 KiB
Raw Blame History

levineuwirth.org — Design Specification v11

Author: Levi Neuwirth Date: March 2026 (v11: 21 March 2026) Status: LIVING DOCUMENT — Updated as implementation progresses.


I. Vision & Philosophy

This website is an intellectual home — the permanent residence of a mind that moves freely between computer science, music composition, poetry, fiction, and whatever else catches fire.

Commitments

  1. Long content over disposable content. Essays are living documents.
  2. Semantic zoom. Title → abstract → headers → body → sidenotes → citations → sources.
  3. Earned ornament. Every decorative element serves a purpose.
  4. The site is the proof. Entirely FOSS. No tracking. No analytics. No fingerprinting.
  5. Reader > Author.
  6. Configuration is code. The build system is a Haskell program.
  7. No homepage epigraph.
  8. Extensible metadata. Future-proofed for semantic embeddings via external JSON injection.

II. All Resolved Decisions

Typography

Role Font License Notes
Body Spectral SIL OFL Screen-first serif. True smallcaps (smcp), four figure styles, ligatures, seven weights + italics. Self-hosted from source — Google Fonts strips OT features.
UI / Headers Fira Sans SIL OFL Humanist sans-serif. Complements Spectral.
Code JetBrains Mono SIL OFL Ligatures, excellent legibility.

Font pairing has been tested across screens and confirmed.

Self-hosting workflow:

pyftsubset Spectral-Regular.ttf \
  --output-file=spectral-regular.woff2 \
  --flavor=woff2 \
  --layout-features='liga,dlig,smcp,c2sc,onum,lnum,pnum,tnum,frac,ordn,sups,subs,ss01,ss02,ss03,ss04,ss05,kern' \
  --unicodes='U+0000-00FF,U+0131,U+0152-0153,U+02BB-02BC,U+02C6,U+02DA,U+02DC,U+2000-206F,U+2074,U+20AC,U+2122,U+2191,U+2193,U+2212,U+2215,U+FEFF,U+FFFD' \
  --no-hinting --desubroutinize

LaTeX Math

Client-side KaTeX (not pure build-time SSR — see Implementation Notes):

  • Pandoc outputs math spans with class="math inline" / class="math display"
  • KaTeX renders client-side from a deferred script
  • KaTeX CSS/fonts loaded conditionally only on pages with math ($if(math)$ in head template)

Navigation

Home | Me | Current | New | Links | Search    [⚙]
───────────────────────────────────────────────
▼ Portals
AI | Fiction | Miscellany | Music | Nonfiction | Poetry | Research | Tech
  • Primary row (always visible): Home, Me, Current (now-page), New (changelog), Links, Search; settings gear (⚙) on the right
  • Settings panel (⚙ button): Theme (Light/Dark), Text size (A/A+), Focus Mode, Reduce Motion, Print — managed by settings.js; state persisted via localStorage
  • Expandable portal row: AI, Fiction, Miscellany, Music, Nonfiction, Poetry, Research, Tech
  • Portal row collapsed by default; expansion state persisted via localStorage
  • Fira Sans smallcaps for primary row

Layout

  • Left margin: Interactive sticky TOC (IntersectionObserver). Collapses on narrow screens.
  • Center column: Body text in Spectral. 650700px max-width.
  • Right margin: Sidenotes only (right column).

Color

Pure monochrome. No accent color. Light mode default (#faf8f4 background, #1a1a1a text). Dark mode via [data-theme="dark"] + prefers-color-scheme.

Content Systems

  • Tag system: Hierarchical, slash-separated (research/mathematics). Hakyll buildTags + custom hierarchy. Tag pages at /<tag>/ with no /tags/ namespace prefix.
  • Pagination: Blog index 20/page, tag pages 20/page. Essay index all on one page.
  • RSS: Atom feed at /feed.xml (all content types, sorted by date) and /music/feed.xml (compositions only).
  • Citations: Numbered superscript markers [1] linked to a bibliography section. Hover preview via citations.js. Further Reading section separate from cited works. data/bibliography.bib + Chicago Author-Date CSL.
  • Collapsible sections: h2/h3 headings toggle their content via collapse.js. Smooth max-height transition. State persisted in localStorage.

Gwern Codebase: Selective Adoption

Component Action Actual outcome
sidenotes.js Adopt directly (Said Achmiz, MIT) Written from scratch — purpose-built for our HTML structure
popups.js Fork and simplify (Said Achmiz, MIT) Exists in static/js/popups.js; Phase 3
CSS typographic foundations Extract and refactor Done
Pandoc AST filters Write from scratch Done
Hakyll architecture Rewrite, informed by gwern Done
Everything else Ignore

Metadata

Extensible YAML frontmatter. Hakyll strips frontmatter before passing to Pandoc, so all frontmatter access goes through Hakyll's metadata API (lookupStringList, getMetadataField, etc.), not through Pandoc Meta.

Frontmatter keys in use:

title:              # page title
date:               # ISO date (YYYY-MM-DD) — used for sorting, feed, reading-time
abstract:           # short description (13 sentences)
tags:               # hierarchical tag list
authors:            # list of author names (defaults to Levi Neuwirth)
further-reading:    # list of BibTeX keys for the Further Reading section
bibliography:       # path to .bib file (optional; defaults to data/bibliography.bib)
csl:                # path to .csl file (optional; defaults to data/chicago-notes.csl)

# Epistemic profile (all optional; section shown only if `status` is present)
status:             # Draft | Working model | Durable | Refined | Superseded | Deprecated
confidence:         # 0100 integer (%)
importance:         # 15 integer (rendered as filled/empty dots)
evidence:           # 15 integer (rendered as filled/empty dots)
scope:              # personal | local | average | broad | civilizational
novelty:            # conventional | moderate | idiosyncratic | innovative
practicality:       # abstract | low | moderate | high | exceptional
stability:          # volatile | revising | fairly stable | stable | established
                    # (auto-computed from git history; use IGNORE.txt to pin)
last-reviewed:      # ISO date — overrides git-derived date when in IGNORE.txt
confidence-history: # list of integers — trend derived from last two entries (↑↓→)

# Version history (optional; falls back to git log, then to date-created/date-modified)
history:
  - date: "2026-03-01"    # ISO date string (quote to prevent YAML date parsing)
    note: Initial draft   # human-readable annotation
  - date: "2026-03-14"
    note: Expanded typography section; added citations

Auto-computed at build time: word-count, reading-time. Auto-derived at build time: stability (from git log --follow), last-reviewed (most recent commit date), confidence-trend (from confidence-history).

IGNORE.txt: A file in the project root listing content paths (one per line) whose stability and last-reviewed should not be recomputed. Cleared automatically after every make build. Useful for pinning manually-set stability labels on pages whose git history is misleading.

Top metadata block:

  1. Tags — hierarchical tag list with links to tag index pages
  2. Description — the abstract field, rendered in italic
  3. Authorsauthors list
  4. Page info — jump links to bottom metadata sections (Epistemic/Bibliography/Backlinks shown conditionally)

Bottom metadata footer:

  • Version history — three-tier priority: (1) frontmatter history list with authored notes → (2) git log dates (date-only) → (3) date-created / date-modified fallback. make build auto-commits content/ before building, keeping git history current.
  • Epistemic (if status set) — compact: status chip · confidence % · importance dots · evidence dots; expanded <details>: stability · scope · novelty · practicality · last reviewed · confidence trend
  • Bibliography — formatted citations + Further Reading
  • Backlinks — auto-generated; each entry shows source title (link) + collapsible context paragraph

Licensing

  • Content: CC BY-SA-NC 4.0
  • Code: MIT

III. Deployment & Infrastructure

Deployment Pipeline

[Local machine]                          [Arch Linux VPS / DreamHost]

content/*.md
    ↓
cabal run site -- build                  nginx serving
    ↓                                    /var/www/levineuwirth.org/
pagefind --site _site
    ↓
rsync -avz --delete \
  _site/ \
  vps:/var/www/levineuwirth.org/  ──→   Live site
build:
	@git add content/
	@git diff --cached --quiet || git commit -m "auto: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
	@date +%s > data/build-start.txt
	@./tools/convert-images.sh        # WebP conversion (skipped if cwebp absent)
	cabal run site -- build
	pagefind --site _site
	@if [ -d .venv ]; then \
	  uv run python tools/embed.py || echo "Warning: embedding failed"; \
	fi
	> IGNORE.txt          # clear stability pins after each build
	@BUILD_END=$(date +%s); BUILD_START=$(cat data/build-start.txt); \
	  echo $((BUILD_END - BUILD_START)) > data/last-build-seconds.txt

sign:
	@./tools/sign-site.sh    # detach-sign every _site/**/*.html; requires passphrase cached via preset-signing-passphrase.sh

deploy: build sign
	rsync -avz --delete _site/ vps:/var/www/levineuwirth.org/

watch:
	cabal run site -- watch

clean:
	cabal run site -- clean

download-model:
	@./tools/download-model.sh   # fetch quantized ONNX model to static/models/ (once per machine)

convert-images:
	@./tools/convert-images.sh   # manual trigger; also runs in build

Hosting Timeline

  1. Immediate: Deploy to DreamHost (rsync static files)
  2. Phase 5: Provision Arch VPS (Hetzner), configure nginx + certbot, migrate DNS

VPS: nginx config (Arch Linux)

server {
    listen 443 ssl http2;
    server_name levineuwirth.org www.levineuwirth.org;
    root /var/www/levineuwirth.org;

    # TLS (managed by certbot)
    ssl_certificate /etc/letsencrypt/live/levineuwirth.org/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/levineuwirth.org/privkey.pem;

    # cdn.jsdelivr.net required for transformers.js (semantic search library).
    # Model weights served same-origin from /models/ — connect-src stays 'self'.
    add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; img-src 'self'; font-src 'self';" always;

    gzip on;
    gzip_types text/html text/css application/javascript application/json image/svg+xml;

    location ~* \.(woff2|css|js|svg|png|jpg|webp)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
    }
    location ~* \.html$ {
        expires 1h;
        add_header Cache-Control "public, must-revalidate";
    }

    try_files $uri $uri.html $uri/ =404;
    error_page 404 /404.html;
}
server {
    listen 80;
    server_name levineuwirth.org www.levineuwirth.org;
    return 301 https://$host$request_uri;
}

IV. Repository Structure

levineuwirth.org/
├── content/
│   ├── essays/
│   │   └── test-essay.md           # Feature test document
│   ├── blog/
│   ├── music/
│   │   └── {slug}/
│   │       ├── index.md            # Composition frontmatter + program notes
│   │       ├── scores/             # LilyPond SVG pages + PDF
│   │       └── audio/              # Per-movement MP3s
│   └── *.md                        # Standalone pages (me, colophon, etc.)
├── static/
│   ├── css/
│   │   ├── base.css                # CSS variables, palette, dark mode
│   │   ├── typography.css          # Spectral OT features, dropcaps, smallcaps, link icons
│   │   ├── layout.css              # 3-column layout, responsive breakpoints
│   │   ├── sidenotes.css           # Sidenote positioning
│   │   ├── popups.css              # Link preview popup styles
│   │   ├── syntax.css              # Monochrome code highlighting (JetBrains Mono)
│   │   ├── components.css          # Nav (incl. settings panel), TOC, metadata, citations, collapsibles
│   │   ├── viz.css                 # Visualization figure layout (.viz-figure, .vega-container, .viz-caption)
│   │   ├── gallery.css             # Exhibit system + annotation callouts
│   │   ├── selection-popup.css     # Text-selection toolbar
│   │   ├── annotations.css         # User highlight marks + annotation tooltip
│   │   ├── images.css              # Figure layout, captions, lightbox overlay
│   │   ├── score-reader.css        # Full-page score reader layout
│   │   ├── catalog.css             # Music catalog page (`/music/`)
│   │   └── print.css               # Print stylesheet (media="print")
│   ├── js/
│   │   ├── theme.js                # Dark/light toggle (sync, not deferred)
│   │   ├── sidenotes.js            # Written from scratch — collision avoidance, hover/focus
│   │   ├── toc.js                  # Sticky TOC + scroll tracking + animated collapse
│   │   ├── nav.js                  # Portal row expand/collapse + localStorage
│   │   ├── collapse.js             # Section collapsing with localStorage persistence
│   │   ├── citations.js            # Citation hover previews
│   │   ├── gallery.js              # Exhibit overlay + annotation toggle
│   │   ├── popups.js               # Link preview popups (internal, Wikipedia, citations)
│   │   ├── settings.js             # Settings panel (theme, text size, focus mode, reduce motion, print)
│   │   ├── selection-popup.js      # Context-aware text-selection toolbar
│   │   ├── annotations.js          # localStorage highlight/annotation engine (UI deferred)
│   │   ├── score-reader.js         # Score reader: page-turn, movement jumps, deep linking
│   │   ├── viz.js                  # Vega-Lite render + dark mode re-render via MutationObserver
│   │   ├── semantic-search.js      # Client-side semantic search: transformers.js + Float32Array cosine ranking
│   │   ├── search.js               # Pagefind UI init + ?q= pre-fill + search timing (#search-timing)
│   │   └── prism.min.js            # Syntax highlighting
│   ├── fonts/                      # Self-hosted WOFF2 (subsetted with OT features)
│   ├── gpg/
│   │   └── pubkey.asc              # Ed25519 signing subkey public key (master: CD90AE96…; subkey: C9A42A6F…)
│   ├── models/                     # Self-hosted ONNX model (gitignored; run: make download-model)
│   │   └── all-MiniLM-L6-v2/      # ~22 MB quantized — served at /models/ for semantic-search.js
│   └── images/
│       └── link-icons/             # SVG icons for external link classification
│           ├── external.svg
│           ├── wikipedia.svg
│           ├── github.svg
│           ├── arxiv.svg
│           └── doi.svg
├── templates/
│   ├── default.html                # Outer shell: nav, head, footer JS
│   ├── essay.html                  # 3-column layout with TOC
│   ├── composition.html            # Music landing page (metadata block, movements, body, recording player)
│   ├── music-catalog.html          # Music catalog index (`/music/`)
│   ├── score-reader.html           # Minimal score reader body (top bar + SVG stage)
│   ├── score-reader-default.html   # Minimal HTML shell for score reader (no nav/footer)
│   ├── blog-post.html
│   ├── page.html                   # Simple standalone pages
│   ├── essay-index.html
│   ├── blog-index.html
│   ├── tag-index.html
│   └── partials/
│       ├── head.html               # CSS, conditional JS (citations, collapse)
│       ├── nav.html                # Two-row nav with portals
│       ├── footer.html
│       ├── metadata.html           # Essay metadata block (top)
│       └── page-footer.html        # Essay footer (bibliography, backlinks)
├── build/
│   ├── Main.hs                     # Entry point
│   ├── Site.hs                     # Hakyll rules (all routes + Atom feed)
│   ├── Compilers.hs                # Pandoc compiler wrappers
│   ├── Contexts.hs                 # Template contexts (word-count, reading-time, bibliography)
│   ├── Citations.hs                # citeproc pipeline: Cite→superscript + bibliography HTML
│   ├── Filters.hs                  # Re-exports all filter modules
│   ├── Filters/
│   │   ├── Typography.hs           # Smart quotes, dashes
│   │   ├── Sidenotes.hs            # Footnote → sidenote conversion
│   │   ├── Dropcaps.hs             # Decorative first-letter drop caps
│   │   ├── Smallcaps.hs            # Smallcaps via smcp OT feature
│   │   ├── Wikilinks.hs            # [[wikilink]] syntax
│   │   ├── Links.hs                # External link classification + data-link-icon attributes
│   │   ├── Math.hs                 # Simple LaTeX → Unicode conversion
│   │   ├── Code.hs                 # Prepend language- prefix for Prism.js
│   │   ├── Images.hs               # Lazy loading, lightbox data-attributes, WebP <picture> wrapper for local raster images
│   │   ├── Score.hs                # Score fragment SVG inlining + currentColor replacement
│   │   └── Viz.hs                  # Visualization IO filter: runs Python scripts, inlines SVG / Vega-Lite JSON
│   ├── Authors.hs                  # Author-as-tag system (slugify, authorLinksField, author pages)
│   ├── Backlinks.hs                # Two-pass build-time backlinks with context paragraph extraction
│   ├── Catalog.hs                  # Music catalog: featured works + grouped-by-category HTML rendering
│   ├── Stability.hs                # Git-based stability auto-calculation + last-reviewed derivation
│   ├── Metadata.hs                 # Stub (Phase 2+)
│   ├── Tags.hs                     # Hierarchical tag system
│   ├── Pagination.hs               # 20/page for blog + tag indexes
│   └── Utils.hs                    # Shared helpers (wordCount, readingTime)
├── data/
│   ├── bibliography.bib            # BibTeX references
│   ├── chicago-notes.csl           # CSL style (in-text, Chicago Author-Date)
│   └── (future: embeddings.json, similar-links.json)
├── tools/
│   ├── subset-fonts.sh
│   ├── viz_theme.py                # Matplotlib monochrome helpers (apply_monochrome, save_svg, LINESTYLE_CYCLE)
│   ├── sign-site.sh                # Detach-sign every _site/**/*.html → .html.sig (called by `make sign`)
│   ├── preset-signing-passphrase.sh  # Cache signing subkey passphrase in gpg-agent (run once per boot)
│   ├── download-model.sh           # Fetch quantized ONNX model to static/models/ (run once per machine)
│   ├── convert-images.sh           # Convert JPEG/PNG → WebP companions via cwebp (runs automatically in build)
│   └── embed.py                    # Build-time embedding pipeline: similar-links + semantic search index
├── levineuwirth.cabal
├── cabal.project
├── cabal.project.freeze
├── Makefile
└── CLAUDE.md

V. Implementation Phases

Phase 1: Foundation ✓

  • Init Hakyll project, modular Haskell build system
  • Font subsetting + self-hosting (Spectral, Fira Sans, JetBrains Mono)
  • CSS: base (palette, variables, dark mode), typography (Spectral features), layout (3-column), sidenotes
  • sidenotes.js — written from scratch (not adopted; see Implementation Notes)
  • Two-row navigation with expandable portals
  • Templates: default, essay, blog-post, index
  • Dark/light toggle with localStorage + prefers-color-scheme
  • Basic Pandoc pipeline (Markdown → HTML, smart typography)
  • Deploy to DreamHost via rsync — deployed to Hetzner VPS instead

Phase 2: Content Features ✓

  • Pandoc filters: sidenotes, dropcaps, smallcaps, wikilinks, typography, link classification, code, math
  • Interactive sticky TOC — IntersectionObserver, animated expand/collapse, page-title display, auto-collapse on scroll
  • Citation system — numbered superscript markers, hover preview, bibliography + Further Reading sections
  • Monochrome syntax highlighting (Prism.js + Filters.Code)
  • Collapsible h2/h3 sections (collapse.js) — max-height transition, localStorage persistence
  • Hierarchical tag system + tag index pages
  • Pagination (blog index and tag pages, 20/page)
  • Metadata: YAML frontmatter + auto-computed word count / reading time
  • Single Atom feed (/feed.xml, all content, sorted by date)
  • External link icons (SVG mask-image, domain-classified via Filters.Links)
  • Gallery / Exhibit system (gallery.js, gallery.css) — added (not in original spec)

Phase 3: Rich Interactions

  • Link preview popups (popups.js) — internal page previews (title, abstract, authors, tags, reading time), Wikipedia excerpts, citation previews; relative-URL fix for index pages
  • Pagefind search (/search.html) — search.js pre-fills from ?q= param; #search-timing shows elapsed ms (mono, faint) via MutationObserver on search results subtree
  • Author system — authors treated as tags; build/Authors.hs; author pages at /authors/{slug}/; authorLinksField in all contexts; defaults to Levi Neuwirth
  • Settings panel — settings.js + settings.css section in components.css; theme, text size (3 steps), focus mode, reduce motion, print; all state in localStorage; theme.js restores all settings before first paint
  • Selection popup — selection-popup.js / selection-popup.css; context-aware toolbar appears 450 ms after text selection; see Implementation Notes
  • Print stylesheet — print.css (media="print"); single-column, light colors, sidenotes as indented blocks, external URLs shown
  • Current page (/current.html) — now-page; added to primary nav
  • [~] Annotations — annotations.js / annotations.css; localStorage infrastructure + highlight re-anchoring written; UI (button in selection popup) deferred

Phase 4: Creative Content & Polish

  • Image handling (lazy load, lightbox, figures, WebP <picture> wrapper for local raster images)
  • Homepage (replaces standalone index; gateway + curated recent content)
  • Poetry typesetting — codex reading mode (reading.html, reading.css, reading.js); poetryCompiler with Ext_hard_line_breaks; narrower measure, stanza spacing, drop-cap suppressed
  • Fiction reading mode — same codex layout; fictionCompiler; chapter drop caps + smallcaps lead-in via h2 + p::first-letter; reading mode infrastructure shared with poetry
  • Music section — score fragment system (A): inline SVG excerpts (motifs, passages) integrated into the gallery/exhibit system; named, TOC-listed, focusable in the shared overlay alongside equations; authored via {.score-fragment score-name="..." score-caption="..."} fenced-div; SVG inlined at build time by Filters.Score; black fills/strokes replaced with currentColor for dark-mode; see Implementation Notes
  • Music section — composition landing pages + full score reader (C): two-URL architecture per composition; /music/{slug}/ (rich prose landing page with movement list, audio players, inline score fragments) and /music/{slug}/score/ (minimal dedicated reader); Hakyll version "score-reader" mechanism; compositionCtx with slug, score-url, has-score, score-page-count, score-pages list, has-movements, movements list (Aeson-parsed nested YAML); score-reader-default.html minimal shell; score-reader.js (page navigation, movement jumps, ?p= deep linking, preloading, keyboard); score-reader.css; dark mode via filter: invert(1); see Implementation Notes
  • Accessibility audit — skip link, TOC collapsed-link tabbing (visibility: hidden), section-toggle focus visibility, lightbox/gallery/settings focus restoration, popup aria-hidden, metadata nav wrapping, footer onclick removal; settings panel focus-steal bug fixed (focus only returns to toggle when it was inside the panel, preventing interference with text-selection popup)
  • [~] Visualization pipeline — Pandoc filter approach (Filters.Viz): .figure fenced divs run python3 <script>, capture SVG stdout, inline with currentColor replacement; .visualization fenced divs embed Vega-Lite JSON in a <script type="application/json" class="vega-spec"> tag rendered by viz.js; viz: true frontmatter gates CDN Vega/Vega-Lite/Vega-Embed + viz.js; dark mode re-renders via MutationObserver; tools/viz_theme.py provides matplotlib monochrome helpers. Infrastructure complete; not yet used in production content.
  • Content migration — migrate existing essays, poems, fiction, and music landing pages from prior formats into content/

Phase 5: Infrastructure & Advanced

  • Arch Linux VPS + nginx + certbot + DNS migration — Hetzner VPS provisioned, Arch Linux installed, nginx configured (config in §III), TLS cert via certbot, DNS migrated from DreamHost. make deploy pushes to GitHub and rsyncs to VPS.
  • Semantic embedding pipeline — Implemented. See Phase 6 "Embedding-powered similar links" and "Full-text semantic search".
  • Backlinks with context — Two-pass build-time system (build/Backlinks.hs). Pass 1: version "links" compiles each page lightly (wikilinks preprocessed, links + context extracted, serialised as JSON). Pass 2: create ["data/backlinks.json"] inverts the map. backlinksField in essayCtx / postCtx loads the JSON and renders <details>-collapsible per-entry lists. popups.js excludes .backlink-source links from the preview popup. Context paragraph uses runPure . writeHtml5String on the surrounding Para block. See Implementation Notes.
  • Link archiving — For all external links in data/bibliography.bib and in page bodies, check availability and save snapshots (Wayback Machine save API or local archivebox instance). Store archive URLs in data/link-archive.json; Filters.Links injects data-archive-url attributes; popups.js falls back to the archive if the live URL returns 404.
  • Self-hosted git (Forgejo) — Run Forgejo on the VPS. Mirror the build repo. Link from the colophon. Not essential; can remain on GitHub indefinitely.
  • Reader mode — Distraction-free reading overlay: hides nav, TOC, sidenotes; widens the body column to ~70ch; activated via a keyboard shortcut or settings panel toggle. Distinct from focus mode (which affects the nav) — reader mode affects the content layout.

Phase 6: Deferred Features

  • Annotation UI — The annotations.js / annotations.css infrastructure exists (localStorage storage, re-anchoring on load, four highlight colors, hover tooltip). The selection popup "Annotate" button was removed pending a design decision on the color-picker and note-entry UX. Revisit: a popover with four color swatches and an optional text field, triggered from the selection popup.
  • [~] Visualization pipeline — Implemented as a Pandoc IO filter (Filters.Viz), not a per-slug Hakyll rule. See Phase 4 entry and Implementation Notes. Infrastructure complete; production content pending.
  • Music catalog page/music/ index listing all compositions grouped by instrumentation category (orchestral → chamber → solo → vocal → choral → electronic → other), with an optional Featured section. Auto-generated from composition frontmatter by build/Catalog.hs; renders HTML in Haskell (same pattern as backlinks). Category, year, duration, instrumentation, and ◼/♫ indicators for score/recording availability. content/music/index.md provides prose intro + abstract. Template: templates/music-catalog.html. CSS: static/css/catalog.css. Context: musicCatalogCtx (provides catalog: true flag, featured-works, has-featured, catalog-by-category).
  • Score reader swipe gesturestouchstart/touchend listeners on #score-reader-stage with passive: true. Threshold: ≥ 50 px horizontal, < 30 px vertical drift. Left swipe → next page; right swipe → previous page.
  • Full-piece audio on composition pagesrecording frontmatter key (path relative to the composition directory). Rendered as a full-width <audio> player in composition.html, above the per-movement list. Styled via .comp-recording / .comp-recording-audio in components.css. Per-movement <audio> players and .comp-btn / .comp-movement-* styles also added in the same pass.
  • RSS/feed improvements/feed.xml now includes compositions (content/music/*/index.md) alongside essays, posts, fiction, poetry. New /music/feed.xml (compositions only, musicFeedConfig). Compositions already had "content" snapshots saved by the landing-page rule; no compiler changes needed.
  • Pagefind improvements — Currently a basic full-text search. Consider: sub-result excerpts, portal-scoped search filters, weighting by importance frontmatter field.
  • Audio essays / podcast feed — Record readings of select essays. Embed a native <audio> player at the top of the essay page, activated by an audio frontmatter key (path to MP3, relative to the content dir). Generate a separate /podcast.xml Atom feed with <enclosure> elements pointing to the MP3s so readers can subscribe in any podcast app. Stretch goal: a paragraph-sync mode where the player emits timeupdate events that highlight the paragraph currently being read — requires a data/audio/{slug}-timestamps.json file mapping paragraph indices to timestamps, authored manually or via forced-alignment tooling (e.g. whisper with word timestamps).
  • Build telemetry page/build/ page generated at build time. build/Stats.hs loads all content items by type, reads "word-count" snapshots, aggregates counts/words/reading-time per type, computes word-length distribution (5 buckets), and reads top-15 tags from the Tags object. Makefile writes date +%s to data/build-start.txt before Hakyll runs; after pagefind, computes elapsed and writes data/last-build-seconds.txt (read on next build). CSS in static/css/build.css (flex bar chart, tabular-nums table, grid dl); loaded conditionally via $if(build)$ in head.html.
  • Epistemic profile — Replaces the old certainty / importance fields with a richer multi-axis system. Compact (always visible in footer): status chip · confidence % · importance dots · evidence dots. Expanded (<details>): stability (auto) · scope · novelty · practicality · last reviewed · confidence trend. Auto-calculation in build/Stability.hs via git log --follow; IGNORE.txt pins overrides. See Metadata section and Implementation Notes for full schema and vocabulary.
  • Writing statistics dashboard — A /stats page computed entirely at build time from the corpus. Contents: total word count across all content types, essay/post/poem count, words written per month rendered as a GitHub-style contribution heatmap (SVG generated by Haskell or a Python script), average and median essay length, longest essay, most-cited essay (by backlink count), tag distribution as a treemap, reading-time histogram, site growth over time (cumulative word count by date). All data collected during the Hakyll build from compiled items and their snapshots; serialized to data/stats.json and rendered into a dedicated stats.html template.
  • Memento mori — Implemented at /memento-mori/ as a full standalone page. 90×52 grid of weeks anchored to birthday anniversaries (nested year/week loop via setFullYear; week 52 stretched to eve of next birthday to absorb 365th/366th days). Week popup shows dynamic day-count and locale-derived day names. Score fragment (bassoon, content/memento-mori/scores/bsn.svg) inlined via Filters.Score. Linked from footer (MM).
  • Embedding-powered similar linkstools/embed.py encodes every #markdownBody page with all-MiniLM-L6-v2 (384 dims, unit-normalised), builds a FAISS IndexFlatIP, queries top-5 neighbours per page (cosine ≥ 0.30), writes data/similar-links.json. build/SimilarLinks.hs provides similarLinksField in essayCtx/postCtx with Hakyll dependency tracking; "Related" section rendered in page-footer.html. Staleness check skips re-embedding when JSON is newer than all HTML. Called by make build via uv run; non-fatal if .venv absent. See Implementation Notes.
  • Bidirectional backlinks with context — See Phase 5 above; implemented with full context-paragraph extraction. Merged with the Phase 5 stub.
  • Signed pages / content integritymake sign (called by make deploy) runs tools/sign-site.sh: walks _site/**/*.html, produces a detached ASCII-armored .sig per page. Signing uses a dedicated Ed25519 subkey isolated in ~/.gnupg-signing/ (master sec# stub + ssb signing subkey). Passphrase cached 24 h in the signing agent via tools/preset-signing-passphrase.sh + gpg-preset-passphrase; ~/.gnupg-signing/gpg-agent.conf sets allow-preset-passphrase. Footer "sig" link points to $url$.sig; hovering shows the ASCII armor via popups.js sigContent provider. Public key at static/gpg/pubkey.asc → served at /gpg/pubkey.asc. Fingerprints: master CD90AE96…B5C9663; signing subkey C9A42A6F…2707066 (keygrip 619844703EC398E70B0045D7150F08179CFEEFE3). See Implementation Notes.
  • Self-hosted semantic search modeltools/download-model.sh fetches the quantized ONNX model (all-MiniLM-L6-v2, ~22 MB, 5 files) from HuggingFace into static/models/all-MiniLM-L6-v2/ (gitignored). semantic-search.js sets env.localModelPath = '/models/' and env.allowRemoteModels = false before calling pipeline(), so all model weight fetches are same-origin. The CDN import of transformers.js itself still requires cdn.jsdelivr.net in script-src; connect-src stays 'self'. make download-model is a one-time setup step per machine. See Implementation Notes.
  • Responsive images (WebP)tools/convert-images.sh walks static/ and content/, calls cwebp -q 85 to produce .webp companions alongside every JPEG/PNG (skips existing; exits gracefully if cwebp absent). make build runs it before Hakyll so WebP files are present when static/** is copied. build/Filters/Images.hs detects local raster images and emits RawInline "html" <picture> elements with a <source srcset="…webp" type="image/webp"> and an <img> fallback; SVG, external URLs, and data: URIs pass through as plain <img>. Generated .webp files gitignored via static/**/*.webp / content/**/*.webp. See Implementation Notes.
  • Full-text semantic searchtools/embed.py also produces paragraph-level embeddings (same all-MiniLM-L6-v2 model, same pass): walks <p>/<li>/<blockquote> in #markdownBody, tracks nearest preceding heading for context, writes data/semantic-index.bin (raw Float32, N×384) + data/semantic-meta.json ([{url, title, heading, excerpt}]). Client: static/js/semantic-search.js dynamically imports @xenova/transformers@2 from CDN, embeds the query, brute-force cosine-ranks all paragraph vectors (fast at <5k paragraphs in JS), renders top-8 results as title + section heading + excerpt. Surfaced on /search.html as a Keyword / Semantic tab strip; active tab persists in localStorage (keyword default on first visit). Both tabs on same page; semantic-search.js loaded within existing $if(search)$ block. See Implementation Notes.

VI. Implementation Notes

sidenotes.js — Written from scratch

The spec called for adopting Said Achmiz's sidenotes.js directly. Instead a purpose-built version was written for the <span class="sidenote"> structure produced by Filters.Sidenotes. Features: JS collision avoidance (positionSidenotes), bidirectional hover highlight, click-to-focus (sticky highlight on wide viewport, anchor scroll fallback on narrow), document-click dismissal. window.resize is used as the reposition signal; collapse.js dispatches it after each section transition.

  • Exhibits (.exhibit--equation, .exhibit--proof): always-visible inline blocks with overlay zoom on click.
  • Annotations (.annotation--static, .annotation--collapsible): editorial callout boxes.
  • TOC integration: exhibits are listed under their parent heading.
  • Implementation: gallery.js, gallery.css; Pandoc fenced-div syntax (:::) to avoid the 4-space code block trap.

LaTeX Math — Client-side KaTeX

The spec described pure build-time SSR. In practice: Pandoc outputs class="math" spans, KaTeX renders client-side from a deferred script. Fully static (no server per request). Revisit if build-time SSR becomes important.

Citation pipeline — key subtleties

  1. Cite nodes, not Span nodes. processCitations with class="in-text" CSL does not convert Cite nodes to Span class="citation" nodes in the Pandoc AST — it only populates their inline content and creates the refs div. The HTML writer wraps them in <span class="citation"> at write time. Our Citations.hs must match Cite nodes directly.
  2. Hakyll strips YAML frontmatter. Hakyll reads frontmatter separately; the body passed to readPandocWith has no YAML block, so Pandoc Meta is empty. further-reading keys are read from Hakyll's metadata API (lookupStringList) in Compilers.hs and passed explicitly to Citations.applyCitations.
  3. nocite format. Each further-reading key must be a separate Cite node with AuthorInText mode and non-empty fallback content — matching what pandoc produces from "@key1 @key2" in YAML. A single Cite node with multiple citations is not recognized by citeproc's nocite processing.
  4. collectCiteOrder queries blocks only, not the full Pandoc (which includes metadata). Querying metadata would pick up the injected nocite Cite nodes and incorrectly classify further-reading entries as inline citations.

Implemented via data-link-icon and data-link-icon-type="svg" attributes set by Filters.Links. CSS uses mask-image: url(...) with background-color: currentColor so icons inherit the text color and work in dark mode. Icons in static/images/link-icons/ as SVG files.

Tags — Hierarchical, no namespace

Tags are slash-separated (research/mathematics). A tag is auto-expanded into all ancestor prefixes so that /research/ aggregates all research/* content. Tag pages live directly at /<tag>/ with no /tags/ namespace.

Collapsible sections

collapse.js wraps each h2/h3's following siblings in a .section-body div and injects a .section-toggle button into the heading. State is persisted per heading in localStorage under section-collapsed:<id>. After each transitionend, dispatches window.resize to retrigger sidenote positioning. Headings themselves are never hidden, preserving IntersectionObserver targets for toc.js.

Atom feed

/feed.xml covers all essays and blog posts (up to 30 most recent). A "content" snapshot is saved in Site.hs before template application, so the feed body is just the compiled article HTML (not the full page with nav/footer). Dates from the date frontmatter key, formatted as RFC 3339.

Author system

Authors are treated as a second tag dimension. build/Authors.hs provides buildAllAuthors (a buildTagsWith call keyed to authors frontmatter) and authorLinksField (a listFieldWith context that defaults to ["Levi Neuwirth"] when no authors key is present, so all unattributed content contributes to his author page). Author pages live at /authors/{slug}/. slugify lowercases and hyphenates; pipe-separated values ("Name | role") strip the role portion via nameOf.

Settings panel

settings.js manages four independent settings, all persisted in localStorage:

  • Theme (data-theme on <html>): light / dark, with syncThemeButtons() toggling .is-active.
  • Text size: three steps [20, 23, 26] px (small / default / large), written as --text-size CSS custom property on <html>. Default index is 1 (23 px).
  • Focus mode (data-focus-mode on <html>): hides TOC, fades header to 7% opacity until hover.
  • Reduce motion (data-reduce-motion on <html>): collapses all animation-duration / transition-duration to 0.01ms.

theme.js (sync, not deferred) restores all four attributes from localStorage before first paint to avoid flash.

Selection popup

selection-popup.js / selection-popup.css. A toolbar appears 450 ms after any text selection of ≥ 2 characters. Context is detected from the DOM ancestors of the selection range:

Context Detection Buttons
code (known lang) closest('pre, code, .sourceCode') + language-* class Copy · <MDN / Hoogle / Docs…>
code (unknown) same, no language-* Copy
math closest('.math, .katex') + Range.intersectsNode fallback Copy · nLab · OEIS · Wolfram
prose (multi-word) fallback BibTeX · Copy · DuckDuckGo · Here · Wikipedia
prose (single word) !/\s/.test(text) BibTeX · Copy · Define · DuckDuckGo · Here · Wikipedia

16 languages are mapped to documentation providers (MDN, Hoogle, docs.python.org, doc.rust-lang.org, etc.) via DOC_PROVIDERS. BibTeX generates a @online{...} BibLaTeX entry (key = lastname + year + firstWord; selected text in note={\enquote{...}}; year scraped from #version-history li). Here opens /search.html?q= in a new tab. Define opens English Wiktionary. Popup positions above the selection, flips below if insufficient space; hides on scroll, outside mousedown, or Escape.

Reading mode (poetry + fiction)

Shared codex layout for creative content. templates/reading.html omits the TOC and emits a <div id="reading-progress"> progress bar instead. body.reading-mode (set via $if(reading)$ in default.html) triggers a slightly warmer background (#fdf9f1 / #1c1917). Poetry pages (body.reading-mode.poetry) use a 52ch measure, 1.85 line-height, stanza paragraph spacing, and suppressed dropcap/smallcaps lead-in; poetryCompiler enables Ext_hard_line_breaks so each source newline becomes <br>. Fiction pages (body.reading-mode.fiction) use a 62ch measure, centered Fira Sans smallcaps chapter headings, and a dropcap + smallcaps lead-in on each h2 + p. Progress bar is driven by reading.js (scroll position → width on #reading-progress). CSS and JS loaded conditionally via $if(reading)$. Content goes in content/poetry/*.md and content/fiction/*.md; tags poetry / fiction route items to the correct portal and library section.

Score fragment system (option A)

Filters/Score.hs walks the Pandoc AST for Div nodes with class score-fragment. It reads the referenced SVG from disk (path resolved relative to the source file's directory via getResourceFilePath + takeDirectory), replaces hardcoded black fill/stroke values with currentColor (6-digit before 3-digit to prevent partial matches on #000 vs #000000), and emits a RawBlock "html" <figure> carrying class="score-fragment exhibit", data-exhibit-name, and data-exhibit-type="score" for gallery.js TOC integration. SVGs are inlined at build time and never served as separate files. gallery.js discovers .score-fragment elements via discoverFocusableScores, adds them to the shared focusables[] array with type: 'score', and the overlay's renderOverlay branches on type — score path clones the SVG into the overlay body (no font-size loop); math path keeps the KaTeX re-render. Overlay body receives class is-score for tighter horizontal padding (2rem 1.5rem vs 3.5rem 4.5rem). CSS: background rect removed via svg > rect:first-child { fill: none }, SVG responsive via width: 100%; height: auto, dark mode via color: var(--text).

Authoring syntax:

::: {.score-fragment score-name="Main Theme, mm. 18" score-caption="The opening gesture."}
![](scores/main-theme.svg)
:::

Music — Composition landing pages + full score reader (option C)

Implemented. Two URLs per composition from one source directory.

Architecture

URL Templates Purpose
/music/{slug}/ composition.html + default.html Rich prose landing page
/music/{slug}/score/ score-reader.html + score-reader-default.html Minimal page-turn reader

The Hakyll version "score-reader" mechanism compiles the same index.md twice: once as the landing page (default version) and once as the reader (customRoute to music/{slug}/score/index.html). Score reader uses makeItem "" — the prose body is irrelevant; only frontmatter fields are needed.

Source directory layout

content/music/symphonic-dances/
├── index.md              ← composition frontmatter + program notes prose
├── scores/
│   ├── page-01.svg       ← one file per score page (LilyPond SVG output)
│   ├── page-02.svg
│   └── symphonic-dances.pdf
└── audio/
    ├── movement-1.mp3
    └── movement-2.mp3

SVG, MP3, and PDF files are copied to _site/ via copyFileCompiler. Score reader SVGs are served as separate <img> files — inlining a full orchestral score is impractical.

Frontmatter schema

---
title: "Symphonic Dances with Claude"
date: 2026-03-01
abstract: >
  A five-movement work for orchestra.  
tags: [music]
instrumentation: "orchestra (2+picc.2+ca.2+bcl.2 — 4.3.3.1 — timp+3perc — hp — str)"
duration: "ca. 24'"
premiere: "2026-05-01"
commissioned-by: "—"          # optional
pdf: scores/symphonic-dances.pdf   # optional; path relative to composition dir
score-pages:                  # required for reader; landing page works without it
  - scores/page-01.svg
  - scores/page-02.svg
movements:                    # optional; omit entirely if no movement structure
  - name: "I. Allegro con brio"
    page: 1                   # 1-indexed starting page in the reader
    duration: "8'"
    audio: audio/movement-1.mp3   # optional; omit if no recording
  - name: "II. Adagio cantabile"
    page: 8
    duration: "10'"
---

compositionCtx fields

Extends essayCtx (all essay fields available — abstract, toc, word-count, etc.). Additional fields:

Field Type Value
$slug$ string takeFileName . takeDirectory of source path
$score-url$ string /music/{slug}/score/
$has-score$ boolean present when score-pages non-empty
$score-page-count$ string show (length score-pages)
$score-pages$ list each item: $score-page-url$ (absolute URL)
$has-movements$ boolean present when movements non-empty
$movements$ list each item: $movement-name$, $movement-page$, $movement-duration$, $movement-audio$, $has-audio$
$composition$ flag "true" — gates score-reader.css in head.html

movements is parsed from the nested YAML using Data.Aeson.KeyMap (Aeson 2.x API). score-pages are resolved to absolute URLs (/music/{slug}/{path}) inside the context so the data-pages attribute in the score reader template needs no further processing.

Score reader

The reader template embeds page URLs as a comma-separated data-pages attribute on #score-reader-stage. score-reader.js splits on commas and filters empties.

score-reader-default.html loads only: base.css, components.css (for settings panel styles), score-reader.css, theme.js (sync, pre-paint), settings.js (theme toggle in the top bar), score-reader.js. No nav, no TOC, no sidenotes, no popups, no gallery, no lightbox.

score-reader.js behaviors:

  • navigate(page): swaps <img src>, updates counter, toggles prev/next disabled states, updates active movement button (last movement whose start page ≤ current page), calls history.replaceState for ?p= deep linking, preloads ±1 pages.
  • Keyboard: ArrowLeft/ArrowRight/ArrowUp/ArrowDown for page turns; Escapehistory.back(). Suppressed when settings panel is open.
  • Dark mode: [data-theme="dark"] .score-page { filter: invert(1); } — clean for pure B&W notation; revisit if LilyPond embeds colored elements.
  • Mobile: score scrolls horizontally at ≤ 640px (min-width: 600px on <img>); arrow buttons hidden; pinch-to-zoom is native.

Known limitations / future work

  • Full-piece audio: a recording frontmatter key for a complete performance would add a top-level audio player on the landing page. Not yet implemented.
  • LilyPond margin cropping: the viewBox drives scaling but LilyPond's default page includes margins. May need per-composition viewBox overrides or CSS object-fit once real scores are tested.

build/Backlinks.hs. The fundamental challenge: backlinks for page A require knowing what other pages link to A, but those pages haven't been compiled yet when A is compiled. Solved with a two-version architecture:

  1. Pass 1 (version "links"): each content file is compiled lightly — wikilinks preprocessed, Markdown parsed, AST walked block-by-block. For every internal link, the URL and the HTML of its surrounding Para block are recorded as a LinkEntry { leUrl, leContext }. Context rendered via runPure (writeHtml5String opts (Pandoc nullMeta [Plain inlines])) with writerTemplate = Nothing (fragment only). Result serialised as JSON per page.

  2. Pass 2 (create ["data/backlinks.json"]): loads all version "links" items, inverts the map (target → [source]), resolves each source's title and abstract from its metadata, emits data/backlinks.json.

  3. Context (backlinksField): loads data/backlinks.json via load, looks up the current page's normalised URL, renders <ul> with <details>-collapsible context per entry.

Key implementation details:

  • All loadAll / loadAllSnapshots / buildTagsWith / buildPaginateWith calls use .&&. hasNoVersion to prevent "links" version items from being picked up alongside default versions.
  • isPageLink filters out http://, https://, #-anchors, mailto:, tel:, and static-asset extensions (.pdf, .svg, .mp3, etc.).
  • JSON encoding uses TL.unpack . TLE.decodeUtf8 . Aeson.encode (not LBSC.unpack) to preserve non-ASCII characters in context paragraphs.
  • Decoding uses Aeson.decodeStrict (TE.encodeUtf8 (T.pack s)) symmetrically.
  • popups.js excludes .backlink-source links from the internal-preview popup (same exception pattern as .meta-authors).

Epistemic Profile

Implemented across build/Stability.hs, build/Contexts.hs, templates/partials/page-footer.html, templates/partials/metadata.html, and static/css/components.css.

Context fields provided by epistemicCtx (included in essayCtx):

Field Source Notes
$status$ frontmatter status via defaultContext
$confidence$ frontmatter confidence via defaultContext
$importance-dots$ frontmatter importance (15) ●●●○○ rendered in Haskell
$evidence-dots$ frontmatter evidence (15) same
$confidence-trend$ frontmatter confidence-history list ↑ / ↓ / → from last two entries
$stability$ auto-computed via git log --follow always resolves; never fails
$last-reviewed$ most recent commit date formatted "%-d %B %Y"; noResult if no commits
$scope$, $novelty$, $practicality$ frontmatter via defaultContext

Stability auto-calculation (build/Stability.hs):

  • Runs git log --follow --format=%ad --date=short -- <filepath> via readProcessWithExitCode.
  • Heuristic: ≤ 1 commits or age < 14 days → volatile; ≤ 5 commits and age < 90 days → revising; ≤ 15 commits or age < 365 days → fairly stable; ≤ 30 commits or age < 730 days → stable; otherwise → established.
  • IGNORE.txt: paths listed here use frontmatter stability/last-reviewed verbatim. Cleared by > IGNORE.txt in the Makefile's build target (one-shot pins).

Critical implementation note: Fields that use unsafeCompiler must return Maybe from the IO block and call fail in the Compiler monad afterward — not inside the IO action. Calling fail inside unsafeCompiler's IO block throws an IOError that Hakyll's $if()$ template evaluation does not catch as NoResult, causing the entire item compilation to error silently.

Visualization pipeline — Pandoc IO filter approach

build/Filters/Viz.hs walks the AST for Div blocks with class figure or visualization.

Static figures (.figure): reads the script attribute, runs python3 <script> with the source file's directory as cwd, captures stdout as SVG. Replaces hardcoded #000000/black fills/strokes with currentColor (same trick as Filters.Score). Wraps in <figure class="viz-figure">. Script is expected to import tools/viz_theme and call save_svg() which writes to stdout.

Interactive figures (.visualization): runs the script, expects Vega-Lite JSON on stdout. Embeds as <script type="application/json" class="vega-spec"> inside a .vega-container div. viz.js finds all .vega-spec scripts, stores parsed spec on container._vegaSpec, calls vegaEmbed. Always applies the site's monochrome Vega config, ignoring the spec's own config. MutationObserver on document.documentElement[data-theme] triggers reRenderAll() on theme change.

Frontmatter: viz: true gates CDN loading of Vega/Vega-Lite/Vega-Embed and viz.js in head.html via $if(viz)$.

tools/viz_theme.py: apply_monochrome() sets matplotlib rcParams (transparent backgrounds, black lines); save_svg(fig) writes SVG to stdout via io.StringIO; LINESTYLE_CYCLE provides dash-pattern sequences for multi-series charts (no color distinction needed).

Authoring syntax:

::: {.figure script="figures/plot.py" caption="Caption text"}
:::

::: {.visualization script="figures/chart.py" caption="Caption text"}
:::

GPG signing — dedicated subkey + preset passphrase

Key architecture: master certifying key in ~/.gnupg (passphrase-protected, used for email). Dedicated signing keyring at ~/.gnupg-signing/ holds: sec# (master stub, no secret) + ssb Ed25519 signing subkey (with secret). Correct isolation: gpg --export-secret-subkeys "FINGERPRINT!" exports only the subkey secret.

Passphrase caching: GPG 2.4's passwd in --edit-key requires the master secret to be present — it cannot change a subkey passphrase in a stub+subkey-only keyring. Instead, gpg-preset-passphrase (/usr/lib/gnupg/gpg-preset-passphrase) caches the passphrase by keygrip directly in the agent. ~/.gnupg-signing/gpg-agent.conf sets allow-preset-passphrase and max-cache-ttl 86400. tools/preset-signing-passphrase.sh prompts via the terminal, calls gpg-preset-passphrase --preset <keygrip>. Must be run once per boot (or when the 24h cache expires).

Popup preview: popups.js sigContent provider fetches the .sig URL (same-origin), renders the ASCII armor in a <pre> inside a .popup-sig div. Bound to a.footer-sig-link explicitly in bindTargets, bypassing the footer-exclusion guard on internal links. Result cached in the shared cache map.

nginx: .sig files need no special handling — they're served as static files alongside .html. The try_files directive handles $uri directly.

Model unification: Both similar-links (page-level) and semantic search (paragraph-level) use all-MiniLM-L6-v2 (384 dims). This is a deliberate simplification: the same model runs at build time (Python/sentence-transformers) and query time (browser/transformers.js Xenova/all-MiniLM-L6-v2 quantized), guaranteeing that query vectors and corpus vectors are in the same embedding space.

Build-time (tools/embed.py): One HTML parse pass per file extracts both the full-page text (for similar-links) and individual paragraphs (for semantic search). Model is loaded once and both encoding jobs run sequentially. Outputs: data/similar-links.json, data/semantic-index.bin (raw float32, shape [N_paragraphs, 384]), data/semantic-meta.json. All three are gitignored (generated). Staleness check: skips the entire run if all three outputs are newer than all _site/ HTML.

Binary index format: para_vecs.tobytes() writes a flat, little-endian float32 array. In JS: new Float32Array(arrayBuffer). No header, no framing — row i starts at byte offset i × 384 × 4. This is the simplest possible format and avoids a numpy/npy parser in the browser.

Client-side search (semantic-search.js): Dynamically imports @xenova/transformers@2 from jsDelivr CDN on first query (lazy — no load cost on pages that never use semantic search). Fetches binary index + metadata JSON (also lazy, browser-cached). Brute-force dot product over a Float32Array in a tight JS loop — fast enough at <5k paragraphs; revisit with a WASM FAISS binding if the corpus grows beyond ~20k paragraphs. Vectors are unit-normalised at build time, so dot product = cosine similarity.

Tab default + localStorage: Keyword (Pagefind) is the default on first visit — zero cold-start. User's last-used tab is stored under search-tab in localStorage and restored on load, so returning users who prefer semantic always land there. If localStorage is unavailable (private browsing restrictions), falls back silently to keyword.

Self-hosted model: semantic-search.js sets mod.env.localModelPath = '/models/' and mod.env.allowRemoteModels = false immediately after the CDN import. transformers.js then resolves model files as GET /models/all-MiniLM-L6-v2/{file}, which are same-origin. make download-model (= tools/download-model.sh) fetches 5 files from HuggingFace: config.json, tokenizer.json, tokenizer_config.json, special_tokens_map.json, onnx/model_quantized.onnx. The files live in static/models/ (gitignored) and are copied to _site/models/ by the existing static/** Hakyll rule.

CSP: script-src requires https://cdn.jsdelivr.net for the transformers.js library import. connect-src stays 'self' — all model weight fetches are same-origin after allowRemoteModels = false. The binary index and meta JSON are also same-origin.

Responsive images — WebP <picture> wrapping

build/Filters/Images.hs inspects each Image inline's src. If it is a local raster (not starting with http://, https://, //, or data:; extension .jpg/.jpeg/.png/.gif), it emits RawInline (Format "html") containing a <picture> element:

<picture>
  <source srcset="/images/foo.webp" type="image/webp">
  <img src="/images/foo.jpg" alt="…" loading="lazy" data-lightbox="true">
</picture>

The WebP srcset is computed at build time by System.FilePath.replaceExtension. No IO is needed in the filter — the <source> is always emitted; browsers silently ignore it if the file doesn't exist (falling back to <img>). SVG, external URLs, and data: URIs remain plain <img> tags. Images inside <a> links get no data-lightbox marker (same as before).

tools/convert-images.sh performs the actual conversion: cwebp -q 85 per file. Runs before Hakyll in make build and is also available as a standalone make convert-images target. Exits 0 with a notice if cwebp is not installed, so the build never fails on machines without libwebp. Generated .webp files are gitignored; git add -f to commit an authored WebP.

Quality note: -q 85 is a good default for photographic images. For pixel-art, diagrams, or images that are already highly compressed, -lossless or a higher quality setting may be appropriate (edit the script).

Annotations (infrastructure only)

annotations.js stores annotations as JSON in localStorage under site-annotations, scoped per location.pathname. On DOMContentLoaded, applyAll() re-anchors saved annotations via a TreeWalker text-stream search (concatenates all text nodes in #markdownBody, finds exact match by index, builds a Range, wraps with <mark>). Cross-element ranges use extractContents() + insertNode() fallback. Four highlight colors (amber / sage / steel / rose) defined in annotations.css as rgba overlays with box-decoration-break: clone. Hover tooltip shows note, date, and delete button. Public API: window.Annotations.add(text, color, note) / .remove(id). The selection-popup "Annotate" button is currently removed pending a UI revision.


VII. Reference: Inspirations

  • gwern.net — Primary model (Gwern Branwen + Said Achmiz). Semantic zoom, sidenotes, popups, monochrome, Pandoc+Hakyll.
  • Edward Tufte — Sidenotes, information design
  • Matthew Butterick's Practical Typography — Web typography in practice
  • Traditional book design — The standard to aspire to on screen

This specification is a living document updated as implementation progresses.