Compare commits
23 Commits
620b974d3f
...
5d344f940e
| Author | SHA1 | Date |
|---|---|---|
|
|
5d344f940e | |
|
|
23bc2d0dc1 | |
|
|
9f61ce5949 | |
|
|
56afdb867a | |
|
|
f254ce866e | |
|
|
c8eeaaa9bc | |
|
|
945086421a | |
|
|
b2951c0c2c | |
|
|
aeb2937f7c | |
|
|
8ca22a45d2 | |
|
|
4e28c82e4c | |
|
|
8040be1aee | |
|
|
caa113e036 | |
|
|
c17c203747 | |
|
|
c68d03af31 | |
|
|
902e43ea19 | |
|
|
f11495ff9a | |
|
|
c64f3d63c0 | |
|
|
7ca937d98c | |
|
|
70ad44e9f4 | |
|
|
7c5354efa7 | |
|
|
37665f67db | |
|
|
a7b3b9cd07 |
|
|
@ -10,6 +10,9 @@ _cache/
|
|||
**/.env
|
||||
**/.env.*
|
||||
**/*.env
|
||||
# .env.example is documentation (tracked), not a credential file — the
|
||||
# patterns above would otherwise shadow it for status/add purposes.
|
||||
!.env.example
|
||||
**/*.key
|
||||
**/*.pem
|
||||
**/*.p12
|
||||
|
|
@ -73,6 +76,9 @@ data/build-stamp.txt
|
|||
data/last-build-seconds.txt
|
||||
data/semantic-index.bin
|
||||
data/semantic-meta.json
|
||||
# Both embed caches (pages + paragraphs); the trailing glob also
|
||||
# catches interrupted-write debris (.tmp / .tmp.npz)
|
||||
data/embed-cache-*
|
||||
|
||||
# Archive: generated text + its staleness stamp (recreated from the
|
||||
# committed artifact on every build — deterministic, so committing them is
|
||||
|
|
|
|||
|
|
@ -0,0 +1,931 @@
|
|||
---
|
||||
title: Repository audit
|
||||
date: 2026-06-09
|
||||
---
|
||||
|
||||
# Repository audit — levineuwirth.org (2026-06-09)
|
||||
|
||||
Comprehensive audit of the repo on `main` at commit `620b974` (working tree
|
||||
modified: branding refresh across `static/` + `templates/partials/`, plus
|
||||
`tools/embed.py` rework; untracked `static/og-image.png`,
|
||||
`templates/partials/logo-mark.svg`, `data/embed-cache-pages.npz.tmp.npz`).
|
||||
|
||||
Severity legend: **HIGH** (likely to break a build, cause data loss, or
|
||||
expose a security weakness) — **MED** (latent bug, brittleness, or
|
||||
documentation drift) — **LOW** (minor robustness gap or fragile assumption) —
|
||||
**NIT** (style, polish, or paranoia).
|
||||
|
||||
Numbers are file:line against the working tree at audit time. Findings
|
||||
marked "verified" were reproduced empirically (solver runs, built `_site/`
|
||||
output inspection, live HTTP checks, binary parsing); the rest were
|
||||
confirmed by reading the code.
|
||||
|
||||
Prior audit: `AUDIT.md` (2026-05-07). Follow-up status in §10.
|
||||
|
||||
---
|
||||
|
||||
## 1. Build & dependency chain
|
||||
|
||||
### 1.1 `cabal.project.freeze` is unsolvable again — next clean build fails — **HIGH**
|
||||
|
||||
`cabal build --dry-run` fails today (verified): the freeze pins
|
||||
`distributive ==0.6.2.1`, but the system (pacman) GHC package db has
|
||||
`comonad-5.0.10` built against `distributive-0.6.3`:
|
||||
|
||||
```
|
||||
rejecting: distributive-0.6.3/installed... (constraint from
|
||||
cabal.project.freeze requires ==0.6.2.1)
|
||||
After searching the rest of the dependency tree exhaustively...
|
||||
```
|
||||
|
||||
The conflict set also names aeson, warp, hakyll, http2, semigroupoids. This
|
||||
is the same failure mode as prior-audit §1.1 — that audit's specific aeson
|
||||
pin was fixed (now 2.2.2.0/hashable 1.4.7.0), but a different package broke
|
||||
the same way after a system update. Recent builds succeed only off the
|
||||
cached `dist-newstyle/cache/plan.json`; the freeze file has since changed,
|
||||
so the next cabal invocation re-solves and fails. Because `make deploy`
|
||||
starts with `make clean`, the next deploy hits this. `levineuwirth.cabal`'s
|
||||
own bounds are compatible with the freeze — the conflict is
|
||||
freeze-vs-installed-db, not freeze-vs-cabal-file.
|
||||
|
||||
Fix: `tools/refreeze.sh` (written for exactly this post-`pacman -Syu`
|
||||
situation). The underlying fragility — freezing against a mutable system
|
||||
package db — remains; consider documenting the refreeze step as part of any
|
||||
system-upgrade ritual. *(In progress at time of writing.)*
|
||||
|
||||
### 1.2 Missing `data/archive-index.json` / `archive-state.json` crashes the build — **HIGH**
|
||||
|
||||
`build/ArchiveIndex.hs:134-146`. The module doc (lines 18-22) promises "An
|
||||
absent or malformed file degrades safely: an empty index makes the link
|
||||
consumers no-op; an absent state file makes every entry @Live@." But
|
||||
`rawIndex = unsafePerformIO $ do decoded <- A.eitherDecodeFileStrict' indexPath`
|
||||
(and identically `rawState`) never checks `doesFileExist`, and aeson's
|
||||
`eitherDecodeFileStrict'` throws an uncaught `IOException` on a missing
|
||||
file (verified: `withBinaryFile: does not exist`). Both files are
|
||||
gitignored (`.gitignore:84-85`), so a fresh clone or a no-`.venv` build —
|
||||
the exact path `build/Archive.hs:20-24` promises to support — throws when
|
||||
the CAF is first forced. Contrast `readUrlSet` (line 109) in the same file,
|
||||
which guards correctly. Currently latent on this machine only because both
|
||||
generated files happen to exist.
|
||||
|
||||
### 1.3 `embed.py` `trust_remote_code=True` executes unpinned third-party code — **HIGH**
|
||||
|
||||
`tools/embed.py:329` (line ~341 in the uncommitted version). The new
|
||||
page-model load is
|
||||
`SentenceTransformer(PAGE_MODEL_NAME, revision=PAGE_MODEL_REVISION, trust_remote_code=True)`.
|
||||
The `revision` arg pins only the `nomic-ai/nomic-embed-text-v1.5` repo; the
|
||||
actual modeling code is pulled via `auto_map` from a *different* repo —
|
||||
verified in the local HF cache: the executed code lives under
|
||||
`transformers_modules/nomic_hyphen_ai/nomic_hyphen_bert_hyphen_2048/...`,
|
||||
i.e. `nomic-ai/nomic-bert-2048` at its current head, which nothing pins. A
|
||||
compromise of that second repo runs arbitrary Python at build time, in a
|
||||
repo whose every other download path (download-model.sh, pdfjs, leaflet) is
|
||||
sha256-pinned. The comment "Both pins are deliberate" is therefore
|
||||
misleading. Fix: pin via `code_revision`, or run with `HF_HUB_OFFLINE=1`
|
||||
after first fetch, or document the accepted risk.
|
||||
|
||||
### 1.4 Working-tree commit hazard: tracked templates reference untracked files — **HIGH (process)**
|
||||
|
||||
`templates/partials/nav.html:5` (tracked, modified) adds
|
||||
`$partial("templates/partials/logo-mark.svg")$` and
|
||||
`templates/partials/head.html` references `/og-image.png` — both target
|
||||
files are **untracked** (no git history). Committing the template diff
|
||||
without `git add`-ing both breaks every page's Hakyll build on a fresh
|
||||
clone (`$partial$` aborts compilation) and 404s the og:image. They must
|
||||
land in the same commit. Conversely, `data/embed-cache-pages.npz.tmp.npz`
|
||||
must **not** be committed (see §4.1). The partial itself is safe as a
|
||||
Hakyll template (verified: zero `$` characters; `match "templates/**"`
|
||||
compiles it).
|
||||
|
||||
### 1.5 `einops` dependency: undocumented, unbounded, imported nowhere — **LOW**
|
||||
|
||||
`pyproject.toml:27` adds `einops>=0.8.2`. No import anywhere in
|
||||
`tools/`/`build/`/`static/js/`; its only consumer is nomic's
|
||||
`trust_remote_code` module (§1.3). Every sibling dependency has an
|
||||
explanatory comment and an upper bound per the file's own stated policy
|
||||
("Upper bounds are intentionally generous (next major) but always
|
||||
present"); einops has neither. `uv lock --check` passes (0.8.2 pinned).
|
||||
|
||||
---
|
||||
|
||||
## 2. Haskell build code — core
|
||||
|
||||
### 2.1 Nav, home grid, and library link `/fiction/` and `/poetry/` — confirmed 404s — **MED**
|
||||
|
||||
`build/Site.hs:50-60` (`homePortals` contains `("Fiction","fiction")`,
|
||||
`("Poetry","poetry")`), `templates/partials/nav.html:56,61`,
|
||||
`templates/library.html:44,58`. No rule generates either index: fiction and
|
||||
poetry are not in `tagIndexable` (`build/Patterns.hs:148-151` = essays +
|
||||
blog + photos) and Site.hs has no landing rule. Verified: `_site/fiction`
|
||||
does not exist; `_site/poetry/` has no `index.html`. nginx has no
|
||||
redirects. Both links 404 in production today.
|
||||
|
||||
### 2.2 Tag/route collisions guarded for `photography` only — **MED**
|
||||
|
||||
`build/Tags.hs:98-99`. `tagIdentifier` maps tag `t` → `t ++ "/index.html"`;
|
||||
`sectionOwnedTopLevelTags = ["photography"]` is the only guard. A
|
||||
tagIndexable item tagged `music` (or `music/x`, which expands to `music`)
|
||||
emits `music/index.html`, already owned by the music index route
|
||||
(`build/Site.hs:486-487`); similarly `essays`, `blog`, `cv`, `archive`,
|
||||
`authors`, `bibliography`. Hakyll does not error on duplicate routes — one
|
||||
silently overwrites the other.
|
||||
|
||||
### 2.3 Sidenotes filter destroys the documented no-JS fallback — **MED**
|
||||
|
||||
`build/Filters/Sidenotes.hs:30-36` vs `static/css/sidenotes.css:125-135`.
|
||||
The module doc claims the Pandoc `<section class="footnotes">` "serves as
|
||||
fallback," but `apply` replaces every `Note`, so the writer never emits the
|
||||
section. CSS depends on it below 1500px. Verified in output:
|
||||
`_site/essays/scaling_outage.html` has 3 `class="sidenote"` and zero
|
||||
`footnotes` occurrences. With JS disabled, footnote content is invisible on
|
||||
narrow viewports. The comment, the CSS, and ozymandias.md's own prose all
|
||||
contradict actual behavior.
|
||||
|
||||
### 2.4 Sidenote bodies rendered without the KaTeX writer — **MED**
|
||||
|
||||
`build/Filters/Sidenotes.hs:103-115`. `inlinesToHtml`/`blocksToHtml` use
|
||||
`writeHtml5String (def :: WriterOptions)` (PlainMath), while the main
|
||||
pipeline uses `KaTeX ""` (`build/Compilers.hs:47`). Math inside a footnote
|
||||
never gets `<span class="math inline">\(...\)</span>`, so KaTeX never
|
||||
renders it — degrades to plain italics, silently inconsistent with body
|
||||
math.
|
||||
|
||||
### 2.5 SourceRefs whitelist vs `/source/` serving whitelist have drifted — **MED**
|
||||
|
||||
`build/Filters/SourceRefs.hs:114-141` vs `build/Site.hs:217-240`. Site.hs:209
|
||||
says "must stay aligned with 'isSourcePath'". Mismatches: SourceRefs wraps
|
||||
`content/` and `yaml-source/` (no Site counterpart); `static/` + any known
|
||||
ext vs Site's `static/js/**`/`static/css/**` only; `tools/` + any ext vs
|
||||
Site's `tools/**.sh`/`tools/**.py`; `data/` at any depth vs Site's
|
||||
top-level `data/*.{json,yaml,md,bib}`. Each mismatch yields a wrapped
|
||||
source-ref whose popup fetch 404s (Forgejo href fallback still works).
|
||||
Inverse: Site serves `data/*.bib` but `.bib` is missing from
|
||||
`hasKnownExt` — dead whitelist entry.
|
||||
|
||||
### 2.6 `epistemicEntry` ignores `confidence: proved` — **MED**
|
||||
|
||||
`build/Site.hs:1014-1024`. Comment: "Compute overall-score the same way
|
||||
Contexts.overallScoreField does," but it uses
|
||||
`readMaybe =<< lookupString "confidence" meta`, which is `Nothing` for
|
||||
`"proved"`/`"proven"`, whereas `Contexts.overallScoreField`
|
||||
(`build/Contexts.hs:574-576`) substitutes 100 via `isProvedConfidence`.
|
||||
Proved pages get no `score` in `data/epistemic-meta.json` and export the
|
||||
raw string under `confidence`, so client-side filtering silently misses
|
||||
them.
|
||||
|
||||
### 2.7 Empty affiliation `<div>` ships on every essay without `affiliation:` — **MED**
|
||||
|
||||
`build/Contexts.hs:84-89` + `templates/partials/metadata-tail.html:12`.
|
||||
`affiliationField` returns an empty list instead of `noResult`; Hakyll's
|
||||
`$if$` is truthy for empty list fields (the codebase knows this —
|
||||
`tagLinksFieldExcludingScope` uses `noResult` for exactly this reason).
|
||||
Verified in output: `_site/essays/asymmetric-forgetting.html` contains
|
||||
`<div class="meta-row meta-affiliation">` with whitespace-only content.
|
||||
|
||||
### 2.8 Library page hard-depends on `content/library.md` — **LOW**
|
||||
|
||||
`build/Site.hs:675`. `_ <- loadSnapshot libraryIntroId "body"` is a
|
||||
top-level compiler statement (not inside a `field`), so it's a hard
|
||||
failure. The block is documented as "optional prose block"; deleting
|
||||
`content/library.md` breaks the whole `library.html` compile. Contrast the
|
||||
existence-guarded sidecars at `build/Tags.hs:277-283` and
|
||||
`build/Site.hs:843-850`.
|
||||
|
||||
### 2.9 Library `primaryPortalOf` reads only list-form `tags:` — **LOW**
|
||||
|
||||
`build/Site.hs:632-638`. `lookupStringList "tags"` returns `Nothing` for
|
||||
scalar comma form (`tags: research, ai`), which Hakyll's `getTags`
|
||||
accepts. Such an item appears on tag pages but is silently dropped from
|
||||
the library. All current content uses list form — latent.
|
||||
|
||||
### 2.10 `allContent` omits me/, memento-mori/, photography from the link graph — **LOW**
|
||||
|
||||
`build/Patterns.hs:124-133`, used by `build/Backlinks.hs:334,345`. Despite
|
||||
"Every content file the backlinks pass should index," `content/me/index.md`
|
||||
and `content/memento-mori/index.md` (full essays, rendered with
|
||||
`backlinksField`) never have their outgoing links extracted; photography
|
||||
likewise. Either deliberate-but-undocumented or the exact silent omission
|
||||
the module header says it exists to prevent.
|
||||
|
||||
### 2.11 Paginated tag pages: split by creation date, sorted by display date — **LOW**
|
||||
|
||||
`build/Tags.hs:371-377`. `buildPaginateWith (sortAndGroupAt tagPageSize)`
|
||||
partitions via `sortRecentFirst` (creation date), then each page re-sorts
|
||||
with `recentFirstByDisplay` (revision-aware). A recently revised old item
|
||||
stays on a late page but jumps to its top — cross-page ordering is not
|
||||
monotone. Only fires above the 150-item threshold.
|
||||
|
||||
### 2.12 `fill:#000` replacement corrupts longer hex colors — **LOW**
|
||||
|
||||
`build/Filters/Score.hs:118-133` (and `Filters/Viz.hs` `processColors`).
|
||||
The 6-digit pass protects only `#000000`; for `fill:#000080` the 3-digit
|
||||
pass produces `fill:currentColor80` — invalid CSS, silently mangled SVG.
|
||||
Quoted attribute forms are safe; only unquoted style-property forms are
|
||||
exposed.
|
||||
|
||||
### 2.13 Source-level preprocessors rewrite inside fenced code blocks — **LOW**
|
||||
|
||||
`build/Filters/Wikilinks.hs:24-31`, `Filters/Transclusion.hs:18-20`,
|
||||
`Filters/EmbedPdf.hs`. All run on the raw source before Pandoc parses
|
||||
fences: `[[anything]]` in a code block becomes a link; a code-block line
|
||||
that is exactly `{{slug}}` or `{{pdf:...}}` becomes raw HTML.
|
||||
Transclusion's comment ("prevents accidental substitution inside prose or
|
||||
code") is false for full-line directives in code blocks. A live foot-gun
|
||||
for a site that documents its own syntax (ozymandias.md does exactly
|
||||
this).
|
||||
|
||||
### 2.14 `domainIcon` matches substrings of the whole URL, not the host — **LOW**
|
||||
|
||||
`build/Filters/Links.hs:120-153`. `"x.com" `T.isInfixOf` url` etc. —
|
||||
`https://example.org/why-x.com-failed` gets the Twitter icon. Contradicts
|
||||
the strict-hostname discipline `isExternal` documents at lines 95-101 of
|
||||
the same file. Cosmetic (icon only).
|
||||
|
||||
### 2.15 `gsubRoute "content/"` strips every occurrence, not just the prefix — **LOW**
|
||||
|
||||
`build/Site.hs:171,357,417` etc. Hakyll's `gsubRoute` is replace-all; a
|
||||
co-located directory literally named `content` would be silently mangled
|
||||
(`content/essays/slug/content/data.csv` → `essays/slug/data.csv`). Same
|
||||
for `gsubRoute "static/"`. Improbable but silent.
|
||||
|
||||
### 2.16 `existsCached` memoizes non-existence for the process lifetime — **LOW**
|
||||
|
||||
`build/Filters/SourceRefs.hs:160-166`. Under `make watch`, a source file
|
||||
created after first reference stays cached as absent until restart.
|
||||
|
||||
### 2.17 Core NITs
|
||||
|
||||
- `build/Site.hs:42-44`: comment says "eight portals"; the list has nine.
|
||||
Echoed at Site.hs:606 ("the eight") vs line 657's "nine times".
|
||||
- `build/Site.hs:866-877`: random-pages.json comment says "essays + blog
|
||||
posts only" but the rule loads fiction and flat poetry too; uses
|
||||
flat-only `content/poetry/*.md` while the epistemic rule uses
|
||||
`allPoetry` — collection poems are epistemic-indexed but never
|
||||
randomizable.
|
||||
- `build/Utils.hs:64-73`: `authorSlugify` comment claims runs of spaces
|
||||
collapse; code maps each space (`"A B"` → `"a--b"`). Consistent
|
||||
everywhere, so links work; comment wrong.
|
||||
- `build/Utils.hs:31-32`: `readingTime` truncates (`div 200`) — 399 words
|
||||
reports "1 min"; comment implies ceiling semantics.
|
||||
- `build/Pagination.hs:42` + `build/Site.hs:77-82`: hardcoded pattern
|
||||
literals duplicate `Patterns.hs`, defeating that module's stated purpose
|
||||
(Patterns.hs:6-10).
|
||||
- `build/Contexts.hs:174-180`: plain `tagLinksField` returns an empty list
|
||||
rather than `noResult` — `$if(item-tags)$` is true and templates emit
|
||||
empty tag wrappers (author-index.html, item-card.html).
|
||||
- `build/Tags.hs:296-304`: `tagItemCtx` composes `defaultContext`, not
|
||||
`siteCtx`, so `$if(has-monogram)$` never fires on tag pages — monograms
|
||||
render on new.html/library but silently never on tag indexes.
|
||||
- `build/Contexts.hs:485-492`: `dotsField` comment says "1–5" but accepts
|
||||
0 (`max 0 (min 5 n)`) — `importance: 0` renders five empty circles.
|
||||
- `build/Contexts.hs:375-381`: `descriptionField` doc says `noResult`;
|
||||
code uses `fail` — behaviorally fine under Hakyll 4.16 `$if$` (verified
|
||||
against Hakyll 4.16.7.1 source) but logs `[ERROR]` debug noise per
|
||||
abstract-less page. Same in `abstractField`, `summaryField`,
|
||||
`bibliographyField`.
|
||||
- `build/Filters/Images.hs:233-234`: `webpSrc` interpolated into `srcset`
|
||||
unescaped while sibling `src` goes through `esc`.
|
||||
- `build/Filters/Links.hs:37-46,63-69`: internal PDF links double-classified
|
||||
(`pdf-link` + `link-internal` chrome) despite the "no overlap" comment.
|
||||
- `build/Filters/Smallcaps.hs:31-34` + `Filters/Archive.hs:42-44`:
|
||||
"headers are skipped" only at top level; a Header nested in a
|
||||
Div/BlockQuote is processed, contradicting the comments.
|
||||
|
||||
Verified clean: no unguarded `head`/`fromJust`/`read`/`!!` hazards in the
|
||||
core modules; filter composition order matches its documenting comments;
|
||||
Hakyll 4.16.7.1 `$if$` treats both `fail` and `noResult` as false.
|
||||
|
||||
---
|
||||
|
||||
## 3. Haskell build code — feature modules
|
||||
|
||||
### 3.1 Stats heatmap day-of-week off-by-one: Sunday clipped out of the SVG — **MED**
|
||||
|
||||
`build/Stats.hs:185,300,317`. `dowOf d = fromEnum (dayOfWeek d) -- Mon=0..Sun=6`
|
||||
— but `time-1.12.2` is ISO-numbered (verified:
|
||||
`map fromEnum [Monday..Sunday] == [1..7]`). So Sunday lands at y=106 while
|
||||
`svgH` = 104 — every Sunday cell is clipped out of the viewBox and grid
|
||||
row 0 is permanently blank. Relatedly, `weekStart` returns the previous
|
||||
*Sunday* (and for a Sunday, 7 days back), not the "first Monday on or
|
||||
before" its comment claims; builds run on a Sunday also clip the newest
|
||||
column horizontally.
|
||||
|
||||
### 3.2 `Commonplace.hs` uses `Char8.pack` — non-ASCII YAML corruption — **MED**
|
||||
|
||||
`build/Commonplace.hs:143`. `Y.decodeEither' (BS.pack raw)` with
|
||||
`Data.ByteString.Char8` truncates each `Char` to 8 bits — the exact hazard
|
||||
`build/Now.hs:249-253` documents and fixes with `TE.encodeUtf8`.
|
||||
`data/commonplace.yaml` is currently pure ASCII, so latent — but a
|
||||
commonplace book of quotations is the likeliest file to acquire an em-dash
|
||||
or curly quote, which will then either fail the YAML parse or publish
|
||||
mojibake.
|
||||
|
||||
### 3.3 Backlinks: links inside tight lists are invisible — **MED**
|
||||
|
||||
`build/Backlinks.hs:220-226`. `extractLinksWithContext`'s `go` handles
|
||||
`Para`, `BlockQuote`, `Div`, `BulletList`, `OrderedList`, then `go _ = []`.
|
||||
Tight list items (the default `- item` form) are `Plain` blocks, not
|
||||
`Para`, so recursion into list children yields nothing. Every internal
|
||||
link written in a tight list never produces a backlink. `Header`, `Table`,
|
||||
and `DefinitionList` blocks are likewise skipped. The doc comment implies
|
||||
coverage it doesn't deliver.
|
||||
|
||||
### 3.4 Stability "age" is the first→last commit span, not time since first commit — **MED**
|
||||
|
||||
`build/Stability.hs:89-93,99-112`. Docs say "age in days since first
|
||||
commit," but `classify (length dates) (daySpan (last dates) newest)`
|
||||
computes the span between first and most recent *commit*, with no
|
||||
reference to today. A piece written in a one-week burst years ago reports
|
||||
"volatile" forever; time passing without commits can never increase
|
||||
stability. Either the comment or the metric is wrong.
|
||||
|
||||
### 3.5 Frontmatter `history:` assumed newest-first; WRITING.md documents oldest-first — **MED**
|
||||
|
||||
`build/Stability.hs:204-217,299-336` vs `WRITING.md:105-109`.
|
||||
`loadVersionHistory` keeps authored order and all range fields treat the
|
||||
head as newest (`es@(newest:_) -> let oldest = last es`). Git history is
|
||||
newest-first, but WRITING.md's `history:` example is oldest-first. With
|
||||
the documented ordering, `version-history-range` renders reversed
|
||||
("14 March 2026 – 1 March 2026"), `range-start` returns the newest date,
|
||||
and `version-history-primary` shows the three *oldest* entries.
|
||||
|
||||
### 3.6 Archive manifest→provenance join is exact-string, rest of system is normalized — **MED**
|
||||
|
||||
`build/Archive.hs:269`. `Map.lookup (meUrl me) provByUrl` joins on the raw
|
||||
URL; everywhere else equivalence is `normalizeUrl` (ArchiveIndex
|
||||
filtering, dup detection, ARCHIVE.md:189-192). Editing a manifest URL to a
|
||||
normalization-equivalent form (`http`→`https`, trailing slash, tracking
|
||||
param) silently unpublishes `/archive/<slug>/` while ArchiveIndex's
|
||||
normalized filter keeps the slug active — links keep pointing at a 404.
|
||||
|
||||
### 3.7 Photography `buildPin` computes wrong slug/thumb/title for flat entries — **MED**
|
||||
|
||||
`build/Photography.hs:354,362`. `slug = takeFileName (takeDirectory fp)` —
|
||||
for a flat `content/photography/foo.md` this yields `"photography"`, so
|
||||
map.json gets `"slug": "photography"`, the title fallback is wrong, and
|
||||
`thumb = "/photography/photography/<p>"` 404s (flat-single assets route to
|
||||
`/photography/<asset>`). PHOTOGRAPHY.md:214 explicitly supports flat
|
||||
singles. Latent — `content/photography/` currently has only `index.md` —
|
||||
but breaks the first geo-tagged flat single.
|
||||
|
||||
### 3.8 `geo-precision` fails open: a typo'd "hidden" publishes coordinates — **MED**
|
||||
|
||||
`build/Photography.hs:347-349,312-320`. Only the exact string matches
|
||||
(`(_, Just "hidden", _) -> return Nothing`); any other value (e.g.
|
||||
`Hidden`, `hiddn`) falls into `roundCoord`, whose catch-all treats unknown
|
||||
values as `city` (~10 km rounding) — publishing coordinates the author
|
||||
meant to suppress. Contradicts the file's own privacy comment (lines
|
||||
287-289) and the fail-closed precedent for `visibility:` in
|
||||
`build/Archive.hs:77-83`.
|
||||
|
||||
### 3.9 Archive state is process-lifetime cached — `watch` goes stale — **LOW**
|
||||
|
||||
`build/ArchiveIndex.hs:123-146` + `build/Archive.hs:304`.
|
||||
`activeUrls`/`rawIndex`/`rawState` are NOINLINE `unsafePerformIO` CAFs read
|
||||
once per process, and `archiveRules` reads the manifest in `preprocess`.
|
||||
Under `site watch`, edits to `manifest.yaml`, `removed.yaml`, or the
|
||||
regenerated state JSONs are never re-read until restart. One-shot builds
|
||||
unaffected.
|
||||
|
||||
### 3.10 Pinned pages render raw ISO in `$last-reviewed$` — **LOW**
|
||||
|
||||
`build/Stability.hs:166-170`. The git branch formats via `fmtIso`
|
||||
("1 May 2026"); the IGNORE.txt-pinned branch returns the frontmatter value
|
||||
verbatim ("2026-05-01") — inconsistent display formatting.
|
||||
|
||||
### 3.11 Empty/all-comments `manifest.yaml` halts the build — **LOW**
|
||||
|
||||
`build/Archive.hs:158-170`. An empty YAML stream decodes as `Null`, which
|
||||
fails to parse as `[ManifestEntry]` and takes the `exitFailure` branch —
|
||||
draining the manifest to zero entries is fatal rather than the empty
|
||||
archive the absent-file branch supports.
|
||||
|
||||
### 3.12 Backlinks `normaliseUrl` misses directory-form canonical URLs — **LOW**
|
||||
|
||||
`build/Backlinks.hs:275-281`. Strips `.html` but not
|
||||
`index.html`/trailing slash: a page routed `essays/foo/index.html` keys as
|
||||
`/essays/foo/index`, but a body link authored `/essays/foo/` doesn't
|
||||
match — backlink silently dropped. `build/SimilarLinks.hs:97-99` handles
|
||||
exactly this case and its comment flags the divergence.
|
||||
|
||||
### 3.13 SimilarLinks PDF viewer URL not percent-encoded — **LOW**
|
||||
|
||||
`build/SimilarLinks.hs:155-164`.
|
||||
`viewerUrl = "/pdfjs/web/viewer.html?file=" ++ escapeHtml raw` —
|
||||
`escapeHtml` handles HTML metachars only; a path containing `&`, `?`, `#`,
|
||||
or spaces breaks the `file=` query value.
|
||||
|
||||
### 3.14 Photography feed thumbnails only for directory-form entries — **LOW**
|
||||
|
||||
`build/Photography.hs:449-453`. `imgTag` requires `isDir`; flat singles
|
||||
and series children (`<series>/<photo>.md`) get text-only feed entries,
|
||||
against PHOTOGRAPHY.md's "thumbnails embedded inline" (lines 36, 445) and
|
||||
the feed's deliberate inclusion of series children.
|
||||
|
||||
### 3.15 Marks: missing confidence/evidence renders a literal "0 TRUST" — **LOW**
|
||||
|
||||
`build/Marks.hs:272-278,565`. `computeTrust _ _ = 0` with a comment
|
||||
claiming the figure "collapses to the bare frame," but
|
||||
`renderEpistemicFigure` unconditionally calls `renderTrustLabel`, so a
|
||||
piece with `status:` but no `confidence`/`evidence` (a case MARKS.md:696
|
||||
says should render) displays a prominent center "0" — indistinguishable
|
||||
from an authored zero-trust score.
|
||||
|
||||
### 3.16 Feature-module NITs
|
||||
|
||||
- `build/Catalog.hs:228-235`: two distinct unknown categories render as
|
||||
adjacent duplicate "Other" sections (equal rank, `groupBy` on raw
|
||||
string).
|
||||
- `build/Stats.hs:754-777`: `pageTOC` comment says "nine h2 sections";
|
||||
lists eleven (matching the eleven rendered).
|
||||
- `build/SimilarLinks.hs:51-54`: comment says "the template caps the
|
||||
display"; the code caps it (`take maxSimilar` at line 80).
|
||||
- `build/Stats.hs:169-171`, `build/Archive.hs:564-569`: "median" is the
|
||||
upper-median for even-length lists.
|
||||
- `build/Backlinks.hs:133-153`: protocol-relative `//host/path` URLs pass
|
||||
`isPageLink` and pollute backlinks.json.
|
||||
- `build/BibExtras.hs:75-98`: `@string`/`@comment`/`@preamble` blocks
|
||||
parsed as citekey entries — only consequential on a citekey/macro-name
|
||||
collision.
|
||||
|
||||
Verified clean: Marks tick positions/axis order/radii match MARKS.md §3;
|
||||
proved-confidence trust substitution matches §4.3; Archive's fail-closed
|
||||
`visibility` validation, removed.yaml conflict rejection, and double-sided
|
||||
SHA-256 verification all match ARCHIVE.md.
|
||||
|
||||
---
|
||||
|
||||
## 4. Python & shell tooling
|
||||
|
||||
### 4.1 `data/embed-cache-pages.npz.tmp.npz` orphan: explained; cleanup + ignore gaps — **MED**
|
||||
|
||||
The orphan (mtime May 26) is the fossil of a fixed bug: an earlier
|
||||
embed.py passed a bare path to `np.savez_compressed`, numpy appended
|
||||
`.npz` (verified in numpy's `_savez` source), and the subsequent
|
||||
`os.replace` raised FileNotFoundError, stranding the file. The current
|
||||
file-handle code (`tools/embed.py:173-183`) is correct, but: (a) nothing
|
||||
deletes the stale orphan — **delete it, don't commit it**; (b) the tmp
|
||||
write has no try/finally, so any mid-write exception strands
|
||||
`embed-cache-pages.npz.tmp`; (c) the new `.gitignore` entry is exact-path
|
||||
(`data/embed-cache-pages.npz`) and covers neither `.tmp` nor `.tmp.npz`
|
||||
variants — widen to `data/embed-cache-pages.npz*`; (d) the fixed tmp name
|
||||
means two concurrent runs interleave writes.
|
||||
|
||||
### 4.2 Corrupt embed cache crashes instead of being discarded — **MED**
|
||||
|
||||
`tools/embed.py:154`. The discard path catches
|
||||
`(OSError, KeyError, ValueError)`, but `np.load` on a truncated `.npz`
|
||||
raises `zipfile.BadZipFile` (verified MRO: `BadZipFile → Exception`), and
|
||||
`EOFError` is also uncaught. A half-written cache (exactly what §4.1(b)
|
||||
can produce) makes every subsequent build print "Warning: embedding
|
||||
failed" and leaves similar-links/semantic index stale until the file is
|
||||
manually deleted — the opposite of the docstring's "unreadable →
|
||||
discarding" contract.
|
||||
|
||||
### 4.3 embed.py staleness check structurally defeated by stamp-build-time — **MED**
|
||||
|
||||
`tools/embed.py:195-200` + `Makefile:68`. `needs_update()` compares
|
||||
`_site/**/*.html` mtimes against embed's outputs — but the build order is
|
||||
`embed.py` → `stamp-build-time.py _site`, and the stamper rewrites the
|
||||
footer timestamp in essentially every HTML file each build. So every page
|
||||
is always newer than embed's outputs and the "skip if fresh" fast path
|
||||
never fires: the full paragraph-embedding pass (and model load) runs on
|
||||
every build. The new page cache papers over half the cost; the paragraph
|
||||
pass pays full price every time. Related (`tools/embed.py:297-299`):
|
||||
model/config changes never invalidate outputs — currently masked by this
|
||||
bug; fixing one exposes the other.
|
||||
|
||||
### 4.4 archive.py writes provenance/index/state non-atomically — **MED**
|
||||
|
||||
`tools/archive.py:718-721,734-737,953-957,1077-1080`. All plain
|
||||
`write_text()`. An interrupt mid-write truncates `PROVENANCE.json`; the
|
||||
next build's `json.loads` (line 642) raises an unhandled
|
||||
`JSONDecodeError` — and a truncated provenance is indistinguishable from
|
||||
corruption in a tool whose whole contract is integrity checking. embed.py
|
||||
got atomic-write helpers; archive.py did not.
|
||||
|
||||
### 4.5 download-leaflet.sh: checksum verification bypassable — **MED**
|
||||
|
||||
`tools/download-leaflet.sh:43-47,90`. The early-exit skip checks file
|
||||
existence only (download-model.sh re-verifies on its skip path), and
|
||||
`curl -o "$target"` writes directly to the final path: a download that
|
||||
*fails* `verify_or_warn` aborts via `set -e` *after* the bad file is in
|
||||
place, and the next run's existence check accepts it permanently. A
|
||||
MITM'd unpkg.com download survives one failed run and is silently
|
||||
vendored on the next.
|
||||
|
||||
### 4.6 Other download/convert scripts leave partial files in final paths — **LOW**
|
||||
|
||||
`tools/download-model.sh:84`: interrupted curl leaves a partial
|
||||
`model_quantized.onnx`; caught today only because model-checksums.sha256
|
||||
pins all five files — any unpinned file would persist forever. Use
|
||||
`-o "$dst.part" && mv`. `tools/convert-images.sh:33`: interrupted cwebp
|
||||
leaves a partial `.webp` that the `-nt` staleness gate then skips forever
|
||||
— a truncated WebP ships until manually deleted.
|
||||
|
||||
### 4.7 archive.py robustness gaps — **LOW**
|
||||
|
||||
- `tools/archive.py:788,795-799`: provenance missing the `artifact` key
|
||||
makes `prev_artifact == slug_dir`, then `sha256_of` raises an uncaught
|
||||
`IsADirectoryError` instead of the structured "prior snapshot
|
||||
incomplete" error.
|
||||
- `tools/archive.py:614-617,938-940,1066-1068`: non-dict manifest entries
|
||||
(`- https://example.com` instead of `- url: ...`) crash with
|
||||
`AttributeError: 'str' object has no attribute 'get'`.
|
||||
- `tools/archive.py:896`: `wayback_save` concatenates the raw URL
|
||||
(contrast `wayback_lookup` at 909, which uses `quote(url, safe="")`).
|
||||
|
||||
### 4.8 add-popup-source.sh: dead CSP reminder + unvalidated nginx interpolation — **LOW**
|
||||
|
||||
`tools/add-popup-source.sh:214`: the connect-src reminder gates on
|
||||
`[[ "$NEEDS_PROXY" -eq 0 && -n "$UPSTREAM_HOST" ]]`, but `UPSTREAM_HOST`
|
||||
is only set in the `NEEDS_PROXY -eq 1` branch (lines 124-131) — the
|
||||
reminder can never print, and the no-proxy case is exactly when it's
|
||||
needed (the provider will be CSP-blocked with no hint). Line 71: `NAME`
|
||||
from a free-text prompt is interpolated into
|
||||
`location /proxy/$NAME/`/`set $upstream_$NAME` with no
|
||||
`^[a-z0-9-]+$` validation (import-photo.sh validates; this doesn't).
|
||||
|
||||
### 4.9 refreeze.sh deletes the freeze before the replacement succeeds — **LOW**
|
||||
|
||||
`tools/refreeze.sh:13-16`. `rm -f "$FREEZE"` then `cabal freeze`; a failed
|
||||
resolve leaves no freeze file (recoverable via git, but write-temp-then-move
|
||||
is safer).
|
||||
|
||||
### 4.10 embed.py / atomic-write NITs — **LOW/NIT**
|
||||
|
||||
`tools/embed.py:109-115`: `atomic_write_bytes` uses a fixed `.tmp` name
|
||||
(concurrent-run collision) and no `fsync` before `os.replace` (power loss
|
||||
can leave an empty target). Same pattern in `_atomic_write_yaml` of
|
||||
extract-exif.py:377, extract-palette.py:65, extract-dimensions.py:65.
|
||||
`tools/embed.py:144`: NpzFile never closed — use
|
||||
`with np.load(...) as npz:`.
|
||||
|
||||
### 4.11 Tooling NITs
|
||||
|
||||
- `tools/import-photo.sh:147-155`: on `mogrify -strip` failure the
|
||||
EXIF-laden JPEG (GPS, serials) remains under `content/`, where
|
||||
`make build`'s `git add content/` could auto-commit it. Delete `$TARGET`
|
||||
on that failure path.
|
||||
- `tools/hooks/pre-commit-marks.sh:28-31`: `awk '{ print $2 }'` truncates
|
||||
paths with spaces; the `status:` probe reads the working tree, not the
|
||||
staged blob. Advisory-only hook.
|
||||
- `tools/preset-signing-passphrase.sh:30`: `echo -n "$PASSPHRASE"` eats a
|
||||
passphrase starting with `-e`/`-n`/`-E`; use `printf '%s'`.
|
||||
- `tools/stamp-build-time.py:52-54`: in-place non-atomic rewrite of
|
||||
`_site/` HTML.
|
||||
- `tools/archive.py:244`: `pdftotext` without `--`; a slug starting with
|
||||
`-` parses as an option. Same in extract-exif.py:159.
|
||||
- `tools/monolith-version.txt` records a sha256 (matches the binary
|
||||
today, verified) but `find_monolith()` never checks it.
|
||||
|
||||
Verified clean: sign-site.sh (atomic sig writes, post-pass manifest
|
||||
verification); compress-assets.sh and download-pdfjs.sh (mktemp + EXIT
|
||||
trap, hash verified before extraction); audit-marks.py, viz_theme.py,
|
||||
extract-dimensions.py, extract-palette.py; embed.py's faiss `-1` padding
|
||||
is safely filtered; `uv lock --check` passes; model-checksums.sha256 pins
|
||||
all five model files.
|
||||
|
||||
---
|
||||
|
||||
## 5. Frontend JavaScript
|
||||
|
||||
### 5.1 Score-reader pages never restore theme/settings — **MED**
|
||||
|
||||
`templates/score-reader-default.html:10` + `static/js/theme.js:12-13`. The
|
||||
template loads `theme.js` without `utils.js` (unlike head.html:66-67), so
|
||||
`window.lnUtils.safeStorage` is undefined and theme/text-size/focus-mode/
|
||||
reduce-motion all silently fail to restore — a dark-theme user gets a
|
||||
light flash-and-stay on every score page. Compounding: settings.js (line
|
||||
15; the template does render the settings toggle) falls back to its no-op
|
||||
store, so theme picks made on score pages never persist either.
|
||||
|
||||
### 5.2 search-filters.js: epistemic filters silently bypass clean-URL pages — **MED**
|
||||
|
||||
`static/js/search-filters.js:117-125`. `normUrl()` returns `u.pathname`
|
||||
verbatim and looks it up in `epistemicMeta[url]`. Verified:
|
||||
`_site/data/epistemic-meta.json` keys include
|
||||
`/essays/beyond-comorbidity-indices/index.html` while rendered result
|
||||
links use `/essays/beyond-comorbidity-indices/`. The lookup misses,
|
||||
`passes(null)` returns true ("no metadata = don't filter"), so every
|
||||
directory-style page bypasses all active epistemic filters. Flat `.html`
|
||||
pages match fine, which hides the bug.
|
||||
|
||||
### 5.3 viz.js ignores the cappuccino theme — **MED**
|
||||
|
||||
`static/js/viz.js:94-99`. `isDark()` knows only
|
||||
`'dark'`/`'light'`/OS-preference, but theme.js/settings.js support
|
||||
`'cappuccino'` — a dark-brown theme (`--bg: #553a28`, base.css:203). With
|
||||
OS-light + cappuccino, charts render the LIGHT config (near-black marks
|
||||
and axis labels) on a dark background.
|
||||
|
||||
### 5.4 collapse.js localStorage keys collide across pages — **MED**
|
||||
|
||||
`static/js/collapse.js:44,83`. Key is
|
||||
`'section-collapsed:' + heading.id` with no pathname namespace (contrast
|
||||
annotations.js). Pandoc auto-slugs (`#introduction`, `#background`) recur
|
||||
across essays, so collapsing "Introduction" on one essay collapses it
|
||||
everywhere. Also uses raw `localStorage` rather than
|
||||
`lnUtils.safeStorage`.
|
||||
|
||||
### 5.5 semantic-search.js: stale-response race + duplicate index fetch — **MED**
|
||||
|
||||
`static/js/semantic-search.js:117-144`. `runSearch` has no generation
|
||||
token; overlapping queries render in promise-resolution order, so an
|
||||
older query's hits can replace a newer one's (with `setStatus('')`
|
||||
masking it). `loadIndex()` (42-59) has no in-flight-promise dedup (unlike
|
||||
`loadModel`'s `loadModelPromise`), so concurrent first searches fetch
|
||||
`semantic-index.bin` + `semantic-meta.json` twice.
|
||||
|
||||
### 5.6 lightbox.js: aria-modal with no focus trap, no keyboard activation — **MED**
|
||||
|
||||
`static/js/lightbox.js`. Overlay sets `role="dialog"` +
|
||||
`aria-modal="true"` but has no Tab handling (gallery.js's `trapTab` at
|
||||
235-257 shows the in-repo pattern) — focus walks into the obscured page.
|
||||
Trigger images get only a `click` listener and no `tabindex`/keydown, so
|
||||
keyboard users can't open it; `close()` focuses a non-focusable `<img>`,
|
||||
which no-ops.
|
||||
|
||||
### 5.7 Frontend LOWs
|
||||
|
||||
- `static/js/gallery.js:122-125,270-275`: math/score overlay is
|
||||
click-only (no role/tabindex/keydown); `closeOverlay()` focus-returns
|
||||
to a non-focusable div — focus drops to `<body>`.
|
||||
- `static/js/popups.js:478,515`: the Wikipedia provider's
|
||||
`decodeURIComponent` runs synchronously before the `.catch` attaches —
|
||||
a malformed percent sequence in a link path throws an uncaught
|
||||
`URIError` per hover.
|
||||
- `static/js/popups.js:359,390`: fetched monogram SVG injected via
|
||||
`innerHTML` unescaped — the single unsanitized path in an otherwise
|
||||
fully escaped pipeline. Build-authored content, so not exploitable
|
||||
today; the comment acknowledges the trust assumption.
|
||||
- `static/js/citations.js`: dead file — no template loads it; popups.js
|
||||
supersedes it. If ever re-added it would double-bind and inject
|
||||
bibliography innerHTML without popups.js's cloned-node hardening.
|
||||
Delete.
|
||||
- `static/js/nav.js:26,30-31`: raw `localStorage` unguarded; if storage
|
||||
access throws, the throw lands before `toggle.addEventListener`,
|
||||
leaving the Portals toggle completely dead (utils.js exists precisely
|
||||
for this).
|
||||
- `static/js/annotations.js:209-215`: marks are mouse-only; the tooltip's
|
||||
Delete button is unreachable by keyboard (only recourse is the
|
||||
all-or-nothing "Clear Annotations").
|
||||
- `static/js/search.js:10`: unguarded `new PagefindUI(...)` — if the
|
||||
pagefind bundle 404s, the ReferenceError aborts the whole handler
|
||||
including the `?q=` pre-fill that the selection-popup "Here" flow
|
||||
depends on.
|
||||
- `static/js/semantic-search.js:55-56,96-107`: no
|
||||
`vectors.length === meta.length * DIM` consistency check — a stale
|
||||
CDN-cached mismatch yields NaN scores and silently garbage ranking.
|
||||
(Current files verified consistent: 1,256,448 bytes = 818 × 384 × 4.)
|
||||
- `static/js/transclude.js:149-151` + `collapse.js:111-114`: nested
|
||||
transcludes render a bare placeholder (no rescan of injected content);
|
||||
`reinitCollapse` is not idempotent (would stack toggle buttons if ever
|
||||
called twice on the same container).
|
||||
- `static/js/popups.js:985-988,1009-1014`: `daysBetween` uses `Math.abs`,
|
||||
so future dates render "N days ago" (now.js:17 handles this correctly).
|
||||
|
||||
### 5.8 Frontend NITs
|
||||
|
||||
- `static/js/copy.js:20-22,39`: code-less `<pre>` fallback copies the
|
||||
"copy" button label along with content.
|
||||
- `static/js/score-reader.js:50`: URL rewritten to `?p=1` on every load
|
||||
even without a `?p=` param.
|
||||
- `static/js/search-filters.js:271`: `parseInt(v,10) || 0` turns junk
|
||||
threshold input into an active ≥0 filter that matches everything.
|
||||
- `static/js/selection-popup.js:90-95`: shift-keyup while typing capitals
|
||||
in the annotation picker re-summons the selection toolbar over it.
|
||||
|
||||
Verified clean: the semantic-search ↔ embed.py contract post-model-split
|
||||
(DIM 384, 818-entry meta, no prefix for MiniLM — the nomic
|
||||
`search_document:` prefix is confined to the build-only page path); XSS
|
||||
escaping across semantic-search, popups providers, map tooltips,
|
||||
annotations (sole exception §5.7 monogram); theme.js ↔ settings.js
|
||||
storage schema identical; all JS selector contracts against templates
|
||||
(including the uncommitted head/nav edits); popups/sidenotes
|
||||
double-init guards; settings.js and gallery.js focus traps.
|
||||
|
||||
---
|
||||
|
||||
## 6. Templates & content
|
||||
|
||||
### 6.1 Draft in undocumented location is never built — **MED**
|
||||
|
||||
`content/drafts/inclusionist-manifesto.md`. WRITING.md:34 says drafts go
|
||||
under `content/drafts/essays/`; `draftEssayPattern`
|
||||
(`build/Patterns.hs:46-49`) matches only that, so this file is invisible
|
||||
even to `make watch`/`make dev` — silently orphaned.
|
||||
|
||||
### 6.2 SIMD/PQC essay `repository:` URL 404s — **MED**
|
||||
|
||||
`content/essays/where-does-simd-help-post-quantum-cryptography/index.md:24`.
|
||||
`https://git.levineuwirth.org/where-simd-helps` is missing the owner
|
||||
segment — verified HTTP 404, while the sibling essay's
|
||||
`.../neuwirth/beyond_comorbidity_indices` returns 200.
|
||||
|
||||
### 6.3 Tracked drafts contradict the gitignore policy — **MED**
|
||||
|
||||
`.gitignore:88` ignores `content/drafts/` as local-only "working notes,"
|
||||
but `git ls-files -i -c` shows four tracked drafts
|
||||
(`digital_progeny.md`, `modern_idolatry.md`, `test-essay.md`,
|
||||
`university_care.md`) — ignore rules don't untrack, so edits are
|
||||
auto-staged by `make build` and pushed publicly by deploy. The over-broad
|
||||
`**/.env.*` pattern also matches the tracked `.env.example`.
|
||||
|
||||
### 6.4 Template/content LOWs and NITs
|
||||
|
||||
- `content/colophon.md:5`: `modified:` is dead frontmatter — nothing
|
||||
reads it; `$date-modified$` (page-footer.html:108) is Hakyll's
|
||||
`dateField` over the `date` key.
|
||||
- Seven files end frontmatter with a valueless `confidence-history:`
|
||||
(YAML null; WRITING.md:97 documents a list of ints) — harmless, but
|
||||
`content/essays/scaling_outage.md` also retains the full WRITING.md
|
||||
scaffold comments in a published essay.
|
||||
- `static/images/canto31.jpg`: still 4.0 MB (prior-audit §6.1 unfixed).
|
||||
- `templates/blog-post.html:25,34`: `id="similar-links"` appears twice in
|
||||
mutually exclusive `$if$` branches — safe, fragile under edit.
|
||||
- `content/drafts/essays/digital_progeny.md`: title duplicates the
|
||||
published "The Specification Dilemma" — stale draft.
|
||||
- Frontmatter flags `home:`/`library:`/`links:`/`search:`/`portal:` are
|
||||
consumed (head.html CSS gates, default.html:6 `data-portal`) but
|
||||
undocumented in WRITING.md.
|
||||
|
||||
Verified clean: all `$partial(...)$` includes resolve; all ~140 distinct
|
||||
template variables have context providers; no missing `alt` attributes,
|
||||
tag-balance failures, or within-page duplicate IDs in composed pages; all
|
||||
26 CSS files referenced by head.html exist; sampled enum values across
|
||||
all sections are legal per WRITING.md and Contexts.hs validation lists.
|
||||
|
||||
---
|
||||
|
||||
## 7. Documentation / spec drift (WRITING.md, README.md)
|
||||
|
||||
### 7.1 `js:` page-script paths documented as content-relative; emitted root-relative — **MED**
|
||||
|
||||
`WRITING.md:773-775` vs `templates/default.html:37`
|
||||
(`<script src="/$script-src$" defer>`). The doc claims a composition's
|
||||
`js: scripts/widget.js` serves at `/music/symphony/scripts/widget.js`; the
|
||||
template emits raw root-relative frontmatter. The only current user
|
||||
(memento-mori) works by coincidence of its root-level route. A
|
||||
composition following the doc would 404.
|
||||
|
||||
### 7.2 "Standalone page `content/my-page/index.md`" has no generic rule — **MED**
|
||||
|
||||
`WRITING.md:20` presents directory-form standalone pages as a general
|
||||
capability; `build/Site.hs` hardcodes only `content/me/index.md` (293) and
|
||||
`content/memento-mori/index.md` (307); the generic rule (351) matches flat
|
||||
`content/*.md` only. A new `content/my-page/index.md` silently doesn't
|
||||
build.
|
||||
|
||||
### 7.3 Portal table lists 8 portals; the build has 9 — **MED**
|
||||
|
||||
`WRITING.md:221-231` omits Photography, which is in `homePortals`
|
||||
(`build/Site.hs:50-60`), the nav, and `content/tag-meta/photography.md`.
|
||||
|
||||
### 7.4 Three implemented frontmatter fields undocumented — **MED**
|
||||
|
||||
WRITING.md:3 claims to cover "all frontmatter fields"; zero hits for:
|
||||
`summary:` (`build/Contexts.hs:415-427`, rendered by essay.html:16 and
|
||||
reading.html:12, in live use), `revised:` (`build/Contexts.hs:815`
|
||||
`getRevisions` — drives `$date-display$`/`$date-original$`/
|
||||
`$revision-note$` and list sort order), `keywords:`
|
||||
(`build/Contexts.hs:283` → `/bibliography/<kw>/` links).
|
||||
|
||||
### 7.5 Documentation LOWs
|
||||
|
||||
- `WRITING.md:268-269,82`: default citation style called "Chicago
|
||||
Author-Date"; the injected CSL (`build/Citations.hs:114,167-168`) is
|
||||
`data/chicago-notes.csl`, titled "Chicago Notes Bibliography".
|
||||
- `README.md:12,19`: `make watch` described as "rebuilds on save without
|
||||
a server"; it runs Hakyll's preview server (WRITING.md:1139 has it
|
||||
right).
|
||||
- `WRITING.md:105-109`: `history:` example ordering contradicts the code
|
||||
(see §3.5).
|
||||
|
||||
---
|
||||
|
||||
## 8. nginx, Makefile & deployment
|
||||
|
||||
### 8.1 Multi-line CSP value embeds literal `\` + LF bytes — **MED**
|
||||
|
||||
`nginx/security-headers.conf:60-71`. The
|
||||
`Content-Security-Policy-Report-Only` value is a single quoted string
|
||||
spanning 12 lines with trailing `\` characters — nginx has no
|
||||
line-continuation inside quoted strings, so the emitted header contains
|
||||
raw backslash, LF, and leading-space bytes between directives. Raw LF in
|
||||
a header value is illegal in HTTP/2 (vhost example enables `http2 on`);
|
||||
strict clients reject the whole response. Sent on every response even as
|
||||
Report-Only. Must be collapsed to one line.
|
||||
|
||||
### 8.2 CSP gaps that will fire under enforcement — **MED**
|
||||
|
||||
`nginx/security-headers.conf:66-67`. (a) `font-src 'self' data:` blocks
|
||||
KaTeX webfonts: head.html:61 loads `katex.min.css` from cdn.jsdelivr.net,
|
||||
whose relative font URLs resolve to the CDN. (b) `connect-src 'self'`
|
||||
blocks the onnxruntime `.wasm` that transformers.js v2 (dynamically
|
||||
imported in `static/js/semantic-search.js:25`) fetches from jsdelivr —
|
||||
the config comment covers the same-origin model files but not the
|
||||
runtime. Both latent while Report-Only.
|
||||
|
||||
### 8.3 Makefile auto-commit sweeps any pre-staged changes — **MED**
|
||||
|
||||
`Makefile:28-29`. `git add content/` followed by
|
||||
`git diff --cached --quiet || git commit -m "auto: ..."` commits the
|
||||
*entire index* — anything previously staged gets folded into an
|
||||
`auto: <timestamp> [skip ci]` commit and pushed publicly on deploy. Use
|
||||
`git commit -- content/` or verify no foreign paths are staged.
|
||||
|
||||
### 8.4 Makefile LOWs
|
||||
|
||||
- pdf-thumbs: the `find | while read` pipeline swallows `pdftoppm`
|
||||
failures (loop exit status is the last iteration's) — a corrupt PDF
|
||||
silently ships without a thumbnail.
|
||||
- deploy: prerequisite order `clean build sign` is guaranteed only under
|
||||
serial make; no `.NOTPARALLEL:` guard for `-j` invocations. (Confirmed:
|
||||
deploy does run `clean` first; `.PHONY` is complete; `.env` export
|
||||
allowlist is sound.)
|
||||
- `tools/hooks/pre-commit-marks.sh` is documented (Makefile:175 comment)
|
||||
but not installed — `.git/hooks/` has only samples and `core.hooksPath`
|
||||
is unset.
|
||||
|
||||
Verified clean: all seven `data/` JSON/YAML files parse;
|
||||
`data/embed-cache-pages.npz` is untracked, so the new gitignore entry is
|
||||
fully effective; nginx archive.conf's add_header-inheritance re-include is
|
||||
correct; no redirect loops; popup-proxy rate-limit/cache zones correctly
|
||||
documented for http{} scope.
|
||||
|
||||
---
|
||||
|
||||
## 9. Working-tree diff review (branding refresh + embed split)
|
||||
|
||||
The model contract is **intact** — the diff splits one MiniLM pipeline
|
||||
into two: pages now use nomic-embed-text-v1.5 (768d, build-only, for
|
||||
similar-links.json); paragraphs stay on all-MiniLM-L6-v2@c9745ed (384d,
|
||||
the browser contract). download-model.sh, model-checksums.sha256,
|
||||
semantic-search.js (`DIM = 384`), and both WRITING.md lines (1108 nomic
|
||||
for Related-pages, 1128 MiniLM for client search) are all consistent.
|
||||
Icon declarations all match real files (verified with `file`: apple-touch
|
||||
180×180, favicon-96 96×96, manifest PNGs 192/512, og-image 1200×630
|
||||
matching declared og:image dimensions; the webp sidecar was regenerated).
|
||||
|
||||
Open items beyond §1.3/§1.4/§4.1:
|
||||
|
||||
### 9.1 32.8 KB traced SVG inlined into every page — **MED**
|
||||
|
||||
`templates/partials/logo-mark.svg` (32,818 bytes, potrace-style single
|
||||
giant `<path>`) is inlined via the nav partial into every HTML page —
|
||||
a ~33 KB per-page weight regression (pre-compression). The two-tone
|
||||
`--logo-ink`/`--logo-bg` cutout (components.css:72-98) genuinely needs
|
||||
inline SVG or `<use>`; an external sprite + `<use href>` restores
|
||||
cacheability. Better still: a hand-drawn or simplified path — a traced
|
||||
bitmap at nav size carries detail that can never resolve.
|
||||
|
||||
### 9.2 Icon asset bloat — **LOW**
|
||||
|
||||
`static/favicon.ico` is now 71,766 bytes; parsed directory shows
|
||||
16/32/48/64/128/256 px entries, the 128+256 pair alone 55.8 KB. The .ico
|
||||
is only the legacy fallback (modern browsers take the SVG); 16+32+48
|
||||
(~8 KB) is conventional. `static/favicon.svg` is a 32,844-byte traced
|
||||
path. `static/images/link-icons/internal.svg` went ~2 KB → 32,818 bytes
|
||||
yet renders at 0.7–1.6 rem via CSS mask in three stylesheets
|
||||
(components.css:853, typography.css:833, popups.css:161).
|
||||
|
||||
### 9.3 Webmanifest regressions — **NIT**
|
||||
|
||||
`static/site.webmanifest`: `purpose` changed maskable→`any` for both
|
||||
icons (Android adaptive launchers will letterbox; convention is separate
|
||||
`any` + `maskable` entries); still no `start_url`/`scope`/`description`
|
||||
(Lighthouse installability warnings). JSON valid; icons verified.
|
||||
|
||||
---
|
||||
|
||||
## 10. Prior audit (AUDIT.md 2026-05-07) follow-up
|
||||
|
||||
| Finding | Status |
|
||||
|---|---|
|
||||
| §1.1 freeze unsolvable | **Effectively still open** — aeson pin fixed, but the freeze broke again via `distributive` after a system update (§1.1 above); the underlying freeze-vs-system-db fragility is unaddressed |
|
||||
| §1.3 Python version mismatch | Fixed (`requires-python = ">=3.14"` matches `.python-version`) |
|
||||
| §1.4 model checksums | Fixed (`tools/model-checksums.sha256`, 5 entries) |
|
||||
| §9.1 nginx headers | Fixed (`nginx/security-headers.conf` + vhost example, README'd) — but see §8.1/§8.2 for new issues in that file |
|
||||
| §6.1 `canto31.jpg` 4 MB | **Unfixed** |
|
||||
| robots.txt / sitemap | Fixed (Site.hs:941/963, present in `_site/`) |
|
||||
| README `paper/`/`spec.md` ghosts | Fixed |
|
||||
| rsync target quoting | Fixed |
|
||||
| date-quoting doc | Fixed (WRITING.md:106) |
|
||||
| tag-meta no-title exception | Fixed (WRITING.md:238-251) |
|
||||
|
||||
---
|
||||
|
||||
## Suggested triage order
|
||||
|
||||
1. ~~`tools/refreeze.sh`~~ (§1.1 — in progress)
|
||||
2. Delete `data/embed-cache-pages.npz.tmp.npz`; widen the gitignore
|
||||
pattern; `git add` `logo-mark.svg` + `og-image.png` before committing
|
||||
the branding diff (§1.4, §4.1)
|
||||
3. Guard `ArchiveIndex.hs` file reads with `doesFileExist` (§1.2)
|
||||
4. Pin or sandbox the nomic remote code (§1.3)
|
||||
5. Fix the `/fiction/`–`/poetry/` 404s (§2.1) and the production-visible
|
||||
frontend MEDs (§5.1, §5.2)
|
||||
6. Collapse the nginx CSP to one line before ever flipping it to
|
||||
enforcing (§8.1, §8.2)
|
||||
7. The rest by severity as time allows
|
||||
17
Makefile
|
|
@ -1,5 +1,10 @@
|
|||
.PHONY: build deploy sign download-model download-pdfjs download-leaflet compress-assets convert-images pdf-thumbs pdfs watch clean dev audit-marks archive-gc archive-wayback archive-check
|
||||
|
||||
# deploy's prerequisite order (clean -> build -> sign) is only correct
|
||||
# serially; under `make -j` they could interleave. This build has no
|
||||
# intra-target parallelism worth preserving, so disable it outright.
|
||||
.NOTPARALLEL:
|
||||
|
||||
# Source .env for deploy / GitHub config if it exists.
|
||||
# .env format: KEY=value (one per line, no `export` prefix, no quotes needed).
|
||||
# Only the variables explicitly listed below are exported to recipe
|
||||
|
|
@ -21,8 +26,12 @@ build:
|
|||
# so a stray secret dropped under content/ is NOT auto-staged. To
|
||||
# intentionally commit a normally-ignored file, use `git add -f`
|
||||
# manually before running `make build`.
|
||||
#
|
||||
# The commit and its guard are pathspec-limited to content/ so that
|
||||
# anything the user had previously staged for other reasons is left
|
||||
# staged, not silently swept into the auto-commit.
|
||||
@git add content/
|
||||
@git diff --cached --quiet || git commit -m "auto: $$(date -u +%Y-%m-%dT%H:%M:%SZ) [skip ci]"
|
||||
@git diff --cached --quiet -- content/ || git commit -m "auto: $$(date -u +%Y-%m-%dT%H:%M:%SZ) [skip ci]" -- content/
|
||||
@mkdir -p data
|
||||
@date +%s > data/build-start.txt
|
||||
@./tools/convert-images.sh
|
||||
|
|
@ -110,12 +119,16 @@ convert-images:
|
|||
# Thumbnails are written as static/papers/foo.thumb.png alongside each PDF.
|
||||
# Skipped silently when pdftoppm is not installed or static/papers/ is empty.
|
||||
pdf-thumbs:
|
||||
# A failing pdftoppm must at least warn: the `find | while` pipeline's
|
||||
# exit status is the last iteration's, so without the `||` a corrupt
|
||||
# PDF would silently ship without a thumbnail.
|
||||
@if command -v pdftoppm >/dev/null 2>&1; then \
|
||||
find static/papers -name '*.pdf' 2>/dev/null | while read pdf; do \
|
||||
thumb="$${pdf%.pdf}.thumb"; \
|
||||
if [ ! -f "$${thumb}.png" ] || [ "$$pdf" -nt "$${thumb}.png" ]; then \
|
||||
echo " pdf-thumb $$pdf"; \
|
||||
pdftoppm -r 100 -f 1 -l 1 -png -singlefile "$$pdf" "$$thumb"; \
|
||||
pdftoppm -r 100 -f 1 -l 1 -png -singlefile "$$pdf" "$$thumb" \
|
||||
|| echo "Warning: pdf-thumb failed for $$pdf (page ships without a thumbnail)" >&2; \
|
||||
fi; \
|
||||
done; \
|
||||
else \
|
||||
|
|
|
|||
|
|
@ -9,14 +9,15 @@ with a custom build system in `build/` and a Haskell + JS + Python toolchain.
|
|||
```sh
|
||||
make build # one-shot production build into _site/
|
||||
make dev # dev build (drafts visible) + local server on :8000
|
||||
make watch # cabal-watch rebuild (drafts visible)
|
||||
make watch # Hakyll live-reload dev server (drafts visible)
|
||||
make clean # cabal run site -- clean
|
||||
make deploy # clean → build → sign → push → rsync to VPS
|
||||
```
|
||||
|
||||
`make build` always runs `make clean` implicitly when invoked from `make deploy`.
|
||||
For day-to-day work, prefer `make dev` (which serves the site on
|
||||
`http://localhost:8000`) or `make watch` (rebuilds on save without a server).
|
||||
`http://localhost:8000`) or `make watch` (Hakyll's live-reload preview server,
|
||||
which rebuilds on save and serves the site locally).
|
||||
|
||||
**Run `make build` any time you add or replace binary assets** (JPEG/PNG
|
||||
figures, PDFs, music assets). `make dev` and `make watch` skip the
|
||||
|
|
|
|||
110
WRITING.md
|
|
@ -17,15 +17,22 @@ frontmatter fields, and every authoring feature available in the Markdown source
|
|||
| Fiction | `content/fiction/my-story.md` | `/fiction/my-story.html` |
|
||||
| Composition | `content/music/{slug}/index.md` | `/music/{slug}/` |
|
||||
| Standalone page | `content/my-page.md` | `/my-page.html` |
|
||||
| Standalone page (with co-located assets) | `content/my-page/index.md` | `/my-page.html` |
|
||||
| Standalone page (with co-located assets; needs a dedicated rule) | `content/me/index.md` | `/me.html` |
|
||||
| Draft essay | `content/drafts/essays/my-draft.md` | `/drafts/essays/my-draft.html` (dev only) |
|
||||
|
||||
File names become URL slugs. Use lowercase, hyphen-separated words.
|
||||
|
||||
If a standalone page embeds co-located SVG score fragments or other relative assets,
|
||||
place it in its own directory (`content/my-page/index.md`) rather than as a flat file.
|
||||
Score fragment paths are resolved relative to the source file's directory; a flat
|
||||
`content/my-page.md` would resolve them from `content/`, which is wrong.
|
||||
Flat `content/<page>.md` is the generic standalone form — any flat file dropped
|
||||
into `content/` builds automatically. Directory-form standalone pages
|
||||
(`content/my-page/index.md`) are **not** picked up by the generic rule; each one
|
||||
requires its own dedicated `match` rule in `build/Site.hs`. The two existing
|
||||
ones are `content/me/index.md` and `content/memento-mori/index.md` — follow
|
||||
their pattern when adding another.
|
||||
|
||||
The directory form exists for pages that embed co-located SVG score fragments
|
||||
or other relative assets: score fragment paths are resolved relative to the
|
||||
source file's directory, and a flat `content/my-page.md` would resolve them
|
||||
from `content/`, which is wrong.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -65,9 +72,12 @@ subtitle: "An Optional Secondary Line" # optional; rendered below the title in
|
|||
date: 2026-03-15 # required; used for ordering, feed, and display
|
||||
abstract: > # optional; shown in the metadata block and link previews
|
||||
A one-paragraph description of the piece.
|
||||
summary: | # optional; rendered in a "Summary" box near the abstract
|
||||
A structured summary. **Markdown allowed** — bold, lists, multiple paragraphs.
|
||||
tags: # optional; see Tags section
|
||||
- nonfiction
|
||||
- nonfiction/philosophy
|
||||
keywords: [lattices, simd] # optional; links to /bibliography/<keyword>/ pages (list or comma-separated string)
|
||||
authors: # optional; overrides the default "Levi Neuwirth" link
|
||||
- "Levi Neuwirth | /me.html"
|
||||
- "Collaborator | https://their.site"
|
||||
|
|
@ -79,7 +89,7 @@ further-reading: # optional; see Citations section
|
|||
- someKey
|
||||
- anotherKey
|
||||
bibliography: data/custom.bib # optional; overrides data/bibliography.bib
|
||||
csl: data/custom.csl # optional; overrides Chicago Author-Date
|
||||
csl: data/custom.csl # optional; overrides Chicago Notes Bibliography
|
||||
no-collapse: true # optional; disables collapsible h2/h3 sections
|
||||
repository: https://git.levineuwirth.org/levi/repo # optional; "Repository" link in metadata
|
||||
preprint: /papers/my-essay.pdf # optional; "Preprint" link in metadata (typeset PDF version)
|
||||
|
|
@ -101,12 +111,20 @@ confidence-history: # list of integers; trend arrow derived from last two
|
|||
peer-status: under-review # optional; unreviewed (default) | under-review | peer-reviewed | published | retracted
|
||||
result-shape: mixed # optional; positive | negative | mixed | comparative | descriptive
|
||||
|
||||
# Version history — optional; falls back to git log, then to date frontmatter
|
||||
# Version history — optional; falls back to git log, then to date frontmatter.
|
||||
# Entries may be listed in any order — they are sorted by date at build time.
|
||||
history:
|
||||
- date: 2026-03-01 # ISO date; unquoted is fine (the Haskell YAML parser keeps it as a string)
|
||||
note: Initial draft
|
||||
- date: 2026-03-14
|
||||
- date: 2026-03-14 # ISO date; unquoted is fine (the Haskell YAML parser keeps it as a string)
|
||||
note: Expanded typography section; added citations
|
||||
- date: 2026-03-01
|
||||
note: Initial draft
|
||||
|
||||
# Revision log — optional; drives the date shown on cards and list pages
|
||||
# (see Revision dates section)
|
||||
revised:
|
||||
- date: "2026-04-10"
|
||||
note: "expanded the section on typography"
|
||||
- date: "2026-03-20" # note is optional per-entry
|
||||
---
|
||||
```
|
||||
|
||||
|
|
@ -226,6 +244,7 @@ The top-level segment maps to a **portal** in the nav:
|
|||
| Miscellany | `/miscellany/` |
|
||||
| Music | `/music/` |
|
||||
| Nonfiction | `/nonfiction/` |
|
||||
| Photography | `/photography/` |
|
||||
| Poetry | `/poetry/` |
|
||||
| Research | `/research/` |
|
||||
| Tech | `/tech/` |
|
||||
|
|
@ -265,7 +284,8 @@ The URL part is optional.
|
|||
|
||||
## Citations
|
||||
|
||||
The citation pipeline uses Chicago Author-Date style. The bibliography lives at
|
||||
The citation pipeline uses Chicago Notes Bibliography style
|
||||
(`data/chicago-notes.csl`). The bibliography lives at
|
||||
`data/bibliography.bib` (BibLaTeX format) by default; override per-page with
|
||||
`bibliography` and `csl`.
|
||||
|
||||
|
|
@ -278,7 +298,7 @@ Multiple sources agree.[@jones2019; @brown2021]
|
|||
```
|
||||
|
||||
Inline citations render as numbered superscripts `[1]`, `[2]`, etc. The
|
||||
bibliography section appears automatically in the page footer. `citations.js`
|
||||
bibliography section appears automatically in the page footer. `popups.js`
|
||||
adds hover previews showing the full reference.
|
||||
|
||||
### Further reading
|
||||
|
|
@ -754,9 +774,8 @@ at the top of the catalog.
|
|||
## Page scripts
|
||||
|
||||
For pages that need custom JavaScript (interactive widgets, visualisations, etc.),
|
||||
place the JS file alongside the content and reference it via the `js:` frontmatter
|
||||
key. The file is copied to `_site/` and injected as a deferred `<script>` at the
|
||||
bottom of `<body>`.
|
||||
reference the JS file via the `js:` frontmatter key. The file is injected as a
|
||||
deferred `<script>` at the bottom of `<body>`.
|
||||
|
||||
```yaml
|
||||
js: scripts/memento-mori.js # single file
|
||||
|
|
@ -770,12 +789,18 @@ js:
|
|||
- scripts/widget-b.js
|
||||
```
|
||||
|
||||
Paths are relative to the content file. A composition at
|
||||
`content/music/symphony/index.md` with `js: scripts/widget.js` serves the
|
||||
script at `/music/symphony/scripts/widget.js`.
|
||||
Paths are **site-root-relative**, not relative to the content file: the template
|
||||
emits the value verbatim with a leading `/` prepended. Write the path without a
|
||||
leading slash. `js: scripts/widget.js` loads `/scripts/widget.js` regardless of
|
||||
where the page lives — a composition at `content/music/symphony/index.md` with
|
||||
that value does *not* get `/music/symphony/scripts/widget.js`.
|
||||
|
||||
No changes to the build system are needed — the `content/**/*.js` glob rule
|
||||
copies all JS files from `content/` to `_site/` automatically.
|
||||
The script file must live where the build serves that URL. The `content/**/*.js`
|
||||
glob rule copies JS files to `_site/` with the `content/` prefix stripped, so
|
||||
`content/scripts/widget.js` is served at `/scripts/widget.js` — this is the
|
||||
current convention (the memento-mori page keeps its script at
|
||||
`content/scripts/memento-mori.js` and references it as
|
||||
`js: scripts/memento-mori.js`).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -896,7 +921,8 @@ should copy and adapt it; the file documents the §2.2 visual contract
|
|||
|
||||
The version history footer section uses a three-tier fallback:
|
||||
|
||||
1. **`history:` frontmatter** — your authored notes, shown exactly as written.
|
||||
1. **`history:` frontmatter** — your authored notes. Entries may be listed in
|
||||
any order — they are sorted by date at build time.
|
||||
2. **Git log** — if no `history:` key, dates are extracted from `git log --follow`.
|
||||
Entries have no message (date only).
|
||||
3. **`date:` frontmatter** — if git has no commits for the file, falls back to
|
||||
|
|
@ -910,14 +936,50 @@ descriptive:
|
|||
|
||||
```yaml
|
||||
history:
|
||||
- date: 2026-03-01
|
||||
note: Initial draft
|
||||
- date: 2026-03-14
|
||||
note: Expanded section 3; incorporated feedback from peer review
|
||||
- date: 2026-03-01
|
||||
note: Initial draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Revision dates
|
||||
|
||||
The `revised:` key records substantive revisions and drives the date shown on
|
||||
item cards and list pages. Two accepted shapes:
|
||||
|
||||
```yaml
|
||||
revised: "2026-04-10" # scalar shorthand — one revision, no note
|
||||
|
||||
revised: # canonical list of objects
|
||||
- date: "2026-04-10"
|
||||
note: "expanded the section on Shestov"
|
||||
- date: "2025-12-03" # note is optional per-entry
|
||||
```
|
||||
|
||||
Dates are ISO `YYYY-MM-DD` strings. Entries may be listed in any order — they
|
||||
are sorted by date at build time, most recent first. Entries missing `date:`
|
||||
or carrying non-string values are silently dropped; the build never fails on
|
||||
a malformed `revised:` block.
|
||||
|
||||
Effects:
|
||||
|
||||
- **`$date-display$` / `$date-iso$`** — cards and list pages show the
|
||||
most-recent revision date instead of the creation date.
|
||||
- **Sort order** — revision-aware lists (`/new.html`, tag pages, the library)
|
||||
sort by the display date, so a freshly revised piece moves to the top.
|
||||
- **`$date-original$`** — when the latest revision date differs from the
|
||||
creation date, the card adds a "revised from …" annotation showing the
|
||||
original date.
|
||||
- **`$revision-note$`** — the note on the most-recent entry renders as an
|
||||
italicized line under the abstract on the card.
|
||||
|
||||
`revised:` is independent of `history:` (the version-history footer above);
|
||||
add a matching `history:` entry if the revision should appear there too.
|
||||
|
||||
---
|
||||
|
||||
## Typography features
|
||||
|
||||
Applied automatically at build time; no markup needed.
|
||||
|
|
@ -1125,7 +1187,7 @@ These pages are built automatically and require no content files or markup:
|
|||
| Author indexes | `/authors/<slug>/` | All content attributed to an author |
|
||||
| Random manifest | `/random-pages.json` | JSON array of page URLs for the random-page button |
|
||||
| Atom feeds | `/feed.xml`, `/music/feed.xml` | All content feed + music-only feed |
|
||||
| Search | `/search.html` | Pagefind full-text search + client-side semantic search (`nomic-embed-text-v1.5` ONNX model) |
|
||||
| Search | `/search.html` | Pagefind full-text search + client-side semantic search (`all-MiniLM-L6-v2` ONNX model) |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -163,10 +163,18 @@ readManifest = do
|
|||
else do
|
||||
parsed <- Y.decodeFileEither manifestPath
|
||||
case parsed of
|
||||
Right es -> return es
|
||||
Left e -> do
|
||||
hPutStrLn stderr $
|
||||
"[archive] FATAL: manifest.yaml: " ++ show e
|
||||
-- An empty or all-comments file decodes as YAML @Null@,
|
||||
-- not as a list. That is the legitimate "drained to zero
|
||||
-- entries" state, not a broken file — treat it as the
|
||||
-- empty manifest the absent-file branch already supports.
|
||||
Right A.Null -> return []
|
||||
Right v -> case A.fromJSON v of
|
||||
A.Success es -> return es
|
||||
A.Error msg -> fatal msg
|
||||
Left e -> fatal (show e)
|
||||
where
|
||||
fatal msg = do
|
||||
hPutStrLn stderr $ "[archive] FATAL: manifest.yaml: " ++ msg
|
||||
exitFailure
|
||||
|
||||
readRemovedUrls :: IO (Set.Set T.Text)
|
||||
|
|
@ -265,8 +273,17 @@ loadArchiveEntries = do
|
|||
removed <- readRemovedUrls
|
||||
validateManifestEntries manifest removed
|
||||
provByUrl <- readProvenances
|
||||
-- Join on normalised URLs, like every other URL comparison in the
|
||||
-- archive system: editing a manifest URL to a normalisation-
|
||||
-- equivalent form (http->https, trailing slash, tracking params)
|
||||
-- must keep matching its provenance — an exact-string join would
|
||||
-- silently unpublish the page while ArchiveIndex's normalised
|
||||
-- filter keeps links pointing at it. Key collisions can't occur:
|
||||
-- validateManifestEntries rejects normalised duplicates.
|
||||
let normKey = T.unpack . normalizeUrl . T.pack
|
||||
provByNorm = Map.mapKeys normKey provByUrl
|
||||
fmap catMaybes $ forM manifest $ \me ->
|
||||
case Map.lookup (meUrl me) provByUrl of
|
||||
case Map.lookup (normKey (meUrl me)) provByNorm of
|
||||
Nothing -> return Nothing
|
||||
Just (slug, pv) -> do
|
||||
let dir = "archive/" ++ slug
|
||||
|
|
@ -299,6 +316,12 @@ loadArchiveEntries = do
|
|||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | All archive rules. Called once from 'Site.rules'.
|
||||
--
|
||||
-- The manifest is read here in 'preprocess' (and 'ArchiveIndex' reads
|
||||
-- its sidecars in once-per-process CAFs), so archive state is fixed at
|
||||
-- rule-generation time: under @site watch@, edits to @manifest.yaml@,
|
||||
-- @removed.yaml@, or the regenerated state JSONs are not picked up
|
||||
-- until the process restarts. One-shot builds are unaffected.
|
||||
archiveRules :: Rules ()
|
||||
archiveRules = do
|
||||
entries <- preprocess loadArchiveEntries
|
||||
|
|
@ -562,10 +585,17 @@ tallyOf xs = intercalate " \183 "
|
|||
| (k, c) <- Map.toList (Map.fromListWith (+) [ (x, 1 :: Int) | x <- xs ]) ]
|
||||
|
||||
-- | The median of a list of ages, as @"N days"@; an em dash when empty.
|
||||
-- An even-length list takes the mean of the two middle elements,
|
||||
-- rounded to the nearest whole day.
|
||||
medianAge :: [Int] -> String
|
||||
medianAge [] = "\8212"
|
||||
medianAge xs =
|
||||
let m = sort xs !! (length xs `div` 2)
|
||||
let sorted = sort xs
|
||||
n = length sorted
|
||||
upper = sorted !! (n `div` 2)
|
||||
lower = sorted !! (n `div` 2 - 1) -- forced only when n is even
|
||||
m | odd n = upper
|
||||
| otherwise = (lower + upper + 1) `div` 2
|
||||
in show m ++ if m == 1 then " day" else " days"
|
||||
|
||||
-- | Parse a @YYYY-MM-DD@ date; 'Nothing' on malformed input.
|
||||
|
|
|
|||
|
|
@ -15,11 +15,18 @@
|
|||
-- * @Archive@ — surfaces each entry's rot status on its page, the
|
||||
-- @/archive/@ index, and the @/build/@ telemetry.
|
||||
--
|
||||
-- Both files are loaded once per build via @unsafePerformIO@ CAFs. An
|
||||
-- absent or malformed file degrades safely: an empty index makes the
|
||||
-- Both files are loaded once per *process* via NOINLINE
|
||||
-- @unsafePerformIO@ CAFs (as are the manifest/removed URL sets below).
|
||||
-- An absent or malformed file degrades safely: an empty index makes the
|
||||
-- link consumers no-op; an absent state file makes every entry @Live@
|
||||
-- (the safe default — no link flip). @archive.py check@ is decoupled
|
||||
-- from @make build@; a build consumes whatever state file exists.
|
||||
--
|
||||
-- Consequence of the once-per-process read (shared with the manifest
|
||||
-- read in 'Archive.archiveRules'): under @site watch@, edits to
|
||||
-- @manifest.yaml@, @removed.yaml@, or the regenerated state JSONs are
|
||||
-- not re-read — the server renders stale archive state until restart.
|
||||
-- One-shot builds (@make build@ / @make deploy@) are unaffected.
|
||||
module ArchiveIndex
|
||||
( ArchiveStatus (..)
|
||||
, statusName
|
||||
|
|
@ -132,6 +139,10 @@ activeUrls = unsafePerformIO $ do
|
|||
{-# NOINLINE rawIndex #-}
|
||||
rawIndex :: Map Text IdxEntry
|
||||
rawIndex = unsafePerformIO $ do
|
||||
exists <- doesFileExist indexPath
|
||||
if not exists
|
||||
then return Map.empty
|
||||
else do
|
||||
decoded <- A.eitherDecodeFileStrict' indexPath
|
||||
let parsed = either (const Map.empty) id decoded
|
||||
return $ Map.filterWithKey
|
||||
|
|
@ -142,6 +153,10 @@ rawIndex = unsafePerformIO $ do
|
|||
{-# NOINLINE rawState #-}
|
||||
rawState :: Map Text ArchiveStatus
|
||||
rawState = unsafePerformIO $ do
|
||||
exists <- doesFileExist statePath
|
||||
if not exists
|
||||
then return Map.empty
|
||||
else do
|
||||
decoded <- A.eitherDecodeFileStrict' statePath
|
||||
return $ either (const Map.empty) (Map.map seStatus) decoded
|
||||
|
||||
|
|
|
|||
|
|
@ -138,6 +138,8 @@ isPageLink u
|
|||
| otherwise =
|
||||
not (T.isPrefixOf "http://" u) &&
|
||||
not (T.isPrefixOf "https://" u) &&
|
||||
-- protocol-relative //host/path is external, not a page path
|
||||
not (T.isPrefixOf "//" u) &&
|
||||
not (T.isPrefixOf "#" u) &&
|
||||
not (T.isPrefixOf "mailto:" u) &&
|
||||
not (T.isPrefixOf "tel:" u) &&
|
||||
|
|
@ -213,18 +215,28 @@ splitSentences = go []
|
|||
-- For every internal link in a paragraph, emit an entry carrying the HTML
|
||||
-- of the sentence containing the link (default display) and the HTML of
|
||||
-- the full paragraph (hover/popup context).
|
||||
-- Recurses into Div, BlockQuote, BulletList, and OrderedList.
|
||||
-- Recurses into Div, BlockQuote, BulletList, OrderedList, and
|
||||
-- DefinitionList. @Plain@ matters as much as @Para@: Pandoc renders
|
||||
-- tight list items (the default @- item@ Markdown form) as @Plain@
|
||||
-- blocks, so without it every link written in a tight list would be
|
||||
-- invisible to the backlinks system.
|
||||
extractLinksWithContext :: Pandoc -> [LinkEntry]
|
||||
extractLinksWithContext (Pandoc _ blocks) = concatMap go blocks
|
||||
where
|
||||
go :: Block -> [LinkEntry]
|
||||
go (Para inlines) = paraEntries inlines
|
||||
go (Plain inlines) = paraEntries inlines
|
||||
go (BlockQuote bs) = concatMap go bs
|
||||
go (Div _ bs) = concatMap go bs
|
||||
go (BulletList items) = concatMap (concatMap go) items
|
||||
go (OrderedList _ items) = concatMap (concatMap go) items
|
||||
go (DefinitionList defs) = concatMap defEntries defs
|
||||
go _ = []
|
||||
|
||||
defEntries :: ([Inline], [[Block]]) -> [LinkEntry]
|
||||
defEntries (term, bodies) =
|
||||
paraEntries term ++ concatMap (concatMap go) bodies
|
||||
|
||||
paraEntries :: [Inline] -> [LinkEntry]
|
||||
paraEntries inlines =
|
||||
let paraHtml = renderInlines inlines
|
||||
|
|
@ -268,17 +280,25 @@ linksCompiler = do
|
|||
-- URL normalisation
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Normalise an internal URL as a map key: strip query string, fragment,
|
||||
-- and trailing @.html@; ensure a leading slash; percent-decode the path
|
||||
-- so that @\/essays\/caf%C3%A9@ and @\/essays\/café@ collide on the same
|
||||
-- key.
|
||||
-- | Normalise an internal URL as a map key: strip query string and
|
||||
-- fragment; ensure a leading slash; strip a trailing @index.html@
|
||||
-- (keeping the directory slash) before the bare @.html@ extension, so a
|
||||
-- page routed @essays\/foo\/index.html@ and a body link authored in the
|
||||
-- canonical directory form @\/essays\/foo\/@ collide on the same key
|
||||
-- (mirrors 'SimilarLinks.normaliseUrl'); percent-decode the path so that
|
||||
-- @\/essays\/caf%C3%A9@ and @\/essays\/café@ collide on the same key.
|
||||
--
|
||||
-- Both sides of the backlink join go through this function: page keys
|
||||
-- via 'backlinksFieldWith' (@normaliseUrl ("/" ++ route)@) and link
|
||||
-- targets via 'targetKey' — so the two always agree.
|
||||
normaliseUrl :: String -> String
|
||||
normaliseUrl url =
|
||||
let t = T.pack url
|
||||
t1 = fst (T.breakOn "?" (fst (T.breakOn "#" t)))
|
||||
t2 = if T.isPrefixOf "/" t1 then t1 else "/" `T.append` t1
|
||||
t3 = fromMaybe t2 (T.stripSuffix ".html" t2)
|
||||
in percentDecode (T.unpack t3)
|
||||
t3 = fromMaybe t2 (T.stripSuffix "index.html" t2)
|
||||
t4 = fromMaybe t3 (T.stripSuffix ".html" t3)
|
||||
in percentDecode (T.unpack t4)
|
||||
|
||||
-- | Decode percent-escapes (@%XX@) into raw bytes, then re-interpret the
|
||||
-- resulting bytestring as UTF-8. Invalid escapes are passed through
|
||||
|
|
|
|||
|
|
@ -72,6 +72,8 @@ parseBibExtras path = Map.fromList . parseBib <$> readFile' path
|
|||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Enumerate all entries in a .bib file as (citekey, extra) pairs.
|
||||
-- @\@string@ \/ @\@comment@ \/ @\@preamble@ blocks (case-insensitive)
|
||||
-- carry no citekey and are skipped wholesale.
|
||||
parseBib :: String -> [(String, BibExtra)]
|
||||
parseBib input = go (dropTo '@' input)
|
||||
where
|
||||
|
|
@ -81,10 +83,17 @@ parseBib input = go (dropTo '@' input)
|
|||
go [] = []
|
||||
go ('@':rest) =
|
||||
let -- Entry type, then '{', then citekey, then ',', then fields, then '}'.
|
||||
r1 = dropWhile isAlphaNum rest -- skip type name
|
||||
(typeName, r1) = span isAlphaNum rest
|
||||
r2 = dropWhile isSpace r1
|
||||
in case r2 of
|
||||
'{':r3 ->
|
||||
'{':r3
|
||||
-- Not citekey entries: a @string macro name (or the body
|
||||
-- of a @comment/@preamble) must never be parsed as a
|
||||
-- citekey. Skip the balanced brace group and carry on.
|
||||
| map toLower typeName `elem` ["string", "comment", "preamble"] ->
|
||||
let (_, r4) = readBraces 1 "" r3
|
||||
in go (dropTo '@' r4)
|
||||
| otherwise ->
|
||||
let (citekey, r4) = span (\c -> c /= ',' && not (isSpace c)) r3
|
||||
r5 = dropWhile (\c -> c /= ',' && c /= '}') r4
|
||||
in case r5 of
|
||||
|
|
|
|||
|
|
@ -99,7 +99,12 @@ parseCatalogEntry item = do
|
|||
year = parseYear meta
|
||||
dur = lookupString "duration" meta
|
||||
instr = lookupString "instrumentation" meta
|
||||
cat = fromMaybe "other" (lookupString "category" meta)
|
||||
-- Fold unknown categories into the canonical "other"
|
||||
-- bucket here: two distinct unknown values share a rank
|
||||
-- but would groupBy into separate groups, rendering as
|
||||
-- adjacent duplicate "Other" sections.
|
||||
rawCat = fromMaybe "other" (lookupString "category" meta)
|
||||
cat = if rawCat `elem` categoryOrder then rawCat else "other"
|
||||
return $ Just CatalogEntry
|
||||
{ ceTitle = title
|
||||
, ceUrl = url
|
||||
|
|
|
|||
|
|
@ -9,7 +9,8 @@ module Commonplace
|
|||
import Data.Aeson (FromJSON (..), withObject, (.:), (.:?), (.!=))
|
||||
import Data.List (nub, sortBy)
|
||||
import Data.Ord (comparing, Down (..))
|
||||
import qualified Data.ByteString.Char8 as BS
|
||||
import qualified Data.Text as T
|
||||
import qualified Data.Text.Encoding as TE
|
||||
import qualified Data.Yaml as Y
|
||||
import Hakyll hiding (escapeHtml, renderTags)
|
||||
import Contexts (siteCtx)
|
||||
|
|
@ -140,7 +141,10 @@ loadCommonplace :: Compiler [CPEntry]
|
|||
loadCommonplace = do
|
||||
rawItem <- load (fromFilePath "data/commonplace.yaml") :: Compiler (Item String)
|
||||
let raw = itemBody rawItem
|
||||
case Y.decodeEither' (BS.pack raw) of
|
||||
-- encodeUtf8, not Char8.pack: Char8 truncates each Char to 8 bits,
|
||||
-- silently corrupting any codepoint above 0x7F (same hazard Now.hs
|
||||
-- documents — em-dash 0x2014 would become control char 0x14).
|
||||
case Y.decodeEither' (TE.encodeUtf8 (T.pack raw)) of
|
||||
Left err -> fail ("commonplace.yaml: " ++ show err)
|
||||
Right entries -> return entries
|
||||
|
||||
|
|
|
|||
|
|
@ -22,6 +22,7 @@ module Contexts
|
|||
, recentFirstByDisplay
|
||||
, Revision (..)
|
||||
, getRevisions
|
||||
, isProvedConfidence
|
||||
) where
|
||||
|
||||
import Data.Aeson (Value (..))
|
||||
|
|
@ -86,7 +87,12 @@ affiliationField = listFieldWith "affiliation-links" ctx $ \item -> do
|
|||
let entries = case lookupStringList "affiliation" meta of
|
||||
Just xs -> xs
|
||||
Nothing -> maybe [] (:[]) (lookupString "affiliation" meta)
|
||||
return $ map (Item (fromFilePath "") . parseEntry) entries
|
||||
-- noResult, not an empty list: Hakyll's $if$ treats an empty
|
||||
-- ListField as truthy, so returning [] would render the wrapper
|
||||
-- markup (an empty .meta-affiliation row) on every page.
|
||||
if null entries
|
||||
then noResult "no affiliation"
|
||||
else return $ map (Item (fromFilePath "") . parseEntry) entries
|
||||
where
|
||||
ctx = field "affiliation-name" (return . fst . itemBody)
|
||||
<> field "affiliation-url" (\i -> let u = snd (itemBody i)
|
||||
|
|
@ -170,10 +176,17 @@ pageScriptsField = listFieldWith "page-scripts" ctx $ \item -> do
|
|||
-- | List context field exposing an item's own (non-expanded) tags as
|
||||
-- @tag-name@ / @tag-url@ objects.
|
||||
--
|
||||
-- Fails with 'noResult' when the item has no tags — same discipline
|
||||
-- as the @Excluding@ variants below — so @$if(...)$@ gates are false
|
||||
-- and templates don't emit empty tag-wrapper markup.
|
||||
--
|
||||
-- $for(essay-tags)$<a href="$tag-url$">$tag-name$</a>$endfor$
|
||||
tagLinksField :: String -> Context a
|
||||
tagLinksField fieldName = listFieldWith fieldName ctx $ \item ->
|
||||
map toItem <$> getTags (itemIdentifier item)
|
||||
tagLinksField fieldName = listFieldWith fieldName ctx $ \item -> do
|
||||
ts <- getTags (itemIdentifier item)
|
||||
if null ts
|
||||
then noResult "no tags"
|
||||
else return (map toItem ts)
|
||||
where
|
||||
toItem t = Item (fromFilePath (t ++ "/index.html")) t
|
||||
ctx = field "tag-name" (return . itemBody)
|
||||
|
|
@ -345,7 +358,7 @@ abstractField :: Context String
|
|||
abstractField = field "abstract" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
case lookupString "abstract" meta of
|
||||
Nothing -> fail "no abstract"
|
||||
Nothing -> noResult "no abstract"
|
||||
Just src -> do
|
||||
let pandocResult = runPure $ do
|
||||
doc <- readMarkdown defaultHakyllReaderOptions (T.pack src)
|
||||
|
|
@ -379,7 +392,7 @@ descriptionField :: Context String
|
|||
descriptionField = field "description" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
case lookupString "abstract" meta of
|
||||
Nothing -> fail "no abstract"
|
||||
Nothing -> noResult "no abstract"
|
||||
Just src -> do
|
||||
let pandocResult = runPure $ do
|
||||
doc <- readMarkdown defaultHakyllReaderOptions (T.pack src)
|
||||
|
|
@ -416,7 +429,7 @@ summaryField :: Context String
|
|||
summaryField = field "summary" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
case lookupString "summary" meta of
|
||||
Nothing -> fail "no summary"
|
||||
Nothing -> noResult "no summary"
|
||||
Just src -> do
|
||||
let pandocResult = runPure $ do
|
||||
doc <- readMarkdown defaultHakyllReaderOptions (T.pack src)
|
||||
|
|
@ -462,11 +475,11 @@ bibliographyField = bibContent <> hasCitations
|
|||
where
|
||||
bibContent = field "bibliography" $ \item -> do
|
||||
bib <- itemBody <$> loadSnapshot (itemIdentifier item) "bibliography"
|
||||
if null bib then fail "no bibliography" else return bib
|
||||
if null bib then noResult "no bibliography" else return bib
|
||||
hasCitations = field "has-citations" $ \item -> do
|
||||
bib <- itemBody <$> (loadSnapshot (itemIdentifier item) "bibliography"
|
||||
:: Compiler (Item String))
|
||||
if null bib then fail "no citations" else return "true"
|
||||
if null bib then noResult "no citations" else return "true"
|
||||
|
||||
-- | Further-reading field: loads the further-reading HTML saved by essayCompiler.
|
||||
-- Returns noResult (making $if(further-reading-refs)$ false) when empty.
|
||||
|
|
@ -474,21 +487,25 @@ furtherReadingField :: Context String
|
|||
furtherReadingField = field "further-reading-refs" $ \item -> do
|
||||
fr <- itemBody <$> (loadSnapshot (itemIdentifier item) "further-reading-refs"
|
||||
:: Compiler (Item String))
|
||||
if null fr then fail "no further reading" else return fr
|
||||
if null fr then noResult "no further reading" else return fr
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Epistemic fields
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Render an integer 1–5 frontmatter key as filled/empty dot chars.
|
||||
-- Returns @noResult@ when the key is absent or unparseable.
|
||||
-- Returns @noResult@ when the key is absent, unparseable, or below 1
|
||||
-- (a zero would otherwise render five empty circles); values above 5
|
||||
-- clamp to 5.
|
||||
dotsField :: String -> String -> Context String
|
||||
dotsField ctxKey metaKey = field ctxKey $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
case lookupString metaKey meta >>= readMaybe of
|
||||
Nothing -> fail (ctxKey ++ ": not set")
|
||||
Just (n :: Int) ->
|
||||
let v = max 0 (min 5 n)
|
||||
Nothing -> noResult (ctxKey ++ ": not set")
|
||||
Just (n :: Int)
|
||||
| n < 1 -> noResult (ctxKey ++ ": value below the 1-5 scale")
|
||||
| otherwise ->
|
||||
let v = min 5 n
|
||||
in return (replicate v '\x25CF' ++ replicate (5 - v) '\x25CB')
|
||||
|
||||
-- | @$confidence-trend$@: ↑, ↓, or → derived from the last two entries
|
||||
|
|
@ -513,11 +530,11 @@ confidenceTrendField = field "confidence-trend" $ \item -> do
|
|||
"[Marks] " ++ toFilePath (itemIdentifier item) ++
|
||||
": confidence: proved is incompatible with confidence-history; ignoring history"
|
||||
Nothing -> return ()
|
||||
fail "confidence is proved; trend suppressed"
|
||||
noResult "confidence is proved; trend suppressed"
|
||||
else case lookupStringList "confidence-history" meta of
|
||||
Nothing -> fail "no confidence history"
|
||||
Nothing -> noResult "no confidence history"
|
||||
Just xs -> case lastTwo xs of
|
||||
Nothing -> fail "no confidence history"
|
||||
Nothing -> noResult "no confidence history"
|
||||
Just (prevS, curS) ->
|
||||
let prev = readMaybe prevS :: Maybe Int
|
||||
cur = readMaybe curS :: Maybe Int
|
||||
|
|
@ -583,7 +600,7 @@ overallScoreField = field "overall-score" $ \item -> do
|
|||
+ fromIntegral (ev - 1) / 4.0 * 0.4
|
||||
score = max 0 (min 100 (round (raw * 100.0) :: Int))
|
||||
in return (show score)
|
||||
_ -> fail "overall-score: confidence or evidence not set"
|
||||
_ -> noResult "overall-score: confidence or evidence not set"
|
||||
|
||||
-- | @$confidence$@: numeric override that suppresses the @proved@ /
|
||||
-- @proven@ sentinel. When the frontmatter value is parseable as an
|
||||
|
|
@ -996,7 +1013,7 @@ compositionCtx =
|
|||
hasScoreField = field "has-score" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
let pages = fromMaybe [] (lookupStringList "score-pages" meta)
|
||||
if null pages then fail "no score pages" else return "true"
|
||||
if null pages then noResult "no score pages" else return "true"
|
||||
|
||||
scorePageCountField = field "score-page-count" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
|
|
@ -1014,7 +1031,7 @@ compositionCtx =
|
|||
|
||||
hasMovementsField = field "has-movements" $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
if null (parseMovements meta) then fail "no movements" else return "true"
|
||||
if null (parseMovements meta) then noResult "no movements" else return "true"
|
||||
|
||||
movementsListField = listFieldWith "movements" movCtx $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
|
|
@ -1032,9 +1049,9 @@ compositionCtx =
|
|||
<> field "movement-page" (return . show . movPage . itemBody)
|
||||
<> field "movement-duration" (return . movDuration . itemBody)
|
||||
<> field "movement-audio"
|
||||
(\i -> maybe (fail "no audio") return (movAudio (itemBody i)))
|
||||
(\i -> maybe (noResult "no audio") return (movAudio (itemBody i)))
|
||||
<> field "has-audio"
|
||||
(\i -> maybe (fail "no audio") (const (return "true"))
|
||||
(\i -> maybe (noResult "no audio") (const (return "true"))
|
||||
(movAudio (itemBody i)))
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
|
|||
|
|
@ -30,22 +30,45 @@ import Text.Pandoc.Walk (walk)
|
|||
import ArchiveIndex (ArchiveStatus (..), archiveIndexIsEmpty,
|
||||
archiveSlugFor, archiveStatusForSlug)
|
||||
|
||||
-- | Annotate body links. Headings are left alone — an affordance there
|
||||
-- would be noise. Identity when the index is empty.
|
||||
-- | Annotate body links. Links inside headings are left alone at
|
||||
-- /every/ nesting depth — an affordance there would be noise, and a
|
||||
-- top-level pattern match would miss a @Header@ inside a @Div@ or
|
||||
-- @BlockQuote@. Header links are tagged with a sentinel class before
|
||||
-- the annotation walk and stripped of it afterwards, so the sentinel
|
||||
-- can never leak into the writer. Identity when the index is empty.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply doc@(Pandoc meta blocks)
|
||||
apply doc
|
||||
| archiveIndexIsEmpty = doc
|
||||
| otherwise = Pandoc meta (map annotateBlock blocks)
|
||||
| otherwise =
|
||||
walk unprotectLink . walk annotateInlines . walk protectHeader $ doc
|
||||
|
||||
annotateBlock :: Block -> Block
|
||||
annotateBlock h@Header{} = h
|
||||
annotateBlock b = walk annotateInlines b
|
||||
-- | Sentinel class marking a link the annotation walk must skip. It
|
||||
-- only exists between the protect and unprotect walks inside 'apply'.
|
||||
skipClass :: T.Text
|
||||
skipClass = "archive-header-skip"
|
||||
|
||||
protectHeader :: Block -> Block
|
||||
protectHeader (Header lvl attr ils) = Header lvl attr (walk protect ils)
|
||||
where
|
||||
protect (Link (ident, cls, kvs) text target) =
|
||||
Link (ident, skipClass : cls, kvs) text target
|
||||
protect x = x
|
||||
protectHeader b = b
|
||||
|
||||
unprotectLink :: Inline -> Inline
|
||||
unprotectLink (Link (ident, cls, kvs) text target)
|
||||
| skipClass `elem` cls =
|
||||
Link (ident, filter (/= skipClass) cls, kvs) text target
|
||||
unprotectLink x = x
|
||||
|
||||
-- | For each archived @Link@: flip it if the target is 'Rotted', else
|
||||
-- append the affordance. Non-archived links pass through untouched.
|
||||
-- append the affordance. Non-archived links — and links protected by
|
||||
-- 'protectHeader' — pass through untouched.
|
||||
annotateInlines :: [Inline] -> [Inline]
|
||||
annotateInlines = concatMap expand
|
||||
where
|
||||
expand l@(Link (_, cls, _) _ _)
|
||||
| skipClass `elem` cls = [l]
|
||||
expand l@(Link attr text (url, _)) =
|
||||
case archiveSlugFor url of
|
||||
Nothing -> [l]
|
||||
|
|
|
|||
|
|
@ -12,15 +12,23 @@
|
|||
--
|
||||
-- The file path must be root-relative (begins with @/@).
|
||||
-- PDF.js is expected to be vendored at @/pdfjs/web/viewer.html@.
|
||||
--
|
||||
-- Code protection (honest scope): lines inside /fenced/ code blocks
|
||||
-- are passed through untouched ('Filters.Wikilinks.mapOutsideFences'),
|
||||
-- so fenced examples can show @{{pdf:…}}@ literally. Indented code
|
||||
-- blocks and inline code spans are NOT recognised — a full-line
|
||||
-- directive inside either is still rewritten.
|
||||
module Filters.EmbedPdf (preprocess) where
|
||||
|
||||
import Data.Char (isDigit)
|
||||
import Data.List (isPrefixOf, isSuffixOf)
|
||||
import Filters.Wikilinks (mapOutsideFences)
|
||||
import qualified Utils as U
|
||||
|
||||
-- | Apply PDF-embed substitution to the raw Markdown source string.
|
||||
-- | Apply PDF-embed substitution to the raw Markdown source string,
|
||||
-- skipping lines inside fenced code blocks.
|
||||
preprocess :: String -> String
|
||||
preprocess = unlines . map processLine . lines
|
||||
preprocess = mapOutsideFences processLine
|
||||
|
||||
processLine :: String -> String
|
||||
processLine line =
|
||||
|
|
|
|||
|
|
@ -231,7 +231,7 @@ renderPicture :: Attr -> [Inline] -> Target -> Bool -> Maybe (Int, Int) -> Text
|
|||
renderPicture (ident, classes, kvs) alt (src, title) lightbox dims =
|
||||
T.concat
|
||||
[ "<picture>"
|
||||
, "<source srcset=\"", T.pack webpSrc, "\" type=\"image/webp\">"
|
||||
, "<source srcset=\"", esc (T.pack webpSrc), "\" type=\"image/webp\">"
|
||||
, "<img"
|
||||
, attrId ident
|
||||
, attrClasses classes
|
||||
|
|
|
|||
|
|
@ -16,8 +16,11 @@ import Text.Pandoc.Definition
|
|||
import Text.Pandoc.Walk (walk)
|
||||
|
||||
-- | Apply link classification to the entire document.
|
||||
-- Two passes: PDF links first (rewrites href to viewer URL), then external
|
||||
-- link classification (operates on http/https, so no overlap).
|
||||
-- Two passes: PDF links first (rewrites href to the viewer URL and tags
|
||||
-- the anchor @pdf-link@), then general classification. The second pass
|
||||
-- explicitly skips anchors the PDF pass already claimed — the viewer URL
|
||||
-- is root-relative, so without that guard it would also be classified as
|
||||
-- an internal page link and get double chrome.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply = walk classifyLink . walk classifyPdfLink
|
||||
|
||||
|
|
@ -49,6 +52,11 @@ classifyLink l@(Link (_, classes, _) _ _)
|
|||
-- brand icon stamp, and have their own popup provider. Leave them
|
||||
-- entirely alone.
|
||||
| "source-ref" `elem` classes = l
|
||||
-- PDF links were already rewritten to the (root-relative) viewer URL
|
||||
-- and given their own chrome by 'classifyPdfLink' in the preceding
|
||||
-- pass; without this guard they would be double-classified as
|
||||
-- internal page links.
|
||||
| "pdf-link" `elem` classes = l
|
||||
classifyLink (Link (ident, classes, kvs) ils (url, title))
|
||||
| isExternal url =
|
||||
let icon = domainIcon url
|
||||
|
|
@ -100,8 +108,9 @@ isExternal url =
|
|||
where
|
||||
siteHost = "levineuwirth.org"
|
||||
|
||||
-- | Extract the lowercased hostname from an absolute http(s) URL.
|
||||
-- Returns 'Nothing' for non-http(s) URLs (relative paths, mailto:, etc.).
|
||||
-- | Extract the lowercased hostname from an absolute http(s) URL,
|
||||
-- stripping any userinfo (@user:pass\@@) and port. Returns 'Nothing'
|
||||
-- for non-http(s) URLs (relative paths, mailto:, etc.).
|
||||
extractHost :: Text -> Maybe Text
|
||||
extractHost url
|
||||
| Just rest <- T.stripPrefix "https://" url = Just (hostOf rest)
|
||||
|
|
@ -109,45 +118,60 @@ extractHost url
|
|||
| otherwise = Nothing
|
||||
where
|
||||
hostOf rest =
|
||||
let withPort = T.takeWhile (\c -> c /= '/' && c /= '?' && c /= '#') rest
|
||||
host = T.takeWhile (/= ':') withPort
|
||||
let authority = T.takeWhile (\c -> c /= '/' && c /= '?' && c /= '#') rest
|
||||
-- 'T.breakOnEnd' yields the segment after the last @\@@, or
|
||||
-- the whole authority when there is no userinfo.
|
||||
(_, hostPort) = T.breakOnEnd "@" authority
|
||||
host = T.takeWhile (/= ':') hostPort
|
||||
in T.toLower host
|
||||
|
||||
-- | Icon name for the link, matching a file in /images/link-icons/<name>.svg.
|
||||
--
|
||||
-- Matches on the URL's host only, never on the full URL — a path like
|
||||
-- @https://example.org/why-x.com-failed@ must not get the Twitter
|
||||
-- icon. URLs with no extractable host get the generic icon.
|
||||
domainIcon :: Text -> Text
|
||||
domainIcon url
|
||||
domainIcon url = maybe "external" iconForHost (extractHost url)
|
||||
|
||||
iconForHost :: Text -> Text
|
||||
iconForHost host
|
||||
-- Scholarly / reference
|
||||
| "wikipedia.org" `T.isInfixOf` url = "wikipedia"
|
||||
| "arxiv.org" `T.isInfixOf` url = "arxiv"
|
||||
| "doi.org" `T.isInfixOf` url = "doi"
|
||||
| "worldcat.org" `T.isInfixOf` url = "worldcat"
|
||||
| "orcid.org" `T.isInfixOf` url = "orcid"
|
||||
| "archive.org" `T.isInfixOf` url = "internet-archive"
|
||||
| m "wikipedia.org" = "wikipedia"
|
||||
| m "arxiv.org" = "arxiv"
|
||||
| m "doi.org" = "doi"
|
||||
| m "worldcat.org" = "worldcat"
|
||||
| m "orcid.org" = "orcid"
|
||||
| m "archive.org" = "internet-archive"
|
||||
-- Code / software
|
||||
| "github.com" `T.isInfixOf` url = "github"
|
||||
| "git.levineuwirth.org" `T.isInfixOf` url = "forgejo"
|
||||
| "tensorflow.org" `T.isInfixOf` url = "tensorflow"
|
||||
| m "github.com" = "github"
|
||||
| m "git.levineuwirth.org" = "forgejo"
|
||||
| m "tensorflow.org" = "tensorflow"
|
||||
-- AI companies (consumer products share a brand icon with the lab)
|
||||
| "anthropic.com" `T.isInfixOf` url = "anthropic"
|
||||
| "claude.ai" `T.isInfixOf` url = "anthropic"
|
||||
| "openai.com" `T.isInfixOf` url = "openai"
|
||||
| "chatgpt.com" `T.isInfixOf` url = "openai"
|
||||
| m "anthropic.com" = "anthropic"
|
||||
| m "claude.ai" = "anthropic"
|
||||
| m "openai.com" = "openai"
|
||||
| m "chatgpt.com" = "openai"
|
||||
-- Social / media
|
||||
| "twitter.com" `T.isInfixOf` url = "twitter"
|
||||
| "x.com" `T.isInfixOf` url = "twitter"
|
||||
| "reddit.com" `T.isInfixOf` url = "reddit"
|
||||
| "youtube.com" `T.isInfixOf` url = "youtube"
|
||||
| "youtu.be" `T.isInfixOf` url = "youtube"
|
||||
| "tiktok.com" `T.isInfixOf` url = "tiktok"
|
||||
| "substack.com" `T.isInfixOf` url = "substack"
|
||||
| "news.ycombinator.com" `T.isInfixOf` url = "hacker-news"
|
||||
| "lesswrong.com" `T.isInfixOf` url = "lesswrong"
|
||||
| m "twitter.com" = "twitter"
|
||||
| m "x.com" = "twitter"
|
||||
| m "reddit.com" = "reddit"
|
||||
| m "youtube.com" = "youtube"
|
||||
| m "youtu.be" = "youtube"
|
||||
| m "tiktok.com" = "tiktok"
|
||||
| m "substack.com" = "substack"
|
||||
| m "news.ycombinator.com" = "hacker-news"
|
||||
| m "lesswrong.com" = "lesswrong"
|
||||
-- News
|
||||
| "nytimes.com" `T.isInfixOf` url = "new-york-times"
|
||||
| m "nytimes.com" = "new-york-times"
|
||||
-- Institutions
|
||||
| "nasa.gov" `T.isInfixOf` url = "nasa"
|
||||
| "apple.com" `T.isInfixOf` url = "apple"
|
||||
| m "nasa.gov" = "nasa"
|
||||
| m "apple.com" = "apple"
|
||||
| otherwise = "external"
|
||||
where
|
||||
-- Label-suffix match: the host is the domain itself or a subdomain
|
||||
-- of it. Never fires on a lookalike label (@notx.com@) or on text
|
||||
-- in the path or query.
|
||||
m d = host == d || ("." <> d) `T.isSuffixOf` host
|
||||
|
||||
-- | Percent-encode characters that would break a @?file=@ query-string value.
|
||||
-- Slashes are intentionally left unencoded so root-relative paths remain
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@
|
|||
module Filters.Score (inlineScores) where
|
||||
|
||||
import Control.Exception (IOException, try)
|
||||
import Data.Char (isHexDigit)
|
||||
import Data.Maybe (listToMaybe)
|
||||
import qualified Data.Text as T
|
||||
import qualified Data.Text.IO as TIO
|
||||
|
|
@ -86,25 +87,48 @@ findImagePath blocks = listToMaybe
|
|||
-- | Replace hardcoded black fill/stroke values with @currentColor@ so the
|
||||
-- SVG inherits the CSS @color@ property in both light and dark modes.
|
||||
--
|
||||
-- 6-digit hex patterns are at the bottom of the composition chain
|
||||
-- (applied first) so they are replaced before the 3-digit shorthand,
|
||||
-- preventing partial matches (e.g. @#000@ matching the prefix of @#000000@).
|
||||
-- Quoted attribute forms (@fill="#000"@) are self-delimiting — the
|
||||
-- closing quote bounds the match — so plain 'T.replace' is safe for
|
||||
-- them. Unquoted style-property forms (@fill:#000@) are not: naive
|
||||
-- replacement would also fire on the prefix of a longer hex colour
|
||||
-- (@fill:#000080@ → @fill:currentColor80@, invalid CSS). Those go
|
||||
-- through 'replaceHexColor', which rewrites a match only when it is
|
||||
-- not followed by another hex digit; the boundary check also makes
|
||||
-- the 3-digit/6-digit application order irrelevant.
|
||||
processColors :: T.Text -> T.Text
|
||||
processColors
|
||||
-- 3-digit hex and keyword patterns (applied after 6-digit replacements)
|
||||
-- 3-digit hex and keyword patterns
|
||||
= T.replace "fill=\"#000\"" "fill=\"currentColor\""
|
||||
. T.replace "fill=\"black\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000\"" "stroke=\"currentColor\""
|
||||
. T.replace "stroke=\"black\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000" "fill:currentColor"
|
||||
. replaceHexColor "fill:#000" "fill:currentColor"
|
||||
. T.replace "fill:black" "fill:currentColor"
|
||||
. T.replace "stroke:#000" "stroke:currentColor"
|
||||
. replaceHexColor "stroke:#000" "stroke:currentColor"
|
||||
. T.replace "stroke:black" "stroke:currentColor"
|
||||
-- 6-digit hex patterns (applied first — bottom of the chain)
|
||||
. T.replace "fill=\"#000000\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000000\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000000" "fill:currentColor"
|
||||
. T.replace "stroke:#000000" "stroke:currentColor"
|
||||
. replaceHexColor "fill:#000000" "fill:currentColor"
|
||||
. replaceHexColor "stroke:#000000" "stroke:currentColor"
|
||||
|
||||
-- | 'T.replace' restricted to hex-boundary-terminated matches: an
|
||||
-- occurrence of @needle@ is rewritten only when the character after
|
||||
-- it is not another hex digit, so @fill:#000@ never fires inside the
|
||||
-- longer colours @fill:#0008@, @fill:#000080@, or @fill:#00000080@.
|
||||
replaceHexColor :: T.Text -> T.Text -> T.Text -> T.Text
|
||||
replaceHexColor needle replacement = go
|
||||
where
|
||||
go t =
|
||||
let (pre, rest) = T.breakOn needle t
|
||||
in if T.null rest
|
||||
then pre
|
||||
else
|
||||
let after = T.drop (T.length needle) rest
|
||||
in case T.uncons after of
|
||||
Just (c, _) | isHexDigit c ->
|
||||
pre <> needle <> go after
|
||||
_ -> pre <> replacement <> go after
|
||||
|
||||
buildHtml :: Maybe T.Text -> Maybe T.Text -> T.Text -> T.Text
|
||||
buildHtml mName mCaption svgContent = T.concat
|
||||
|
|
|
|||
|
|
@ -4,12 +4,23 @@
|
|||
--
|
||||
-- Each footnote becomes:
|
||||
-- * A @<sup class="sidenote-ref">@ anchor in the body text.
|
||||
-- * An @<aside class="sidenote">@ immediately following it, containing
|
||||
-- * A @<span class="sidenote">@ immediately following it, containing
|
||||
-- the rendered note content.
|
||||
--
|
||||
-- On wide viewports, sidenotes.css floats asides into the right margin.
|
||||
-- On narrow viewports they are hidden; the standard Pandoc-generated
|
||||
-- @<section class="footnotes">@ at the document end serves as fallback.
|
||||
-- Additionally, every consumed note is re-emitted in a
|
||||
-- @<section class="footnotes">@ appended at the document end. The
|
||||
-- filter swallows Pandoc's own @Note@ inlines, so Pandoc's writer
|
||||
-- never produces that section itself — without this re-emission,
|
||||
-- narrow viewports with JavaScript disabled (where sidenotes.css
|
||||
-- hides @.sidenote@ and sidenotes.js's bottom sheet never runs)
|
||||
-- would lose footnote content entirely.
|
||||
--
|
||||
-- On wide viewports, sidenotes.css floats the spans into the right
|
||||
-- margin and hides @section.footnotes@; on narrow viewports the
|
||||
-- spans are hidden and the section is shown. The in-text anchor
|
||||
-- targets the footnotes item (the only target visible on narrow
|
||||
-- no-JS viewports); sidenotes.js intercepts clicks and pairs
|
||||
-- ref\/note by element id, so the href is purely the no-JS path.
|
||||
module Filters.Sidenotes (apply) where
|
||||
|
||||
import Control.Monad.State.Strict
|
||||
|
|
@ -18,21 +29,58 @@ import Data.Text (Text)
|
|||
import qualified Data.Text as T
|
||||
import Text.Pandoc.Class (runPure)
|
||||
import Text.Pandoc.Definition
|
||||
import Text.Pandoc.Options (WriterOptions)
|
||||
import Text.Pandoc.Options (WriterOptions (..),
|
||||
HTMLMathMethod (KaTeX))
|
||||
import Text.Pandoc.Walk (walkM)
|
||||
import Text.Pandoc.Writers.HTML (writeHtml5String)
|
||||
|
||||
-- | Transform all @Note@ inlines in the document to inline sidenote HTML.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply doc = evalState (walkM convertNote doc) (1 :: Int)
|
||||
-- | Accumulator: next label counter plus collected notes
|
||||
-- (newest-first; reversed before rendering the fallback section).
|
||||
type NoteState = (Int, [(Text, [Block])])
|
||||
|
||||
convertNote :: Inline -> State Int Inline
|
||||
-- | Transform all @Note@ inlines in the document to inline sidenote
|
||||
-- HTML, and append the collected notes as a @section.footnotes@
|
||||
-- fallback block.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply doc =
|
||||
let (Pandoc m blocks, (_, collected)) =
|
||||
runState (walkM convertNote doc) (1, [])
|
||||
notes = reverse collected
|
||||
in Pandoc m $
|
||||
if null notes
|
||||
then blocks
|
||||
else blocks ++ [footnotesSection notes]
|
||||
|
||||
convertNote :: Inline -> State NoteState Inline
|
||||
convertNote (Note blocks) = do
|
||||
n <- get
|
||||
put (n + 1)
|
||||
(n, acc) <- get
|
||||
put (n + 1, (toLabel n, blocks) : acc)
|
||||
return $ RawInline "html" (renderNote n blocks)
|
||||
convertNote x = return x
|
||||
|
||||
-- | The end-of-document fallback list. Letter labels are rendered
|
||||
-- explicitly (an @<ol>@'s automatic numbering would disagree with
|
||||
-- the in-text letters), so the list itself is unstyled.
|
||||
footnotesSection :: [(Text, [Block])] -> Block
|
||||
footnotesSection notes = RawBlock "html" $ T.concat $
|
||||
[ "<section class=\"footnotes\" role=\"doc-endnotes\">"
|
||||
, "<ol class=\"footnotes-list\">"
|
||||
]
|
||||
++ map item notes ++
|
||||
[ "</ol>"
|
||||
, "</section>"
|
||||
]
|
||||
where
|
||||
item (lbl, blocks) = T.concat
|
||||
[ "<li id=\"fn-", lbl, "\" class=\"footnote-item\">"
|
||||
, "<span class=\"footnote-label\" aria-hidden=\"true\">", lbl, "</span>"
|
||||
, blocksToHtml blocks
|
||||
, "<a href=\"#snref-", lbl
|
||||
, "\" class=\"footnote-back\" role=\"doc-backlink\""
|
||||
, " aria-label=\"Back to reference ", lbl, "\">\x21a9\xfe0e</a>"
|
||||
, "</li>"
|
||||
]
|
||||
|
||||
-- | Convert a 1-based counter to a letter label using base-26 expansion
|
||||
-- (Excel-column style): 1→a, 2→b, … 26→z, 27→aa, 28→ab, … 52→az,
|
||||
-- 53→ba, … 702→zz, 703→aaa. Guarantees a unique label per counter so
|
||||
|
|
@ -53,8 +101,14 @@ renderNote n blocks =
|
|||
let inner = blocksToInlineHtml blocks
|
||||
lbl = toLabel n
|
||||
in T.concat
|
||||
-- href targets the footnotes-section item: on narrow no-JS
|
||||
-- viewports that is the only visible rendering of the note
|
||||
-- (the adjacent .sidenote span is display:none there, and on
|
||||
-- wide viewports the note is already visible in the margin).
|
||||
-- sidenotes.js pairs ref/note by id and preventDefaults the
|
||||
-- click, so the href only ever navigates without JS.
|
||||
[ "<sup class=\"sidenote-ref\" id=\"snref-", lbl, "\">"
|
||||
, "<a href=\"#sn-", lbl, "\">", lbl, "</a>"
|
||||
, "<a href=\"#fn-", lbl, "\">", lbl, "</a>"
|
||||
, "</sup>"
|
||||
, "<span class=\"sidenote\" id=\"sn-", lbl, "\">"
|
||||
, "<sup class=\"sidenote-num\">", lbl, "</sup>\x00a0"
|
||||
|
|
@ -84,16 +138,25 @@ blocksToInlineHtml = T.concat . map renderOne
|
|||
renderOne b =
|
||||
blocksToHtml [b]
|
||||
|
||||
-- | Writer options for note bodies. Must agree with the math method in
|
||||
-- 'Compilers.writerOpts' (KaTeX), or math inside a footnote silently
|
||||
-- degrades to the writer default (PlainMath -> italics) and the
|
||||
-- client-side KaTeX pass never sees it. Defined locally because
|
||||
-- importing Compilers from here would create a module cycle
|
||||
-- (Compilers -> Filters -> Filters.Sidenotes).
|
||||
noteWriterOpts :: WriterOptions
|
||||
noteWriterOpts = def { writerHTMLMathMethod = KaTeX "" }
|
||||
|
||||
-- | Render a list of inlines to HTML (no surrounding @<p>@).
|
||||
inlinesToHtml :: [Inline] -> Text
|
||||
inlinesToHtml inlines =
|
||||
case runPure (writeHtml5String (def :: WriterOptions) (Pandoc mempty [Plain inlines])) of
|
||||
case runPure (writeHtml5String noteWriterOpts (Pandoc mempty [Plain inlines])) of
|
||||
Left _ -> T.empty
|
||||
Right t -> t
|
||||
|
||||
-- | Render a list of Pandoc blocks to an HTML fragment via a pure writer run.
|
||||
blocksToHtml :: [Block] -> Text
|
||||
blocksToHtml blocks =
|
||||
case runPure (writeHtml5String (def :: WriterOptions) (Pandoc mempty blocks)) of
|
||||
case runPure (writeHtml5String noteWriterOpts (Pandoc mempty blocks)) of
|
||||
Left _ -> T.empty
|
||||
Right t -> t
|
||||
|
|
|
|||
|
|
@ -14,7 +14,8 @@
|
|||
-- extra filter logic is needed for that case.
|
||||
--
|
||||
-- The filter is /not/ applied inside headings (where Fira Sans uppercase
|
||||
-- text looks intentional) or inside @Code@/@RawInline@ inlines.
|
||||
-- text looks intentional, at any nesting depth — including headings
|
||||
-- inside divs and block quotes) or inside @Code@/@RawInline@ inlines.
|
||||
module Filters.Smallcaps (apply) where
|
||||
|
||||
import Data.Char (isUpper, isAlpha)
|
||||
|
|
@ -25,13 +26,31 @@ import Text.Pandoc.Walk (walk)
|
|||
import qualified Utils as U
|
||||
|
||||
-- | Apply smallcaps detection to paragraph-level content.
|
||||
-- Skips heading blocks to avoid false positives.
|
||||
-- Heading blocks are skipped at /every/ nesting level (a top-level
|
||||
-- pattern match would miss a @Header@ inside a @Div@ or
|
||||
-- @BlockQuote@): each header's @Str@ content is swapped for a
|
||||
-- sentinel 'RawInline' before the wrapping walk and restored
|
||||
-- afterwards, so 'wrapCaps' can never see it, wherever the header
|
||||
-- sits in the block tree.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply (Pandoc meta blocks) = Pandoc meta (map applyBlock blocks)
|
||||
apply = walk restoreStr . walk wrapCaps . walk protectHeader
|
||||
|
||||
applyBlock :: Block -> Block
|
||||
applyBlock b@(Header {}) = b -- leave headings untouched
|
||||
applyBlock b = walk wrapCaps b
|
||||
-- | Sentinel format marking a @Str@ that must not be wrapped. It only
|
||||
-- exists between the protect and restore walks inside 'apply' and
|
||||
-- can never leak into the writer.
|
||||
skipFmt :: Format
|
||||
skipFmt = Format "smallcaps-skip"
|
||||
|
||||
protectHeader :: Block -> Block
|
||||
protectHeader (Header lvl attr ils) = Header lvl attr (walk protectStr ils)
|
||||
where
|
||||
protectStr (Str t) = RawInline skipFmt t
|
||||
protectStr x = x
|
||||
protectHeader b = b
|
||||
|
||||
restoreStr :: Inline -> Inline
|
||||
restoreStr (RawInline fmt t) | fmt == skipFmt = Str t
|
||||
restoreStr x = x
|
||||
|
||||
-- | Wrap an all-caps Str token in an abbr element, preserving any trailing
|
||||
-- punctuation (comma, period, colon, semicolon, closing paren/bracket)
|
||||
|
|
|
|||
|
|
@ -19,12 +19,15 @@
|
|||
-- source-preview rule in 'Site.rules') and renders a
|
||||
-- syntax-highlighted snippet via Prism.
|
||||
--
|
||||
-- Conservative-by-design: the trigger only fires on paths under a
|
||||
-- short whitelist of top-level directories, or a small set of named
|
||||
-- root files. This keeps the parser cheap and avoids false positives
|
||||
-- on words that happen to contain a slash and a dot.
|
||||
-- Conservative-by-design: the trigger only fires on paths the
|
||||
-- @/source/@ serving rule actually publishes ('isServedPath', a
|
||||
-- mirror of @sourcePreviewable@ in 'Site.rules'), or a small set of
|
||||
-- named root files. This keeps the parser cheap, avoids false
|
||||
-- positives on words that happen to contain a slash and a dot, and
|
||||
-- guarantees every wrapped path has a fetchable @/source/…@ copy.
|
||||
module Filters.SourceRefs (apply, isSourcePath, forgejoSourceUrl) where
|
||||
|
||||
import Control.Monad (when)
|
||||
import Data.IORef (IORef, atomicModifyIORef', newIORef, readIORef)
|
||||
import qualified Data.Map.Strict as Map
|
||||
import Data.Text (Text)
|
||||
|
|
@ -94,16 +97,17 @@ classifyExistingLink x = pure x
|
|||
-- Heuristic
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | True when the text looks like a repo-relative path under one of
|
||||
-- the whitelisted directories (or is a whitelisted root file), ends
|
||||
-- in a known source extension, and contains only safe path
|
||||
-- characters. Conservative by design — the goal is no false
|
||||
-- positives on prose that incidentally contains a slash and a dot.
|
||||
-- | True when the text looks like a repo-relative path that the
|
||||
-- @/source/@ serving rule actually publishes (or is a whitelisted
|
||||
-- root file), ends in a known source extension, and contains only
|
||||
-- safe path characters. Conservative by design — the goal is no
|
||||
-- false positives on prose that incidentally contains a slash and a
|
||||
-- dot, and no wrapped path whose popup fetch would 404.
|
||||
isSourcePath :: Text -> Bool
|
||||
isSourcePath t = and
|
||||
[ not (T.null t)
|
||||
, T.all safeChar t
|
||||
, (hasKnownPrefix t && hasKnownExt t) || isKnownRootFile t
|
||||
, (isServedPath t && hasKnownExt t) || isKnownRootFile t
|
||||
]
|
||||
where
|
||||
safeChar c =
|
||||
|
|
@ -112,11 +116,26 @@ isSourcePath t = and
|
|||
|| ('0' <= c && c <= '9')
|
||||
|| c == '/' || c == '.' || c == '_' || c == '-' || c == '+'
|
||||
|
||||
hasKnownPrefix :: Text -> Bool
|
||||
hasKnownPrefix t = any (`T.isPrefixOf` t)
|
||||
[ "build/", "static/", "templates/", "tools/"
|
||||
, "nginx/", "data/", "content/", "yaml-source/"
|
||||
-- | Mirror of the @sourcePreviewable@ whitelist in 'Site.rules' (the
|
||||
-- rule that copies files to @/source/<path>@) — the two must stay
|
||||
-- aligned so every link this filter emits has a corresponding
|
||||
-- @/source/…@ target for the popup to fetch. Directories Site.hs
|
||||
-- does not serve (e.g. @content/@) are deliberately absent here:
|
||||
-- wrapping them would emit popups that are guaranteed to 404.
|
||||
isServedPath :: Text -> Bool
|
||||
isServedPath t = or
|
||||
[ "build/" `T.isPrefixOf` t && hasExt ".hs"
|
||||
, "static/js/" `T.isPrefixOf` t
|
||||
, "static/css/" `T.isPrefixOf` t
|
||||
, "templates/" `T.isPrefixOf` t
|
||||
, "tools/" `T.isPrefixOf` t && (hasExt ".sh" || hasExt ".py")
|
||||
, "nginx/" `T.isPrefixOf` t && hasExt ".conf"
|
||||
, "data/" `T.isPrefixOf` t
|
||||
&& not ("/" `T.isInfixOf` T.drop 5 t) -- top-level data files only
|
||||
&& (hasExt ".json" || hasExt ".yaml" || hasExt ".md" || hasExt ".bib")
|
||||
]
|
||||
where
|
||||
hasExt e = e `T.isSuffixOf` T.toLower t
|
||||
|
||||
hasKnownExt :: Text -> Bool
|
||||
hasKnownExt t =
|
||||
|
|
@ -125,7 +144,7 @@ hasKnownExt t =
|
|||
[ ".hs", ".js", ".mjs", ".css", ".html"
|
||||
, ".py", ".cabal", ".md", ".yaml", ".yml"
|
||||
, ".toml", ".sh", ".bash", ".svg", ".conf"
|
||||
, ".json", ".ini", ".tex"
|
||||
, ".json", ".ini", ".tex", ".bib"
|
||||
]
|
||||
|
||||
isKnownRootFile :: Text -> Bool
|
||||
|
|
@ -142,14 +161,19 @@ isKnownRootFile t = t `elem`
|
|||
-- File existence cache
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Process-wide memo of @doesFileExist@ results, keyed by the same
|
||||
-- path the popup will fetch. Hakyll runs this filter once per
|
||||
-- compiled page and the same source-file references recur across
|
||||
-- | Process-wide memo of /positive/ @doesFileExist@ results, keyed by
|
||||
-- the same path the popup will fetch. Hakyll runs this filter once
|
||||
-- per compiled page and the same source-file references recur across
|
||||
-- many pages (e.g. @build\/Filters\/Links.hs@ in the Links page,
|
||||
-- the Colophon, several essays); the cache turns N stats into one
|
||||
-- per distinct path. The build process's working directory is the
|
||||
-- project root, so the path can be passed straight to
|
||||
-- 'doesFileExist' without prefixing.
|
||||
-- per distinct path. Only existence is memoized: a missing file is
|
||||
-- re-stat'ed on every miss, so a source file created during a
|
||||
-- long-lived @make watch@ session is picked up on the next rebuild
|
||||
-- instead of staying "absent" for the process lifetime. (A file
|
||||
-- /deleted/ mid-watch stays cached as present until restart — the
|
||||
-- benign direction: the popup fetch 404s and simply never appears.)
|
||||
-- The build process's working directory is the project root, so the
|
||||
-- path can be passed straight to 'doesFileExist' without prefixing.
|
||||
{-# NOINLINE existsCacheRef #-}
|
||||
existsCacheRef :: IORef (Map.Map Text Bool)
|
||||
existsCacheRef = unsafePerformIO (newIORef Map.empty)
|
||||
|
|
@ -161,6 +185,7 @@ existsCached path = do
|
|||
Just b -> pure b
|
||||
Nothing -> do
|
||||
b <- doesFileExist (T.unpack path)
|
||||
when b $
|
||||
atomicModifyIORef' existsCacheRef (\m -> (Map.insert path b m, ()))
|
||||
pure b
|
||||
|
||||
|
|
|
|||
|
|
@ -5,7 +5,13 @@
|
|||
-- HTML placeholders that transclude.js resolves at runtime.
|
||||
--
|
||||
-- A directive must be the sole content of a line (after trimming) to be
|
||||
-- replaced — this prevents accidental substitution inside prose or code.
|
||||
-- replaced — this prevents accidental substitution inside prose.
|
||||
--
|
||||
-- Code protection (honest scope): lines inside /fenced/ code blocks
|
||||
-- are passed through untouched ('Filters.Wikilinks.mapOutsideFences'),
|
||||
-- so fenced examples can show @{{slug}}@ literally. Indented code
|
||||
-- blocks and inline code spans are NOT recognised — a full-line
|
||||
-- directive inside either is still rewritten.
|
||||
--
|
||||
-- Examples:
|
||||
-- {{my-essay}} → full-page transclusion of /my-essay.html
|
||||
|
|
@ -14,11 +20,13 @@
|
|||
module Filters.Transclusion (preprocess) where
|
||||
|
||||
import Data.List (isSuffixOf, isPrefixOf, stripPrefix)
|
||||
import Filters.Wikilinks (mapOutsideFences)
|
||||
import qualified Utils as U
|
||||
|
||||
-- | Apply transclusion substitution to the raw Markdown source string.
|
||||
-- | Apply transclusion substitution to the raw Markdown source string,
|
||||
-- skipping lines inside fenced code blocks.
|
||||
preprocess :: String -> String
|
||||
preprocess = unlines . map processLine . lines
|
||||
preprocess = mapOutsideFences processLine
|
||||
|
||||
processLine :: String -> String
|
||||
processLine line =
|
||||
|
|
|
|||
|
|
@ -37,6 +37,7 @@
|
|||
module Filters.Viz (inlineViz) where
|
||||
|
||||
import Control.Exception (IOException, catch)
|
||||
import Data.Char (isHexDigit)
|
||||
import Data.Maybe (fromMaybe)
|
||||
import qualified Data.Text as T
|
||||
import System.Directory (doesFileExist)
|
||||
|
|
@ -117,20 +118,47 @@ runScript baseDir attrs =
|
|||
|
||||
-- | Replace hardcoded black fill/stroke values with @currentColor@ so the
|
||||
-- embedded SVG inherits the CSS text colour in both light and dark modes.
|
||||
--
|
||||
-- Quoted attribute forms (@fill="#000"@) are self-delimiting — the
|
||||
-- closing quote bounds the match — so plain 'T.replace' is safe for
|
||||
-- them. Unquoted style-property forms (@fill:#000@) are not: naive
|
||||
-- replacement would also fire on the prefix of a longer hex colour
|
||||
-- (@fill:#000080@ → @fill:currentColor80@, invalid CSS). Those go
|
||||
-- through 'replaceHexColor', which rewrites a match only when it is
|
||||
-- not followed by another hex digit.
|
||||
processColors :: T.Text -> T.Text
|
||||
processColors
|
||||
= T.replace "fill=\"#000\"" "fill=\"currentColor\""
|
||||
. T.replace "fill=\"black\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000\"" "stroke=\"currentColor\""
|
||||
. T.replace "stroke=\"black\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000" "fill:currentColor"
|
||||
. replaceHexColor "fill:#000" "fill:currentColor"
|
||||
. T.replace "fill:black" "fill:currentColor"
|
||||
. T.replace "stroke:#000" "stroke:currentColor"
|
||||
. replaceHexColor "stroke:#000" "stroke:currentColor"
|
||||
. T.replace "stroke:black" "stroke:currentColor"
|
||||
. T.replace "fill=\"#000000\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000000\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000000" "fill:currentColor"
|
||||
. T.replace "stroke:#000000" "stroke:currentColor"
|
||||
. replaceHexColor "fill:#000000" "fill:currentColor"
|
||||
. replaceHexColor "stroke:#000000" "stroke:currentColor"
|
||||
|
||||
-- | 'T.replace' restricted to hex-boundary-terminated matches: an
|
||||
-- occurrence of @needle@ is rewritten only when the character after
|
||||
-- it is not another hex digit, so @fill:#000@ never fires inside the
|
||||
-- longer colours @fill:#0008@, @fill:#000080@, or @fill:#00000080@.
|
||||
-- (Mirrors 'Filters.Score.replaceHexColor'.)
|
||||
replaceHexColor :: T.Text -> T.Text -> T.Text -> T.Text
|
||||
replaceHexColor needle replacement = go
|
||||
where
|
||||
go t =
|
||||
let (pre, rest) = T.breakOn needle t
|
||||
in if T.null rest
|
||||
then pre
|
||||
else
|
||||
let after = T.drop (T.length needle) rest
|
||||
in case T.uncons after of
|
||||
Just (c, _) | isHexDigit c ->
|
||||
pre <> needle <> go after
|
||||
_ -> pre <> replacement <> go after
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- JSON safety for <script> embedding
|
||||
|
|
|
|||
|
|
@ -12,23 +12,129 @@
|
|||
-- replaced with hyphens, non-alphanumeric characters stripped, and
|
||||
-- a @.html@ suffix appended so the link resolves identically under
|
||||
-- the dev server, file:// previews, and nginx in production.
|
||||
module Filters.Wikilinks (preprocess) where
|
||||
--
|
||||
-- Code protection (honest scope): lines inside /fenced/ code blocks
|
||||
-- are passed through untouched (see 'mapOutsideFences'), and within a
|
||||
-- line, inline code spans (backtick runs, CommonMark equal-length
|
||||
-- matching) are skipped — so both fenced and @`inline`@ examples can
|
||||
-- show @[[…]]@ literally. Indented code blocks and code spans that
|
||||
-- cross a line break are NOT recognised; a wikilink inside those is
|
||||
-- still rewritten.
|
||||
module Filters.Wikilinks (preprocess, mapOutsideFences) where
|
||||
|
||||
import Data.Char (isAlphaNum, toLower, isSpace)
|
||||
import Data.List (intercalate)
|
||||
import qualified Utils as U
|
||||
|
||||
-- | Scan the raw Markdown source for @[[…]]@ wikilinks and replace them
|
||||
-- with standard Markdown link syntax.
|
||||
-- with standard Markdown link syntax. Processing is line-by-line and
|
||||
-- skips fenced code blocks; a wikilink therefore cannot span a line
|
||||
-- break (which was never a sensible authoring form).
|
||||
preprocess :: String -> String
|
||||
preprocess [] = []
|
||||
preprocess ('[':'[':rest) =
|
||||
preprocess = mapOutsideFences replaceWikilinks
|
||||
|
||||
replaceWikilinks :: String -> String
|
||||
replaceWikilinks = go
|
||||
where
|
||||
go [] = []
|
||||
-- Inline code span: a backtick run opens a span closed by a run of
|
||||
-- exactly the same length (CommonMark). Its body passes through
|
||||
-- verbatim so documentation can quote @`[[…]]`@ literally. An
|
||||
-- unclosed run is literal text — and then a following @[[…]]@ is
|
||||
-- genuinely a wikilink, matching how Pandoc will read the line.
|
||||
go s@('`':_) =
|
||||
let (run, afterRun) = span (== '`') s
|
||||
in case codeSpan (length run) afterRun of
|
||||
Just (body, after) -> run ++ body ++ run ++ go after
|
||||
Nothing -> run ++ go afterRun
|
||||
go ('[':'[':rest) =
|
||||
case break (== ']') rest of
|
||||
(inner, ']':']':after)
|
||||
| not (null inner) ->
|
||||
toMarkdownLink inner ++ preprocess after
|
||||
_ -> '[' : '[' : preprocess rest
|
||||
preprocess (c:rest) = c : preprocess rest
|
||||
toMarkdownLink inner ++ go after
|
||||
_ -> '[' : '[' : go rest
|
||||
go (c:rest) = c : go rest
|
||||
|
||||
-- @codeSpan n s@: the span body and the remainder after a closing
|
||||
-- run of exactly @n@ backticks; 'Nothing' when no closer exists on
|
||||
-- this line.
|
||||
codeSpan :: Int -> String -> Maybe (String, String)
|
||||
codeSpan n = loop
|
||||
where
|
||||
loop [] = Nothing
|
||||
loop s@('`':_) =
|
||||
let (run, rest) = span (== '`') s
|
||||
in if length run == n
|
||||
then Just ("", rest)
|
||||
else prepend run <$> loop rest
|
||||
loop (c:cs) = prepend [c] <$> loop cs
|
||||
prepend pre (body, after) = (pre ++ body, after)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Fence-aware line mapping (shared by all source-level preprocessors)
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Apply a line transformation to every line that is not part of a
|
||||
-- fenced code block. Shared by the three source-level preprocessors
|
||||
-- (wikilinks here, 'Filters.Transclusion', 'Filters.EmbedPdf') so
|
||||
-- their directive syntax can be quoted literally inside fenced code.
|
||||
--
|
||||
-- Fence tracking follows CommonMark: an opener is at most three
|
||||
-- spaces of indentation followed by a run of at least three backticks
|
||||
-- or tildes (longer runs allowed); for backtick fences the info
|
||||
-- string may not contain a backtick. The closer uses the same fence
|
||||
-- character, a run at least as long as the opener, and nothing but
|
||||
-- whitespace after it. An unclosed fence extends to the end of the
|
||||
-- document. Fence delimiter lines themselves pass through untouched.
|
||||
--
|
||||
-- Honest scope: only /fenced/ code blocks are protected. Indented
|
||||
-- code blocks and inline code spans are not recognised here — a
|
||||
-- directive inside either is still rewritten.
|
||||
mapOutsideFences :: (String -> String) -> String -> String
|
||||
mapOutsideFences f = unlines . go Nothing . lines
|
||||
where
|
||||
go _ [] = []
|
||||
go Nothing (l:ls) =
|
||||
case openingFence l of
|
||||
Just fence -> l : go (Just fence) ls
|
||||
Nothing -> f l : go Nothing ls
|
||||
go st@(Just fence) (l:ls)
|
||||
| closesFence fence l = l : go Nothing ls
|
||||
| otherwise = l : go st ls
|
||||
|
||||
-- | The fence character and run length of a CommonMark fence opener,
|
||||
-- or 'Nothing' when the line does not open a fence.
|
||||
openingFence :: String -> Maybe (Char, Int)
|
||||
openingFence l = do
|
||||
rest <- stripFenceIndent l
|
||||
case rest of
|
||||
(c:_) | c == '`' || c == '~' ->
|
||||
let run = takeWhile (== c) rest
|
||||
n = length run
|
||||
info = drop n rest
|
||||
in if n >= 3 && (c == '~' || '`' `notElem` info)
|
||||
then Just (c, n)
|
||||
else Nothing
|
||||
_ -> Nothing
|
||||
|
||||
-- | True when the line closes the fence opened by @(c, n)@: the same
|
||||
-- fence character, a run at least as long as the opener, and only
|
||||
-- whitespace after it.
|
||||
closesFence :: (Char, Int) -> String -> Bool
|
||||
closesFence (c, n) l =
|
||||
case stripFenceIndent l of
|
||||
Nothing -> False
|
||||
Just rest ->
|
||||
let run = takeWhile (== c) rest
|
||||
in length run >= n && all isSpace (drop (length run) rest)
|
||||
|
||||
-- | Strip up to three leading spaces (the indentation CommonMark allows
|
||||
-- on a fence line); 'Nothing' for four or more, which would be an
|
||||
-- indented code block rather than a fence.
|
||||
stripFenceIndent :: String -> Maybe String
|
||||
stripFenceIndent l =
|
||||
let (indent, rest) = span (== ' ') l
|
||||
in if length indent <= 3 then Just rest else Nothing
|
||||
|
||||
-- | Convert the inner content of @[[…]]@ to a Markdown link.
|
||||
--
|
||||
|
|
|
|||
|
|
@ -230,7 +230,7 @@ data EpistemicData = EpistemicData
|
|||
, epPeerStatus :: Maybe String -- ^ Validated peer-status slug ('Nothing' when absent / unreviewed / invalid).
|
||||
, epResultShape :: Maybe String -- ^ Validated result-shape value.
|
||||
, epStability :: String -- ^ Always one of the five stability labels.
|
||||
, epTrust :: Int -- ^ Trust score 0–100 (60/40 weighted; @proved@ substitutes 100 for confidence).
|
||||
, epTrust :: Maybe Int -- ^ Trust score 0–100 (60/40 weighted; @proved@ substitutes 100 for confidence). 'Nothing' when confidence or evidence is missing — no label is rendered.
|
||||
}
|
||||
|
||||
-- | Read the figure inputs from a Hakyll item's metadata + git history.
|
||||
|
|
@ -267,15 +267,16 @@ readEpistemicData item = do
|
|||
trimS = trim'
|
||||
|
||||
-- | Trust score: the same 60/40 weighted composite of confidence and
|
||||
-- evidence used by 'Contexts.overallScoreField'. Returns 0 when either
|
||||
-- input is missing — which is fine for the figure (the polygon and
|
||||
-- trust label simply collapse to the bare frame).
|
||||
computeTrust :: Maybe Int -> Maybe Int -> Int
|
||||
-- evidence used by 'Contexts.overallScoreField'. Returns 'Nothing'
|
||||
-- when either input is missing — the figure then renders no trust
|
||||
-- label at all (it collapses to the bare frame), rather than a
|
||||
-- literal "0" indistinguishable from an authored zero score.
|
||||
computeTrust :: Maybe Int -> Maybe Int -> Maybe Int
|
||||
computeTrust (Just c) (Just e) =
|
||||
let raw :: Double
|
||||
raw = fromIntegral c / 100.0 * 0.6 + fromIntegral (e - 1) / 4.0 * 0.4
|
||||
in max 0 (min 100 (round (raw * 100.0)))
|
||||
computeTrust _ _ = 0
|
||||
in Just (max 0 (min 100 (round (raw * 100.0))))
|
||||
computeTrust _ _ = Nothing
|
||||
|
||||
-- | Same predicate as 'Contexts.isProvedConfidence' — local copy to keep
|
||||
-- the module's dependency graph light (Marks → Stability only). The
|
||||
|
|
@ -390,15 +391,16 @@ renderEpistemicFigure d = T.concat
|
|||
[ "<svg xmlns=\"http://www.w3.org/2000/svg\""
|
||||
, " viewBox=\"0 0 200 200\""
|
||||
, " role=\"img\""
|
||||
, " aria-label=\"Epistemic figure: trust ", T.pack (show (epTrust d))
|
||||
, ", stability ", T.pack (epStability d), "\">"
|
||||
, " aria-label=\"Epistemic figure: "
|
||||
, maybe "" (\t -> "trust " <> T.pack (show t) <> ", ") (epTrust d)
|
||||
, "stability ", T.pack (epStability d), "\">"
|
||||
, renderRoundel
|
||||
, renderGuides
|
||||
, renderAxes
|
||||
, renderPolygon d
|
||||
, renderVertexMarks d
|
||||
, renderTicks (epStability d) (epPeerStatus d)
|
||||
, renderTrustLabel (epTrust d)
|
||||
, maybe "" renderTrustLabel (epTrust d)
|
||||
, renderResultShape (epResultShape d) (epTrust d)
|
||||
, "</svg>"
|
||||
]
|
||||
|
|
@ -578,10 +580,11 @@ renderTrustLabel score = T.concat
|
|||
, " opacity=\"0.7\">TRUST</text>"
|
||||
]
|
||||
|
||||
-- | Result-shape glyph immediately to the right of the trust score.
|
||||
renderResultShape :: Maybe String -> Int -> T.Text
|
||||
-- | Result-shape glyph immediately to the right of the trust score —
|
||||
-- or centred in its place when no trust score is rendered.
|
||||
renderResultShape :: Maybe String -> Maybe Int -> T.Text
|
||||
renderResultShape Nothing _ = ""
|
||||
renderResultShape (Just shape) score =
|
||||
renderResultShape (Just shape) mScore =
|
||||
let glyph = case shape of
|
||||
"positive" -> "+"
|
||||
"negative" -> "\x2212" -- minus sign (not hyphen-minus)
|
||||
|
|
@ -589,15 +592,20 @@ renderResultShape (Just shape) score =
|
|||
"comparative" -> "\x223C" -- ∼
|
||||
"descriptive" -> "\x25A1" -- □
|
||||
_ -> ""
|
||||
-- Offset proportional to the trust number's width (digits ≈ 8 px each).
|
||||
digitCount = length (show score)
|
||||
-- Offset proportional to the trust number's width (digits ≈ 8 px
|
||||
-- each); with no trust label the glyph takes the centre itself.
|
||||
(x, anchor) = case mScore of
|
||||
Just score ->
|
||||
let digitCount = length (show score)
|
||||
offset = fromIntegral digitCount * 4.5 + 3 :: Double
|
||||
in (fxCenter + offset, "start")
|
||||
Nothing -> (fxCenter, "middle")
|
||||
in if T.null (T.pack glyph)
|
||||
then ""
|
||||
else T.concat
|
||||
[ "<text x=\"", ff (fxCenter + offset)
|
||||
[ "<text x=\"", ff x
|
||||
, "\" y=\"", ff (fyCenter + 4)
|
||||
, "\" text-anchor=\"start\""
|
||||
, "\" text-anchor=\"", anchor, "\""
|
||||
, " fill=\"currentColor\" stroke=\"none\""
|
||||
, " font-family=\"Spectral, serif\" font-size=\"16\">"
|
||||
, T.pack glyph
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ module Pagination
|
|||
) where
|
||||
|
||||
import Hakyll
|
||||
import Patterns (blogPattern)
|
||||
|
||||
|
||||
-- | Items per page across most paginated lists (e.g. the blog).
|
||||
|
|
@ -39,7 +40,7 @@ blogPageId n = fromFilePath $ "blog/page/" ++ show n ++ "/index.html"
|
|||
-- @baseCtx@: site-level context (siteCtx).
|
||||
blogPaginateRules :: Context String -> Context String -> Rules ()
|
||||
blogPaginateRules itemCtx baseCtx = do
|
||||
paginate <- buildPaginateWith sortAndGroup ("content/blog/*.md" .&&. hasNoVersion) blogPageId
|
||||
paginate <- buildPaginateWith sortAndGroup (blogPattern .&&. hasNoVersion) blogPageId
|
||||
paginateRules paginate $ \pageNum pat -> do
|
||||
route idRoute
|
||||
compile $ do
|
||||
|
|
|
|||
|
|
@ -122,7 +122,14 @@ allWritings :: Pattern
|
|||
allWritings = essayPattern .||. blogPattern .||. poetryPattern .||. fictionPattern
|
||||
|
||||
-- | Every content file the backlinks pass should index. Includes music
|
||||
-- landing pages and top-level standalone pages, in addition to writings.
|
||||
-- landing pages and top-level standalone pages, in addition to writings,
|
||||
-- plus the two directory-form standalone essays (@content/me/index.md@
|
||||
-- and @content/memento-mori/index.md@) — full essays rendered with
|
||||
-- backlinks, whose outgoing links must be visible to the link graph.
|
||||
--
|
||||
-- Photography is deliberately excluded: photo pages do not render the
|
||||
-- backlinks block (see 'Contexts.photographyCtx'), and caption-scale
|
||||
-- entries would add link-graph noise with no consuming surface.
|
||||
allContent :: Pattern
|
||||
allContent =
|
||||
essayPattern
|
||||
|
|
@ -131,6 +138,8 @@ allContent =
|
|||
.||. fictionPattern
|
||||
.||. musicPattern
|
||||
.||. standalonePagesPattern
|
||||
.||. "content/me/index.md"
|
||||
.||. "content/memento-mori/index.md"
|
||||
|
||||
-- | Content shown on author index pages — essays + blog posts.
|
||||
-- (Poetry and fiction have their own dedicated indexes and are not
|
||||
|
|
|
|||
|
|
@ -27,7 +27,7 @@ import Data.Maybe (mapMaybe, fromMaybe, catMaybes)
|
|||
import qualified Data.Set as Set
|
||||
import Data.Set (Set)
|
||||
import Data.Ord (Down (..), comparing)
|
||||
import System.FilePath (takeDirectory, takeFileName, replaceExtension)
|
||||
import System.FilePath (takeBaseName, takeDirectory, takeFileName, replaceExtension)
|
||||
import qualified Data.Aeson as Aeson
|
||||
import Data.Aeson (Value (..), (.=))
|
||||
import qualified Data.Aeson.KeyMap as KM
|
||||
|
|
@ -305,10 +305,11 @@ stripIndexHtml r
|
|||
-- * @exact@: 4 decimal places (~10 m)
|
||||
-- * @km@ : 2 decimal places (~1 km)
|
||||
-- * @city@ : 1 decimal place (~10 km) — default
|
||||
-- * other : treated as @city@
|
||||
-- * other : treated as @city@ (defensive only — 'buildPin' validates
|
||||
-- the precision and fails closed before consulting this function)
|
||||
--
|
||||
-- @hidden@ is handled at the call site by skipping the pin entirely;
|
||||
-- this function is not consulted in that case.
|
||||
-- @hidden@ and unrecognised values are handled at the call site by
|
||||
-- skipping the pin entirely; this function is not consulted then.
|
||||
roundCoord :: String -> Double -> Double
|
||||
roundCoord prec x =
|
||||
let n = case prec of
|
||||
|
|
@ -336,7 +337,10 @@ parseGeo meta = case KM.lookup "geo" meta of
|
|||
-- | Build a single pin object from a photo entry. Returns 'Nothing'
|
||||
-- when:
|
||||
-- * the entry has no @geo:@ frontmatter, or
|
||||
-- * it has @geo-precision: hidden@, or
|
||||
-- * @geo-precision:@ is anything other than @exact@/@km@/@city@ —
|
||||
-- @hidden@ and unrecognised values (typos, wrong case) alike.
|
||||
-- Failing closed means a typo'd \"hidden\" can never publish
|
||||
-- coordinates the author meant to suppress.
|
||||
-- * the entry has no resolvable route (shouldn't happen for
|
||||
-- photographyPattern items, but be defensive).
|
||||
buildPin :: Item String -> Compiler (Maybe Value)
|
||||
|
|
@ -345,13 +349,21 @@ buildPin item = do
|
|||
meta <- getMetadata ident
|
||||
mRoute <- getRoute ident
|
||||
case (parseGeo meta, lookupString "geo-precision" meta, mRoute) of
|
||||
(_, Just "hidden", _) -> return Nothing
|
||||
(Just (lat, lon), prec, Just r) ->
|
||||
(Just (lat, lon), prec, Just r)
|
||||
| maybe True (`elem` ["exact", "km", "city"]) prec ->
|
||||
let prec' = fromMaybe "city" prec
|
||||
rLat = roundCoord prec' lat
|
||||
rLon = roundCoord prec' lon
|
||||
fp = toFilePath ident
|
||||
slug = takeFileName (takeDirectory fp)
|
||||
-- Directory entries (<slug>/index.md) and series children
|
||||
-- (<series>/<photo>.md) both key assets off the parent
|
||||
-- directory; a flat single (content/photography/foo.md)
|
||||
-- has no entry directory, so its slug is its basename and
|
||||
-- its co-located assets route to /photography/ directly.
|
||||
isFlat = takeDirectory fp == "content/photography"
|
||||
&& takeFileName fp /= "index.md"
|
||||
slug = if isFlat then takeBaseName fp
|
||||
else takeFileName (takeDirectory fp)
|
||||
title = fromMaybe slug (lookupString "title" meta)
|
||||
photo = lookupString "photo" meta
|
||||
-- Trim trailing "index.html" so the click-through URL
|
||||
|
|
@ -359,7 +371,8 @@ buildPin item = do
|
|||
url = "/" ++ stripIndexHtml r
|
||||
thumb = case photo of
|
||||
Just p | not (null p) ->
|
||||
"/photography/" ++ slug ++ "/" ++ p
|
||||
if isFlat then "/photography/" ++ p
|
||||
else "/photography/" ++ slug ++ "/" ++ p
|
||||
_ -> ""
|
||||
captured = lookupString "captured" meta
|
||||
in return $ Just $ Aeson.object $
|
||||
|
|
@ -443,13 +456,20 @@ photographyFeedDescription = field "description" $ \item -> do
|
|||
body <- itemBody <$> (loadSnapshot ident "content" :: Compiler (Item String))
|
||||
meta <- getMetadata ident
|
||||
let fp = toFilePath ident
|
||||
isDir = takeFileName fp == "index.md"
|
||||
-- Same asset-path derivation as 'buildPin': directory entries
|
||||
-- (<slug>/index.md) and series children (<series>/<photo>.md)
|
||||
-- both key assets off the parent directory; a flat single
|
||||
-- (content/photography/foo.md) has no entry directory, so its
|
||||
-- co-located assets route to /photography/ directly.
|
||||
isFlat = takeDirectory fp == "content/photography"
|
||||
&& takeFileName fp /= "index.md"
|
||||
slug = takeFileName (takeDirectory fp)
|
||||
photo = lookupString "photo" meta
|
||||
imgTag = case (isDir, photo) of
|
||||
(True, Just p) | not (null p) ->
|
||||
"<p><img src=\"https://levineuwirth.org/photography/"
|
||||
++ slug ++ "/" ++ p ++ "\" alt=\"\"></p>\n"
|
||||
imgTag = case lookupString "photo" meta of
|
||||
Just p | not (null p) ->
|
||||
let src = if isFlat then "/photography/" ++ p
|
||||
else "/photography/" ++ slug ++ "/" ++ p
|
||||
in "<p><img src=\"https://levineuwirth.org"
|
||||
++ src ++ "\" alt=\"\"></p>\n"
|
||||
_ -> ""
|
||||
return (imgTag ++ body)
|
||||
|
||||
|
|
|
|||
|
|
@ -49,7 +49,8 @@ instance Aeson.FromJSON SimilarEntry where
|
|||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Maximum entries rendered in the "Related" block. The on-disk JSON may
|
||||
-- contain more (embed.py's TOP_N); the template caps the display.
|
||||
-- contain more (embed.py's TOP_N); 'similarLinksField' caps the list
|
||||
-- (@take maxSimilar@) before rendering.
|
||||
maxSimilar :: Int
|
||||
maxSimilar = 3
|
||||
|
||||
|
|
@ -101,10 +102,10 @@ normaliseUrl url =
|
|||
|
||||
-- | Percent-decode @%XX@ escapes (UTF-8) so percent-encoded paths
|
||||
-- collide with their decoded form on map lookup. Mirrors
|
||||
-- 'Backlinks.percentDecode'; the two implementations are intentionally
|
||||
-- duplicated because they apply different normalisations *before*
|
||||
-- decoding (Backlinks strips @.html@ unconditionally; SimilarLinks
|
||||
-- preserves the trailing-slash form for index pages).
|
||||
-- 'Backlinks.percentDecode' (and 'Backlinks.normaliseUrl' now applies
|
||||
-- the same strip-@index.html@-then-@.html@ normalisation as this
|
||||
-- module); the duplication keeps the two modules dependency-free of
|
||||
-- each other.
|
||||
percentDecode :: String -> String
|
||||
percentDecode = T.unpack . TE.decodeUtf8With TE.lenientDecode . BS.pack . go
|
||||
where
|
||||
|
|
@ -121,6 +122,25 @@ percentDecode = T.unpack . TE.decodeUtf8With TE.lenientDecode . BS.pack . go
|
|||
| c >= 'A' && c <= 'F' = Just (fromEnum c - fromEnum 'A' + 10)
|
||||
| otherwise = Nothing
|
||||
|
||||
-- | Percent-encode a string for use as a URI query value: RFC 3986
|
||||
-- unreserved characters pass through; everything else — including @&@,
|
||||
-- @?@, @#@, spaces, and non-ASCII text via its UTF-8 bytes — becomes
|
||||
-- @%XX@. Hand-rolled (the moral equivalent of network-uri's
|
||||
-- @escapeURIString isUnreserved@) because network-uri is not otherwise
|
||||
-- a dependency. The output is also HTML-attribute-safe: it contains
|
||||
-- only unreserved characters and @%XX@ escapes.
|
||||
percentEncode :: String -> String
|
||||
percentEncode = concatMap enc . BS.unpack . TE.encodeUtf8 . T.pack
|
||||
where
|
||||
enc b
|
||||
| unreserved b = [toEnum (fromIntegral b)]
|
||||
| otherwise = ['%', hexDigit (b `div` 16), hexDigit (b `mod` 16)]
|
||||
unreserved b =
|
||||
let c = toEnum (fromIntegral b) :: Char
|
||||
in (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|
||||
|| (c >= '0' && c <= '9') || c `elem` ("-._~" :: String)
|
||||
hexDigit n = "0123456789ABCDEF" !! fromIntegral n
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- HTML rendering
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -153,8 +173,14 @@ renderSimilarLinks entries =
|
|||
++ "</a></li>\n"
|
||||
|
||||
renderPdf se =
|
||||
-- The PDF path becomes the @file=@ query value, so it must be
|
||||
-- percent-encoded (HTML escaping alone leaves @&@/@?@/@#@/spaces
|
||||
-- free to break the query). A @#page=N@ fragment stays a fragment
|
||||
-- of the viewer URL itself — PDF.js reads it from location.hash.
|
||||
let raw = seUrl se
|
||||
viewerUrl = "/pdfjs/web/viewer.html?file=" ++ escapeHtml raw
|
||||
(path, frag) = break (== '#') raw
|
||||
viewerUrl = "/pdfjs/web/viewer.html?file="
|
||||
++ percentEncode path ++ escapeHtml frag
|
||||
in "<li class=\"similar-links-item\">"
|
||||
++ "<a class=\"similar-link pdf-link\""
|
||||
++ " href=\"" ++ viewerUrl ++ "\""
|
||||
|
|
|
|||
210
build/Site.hs
|
|
@ -31,7 +31,7 @@ import Commonplace (commonplaceCtx)
|
|||
import Now (nowCtx)
|
||||
import Contexts (siteCtx, essayCtx, postCtx, pageCtx, poetryCtx, fictionCtx, compositionCtx,
|
||||
contentKindField, recentFirstByDisplay,
|
||||
tagLinksFieldExcludingTopSegment)
|
||||
tagLinksFieldExcludingTopSegment, isProvedConfidence)
|
||||
import qualified Patterns as P
|
||||
import Photography (photographyRules)
|
||||
import Tags (buildAllTags, applyTagRules, sidecarIdentifier,
|
||||
|
|
@ -40,7 +40,7 @@ import Pagination (blogPaginateRules)
|
|||
import Stats (statsRules)
|
||||
|
||||
-- | Home-page portal grid order. Canonical ordering authority for every
|
||||
-- rendering of the eight portals (currently: the home page; future
|
||||
-- rendering of the portals (currently: the home page; future
|
||||
-- consumers follow this list). Each entry is (display name, tag name);
|
||||
-- the tag name is the key to everything else — URL (@/\<tag\>/@),
|
||||
-- sidecar path (@content\/tag-meta\/\<tag\>.md@), and the Tags.hs
|
||||
|
|
@ -73,13 +73,17 @@ libraryShelfMax = 5
|
|||
libraryIntroId :: Identifier
|
||||
libraryIntroId = fromFilePath "content/library.md"
|
||||
|
||||
-- Poems inside collection subdirectories, excluding their index pages.
|
||||
collectionPoems :: Pattern
|
||||
collectionPoems = "content/poetry/*/*.md" .&&. complement "content/poetry/*/index.md"
|
||||
|
||||
-- All poetry content (flat + collection), excluding collection index pages.
|
||||
allPoetry :: Pattern
|
||||
allPoetry = "content/poetry/*.md" .||. collectionPoems
|
||||
-- | Route that strips a literal prefix from the identifier's path.
|
||||
-- Hakyll's 'gsubRoute' replaces /every/ occurrence of its pattern, so
|
||||
-- @gsubRoute "content/"@ would also mangle a co-located directory that
|
||||
-- happened to be named @content@ deeper in the path
|
||||
-- (@content/essays/slug/content/data.csv@ → @essays/slug/data.csv@).
|
||||
-- This touches only the leading occurrence; identifiers that don't
|
||||
-- start with the prefix pass through unchanged.
|
||||
stripPrefixRoute :: String -> Routes
|
||||
stripPrefixRoute prefix = customRoute $ \ident ->
|
||||
let fp = toFilePath ident
|
||||
in fromMaybe fp (stripPrefix prefix fp)
|
||||
|
||||
feedConfig :: FeedConfiguration
|
||||
feedConfig = FeedConfiguration
|
||||
|
|
@ -168,18 +172,18 @@ rules = do
|
|||
-- Per-page JS files — authored alongside content in content/**/*.js.
|
||||
-- Draft JS is handled by a separate dev-only rule below.
|
||||
match ("content/**/*.js" .&&. complement "content/drafts/**") $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- Per-page JS co-located with draft essays (dev-only).
|
||||
when isDev $ match "content/drafts/**/*.js" $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- CSS — must be matched before the broad static/** rule to avoid
|
||||
-- double-matching (compressCssCompiler vs. copyFileCompiler).
|
||||
match "static/css/*" $ do
|
||||
route $ gsubRoute "static/" (const "")
|
||||
route $ stripPrefixRoute "static/"
|
||||
compile compressCssCompiler
|
||||
|
||||
-- All other static files (fonts, JS, images, …). Build-time
|
||||
|
|
@ -192,7 +196,7 @@ rules = do
|
|||
.&&. complement "static/**/*.exif.yaml"
|
||||
.&&. complement "static/**/*.palette.yaml"
|
||||
) $ do
|
||||
route $ gsubRoute "static/" (const "")
|
||||
route $ stripPrefixRoute "static/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- Templates
|
||||
|
|
@ -299,7 +303,7 @@ rules = do
|
|||
|
||||
-- SVG score fragments co-located with me/index.md.
|
||||
match "content/me/scores/*.svg" $ do
|
||||
route $ gsubRoute "content/me/" (const "")
|
||||
route $ stripPrefixRoute "content/me/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- memento-mori/index.md — lives in its own directory so co-located SVG
|
||||
|
|
@ -315,7 +319,7 @@ rules = do
|
|||
|
||||
-- SVG score fragments co-located with memento-mori/index.md.
|
||||
match "content/memento-mori/scores/*.svg" $ do
|
||||
route $ gsubRoute "content/memento-mori/" (const "")
|
||||
route $ stripPrefixRoute "content/memento-mori/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -354,7 +358,7 @@ rules = do
|
|||
.&&. complement "content/colophon.md"
|
||||
.&&. complement "content/current.md"
|
||||
.&&. complement "content/library.md") $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ pageCompiler
|
||||
>>= loadAndApplyTemplate "templates/page.html" pageCtx
|
||||
|
|
@ -414,7 +418,7 @@ rules = do
|
|||
.&&. complement "content/essays/*.md"
|
||||
.&&. complement "content/essays/*/index.md"
|
||||
.&&. complement "content/essays/**/*.dims.yaml") $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- Static assets co-located with draft essays (dev-only).
|
||||
|
|
@ -422,14 +426,14 @@ rules = do
|
|||
.&&. complement "content/drafts/essays/*.md"
|
||||
.&&. complement "content/drafts/essays/*/index.md"
|
||||
.&&. complement "content/drafts/essays/**/*.dims.yaml") $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Blog posts
|
||||
-- ---------------------------------------------------------------------------
|
||||
match "content/blog/*.md" $ do
|
||||
route $ gsubRoute "content/blog/" (const "blog/")
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ postCompiler
|
||||
>>= saveSnapshot "content"
|
||||
|
|
@ -440,19 +444,12 @@ rules = do
|
|||
-- ---------------------------------------------------------------------------
|
||||
-- Poetry
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Flat poems (e.g. content/poetry/sonnet-60.md)
|
||||
match "content/poetry/*.md" $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ poetryCompiler
|
||||
>>= saveSnapshot "content"
|
||||
>>= loadAndApplyTemplate "templates/reading.html" poetryCtx
|
||||
>>= loadAndApplyTemplate "templates/default.html" poetryCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- Collection poems (e.g. content/poetry/shakespeare-sonnets/sonnet-1.md)
|
||||
match collectionPoems $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
-- All poems — flat (content/poetry/sonnet-60.md) and collection
|
||||
-- (content/poetry/shakespeare-sonnets/sonnet-1.md) forms share one
|
||||
-- rule; collection index pages are excluded by 'P.poetryPattern'
|
||||
-- itself and matched separately below.
|
||||
match P.poetryPattern $ do
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ poetryCompiler
|
||||
>>= saveSnapshot "content"
|
||||
|
|
@ -462,7 +459,7 @@ rules = do
|
|||
|
||||
-- Collection index pages (e.g. content/poetry/shakespeare-sonnets/index.md)
|
||||
match "content/poetry/*/index.md" $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ pageCompiler
|
||||
>>= loadAndApplyTemplate "templates/default.html" pageCtx
|
||||
|
|
@ -472,7 +469,7 @@ rules = do
|
|||
-- Fiction
|
||||
-- ---------------------------------------------------------------------------
|
||||
match "content/fiction/*.md" $ do
|
||||
route $ gsubRoute "content/fiction/" (const "fiction/")
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ fictionCompiler
|
||||
>>= saveSnapshot "content"
|
||||
|
|
@ -496,20 +493,20 @@ rules = do
|
|||
|
||||
-- Static assets (SVG score pages, audio, PDF) served unchanged.
|
||||
match "content/music/**/*.svg" $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
match "content/music/**/*.mp3" $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
match "content/music/**/*.pdf" $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
compile copyFileCompiler
|
||||
|
||||
-- Landing page — full essay pipeline.
|
||||
match "content/music/*/index.md" $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
route $ stripPrefixRoute "content/"
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ compositionCompiler
|
||||
>>= saveSnapshot "content"
|
||||
|
|
@ -566,6 +563,46 @@ rules = do
|
|||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Poetry index
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Nav, the home portal grid, and the library all link /poetry/; this
|
||||
-- rule is what keeps those links from 404ing. Lists flat poems and
|
||||
-- collection poems alike; collection index pages are excluded by
|
||||
-- 'P.poetryPattern' itself.
|
||||
create ["poetry/index.html"] $ do
|
||||
route idRoute
|
||||
compile $ do
|
||||
poems <- recentFirst =<< loadAll (P.poetryPattern .&&. hasNoVersion)
|
||||
let ctx =
|
||||
listField "essays" poetryCtx (return poems)
|
||||
<> constField "title" "Poetry"
|
||||
<> constField "portal" "true"
|
||||
<> siteCtx
|
||||
makeItem ""
|
||||
>>= loadAndApplyTemplate "templates/essay-index.html" ctx
|
||||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Fiction index
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Same rationale as the poetry index. content/fiction/ has no entries
|
||||
-- yet; an empty match list renders an empty index rather than a 404.
|
||||
create ["fiction/index.html"] $ do
|
||||
route idRoute
|
||||
compile $ do
|
||||
stories <- recentFirst =<< loadAll (P.fictionPattern .&&. hasNoVersion)
|
||||
let ctx =
|
||||
listField "essays" fictionCtx (return stories)
|
||||
<> constField "title" "Fiction"
|
||||
<> constField "portal" "true"
|
||||
<> siteCtx
|
||||
makeItem ""
|
||||
>>= loadAndApplyTemplate "templates/essay-index.html" ctx
|
||||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- New page — all content sorted by creation date, newest first
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -573,10 +610,10 @@ rules = do
|
|||
route idRoute
|
||||
compile $ do
|
||||
let allContent = ( allEssays
|
||||
.||. "content/blog/*.md"
|
||||
.||. "content/fiction/*.md"
|
||||
.||. allPoetry
|
||||
.||. "content/music/*/index.md"
|
||||
.||. P.blogPattern
|
||||
.||. P.fictionPattern
|
||||
.||. P.poetryPattern
|
||||
.||. P.musicPattern
|
||||
) .&&. hasNoVersion
|
||||
items <- recentFirstByDisplay =<< loadAll allContent
|
||||
let itemCtx = contentKindField
|
||||
|
|
@ -601,7 +638,7 @@ rules = do
|
|||
-- Library — portal-grouped view over the /new.html dataset, deduplicated
|
||||
-- by primary portal. An item's primary portal is the top segment of the
|
||||
-- first tag in its frontmatter 'tags:' list whose top segment matches a
|
||||
-- known portal (the eight in 'homePortals'). Items with no such tag are
|
||||
-- known portal (those in 'homePortals'). Items with no such tag are
|
||||
-- silently dropped from the library (they remain on /new.html and on any
|
||||
-- tag pages their frontmatter produces).
|
||||
--
|
||||
|
|
@ -629,9 +666,11 @@ rules = do
|
|||
|
||||
-- Top segment of the first tag that names a known portal.
|
||||
-- Nothing when no tag matches — item is excluded from library.
|
||||
-- Reads tags via 'getTags' (not lookupStringList) so the
|
||||
-- scalar comma form ("tags: research, ai") is accepted with
|
||||
-- the same semantics the tag pages use.
|
||||
primaryPortalOf item = do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
let ts = fromMaybe [] (lookupStringList "tags" meta)
|
||||
ts <- getTags (itemIdentifier item)
|
||||
return $ listToMaybe
|
||||
[ p | t <- ts
|
||||
, let p = takeWhile (/= '/') t
|
||||
|
|
@ -654,12 +693,12 @@ rules = do
|
|||
|
||||
-- Load every content item once, then partition by primary portal
|
||||
-- so each shelf draws from a pre-filtered list rather than
|
||||
-- re-scanning the whole corpus nine times.
|
||||
-- re-scanning the whole corpus once per portal.
|
||||
essays <- loadAll (allEssays .&&. hasNoVersion)
|
||||
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion)
|
||||
fiction <- loadAll ("content/fiction/*.md" .&&. hasNoVersion)
|
||||
poetry <- loadAll (allPoetry .&&. hasNoVersion)
|
||||
music <- loadAll ("content/music/*/index.md" .&&. hasNoVersion)
|
||||
posts <- loadAll (P.blogPattern .&&. hasNoVersion)
|
||||
fiction <- loadAll (P.fictionPattern .&&. hasNoVersion)
|
||||
poetry <- loadAll (P.poetryPattern .&&. hasNoVersion)
|
||||
music <- loadAll (P.musicPattern .&&. hasNoVersion)
|
||||
photos <- loadAll (P.photographyPattern .&&. hasNoVersion)
|
||||
let allContent = essays ++ posts ++ fiction ++ poetry ++ music ++ photos
|
||||
:: [Item String]
|
||||
|
|
@ -668,21 +707,30 @@ rules = do
|
|||
itemsByPortal =
|
||||
Map.fromListWith (++) [(p, [i]) | (Just p, i) <- tagged]
|
||||
|
||||
-- Eager snapshot load registers the library-intro dependency
|
||||
-- unconditionally, so a first-populate of content/library.md
|
||||
-- re-renders the library page even when the gate was previously
|
||||
-- false (see 'sidecarContext' in Tags.hs for the same pattern).
|
||||
-- Existence-guarded, like the sidecar contexts in Tags.hs:
|
||||
-- deleting content/library.md degrades to a library page with
|
||||
-- no intro block rather than failing the whole compile. When
|
||||
-- the file exists, the eager snapshot load registers the
|
||||
-- library-intro dependency unconditionally, so a first-populate
|
||||
-- of content/library.md re-renders the library page even when
|
||||
-- the gate was previously false (see 'sidecarContext' in
|
||||
-- Tags.hs for the same pattern).
|
||||
introIds <- getMatches "content/library.md"
|
||||
libraryIntroFld <-
|
||||
if libraryIntroId `elem` introIds
|
||||
then do
|
||||
_ <- loadSnapshot libraryIntroId "body" :: Compiler (Item String)
|
||||
let libraryIntroFld = field "library-intro" $ \_ -> do
|
||||
return $ field "library-intro" $ \_ -> do
|
||||
html <- itemBody <$> loadSnapshot libraryIntroId "body"
|
||||
if all isSpace html
|
||||
then noResult "empty library intro"
|
||||
else return html
|
||||
else return mempty
|
||||
|
||||
-- One shelf's context contribution: the @<slug>-entries@
|
||||
-- listField (or absent via noResult when the shelf is
|
||||
-- empty) plus an optional @<slug>-has-more@ gate.
|
||||
portalSection p = do
|
||||
let portalSection p = do
|
||||
let portalItems = fromMaybe [] (Map.lookup p itemsByPortal)
|
||||
sorted <- recentFirstByDisplay portalItems
|
||||
|
||||
|
|
@ -763,10 +811,10 @@ rules = do
|
|||
bibKwMap = invertKeywordsBib bibExtrasAll
|
||||
|
||||
writingIds <- getMatches $ (P.essayPattern
|
||||
.||. "content/blog/*.md"
|
||||
.||. "content/fiction/*.md"
|
||||
.||. P.blogPattern
|
||||
.||. P.fictionPattern
|
||||
.||. P.poetryPattern
|
||||
.||. "content/music/*/index.md")
|
||||
.||. P.musicPattern)
|
||||
.&&. hasNoVersion
|
||||
|
||||
writingKwPairs <- forM writingIds $ \ident -> do
|
||||
|
|
@ -863,15 +911,17 @@ rules = do
|
|||
>>= relativizeUrls
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Random page manifest — essays + blog posts only (no pagination/index pages)
|
||||
-- Random page manifest — essays, blog posts, fiction, and poetry (flat
|
||||
-- and collection poems alike). No pagination/index pages; music and
|
||||
-- photography landings are also excluded.
|
||||
-- ---------------------------------------------------------------------------
|
||||
create ["random-pages.json"] $ do
|
||||
route idRoute
|
||||
compile $ do
|
||||
essays <- loadAll (allEssays .&&. hasNoVersion) :: Compiler [Item String]
|
||||
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
fiction <- loadAll ("content/fiction/*.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
poetry <- loadAll ("content/poetry/*.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
posts <- loadAll (P.blogPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
fiction <- loadAll (P.fictionPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
poetry <- loadAll (P.poetryPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
routes <- mapM (getRoute . itemIdentifier) (essays ++ posts ++ fiction ++ poetry)
|
||||
let urls = [ "/" ++ r | Just r <- routes ]
|
||||
makeItem $ LBS.unpack (Aeson.encode urls)
|
||||
|
|
@ -885,10 +935,10 @@ rules = do
|
|||
route idRoute
|
||||
compile $ do
|
||||
essays <- loadAll (allEssays .&&. hasNoVersion) :: Compiler [Item String]
|
||||
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
fiction <- loadAll ("content/fiction/*.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
poetry <- loadAll (allPoetry .&&. hasNoVersion) :: Compiler [Item String]
|
||||
music <- loadAll ("content/music/*/index.md" .&&. hasNoVersion) :: Compiler [Item String]
|
||||
posts <- loadAll (P.blogPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
fiction <- loadAll (P.fictionPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
poetry <- loadAll (P.poetryPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
music <- loadAll (P.musicPattern .&&. hasNoVersion) :: Compiler [Item String]
|
||||
let items = essays ++ posts ++ fiction ++ poetry ++ music
|
||||
pairs <- mapM epistemicEntry items
|
||||
let metaMap = Map.fromList (catMaybes pairs)
|
||||
|
|
@ -903,10 +953,10 @@ rules = do
|
|||
posts <- fmap (take 30) . recentFirst
|
||||
=<< loadAllSnapshots
|
||||
( ( allEssays
|
||||
.||. "content/blog/*.md"
|
||||
.||. "content/fiction/*.md"
|
||||
.||. allPoetry
|
||||
.||. "content/music/*/index.md"
|
||||
.||. P.blogPattern
|
||||
.||. P.fictionPattern
|
||||
.||. P.poetryPattern
|
||||
.||. P.musicPattern
|
||||
)
|
||||
.&&. hasNoVersion
|
||||
)
|
||||
|
|
@ -926,7 +976,7 @@ rules = do
|
|||
compile $ do
|
||||
compositions <- recentFirst
|
||||
=<< loadAllSnapshots
|
||||
("content/music/*/index.md" .&&. hasNoVersion)
|
||||
(P.musicPattern .&&. hasNoVersion)
|
||||
"content"
|
||||
let feedCtx =
|
||||
dateField "updated" "%Y-%m-%dT%H:%M:%SZ"
|
||||
|
|
@ -966,10 +1016,10 @@ rules = do
|
|||
entries <- recentFirst
|
||||
=<< loadAllSnapshots
|
||||
( ( allEssays
|
||||
.||. "content/blog/*.md"
|
||||
.||. "content/fiction/*.md"
|
||||
.||. allPoetry
|
||||
.||. "content/music/*/index.md"
|
||||
.||. P.blogPattern
|
||||
.||. P.fictionPattern
|
||||
.||. P.poetryPattern
|
||||
.||. P.musicPattern
|
||||
)
|
||||
.&&. hasNoVersion
|
||||
)
|
||||
|
|
@ -1011,8 +1061,12 @@ epistemicEntry item = do
|
|||
, grab "stability" meta
|
||||
]
|
||||
obj = Map.fromList fields
|
||||
-- Compute overall-score the same way Contexts.overallScoreField does.
|
||||
obj' = case ( readMaybe =<< lookupString "confidence" meta :: Maybe Int
|
||||
-- Compute overall-score the same way Contexts.overallScoreField
|
||||
-- does, including the "proved"/"proven" sentinel -> 100.
|
||||
confRaw = lookupString "confidence" meta
|
||||
confInt | isProvedConfidence confRaw = Just 100
|
||||
| otherwise = readMaybe =<< confRaw :: Maybe Int
|
||||
obj' = case ( confInt
|
||||
, readMaybe =<< lookupString "evidence" meta :: Maybe Int
|
||||
) of
|
||||
(Just conf, Just ev) ->
|
||||
|
|
|
|||
|
|
@ -33,8 +33,11 @@ import Control.Exception (catch, IOException)
|
|||
import Data.Aeson (Value (..))
|
||||
import qualified Data.Aeson.KeyMap as KM
|
||||
import qualified Data.Vector as V
|
||||
import Data.List (sortBy)
|
||||
import Data.Maybe (catMaybes, fromMaybe, listToMaybe)
|
||||
import Data.Ord (comparing, Down (..))
|
||||
import Data.Time.Calendar (Day, diffDays)
|
||||
import Data.Time.Clock (getCurrentTime, utctDay)
|
||||
import Data.Time.Format (parseTimeM, formatTime, defaultTimeLocale)
|
||||
import qualified Data.Text as T
|
||||
import qualified Data.Text.IO as TIO
|
||||
|
|
@ -85,14 +88,8 @@ gitDates fp = do
|
|||
parseIso :: String -> Maybe Day
|
||||
parseIso = parseTimeM True defaultTimeLocale "%Y-%m-%d"
|
||||
|
||||
-- | Approximate day-span between the oldest and newest ISO date strings.
|
||||
daySpan :: String -> String -> Int
|
||||
daySpan oldest newest =
|
||||
case (parseIso oldest, parseIso newest) of
|
||||
(Just o, Just n) -> fromIntegral (abs (diffDays n o))
|
||||
_ -> 0
|
||||
|
||||
-- | Derive stability label from commit dates (newest-first).
|
||||
-- | Derive stability label from commit dates (newest-first), judged as
|
||||
-- of @today@.
|
||||
--
|
||||
-- Thresholds (commit count + age in days since first commit):
|
||||
--
|
||||
|
|
@ -104,13 +101,18 @@ daySpan oldest newest =
|
|||
--
|
||||
-- These cliffs are deliberately conservative: a fast burst of commits
|
||||
-- early in a piece's life looks volatile until enough time has passed
|
||||
-- to demonstrate it has settled.
|
||||
stabilityFromDates :: [String] -> String
|
||||
stabilityFromDates [] = "volatile"
|
||||
stabilityFromDates dates@(newest : _) =
|
||||
-- 'last' is safe: the (newest:_) pattern guarantees non-empty.
|
||||
classify (length dates) (daySpan (last dates) newest)
|
||||
-- to demonstrate it has settled. Age is measured from the first commit
|
||||
-- to /today/, not to the most recent commit — a piece written in a
|
||||
-- one-week burst must be able to stabilise as quiet time accumulates.
|
||||
stabilityFromDates :: Day -> [String] -> String
|
||||
stabilityFromDates _ [] = "volatile"
|
||||
stabilityFromDates today dates =
|
||||
classify (length dates) ageDays
|
||||
where
|
||||
-- 'last' is safe: the [] case is handled above.
|
||||
ageDays = case parseIso (last dates) of
|
||||
Just firstDay -> fromIntegral (diffDays today firstDay)
|
||||
Nothing -> 0
|
||||
classify n age
|
||||
| n <= 1 || age < volatileAge = "volatile"
|
||||
| n <= 5 && age < revisingAge = "revising"
|
||||
|
|
@ -149,7 +151,9 @@ resolveStability item = do
|
|||
ignored <- readIgnore
|
||||
if srcPath `elem` ignored
|
||||
then return $ fromMaybe "volatile" (lookupString "stability" meta)
|
||||
else stabilityFromDates <$> gitDates srcPath
|
||||
else do
|
||||
today <- utctDay <$> getCurrentTime
|
||||
stabilityFromDates today <$> gitDates srcPath
|
||||
|
||||
-- | Context field @$stability$@.
|
||||
-- Always resolves to a label; prefers frontmatter when the file is pinned.
|
||||
|
|
@ -166,7 +170,9 @@ lastReviewedField = field "last-reviewed" $ \item -> do
|
|||
mDate <- unsafeCompiler $ do
|
||||
ignored <- readIgnore
|
||||
if srcPath `elem` ignored
|
||||
then return $ lookupString "last-reviewed" meta
|
||||
-- Frontmatter convention is ISO; format it like the git
|
||||
-- branch so pinned pages don't render a raw "2026-05-01".
|
||||
then return $ fmtIso <$> lookupString "last-reviewed" meta
|
||||
else fmap fmtIso . listToMaybe <$> gitDates srcPath
|
||||
case mDate of
|
||||
Nothing -> fail "no last-reviewed"
|
||||
|
|
@ -228,14 +234,21 @@ versionHistoryHeadCount = 3
|
|||
|
||||
-- | Load version-history entries for an item.
|
||||
-- Priority: frontmatter @history:@ list → git log dates → empty.
|
||||
--
|
||||
-- Entries are sorted newest-first by ISO date regardless of authored
|
||||
-- order: every consumer (primary/rest split, range fields) assumes the
|
||||
-- head is the newest entry, and the @history:@ list may be authored in
|
||||
-- either direction. Git dates already arrive newest-first; the sort is
|
||||
-- idempotent there.
|
||||
loadVersionHistory :: Item a -> Compiler [VHEntry]
|
||||
loadVersionHistory item = do
|
||||
let srcPath = toFilePath (itemIdentifier item)
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
let fmEntries = parseFmHistory meta
|
||||
let newestFirst = sortBy (comparing (Down . vhDateIso))
|
||||
fmEntries = newestFirst (parseFmHistory meta)
|
||||
if not (null fmEntries)
|
||||
then return fmEntries
|
||||
else unsafeCompiler (gitLogHistory srcPath)
|
||||
else unsafeCompiler (newestFirst <$> gitLogHistory srcPath)
|
||||
|
||||
-- | Wrap a list of 'VHEntry' as Hakyll Items with unique paths so the
|
||||
-- list field works correctly inside @$for$@.
|
||||
|
|
|
|||
|
|
@ -156,21 +156,35 @@ stripHtmlTags = go
|
|||
skipApos (_:rs) = skipApos rs
|
||||
skipApos [] = []
|
||||
|
||||
-- | Normalise a page URL for backlink map lookup (strip trailing .html).
|
||||
-- | Normalise a page URL for backlink map lookup. Must mirror
|
||||
-- 'Backlinks.normaliseUrl': strip a trailing @index.html@ (keeping the
|
||||
-- directory slash) before the bare @.html@ extension, so the keys this
|
||||
-- produces match the keys written into @data/backlinks.json@.
|
||||
normUrl :: String -> String
|
||||
normUrl u
|
||||
| "index.html" `isSuffixOf` u = take (length u - 10) u
|
||||
| ".html" `isSuffixOf` u = take (length u - 5) u
|
||||
| otherwise = u
|
||||
|
||||
pad2 :: (Show a, Integral a) => a -> String
|
||||
pad2 n = if n < 10 then "0" ++ show n else show n
|
||||
|
||||
-- | Median of a non-empty list; returns 0 for empty.
|
||||
-- | Median of a non-empty list; returns 0 for empty. An even-length
|
||||
-- list takes the mean of the two middle elements, rounded to the
|
||||
-- nearest unit.
|
||||
median :: [Int] -> Int
|
||||
median [] = 0
|
||||
median xs = sort xs !! (length xs `div` 2)
|
||||
-- Index is < length xs for non-empty xs, so '(!!)' is safe here
|
||||
-- by construction. The empty case is caught by the first equation.
|
||||
median xs
|
||||
| odd n = upper
|
||||
| otherwise = (lower + upper + 1) `div` 2
|
||||
where
|
||||
-- Indexes are in range for non-empty xs (lower is consulted only
|
||||
-- when n >= 2), so '(!!)' is safe here by construction. The empty
|
||||
-- case is caught by the first equation.
|
||||
sorted = sort xs
|
||||
n = length sorted
|
||||
upper = sorted !! (n `div` 2)
|
||||
lower = sorted !! (n `div` 2 - 1)
|
||||
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -181,8 +195,11 @@ parseDay :: String -> Maybe Day
|
|||
parseDay = parseTimeM True defaultTimeLocale "%Y-%m-%d"
|
||||
|
||||
-- | First Monday on or before 'day' (start of its ISO week).
|
||||
-- 'fromEnum' on 'DayOfWeek' is ISO-numbered (Monday=1 .. Sunday=7),
|
||||
-- so Monday must subtract 0 days, Sunday 6.
|
||||
weekStart :: Day -> Day
|
||||
weekStart day = addDays (fromIntegral (negate (fromEnum (dayOfWeek day)))) day
|
||||
weekStart day =
|
||||
addDays (fromIntegral (negate (fromEnum (dayOfWeek day) - 1))) day
|
||||
|
||||
-- | Intensity class for the heatmap (hm0 … hm4).
|
||||
heatClass :: Int -> String
|
||||
|
|
@ -297,7 +314,7 @@ renderHeatmap wordsByDay today =
|
|||
nDays = diffDays today startDay + 1
|
||||
allDays = [addDays i startDay | i <- [0 .. nDays - 1]]
|
||||
weekOf d = fromIntegral (diffDays d startDay `div` 7) :: Int
|
||||
dowOf d = fromEnum (dayOfWeek d) -- Mon=0..Sun=6
|
||||
dowOf d = fromEnum (dayOfWeek d) - 1 -- ISO 1..7 -> Mon=0..Sun=6
|
||||
svgW = (nWeeks - 1) * step + cellSz
|
||||
svgH = 6 * step + cellSz + hdrH
|
||||
|
||||
|
|
@ -752,7 +769,7 @@ renderArchive metrics =
|
|||
dl [ (k, txt v) | (k, v) <- metrics ]
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Static TOC (matches the nine h2 sections above)
|
||||
-- Static TOC (matches the eleven h2 sections above)
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
pageTOC :: H.Html
|
||||
|
|
|
|||
|
|
@ -30,16 +30,18 @@ module Tags
|
|||
) where
|
||||
|
||||
import Data.Char (isSpace)
|
||||
import Data.List (intercalate, isPrefixOf, nub, sort)
|
||||
import Data.List (intercalate, isPrefixOf, nub, sort, sortBy)
|
||||
import Data.Maybe (fromMaybe, isNothing, maybeToList)
|
||||
import Data.Ord (comparing)
|
||||
import Data.Set (Set)
|
||||
import qualified Data.Set as Set
|
||||
import Data.Time.Clock (UTCTime)
|
||||
import Data.Time.Format (defaultTimeLocale, parseTimeM)
|
||||
import Hakyll
|
||||
import Pagination (sortAndGroupAt)
|
||||
import Patterns (tagIndexable)
|
||||
import Contexts (abstractField, contentKindField,
|
||||
recentFirstByDisplay, revisionDateFields,
|
||||
tagLinksFieldExcludingScope)
|
||||
import Contexts (Revision (..), abstractField, contentKindField,
|
||||
getRevisions, recentFirstByDisplay, revisionDateFields,
|
||||
siteCtx, tagLinksFieldExcludingScope)
|
||||
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -80,23 +82,23 @@ expandTag t =
|
|||
|
||||
-- | Top-level tags that own a section URL outside the tag system, and
|
||||
-- therefore must NOT be created as tag pages — doing so would
|
||||
-- collide with a section landing route. The literal @"photography"@
|
||||
-- is the only one currently affected: every photo's @tags:@ list
|
||||
-- begins with the bare @"photography"@ portal tag (per the section's
|
||||
-- convention), and 'tagIdentifier' would route that to
|
||||
-- @"photography/index.html"@ — already owned by
|
||||
-- @photographyLandingRules@.
|
||||
-- collide with a section landing route. Hakyll does not error on
|
||||
-- duplicate routes (one item silently overwrites the other), so an
|
||||
-- essay tagged e.g. @music@ would otherwise clobber
|
||||
-- @music/index.html@. The set therefore lists every namespace that
|
||||
-- owns a @<name>/index.html@ route, not just the tags currently in
|
||||
-- use: @photography@ (every photo's @tags:@ list begins with it, per
|
||||
-- the section convention) plus the other section landings and
|
||||
-- generated index namespaces.
|
||||
--
|
||||
-- Sub-tags (@photography/landscape@, @photography/film@, …) are
|
||||
-- unaffected; they keep their tag pages because no section landing
|
||||
-- claims those URLs.
|
||||
--
|
||||
-- Other portal tags (@music@, @poetry@, @fiction@, …) don't appear
|
||||
-- here because their content types don't currently feed
|
||||
-- 'tagIndexable', so the top-level tag never enters the tag system.
|
||||
-- Add to this set if that ever changes.
|
||||
sectionOwnedTopLevelTags :: [String]
|
||||
sectionOwnedTopLevelTags = ["photography"]
|
||||
sectionOwnedTopLevelTags =
|
||||
[ "photography", "poetry", "fiction", "music", "essays", "blog"
|
||||
, "cv", "archive", "authors", "bibliography"
|
||||
]
|
||||
|
||||
-- | All expanded tags for an item (reads the "tags" metadata field).
|
||||
-- Filters out any 'sectionOwnedTopLevelTags' to prevent route
|
||||
|
|
@ -293,6 +295,10 @@ sidecarContext sidecarSet tag
|
|||
-- Provides the fields consumed by @templates/partials/item-card.html@
|
||||
-- (@$item-kind$@, @$date-iso$@, @$date-created$@, @$abstract$@,
|
||||
-- @$item-tags$@) with tag-ribbon suppression scoped to the current tag.
|
||||
--
|
||||
-- Composes 'siteCtx' (not bare 'defaultContext') so per-item fields
|
||||
-- the card partial gates on — notably @$has-monogram$@ — fire here
|
||||
-- the same way they do on /new.html and the library.
|
||||
tagItemCtx :: String -> Context String
|
||||
tagItemCtx scope =
|
||||
contentKindField
|
||||
|
|
@ -301,7 +307,7 @@ tagItemCtx scope =
|
|||
<> revisionDateFields
|
||||
<> tagLinksFieldExcludingScope "item-tags" scope
|
||||
<> abstractField
|
||||
<> defaultContext
|
||||
<> siteCtx
|
||||
|
||||
-- | Page identifier for a tag index page.
|
||||
-- Page 1 → <tag>/index.html
|
||||
|
|
@ -359,9 +365,39 @@ clientPaginatedRule tag pat sidecarSet saCtx baseCtx = do
|
|||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- | Display date of an identifier: the most-recent @revised:@ entry's
|
||||
-- date when present and parseable, else the creation date. Mirrors
|
||||
-- the (unexported) @itemDisplayUTC@ behind 'Contexts.recentFirstByDisplay',
|
||||
-- but needs only 'MonadMetadata' — the paginate grouper runs in
|
||||
-- 'Rules' over bare 'Identifier's, where no 'Item's exist yet.
|
||||
identifierDisplayUTC :: (MonadMetadata m, MonadFail m)
|
||||
=> Identifier -> m UTCTime
|
||||
identifierDisplayUTC ident = do
|
||||
meta <- getMetadata ident
|
||||
case getRevisions meta of
|
||||
(r:_) | Just utc <- (parseTimeM True defaultTimeLocale "%Y-%m-%d"
|
||||
(revisionDateISO r) :: Maybe UTCTime)
|
||||
-> return utc
|
||||
_ -> getItemUTC defaultTimeLocale ident
|
||||
|
||||
-- | Partition identifiers into pages of @n@, most recent first by
|
||||
-- /display/ date — the same revision-aware key
|
||||
-- 'recentFirstByDisplay' sorts by within each rendered page — so
|
||||
-- cross-page ordering is monotone. With creation-date partitioning
|
||||
-- (plain @sortRecentFirst@), a recently revised old item stayed on a
|
||||
-- late page but jumped to its top; now it migrates to the early page
|
||||
-- where its displayed date says it belongs.
|
||||
sortAndGroupByDisplayAt :: (MonadMetadata m, MonadFail m)
|
||||
=> Int -> [Identifier] -> m [[Identifier]]
|
||||
sortAndGroupByDisplayAt n ids = do
|
||||
keyed <- mapM (\i -> (,) <$> identifierDisplayUTC i <*> pure i) ids
|
||||
return $ paginateEvery n $ map snd $ sortBy (flip (comparing fst)) keyed
|
||||
|
||||
-- | Server-side pagination at 'tagPageSize' per page. Previous/next
|
||||
-- navigation renders via @templates/partials/paginate-nav.html@;
|
||||
-- the count toggle operates within the current page only.
|
||||
-- the count toggle operates within the current page only. Pages are
|
||||
-- partitioned and sorted by the same display-date key (see
|
||||
-- 'sortAndGroupByDisplayAt').
|
||||
serverPaginatedRule :: String
|
||||
-> Pattern
|
||||
-> Set Identifier
|
||||
|
|
@ -369,7 +405,7 @@ serverPaginatedRule :: String
|
|||
-> Context String -- ^ base (siteCtx)
|
||||
-> Rules ()
|
||||
serverPaginatedRule tag pat sidecarSet saCtx baseCtx = do
|
||||
paginate <- buildPaginateWith (sortAndGroupAt tagPageSize) pat (tagPageId tag)
|
||||
paginate <- buildPaginateWith (sortAndGroupByDisplayAt tagPageSize) pat (tagPageId tag)
|
||||
paginateRules paginate $ \pageNum pat' -> do
|
||||
route idRoute
|
||||
compile $ do
|
||||
|
|
|
|||
|
|
@ -27,9 +27,9 @@ wordCount :: String -> Int
|
|||
wordCount = length . words
|
||||
|
||||
-- | Estimate reading time in minutes (assumes 200 words per minute).
|
||||
-- Minimum is 1 minute.
|
||||
-- Rounds up — 399 words is 2 minutes, not 1. Minimum is 1 minute.
|
||||
readingTime :: String -> Int
|
||||
readingTime s = max 1 (wordCount s `div` 200)
|
||||
readingTime s = max 1 ((wordCount s + 199) `div` 200)
|
||||
|
||||
-- | Escape HTML special characters: @&@, @<@, @>@, @\"@, @\'@.
|
||||
--
|
||||
|
|
@ -62,7 +62,11 @@ trim :: String -> String
|
|||
trim = dropWhileEnd isSpace . dropWhile isSpace
|
||||
|
||||
-- | Lowercase a string, drop everything that isn't alphanumeric or
|
||||
-- space, then replace runs of spaces with single hyphens.
|
||||
-- space, then replace each space with a hyphen. Note that a run of
|
||||
-- spaces therefore becomes a run of hyphens (@"A B" → "a--b"@) —
|
||||
-- deliberately left as-is, since every slug on the site is generated
|
||||
-- by this one function and collapsing runs now would move existing
|
||||
-- author URLs.
|
||||
--
|
||||
-- Used for author URL slugs (e.g. @"Levi Neuwirth" → "levi-neuwirth"@).
|
||||
-- Centralised here so 'Authors' and 'Contexts' cannot drift on Unicode
|
||||
|
|
|
|||
|
|
@ -68,8 +68,8 @@ constraints: any.Glob ==0.10.2,
|
|||
any.deepseq ==1.4.8.1,
|
||||
any.digest ==0.0.2.1,
|
||||
any.directory ==1.3.8.5,
|
||||
any.distributive ==0.6.2.1,
|
||||
any.djot ==0.1.2.3,
|
||||
any.distributive ==0.6.3,
|
||||
any.djot ==0.1.2.4,
|
||||
any.dlist ==1.0,
|
||||
any.doclayout ==0.5.0.1,
|
||||
any.doctemplates ==0.11.0.1,
|
||||
|
|
@ -198,7 +198,7 @@ constraints: any.Glob ==0.10.2,
|
|||
any.unliftio-core ==0.2.1.0,
|
||||
any.unordered-containers ==0.2.20.1,
|
||||
any.utf8-string ==1.0.2,
|
||||
any.uuid-types ==1.0.6,
|
||||
any.uuid-types ==1.0.6.1,
|
||||
any.vault ==0.3.1.6,
|
||||
any.vector ==0.13.2.0,
|
||||
any.vector-algorithms ==0.9.1.0,
|
||||
|
|
|
|||
|
|
@ -1,7 +1,6 @@
|
|||
---
|
||||
title: Colophon
|
||||
date: 2026-03-21
|
||||
modified: 2026-04-27
|
||||
status: "Durable"
|
||||
confidence: 93
|
||||
tags: [meta]
|
||||
|
|
|
|||
|
|
@ -1,23 +0,0 @@
|
|||
---
|
||||
title: "The Specification Dilemma"
|
||||
date: 2026-04-20 # required; used for ordering, feed, and display
|
||||
abstract: > # optional; shown in the metadata block and link previews
|
||||
We should not consider AI entities as mere tools, though they may be the raw foundation from which exceptional tools for thought are constructed to augment the human mind. Rather, we should consider AI as the ultimate distillation and consolidation of humanity's achievements - the ultimate progeny of our civilization.
|
||||
tags: # optional; see Tags section
|
||||
- ai
|
||||
- tech
|
||||
|
||||
# Epistemic profile — all optional; the entire section is hidden unless `status` is set
|
||||
status: "Draft" # Draft | Working model | Durable | Refined | Superseded | Deprecated
|
||||
confidence: 100 # 0–100 integer (%)
|
||||
importance: 5 # 1–5 integer (rendered as filled/empty dots ●●●○○)
|
||||
evidence: 1 # 1–5 integer (same)
|
||||
scope: civilizational # personal | local | average | broad | civilizational
|
||||
novelty: idiosyncratic # conventional | moderate | idiosyncratic | innovative
|
||||
practicality: moderate # abstract | low | moderate | high | exceptional
|
||||
confidence-history: # list of integers; trend arrow derived from last two entries
|
||||
---
|
||||
|
||||
TODO: block quote about Richard Feynman and the beauty of science - idea "it's more beautiful this way"
|
||||
|
||||
I have often felt there has been a loss of wonder from the world, and I lament this fact.
|
||||
|
|
@ -1,41 +0,0 @@
|
|||
---
|
||||
title: "The Modern Idolatry"
|
||||
date: 2026-04-06
|
||||
abstract: >
|
||||
Thoughts on idolizing notions of success, whether extrinsic or intrinsic, prompted by my upcoming graduation from Brown University and a recent week spent in Paris.
|
||||
tags:
|
||||
- miscellany
|
||||
- philosophy
|
||||
- personal
|
||||
- personal/travel
|
||||
authors:
|
||||
- "Levi Neuwirth | /me.html"
|
||||
status: "Draft"
|
||||
history:
|
||||
- date: "2026-04-06"
|
||||
---
|
||||
|
||||
Travel affects me profoundly, and the effect is strangely uniform. There is a hierarchical structure of dichotomies that seems to define most aspects of my life, and my interactions with place are no exception to this rule. One of the dichotomies is as follows: I am rather accustomed to moving around in my adult life to date, never spending more than 4 months in a place before spending at least a few weeks somewhere else, and yet I rapidly develop a sense of "home" wherever I am - a stagnation of sorts, an acceptance of the region in which I reside and an abstraction away of the remainder of the world to some vast, estoeric TERRA INCOGNITA. Perhaps the most profound, persistent personal effect of travel on me is that it knocks me out of this mental state of spatial hibernation, reminding me that there is an entire world beyond that which I consistently perceive, and that I have the means to do something to have a positive impact on it. This has been a profoundly important sensation for me to have for many years now, and is thus one basis by which travel is consistently a high priority for me.
|
||||
|
||||
This is often combined with a sense of grand melancholy, the sort that for me is nearly ubiquitous in the presence of grandeur and beauty. It is a different incarnation of the same melancholy^[I should emphasize here that while "melancholy" may in general invoke a negative connotation, I do not feel that this is a negative emotion whatsoever. To me, the primary effect of melancholy, or at least melancholy of this sort, is an amplification of the imposing impetus, usually some sense of grandeur. The melancholy is like delicate cinnamon powder added to the top of a pristine flat white.] that I feel when I listen to a profound piece of music, view a painting that I enjoy, or reach the summit of a mountain that I have been embracing for hours. In this case the strength is perhaps yielded by the confluence of grandeur of the natural world - the vastness of space, the mystery of distinct regions that I have yet to know and the warm embrace of returning to those which I know but not well - and that of the human world - the various cultures, languages, beliefs, institutions, and above all people that are present in various places.
|
||||
|
||||
This grand, amplified melancholy typically has three causes in my life, two of which I have already mentioned. The third is instances of outward-facing "success" - I typically feel melancholic and pensive when I have done something or crossed some milestone^["Milestones" are not terms that I would use nor guidelines or aspects of some personal timeline or plan, but rather things that society imposes. They don't mean much to me on a personal level, but do unavoidably impact how I feel, since I cannot avoid societial influences as much as I sometimes wish I could.] that many folks see as an indicator of success (or the potential for it). One might imagine, then, that I felt quite a sensation as I was travelling in Paris during my most recent spring break, on the verge of graduating from Brown University after four years of work and extreme personal growth, and such an imagination would be highly warranted. As I took endless walks on the [Champ de Mars](https://en.wikipedia.org/wiki/Champ_de_Mars) and along the [Seine](https://en.wikipedia.org/wiki/Seine) many thoughts and musings were prompted by the grand sensations of emotion, grandeur, and wonder that I felt. They are largely concentrated around the theme of modern idolatry in the name of "success" and the impliciations of this, on both a personal and broader philosophical and societal level. My attempts to collect them into a format that I can share follow.
|
||||
|
||||
## Dichotomies
|
||||
<figure class="prose-excerpt">
|
||||
<blockquote>
|
||||
|
||||
"Everything is a dichotomy; that is perhaps the grandeur of life, of the Universe itself."
|
||||
|
||||
|
||||
</blockquote>
|
||||
<figcaption>Levi's personal journal, 29 January 2026</figcaption>
|
||||
</figure>
|
||||
|
||||
::: dropcap
|
||||
What of "success" do I understand, and what of it have I cumulatively failed to understand? Of course, this question depends on one's chosen definition of "success," so perhaps the most interesting approach is to parameterize our choice of definition. Indeed, SUCCESS is a concept that means different things to different people, so perhaps such parameterization is implicitly necessary. Yet such parameterization unsettles me greatly on a personal level. It is the first example of dichotomy that we, together, may explore.
|
||||
:::
|
||||
|
||||
Society widely seems to view success as the fulfillment of goals rooted in extrinsic motivations. The credentialist nature of our society seems to conflate one's ability to earn a title with competence, experience, and, in some cases, worthiness - and who, exactly, is worthy of success, or, rather, is it success that deems one worthy in the eyes of the world? In more ways than one, it seems that we have been conditioned somehow through our institutions, both explicit and implicit, to conflate worthiness with success, and this conflation is perhaps grounded in the idea that success will be transitative; that is, one's continued association with successful people leads to more successful outcomes. This seems to imply that "success" is somehow a communal thing, inherently extrinsic that it diffuses and saturates, so long as those who have it^[For the sake of illustration here we are assuming that "success" is something to be had, a notion that will be debunked later.] are willing to continue associating with those who have less of it.
|
||||
|
||||
Yet this is in direct contrast to what is arguably the foundation of our^[I use "our" here to refer to citizens of the United States, my country of birth and the culture that largely influenced my perception of success.] success. The extrinsic nature of such success is not problematic, but the communal aspect is. The ethos of the [American Dream](https://en.wikipedia.org/wiki/American_Dream) is largely that of individualism - the promise that dense individual effort leads to success.
|
||||
|
|
@ -1,236 +0,0 @@
|
|||
---
|
||||
title: A Test Essay
|
||||
date: 2026-03-14
|
||||
abstract: A comprehensive end-to-end exercise of the Hakyll pipeline — typography, code, math, sidenotes, filters, tables, exhibits, and annotations.
|
||||
tags: [meta]
|
||||
affiliation: "Department of Imaginary Systems, University of Nowhere | https://example.com"
|
||||
status: Working model
|
||||
confidence: 72
|
||||
importance: 3
|
||||
evidence: 2
|
||||
scope: average
|
||||
novelty: moderate
|
||||
practicality: moderate
|
||||
confidence-history: [55, 63, 72]
|
||||
history:
|
||||
- date: "2026-03-01"
|
||||
note: Initial draft
|
||||
- date: "2026-03-14"
|
||||
note: Expanded typography and citation sections; added math examples
|
||||
---
|
||||
|
||||
The body typeface is Spectral, a screen-first serif with seven weights and full OpenType support. Old-style figures are enabled by default: the year 2026, the number 1984, Euler's number 2.718. Standard ligatures are active: *first*, *fifty*, *ffle*. The typographic principles informing this layout draw on Butterick[@butterick2019] and Tufte[@tufte1983]. This document is built with Pandoc[@pandoc].
|
||||
|
||||
Paragraphs following one another use first-line indentation in the traditional book manner, with no inter-paragraph vertical gap. This is the second paragraph of the opening section, and you should see the indent at the start of this line.
|
||||
|
||||
A third paragraph to confirm the indent is consistent across multiple consecutive paragraphs and does not drift or accumulate.
|
||||
|
||||
## Typography
|
||||
|
||||
### Headings
|
||||
|
||||
Headings are set in Fira Sans Semibold, a humanist sans-serif that complements Spectral. The hierarchy below demonstrates all levels used in practice.
|
||||
|
||||
## Section heading (H2)
|
||||
|
||||
### Subsection heading (H3)
|
||||
|
||||
#### Minor heading (H4)
|
||||
|
||||
##### Rarely used (H5)
|
||||
|
||||
Body text resumes here, following the heading sequence above. The vertical rhythm above each heading and the transition back to Spectral below it should feel natural, not abrupt.
|
||||
|
||||
### Inline Elements
|
||||
|
||||
This sentence demonstrates **bold emphasis (700)** and <strong class="semibold">semibold emphasis (600)</strong> side by side — the authorial choice the spec describes. Italic text looks like *this phrase set in Spectral italic*. Combined: ***bold italic***.
|
||||
|
||||
Abbreviations use Spectral's true small-caps via the `smcp` OpenType feature: the organisations <abbr title="National Science Foundation">NSF</abbr>, <abbr title="American Civil Liberties Union">ACLU</abbr>, and <abbr title="Central Intelligence Agency">CIA</abbr>. These should appear as genuine small capitals, not scaled-down full caps.
|
||||
|
||||
Superscripts use Spectral's `sups` glyphs: E = mc^2^, footnote reference^1^, ordinals like 1^st^ and 2^nd^. Subscripts use `subs`: H~2~O, CO~2~.
|
||||
|
||||
Inline code looks like `cabal run site -- build` and sits comfortably in a line of Spectral body text. The size differential and background tint should clearly distinguish it without being jarring.
|
||||
|
||||
### Blockquotes
|
||||
|
||||
> The site is the proof. If a site about careful writing is itself carelessly made, the argument is self-defeating. Every element must earn its presence.
|
||||
|
||||
Text resumes after the blockquote without indent — the indent reset rule is working if this line begins flush left.
|
||||
|
||||
> A nested quotation scenario: this outer blockquote contains ordinary text, establishing the left-border visual hierarchy.
|
||||
|
||||
## Code
|
||||
|
||||
JetBrains Mono is used for all code. Ligatures and contextual alternates are active: `->` `=>` `!=` `::` `>=` in inline code, and in blocks below.
|
||||
|
||||
```haskell
|
||||
-- Hakyll site compiler entry point
|
||||
module Main where
|
||||
|
||||
import Hakyll (hakyll)
|
||||
import Site (rules)
|
||||
|
||||
main :: IO ()
|
||||
main = hakyll rules
|
||||
```
|
||||
|
||||
```css
|
||||
/* CSS custom property example */
|
||||
:root {
|
||||
--bg: #faf8f4;
|
||||
--text: #1a1a1a;
|
||||
}
|
||||
|
||||
body {
|
||||
background-color: var(--bg);
|
||||
color: var(--text);
|
||||
font-feature-settings: 'liga' 1, 'onum' 1;
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
def greet(name: str) -> str:
|
||||
return f"Hello, {name}!"
|
||||
```
|
||||
|
||||
The code block border, background tint, and monospaced font should feel quiet — part of the page, not a jarring box.
|
||||
|
||||
## Tables
|
||||
|
||||
Tables use Fira Sans at 90% size, with lining figures and tabular spacing enabled for numeric alignment.
|
||||
|
||||
| Font | Role | Weight(s) | File size |
|
||||
|:---------------|:----------------|:------------|:----------|
|
||||
| Spectral | Body text | 400, 600, 700 | 21–24 KB |
|
||||
| Fira Sans | UI / headings | 400, 600 | 16 KB |
|
||||
| JetBrains Mono | Code | 400 | 19–20 KB |
|
||||
|
||||
## Dark Mode
|
||||
|
||||
Use the toggle in the top-right corner of the nav to switch between light and dark. Both themes use warm monochrome palettes derived from the same base hue. The background, text, borders, muted text, code blocks, and blockquote borders should all shift coherently.
|
||||
|
||||
Check the following specifically in dark mode: sidenotes, code block backgrounds, the blockquote border, and the table header row. The `transition` on `body` should make the switch feel smooth rather than abrupt.
|
||||
|
||||
- Background: `#1c1a18` (warm dark, not pure black)
|
||||
- Text: `#e8e5df` (warm off-white, not pure white)
|
||||
- Muted text, borders: proportionally darker warm greys
|
||||
|
||||
## Mathematics
|
||||
|
||||
The quadratic formula solves $ax^2 + bx + c = 0$ for real roots:
|
||||
|
||||
$$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$$
|
||||
|
||||
This is a well-known result.[^quadratic] Euler's identity is often cited as the most beautiful equation in mathematics:
|
||||
|
||||
$$e^{i\pi} + 1 = 0$$
|
||||
|
||||
It connects the five most important constants in mathematics.[^euler] The CSS smallcaps filter should catch abbreviations like NASA, HTML, CSS, and API automatically.
|
||||
|
||||
[^quadratic]: The formula follows directly from completing the square. For a derivation, see any introductory algebra text, e.g. Stewart's *Precalculus*.
|
||||
|
||||
[^euler]: This follows from Euler's formula $e^{i\theta} = \cos\theta + i\sin\theta$ evaluated at $\theta = \pi$.
|
||||
|
||||
### Turán's Theorem
|
||||
|
||||
The Turán graph $T(n,k)$ is the complete $k$-partite graph on $n$ vertices with part sizes as equal as possible. Its edge count is given by the formula below — this is the identity the moving-vertex argument exploits.
|
||||
|
||||
::: {.exhibit .exhibit--equation data-exhibit-name="Turán Edge Count" data-exhibit-type="equation" data-exhibit-caption="Edge count of a complete k-partite graph: total pairs minus same-part pairs."}
|
||||
|
||||
:::: exhibit-body
|
||||
$$\binom{n}{2} - \sum_{i=1}^{k}\binom{m_i}{2}$$
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
Every pair of vertices is adjacent *except* those within the same part, so the formula counts edges by subtracting same-part pairs from all pairs.
|
||||
|
||||
::: {.annotation .annotation--static}
|
||||
<div class="annotation-header">
|
||||
<span class="annotation-label">Remark</span>
|
||||
<span class="annotation-name">Equal parts maximise edges</span>
|
||||
</div>
|
||||
<div class="annotation-body">
|
||||
Intuitively: if two parts differ in size by more than one vertex, moving a vertex from the larger to the smaller part creates more cross-part pairs than it destroys within-part pairs. The moving-vertex argument below makes this precise.
|
||||
</div>
|
||||
:::
|
||||
|
||||
::: {.annotation .annotation--collapsible}
|
||||
<div class="annotation-header">
|
||||
<span class="annotation-label">Note</span>
|
||||
<span class="annotation-name">Turán graph definition</span>
|
||||
<button class="annotation-toggle" aria-expanded="false">▸ expand</button>
|
||||
</div>
|
||||
<div class="annotation-body">
|
||||
The *Turán graph* $T(n,k)$ is the unique (up to isomorphism) complete $k$-partite graph on $n$ vertices whose part sizes differ by at most one. By Turán's theorem, $T(n,k)$ is the $K_{k+1}$-free graph on $n$ vertices with the maximum number of edges.
|
||||
</div>
|
||||
:::
|
||||
|
||||
::: {.exhibit .exhibit--proof data-exhibit-name="Turán Bound" data-exhibit-type="proof" data-exhibit-caption="Moving one vertex from the larger to the smaller part strictly increases the edge count when parts differ by ≥ 2."}
|
||||
|
||||
:::: exhibit-body
|
||||
Without loss of generality suppose $n_1 - n_2 \ge 2$. Form a new complete $k$-partite graph by moving one vertex from part 1 to part 2. Since the new graph is still complete $k$-partite on the same $n$ vertices, it suffices to show it has strictly more edges.
|
||||
|
||||
The number of edges in any complete $k$-partite graph $M_{m_1,\ldots,m_k}$ is
|
||||
|
||||
$$\binom{n}{2} - \sum_{i=1}^{k}\binom{m_i}{2},$$
|
||||
|
||||
since every pair of vertices is adjacent *except* those within the same part. Therefore
|
||||
|
||||
$$|E(G')| - |E(G)| = \binom{n_1}{2} + \binom{n_2}{2} - \binom{n_1-1}{2} - \binom{n_2+1}{2}.$$
|
||||
|
||||
Using $\binom{m}{2} = \frac{m(m-1)}{2}$, this simplifies to $(n_1 - 1) - n_2 = n_1 - n_2 - 1$. Since $n_1 - n_2 \ge 2$, we get $|E(G')| - |E(G)| \ge 1 > 0$. [□]{.proof-qed}
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
## Music Notation
|
||||
|
||||
Score fragments are embedded inline as responsive SVGs, integrated with the gallery focusable system. Clicking the fragment — or the expand glyph that appears on hover — opens the shared overlay. The SVG inherits the page's text color via `currentColor`, so notation renders correctly in both light and dark modes. The caption below the score is a persistent `<figcaption>`, in keeping with the convention of printed musical editions.
|
||||
|
||||
Prose commentary surrounds the fragment just as it would in an analytical text — above to introduce the passage, below to elaborate on what was shown.
|
||||
|
||||
## Links and Wikilinks
|
||||
|
||||
External links with domain classes: [Wikipedia on the quadratic formula](https://en.wikipedia.org/wiki/Quadratic_formula), an [arXiv preprint](https://arxiv.org/abs/1234.5678), a [DOI link](https://doi.org/10.1000/xyz123), and [jgm/pandoc on GitHub](https://github.com/jgm/pandoc). A generic external: [example.com](https://example.com).
|
||||
|
||||
An internal link [to the essay index](/essays/index.html) is left completely unchanged — no extra classes or attributes added.
|
||||
|
||||
Wikilinks: [[About This Site]] resolved from `[[About This Site]]`, and [[The Colophon|the colophon]] resolved from `[[The Colophon|the colophon]]`.
|
||||
|
||||
## Filter Output
|
||||
|
||||
### Abbreviations
|
||||
|
||||
`Filters.Typography` matches exact Pandoc `Str` tokens against a table of common Latin abbreviations and wraps them in `<abbr title="…">` elements. Hover over the highlighted abbreviations below to see the tooltip.
|
||||
|
||||
Common scholarly shorthand: e.g. the quadratic formula, i.e. the formula $x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}$. See cf. Stewart §3.4. The argument follows from first principles, viz. the moving-vertex technique. NB: the result holds only for $k \ge 2$.
|
||||
|
||||
### Smallcaps
|
||||
|
||||
`Filters.Smallcaps` detects runs of three or more uppercase letters and wraps them in `<abbr class="smallcaps">`. Technology acronyms detected automatically: HTML, CSS, API, JSON, URL, NASA, MIT. Trailing punctuation is stripped before the check so HTTP, and REST. also work correctly.
|
||||
|
||||
Not converted: short tokens like I, OK (two letters), or mixed-case tokens like JavaScript, macOS, or LaTeX.
|
||||
|
||||
### Annotations
|
||||
|
||||
::: {.annotation .annotation--static}
|
||||
<div class="annotation-header">
|
||||
<span class="annotation-label">Remark</span>
|
||||
<span class="annotation-name">On static annotations</span>
|
||||
</div>
|
||||
<div class="annotation-body">
|
||||
This is a static annotation. It is always visible and has no toggle. The border separates the header from the body.
|
||||
</div>
|
||||
:::
|
||||
|
||||
::: {.annotation .annotation--collapsible}
|
||||
<div class="annotation-header">
|
||||
<span class="annotation-label">Note</span>
|
||||
<span class="annotation-name">On collapsible annotations</span>
|
||||
<button class="annotation-toggle" aria-expanded="false">▸ expand</button>
|
||||
</div>
|
||||
<div class="annotation-body">
|
||||
This annotation is collapsed by default. The abbreviations i.e. and e.g. should be wrapped in `<abbr>` tags by `Filters.Typography`. Clicking the button should expand and collapse this body smoothly, with the last line fully visible.
|
||||
</div>
|
||||
:::
|
||||
|
|
@ -1,47 +0,0 @@
|
|||
---
|
||||
title: "Universities Should Care"
|
||||
date: 2026-04-28 # required; used for ordering, feed, and display
|
||||
abstract: > # optional; shown in the metadata block and link previews
|
||||
As Students should be more than a mere statistic to the Universities at which they study. I critique Brown University, my undergraduate institution, in this regard. The degradation of students to treatment as if they are a mere statistic is potentially a major reason for the decline in postsecondary education in the modern United States.
|
||||
tags: # optional; see Tags section
|
||||
- ai
|
||||
- tech
|
||||
|
||||
# Epistemic profile — all optional; the entire section is hidden unless `status` is set
|
||||
status: "Draft" # Draft | Working model | Durable | Refined | Superseded | Deprecated
|
||||
confidence: 85 # 0–100 integer (%)
|
||||
importance: 4 # 1–5 integer (rendered as filled/empty dots ●●●○○)
|
||||
evidence: 5 # 1–5 integer (same)
|
||||
scope: broad # personal | local | average | broad | civilizational
|
||||
novelty: moderate # conventional | moderate | idiosyncratic | innovative
|
||||
practicality: high # abstract | low | moderate | high | exceptional
|
||||
confidence-history: # list of integers; trend arrow derived from last two entries
|
||||
---
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
---
|
||||
Planning: List of grievances
|
||||
|
||||
COMPUTER SCIENCE
|
||||
- TA System section.
|
||||
-
|
||||
|
||||
RES LIFE
|
||||
- Obviously: repeated requests for discussion and process for moving out in Fall '23.
|
||||
- Unable to control heat
|
||||
- Lack of bathrooms.
|
||||
- Lack of kitchens
|
||||
|
||||
DINING
|
||||
- Let's run through some calculations to see the actual cost of every meal averaged across a semester.
|
||||
- No real late night options.
|
||||
- Poor optimization of queues / high demand items like grilled cheese.
|
||||
- Inconsistent pricing for the same items across locations.
|
||||
|
||||
SECURITY
|
||||
- No substantive changes since December 13th.
|
||||
|
||||
EFFECTS ON THE CULTURE
|
||||
|
|
@ -13,7 +13,6 @@ importance: 1
|
|||
scope: personal
|
||||
novelty: conventional
|
||||
practicality: moderate
|
||||
confidence-history:
|
||||
---
|
||||
|
||||
A fuller write-up follows. In the meantime, see the [projects index](/cv/projects/).
|
||||
|
|
|
|||
|
|
@ -18,7 +18,6 @@ evidence: 4
|
|||
scope: broad
|
||||
novelty: innovative
|
||||
practicality: high
|
||||
confidence-history:
|
||||
---
|
||||
|
||||
A fuller write-up follows with the clinical-implications manuscript. In the meantime, see the [projects index](/cv/projects/).
|
||||
|
|
|
|||
|
|
@ -1,23 +1,20 @@
|
|||
---
|
||||
title: "Speculative Reluctance"
|
||||
date: 2026-04-15 # required; used for ordering, feed, and display
|
||||
abstract: > # optional; shown in the metadata block and link previews
|
||||
date: 2026-04-15
|
||||
abstract: >
|
||||
AI labs are likely deliberately reluctant to scale because they are aware that any imminient shift to locally run models as the norm would render their compute redundant. We take Anthropic as a principal case study to validate this hypothesis.
|
||||
tags: # optional; see Tags section
|
||||
tags:
|
||||
- ai
|
||||
- tech
|
||||
- speculative
|
||||
- open
|
||||
|
||||
# Epistemic profile — all optional; the entire section is hidden unless `status` is set
|
||||
status: "Draft" # Draft | Working model | Durable | Refined | Superseded | Deprecated
|
||||
confidence: 55 # 0–100 integer (%)
|
||||
importance: 3 # 1–5 integer (rendered as filled/empty dots ●●●○○)
|
||||
evidence: 1 # 1–5 integer (same)
|
||||
scope: broad # personal | local | average | broad | civilizational
|
||||
novelty: moderate # conventional | moderate | idiosyncratic | innovative
|
||||
practicality: high # abstract | low | moderate | high | exceptional
|
||||
confidence-history: # list of integers; trend arrow derived from last two entries
|
||||
status: "Draft"
|
||||
confidence: 55
|
||||
importance: 3
|
||||
evidence: 1
|
||||
scope: broad
|
||||
novelty: moderate
|
||||
practicality: high
|
||||
---
|
||||
|
||||
Running a lab that develops frontier LLMs is somewhat like playing a game that, by all measurable metrics external, you are bound to lose. The amount of compute required to train a frontier LLM is unbelievably expensive. The expense of inference is even more astronomical. OpenAI claims at the time of this writing to have somewhere between 900 Million and 1 Billion active users, all of whom require some amount of inference cost, and some small subset of whom consume an enormous amount of compute - to use their words, this is ["commercial scale."](https://openai.com/index/accelerating-the-next-phase-ai/). This isn't to mention the immense amount of competition - there are many major players in the United States alone contributing models that push the boundaries. OpenAI may have been the first, but Anthropic, Google, Meta, xAI, and, yes, even Amazon and Bytedance are following right along.
|
||||
|
|
|
|||
|
|
@ -19,7 +19,6 @@ evidence: 5
|
|||
scope: civilizational
|
||||
novelty: innovative
|
||||
practicality: moderate
|
||||
confidence-history:
|
||||
---
|
||||
|
||||
There are at least two distinct ways to reduce the search space over which AGI^[The definition of "Artificial General Intelligence", or whether such a definition exists, is contentious. My use of the term is not intended to endorse any proposed timeline for AGI, nor to suggest that it is inevitable. It is rather to provide calibration through a hypothetical goal that clearly justifies pursuit.] will have to operate. The first involves a harmonious interaction of agent and human, not transactional in origin, not fully autonomous nor fully human-driven, but rather collaborative in nature - the agent augments the capacity of the human, just as any other good tool for thought does, by working within the scope of something well specified and ideated upon. This is not to say that the agent cannot have a place in such planning, but rather that the human is ultimately the driver of the actions and tasks, defining the scope of what is to be done in as much detail as possible without being the one to actually do it.
|
||||
|
|
|
|||
|
|
@ -14,7 +14,6 @@ importance: 1
|
|||
scope: local
|
||||
novelty: moderate
|
||||
practicality: low
|
||||
confidence-history:
|
||||
---
|
||||
|
||||
A fuller write-up follows. In the meantime, see the [projects index](/cv/projects/).
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ authors:
|
|||
affiliation:
|
||||
- "Department of Computer Science, Brown University | https://cs.brown.edu"
|
||||
bibliography: data/simd-paper.bib
|
||||
repository: "https://git.levineuwirth.org/where-simd-helps"
|
||||
repository: "https://git.levineuwirth.org/neuwirth/where-simd-helps"
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
|
|
|||
|
|
@ -42,8 +42,20 @@ add_header Permissions-Policy
|
|||
# report stream has been clean for a week.
|
||||
#
|
||||
# External origins justified inline:
|
||||
# cdn.jsdelivr.net KaTeX CSS + JS, Vega / Vega-Lite / Vega-Embed
|
||||
# cdn.jsdelivr.net KaTeX CSS + JS + webfonts (the KaTeX CSS
|
||||
# references its fonts relatively, so they
|
||||
# resolve to the CDN -> font-src), Vega /
|
||||
# Vega-Lite / Vega-Embed, transformers.js
|
||||
# (whose onnxruntime fetches its .wasm from
|
||||
# the CDN via fetch() -> connect-src)
|
||||
# *.basemaps.cartocdn.com Leaflet basemap tiles (photography map only)
|
||||
# connect-src API hosts link-popup providers fetched directly via
|
||||
# CORS (the list popups.js documents in its
|
||||
# header, plus git.levineuwirth.org for the
|
||||
# Forgejo provider). The CORS-broken trio
|
||||
# (arxiv, archive.org, pubmed) goes through
|
||||
# the same-origin /proxy/ instead — see
|
||||
# nginx/popup-proxy.conf.
|
||||
#
|
||||
# Why 'unsafe-inline' on style:
|
||||
# - photography.html emits <span style="background:$swatch$"> for
|
||||
|
|
@ -53,18 +65,14 @@ add_header Permissions-Policy
|
|||
# Why 'unsafe-eval' on script:
|
||||
# - vega-embed compiles Vega-Lite specs at runtime via new Function().
|
||||
# Removing this would require pre-compiling specs at build time.
|
||||
# - it also covers WebAssembly.instantiate for onnxruntime-web
|
||||
# (semantic search).
|
||||
#
|
||||
# The value MUST stay on one physical line: nginx has no line
|
||||
# continuation inside quoted strings — a trailing backslash would embed
|
||||
# literal backslash + LF bytes in the header value, which is illegal in
|
||||
# HTTP/2 and gets whole responses rejected by strict clients.
|
||||
#
|
||||
# To collect violation reports, set up a `report-uri` endpoint and add
|
||||
# `report-uri /csp-report;` (and/or `report-to <group>;`) below.
|
||||
add_header Content-Security-Policy-Report-Only
|
||||
"default-src 'self'; \
|
||||
script-src 'self' 'unsafe-eval' https://cdn.jsdelivr.net; \
|
||||
style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; \
|
||||
img-src 'self' data: https://*.basemaps.cartocdn.com; \
|
||||
font-src 'self' data:; \
|
||||
connect-src 'self'; \
|
||||
frame-ancestors 'none'; \
|
||||
base-uri 'self'; \
|
||||
form-action 'self'; \
|
||||
object-src 'none'; \
|
||||
upgrade-insecure-requests" always;
|
||||
add_header Content-Security-Policy-Report-Only "default-src 'self'; script-src 'self' 'unsafe-eval' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; img-src 'self' data: https://*.basemaps.cartocdn.com; font-src 'self' data: https://cdn.jsdelivr.net; connect-src 'self' https://cdn.jsdelivr.net https://*.wikipedia.org https://api.crossref.org https://api.github.com https://openlibrary.org https://api.biorxiv.org https://www.youtube.com https://git.levineuwirth.org; frame-ancestors 'none'; base-uri 'self'; form-action 'self'; object-src 'none'; upgrade-insecure-requests" always;
|
||||
|
|
|
|||
|
|
@ -7,7 +7,6 @@ dependencies = [
|
|||
# Visualization
|
||||
"matplotlib>=3.9,<4",
|
||||
"altair>=5.4,<6",
|
||||
|
||||
# Embedding pipeline
|
||||
# Upper bounds are intentionally generous (next major) but always
|
||||
# present so that an unrelated `uv sync` upgrade can't silently pull
|
||||
|
|
@ -18,7 +17,6 @@ dependencies = [
|
|||
"beautifulsoup4>=4.12,<5",
|
||||
# CPU-only torch — avoids pulling ~3 GB of CUDA libraries
|
||||
"torch>=2.5,<3",
|
||||
|
||||
# Photography pipeline
|
||||
# Pillow handles EXIF reading when exiftool is not installed (the
|
||||
# preferred path); colorthief computes the 5-color palette strip.
|
||||
|
|
@ -26,6 +24,10 @@ dependencies = [
|
|||
"pillow>=10.0,<12",
|
||||
"colorthief>=0.2,<1",
|
||||
"pyyaml>=6.0,<7",
|
||||
# Not imported by this repo: required at runtime by nomic-embed's
|
||||
# remote modeling code (nomic-bert-2048, loaded by embed.py's page
|
||||
# pass under trust_remote_code with a pinned code_revision).
|
||||
"einops>=0.8.2,<1",
|
||||
]
|
||||
|
||||
[[tool.uv.index]]
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 1.3 KiB After Width: | Height: | Size: 20 KiB |
|
|
@ -70,34 +70,35 @@ nav.site-nav {
|
|||
}
|
||||
|
||||
/* Home logo — square button flush into the top-left corner of the nav bar.
|
||||
The L silhouette is rendered via ::before mask-image so the background
|
||||
matches --bg-nav exactly and the foreground follows --nav-logo-fg (set
|
||||
per theme in base.css — override there to restyle for light mode). */
|
||||
The rooted-L mark lives in /logo-sprite.svg and is referenced with
|
||||
<use> (cacheable once, not ~33 KB inlined per page). Its two-tone
|
||||
cutout still renders because CSS custom properties cascade into the
|
||||
use-element shadow tree: the letter is drawn in --logo-ink and the
|
||||
root filament is punched through in --logo-bg. Mapping --logo-bg to
|
||||
--bg-nav (the button's own surface) makes the roots read as the nav
|
||||
background showing through. Both tokens are theme-driven in
|
||||
base.css — override --nav-logo-fg / --bg-nav there to restyle per
|
||||
theme. */
|
||||
.nav-logo {
|
||||
position: absolute;
|
||||
left: 0;
|
||||
top: 0;
|
||||
bottom: 0;
|
||||
aspect-ratio: 1 / 1;
|
||||
display: block;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
overflow: hidden;
|
||||
flex-shrink: 0;
|
||||
text-decoration: none;
|
||||
background-color: var(--bg-nav);
|
||||
--logo-ink: var(--nav-logo-fg);
|
||||
--logo-bg: var(--bg-nav);
|
||||
}
|
||||
.nav-logo::before {
|
||||
content: '';
|
||||
position: absolute;
|
||||
inset: 12%;
|
||||
background-color: var(--nav-logo-fg);
|
||||
mask-image: url('/images/link-icons/internal.svg');
|
||||
mask-size: contain;
|
||||
mask-repeat: no-repeat;
|
||||
mask-position: center;
|
||||
-webkit-mask-image: url('/images/link-icons/internal.svg');
|
||||
-webkit-mask-size: contain;
|
||||
-webkit-mask-repeat: no-repeat;
|
||||
-webkit-mask-position: center;
|
||||
.nav-logo__mark {
|
||||
width: 76%;
|
||||
height: 76%;
|
||||
display: block;
|
||||
}
|
||||
|
||||
/* Controls cluster: portals toggle + theme toggle, pinned right */
|
||||
|
|
|
|||
|
|
@ -16,8 +16,10 @@
|
|||
For an inline <span> inside a <p>, this is roughly the line containing
|
||||
the sidenote reference, giving correct vertical alignment without JS.
|
||||
|
||||
On narrow viewports the <span> is hidden and the Pandoc-generated
|
||||
<section class="footnotes"> at document end is shown instead.
|
||||
On narrow viewports the <span> is hidden and the
|
||||
<section class="footnotes"> the Sidenotes filter appends at document
|
||||
end is shown instead (Pandoc's own footnote section never exists —
|
||||
the filter consumes every Note, and re-emits this fallback itself).
|
||||
*/
|
||||
|
||||
/* ============================================================
|
||||
|
|
@ -137,22 +139,54 @@
|
|||
|
||||
|
||||
/* ============================================================
|
||||
FOOTNOTE REFERENCES — shown on narrow viewports alongside
|
||||
section.footnotes
|
||||
FOOTNOTES FALLBACK LIST — the section the Sidenotes filter
|
||||
appends at document end; visible on narrow viewports only
|
||||
(see the media queries above). Letter labels are rendered
|
||||
explicitly because an <ol>'s automatic numbers would disagree
|
||||
with the in-text letter refs.
|
||||
============================================================ */
|
||||
|
||||
a.footnote-ref {
|
||||
text-decoration: none;
|
||||
color: var(--text-faint);
|
||||
font-size: 0.75em;
|
||||
line-height: 0;
|
||||
section.footnotes .footnotes-list {
|
||||
list-style: none;
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
.footnote-item {
|
||||
position: relative;
|
||||
top: -0.4em;
|
||||
padding-left: 1.5rem;
|
||||
margin-bottom: 0.85rem;
|
||||
font-size: 0.85rem;
|
||||
line-height: 1.6;
|
||||
color: var(--text-muted);
|
||||
}
|
||||
|
||||
.footnote-label {
|
||||
position: absolute;
|
||||
left: 0;
|
||||
top: 0.15em;
|
||||
font-family: var(--font-sans);
|
||||
font-size: 0.75em;
|
||||
color: var(--text-faint);
|
||||
}
|
||||
|
||||
/* First paragraph flows on the label's line; later ones stack. */
|
||||
.footnote-item > p {
|
||||
margin: 0 0 0.5em;
|
||||
}
|
||||
.footnote-item > p:first-of-type {
|
||||
display: inline;
|
||||
}
|
||||
|
||||
.footnote-back {
|
||||
margin-left: 0.35em;
|
||||
text-decoration: none;
|
||||
font-family: var(--font-sans);
|
||||
color: var(--text-faint);
|
||||
transition: color var(--transition-fast);
|
||||
}
|
||||
|
||||
a.footnote-ref:hover {
|
||||
.footnote-back:hover {
|
||||
color: var(--text-muted);
|
||||
}
|
||||
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 8.8 KiB |
|
Before Width: | Height: | Size: 15 KiB After Width: | Height: | Size: 15 KiB |
|
Before Width: | Height: | Size: 114 KiB After Width: | Height: | Size: 32 KiB |
|
Before Width: | Height: | Size: 4.0 MiB After Width: | Height: | Size: 1.8 MiB |
|
|
@ -12,6 +12,8 @@
|
|||
var STORAGE_KEY = 'site-annotations';
|
||||
var tooltip = null;
|
||||
var tooltipTimer = null;
|
||||
var tooltipPinned = false; /* keyboard-opened: blur must not dismiss */
|
||||
var tooltipMark = null; /* mark that opened the tooltip, for focus return */
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Storage
|
||||
|
|
@ -148,6 +150,18 @@
|
|||
|
||||
tooltip.addEventListener('mouseenter', function () { clearTimeout(tooltipTimer); });
|
||||
tooltip.addEventListener('mouseleave', function () { hideTooltip(false); });
|
||||
|
||||
/* Keyboard flow: Escape closes a pinned tooltip and returns focus
|
||||
to its mark; tabbing out of the tooltip dismisses it. */
|
||||
tooltip.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Escape') {
|
||||
hideTooltip(true);
|
||||
if (tooltipMark) tooltipMark.focus();
|
||||
}
|
||||
});
|
||||
tooltip.addEventListener('focusout', function (e) {
|
||||
if (!tooltip.contains(e.relatedTarget)) hideTooltip(false);
|
||||
});
|
||||
}
|
||||
|
||||
/* Defer to the shared utility (loaded synchronously from
|
||||
|
|
@ -159,6 +173,8 @@
|
|||
|
||||
function showTooltip(mark, ann) {
|
||||
clearTimeout(tooltipTimer);
|
||||
tooltipPinned = false;
|
||||
tooltipMark = mark;
|
||||
|
||||
var note = ann.note || '';
|
||||
var created = ann.created ? new Date(ann.created).toLocaleDateString() : '';
|
||||
|
|
@ -197,6 +213,7 @@
|
|||
|
||||
function hideTooltip(immediate) {
|
||||
clearTimeout(tooltipTimer);
|
||||
tooltipPinned = false;
|
||||
if (immediate) {
|
||||
if (tooltip) tooltip.classList.remove('is-visible');
|
||||
} else {
|
||||
|
|
@ -212,6 +229,28 @@
|
|||
showTooltip(mark, ann);
|
||||
});
|
||||
mark.addEventListener('mouseleave', function () { hideTooltip(false); });
|
||||
|
||||
/* Keyboard: focus mirrors hover; Enter/Space pins the tooltip and
|
||||
moves focus to its Delete button; Escape dismisses. */
|
||||
mark.setAttribute('tabindex', '0');
|
||||
mark.addEventListener('focus', function () {
|
||||
clearTimeout(tooltipTimer);
|
||||
showTooltip(mark, ann);
|
||||
});
|
||||
mark.addEventListener('blur', function () {
|
||||
if (!tooltipPinned) hideTooltip(false);
|
||||
});
|
||||
mark.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Enter' || e.key === ' ') {
|
||||
e.preventDefault();
|
||||
showTooltip(mark, ann);
|
||||
tooltipPinned = true;
|
||||
var del = tooltip.querySelector('.ann-tooltip-delete');
|
||||
if (del) del.focus();
|
||||
} else if (e.key === 'Escape') {
|
||||
hideTooltip(true);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
|
|
|
|||
|
|
@ -1,86 +0,0 @@
|
|||
/* citations.js — hover tooltip for inline citation markers.
|
||||
On hover of a .cite-marker, reads the matching bibliography entry from
|
||||
the DOM and shows it in a floating tooltip. On click, follows the href
|
||||
to jump to the bibliography section. Phase 3 popups.js can supersede this. */
|
||||
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
let activeTooltip = null;
|
||||
let hideTimer = null;
|
||||
|
||||
function makeTooltip(html) {
|
||||
const el = document.createElement('div');
|
||||
el.className = 'cite-tooltip';
|
||||
el.innerHTML = html;
|
||||
el.addEventListener('mouseenter', () => clearTimeout(hideTimer));
|
||||
el.addEventListener('mouseleave', scheduleHide);
|
||||
return el;
|
||||
}
|
||||
|
||||
function positionTooltip(tooltip, anchor) {
|
||||
document.body.appendChild(tooltip);
|
||||
const aRect = anchor.getBoundingClientRect();
|
||||
const tRect = tooltip.getBoundingClientRect();
|
||||
|
||||
let left = aRect.left + window.scrollX;
|
||||
let top = aRect.top + window.scrollY - tRect.height - 10;
|
||||
|
||||
// Keep horizontally within viewport with margin
|
||||
const maxLeft = window.innerWidth - tRect.width - 12;
|
||||
left = Math.max(8, Math.min(left, maxLeft));
|
||||
|
||||
// Flip below anchor if not enough room above
|
||||
if (top < window.scrollY + 8) {
|
||||
top = aRect.bottom + window.scrollY + 10;
|
||||
}
|
||||
|
||||
tooltip.style.left = left + 'px';
|
||||
tooltip.style.top = top + 'px';
|
||||
}
|
||||
|
||||
function scheduleHide() {
|
||||
hideTimer = setTimeout(() => {
|
||||
if (activeTooltip) {
|
||||
activeTooltip.remove();
|
||||
activeTooltip = null;
|
||||
}
|
||||
}, 180);
|
||||
}
|
||||
|
||||
function getRefHtml(refEl) {
|
||||
// Strip the [N] number span, return the remaining innerHTML
|
||||
const clone = refEl.cloneNode(true);
|
||||
const num = clone.querySelector('.ref-num');
|
||||
if (num) num.remove();
|
||||
return clone.innerHTML.trim();
|
||||
}
|
||||
|
||||
function init() {
|
||||
document.querySelectorAll('.cite-marker').forEach(marker => {
|
||||
const link = marker.querySelector('a.cite-link');
|
||||
if (!link) return;
|
||||
|
||||
const href = link.getAttribute('href');
|
||||
if (!href || !href.startsWith('#')) return;
|
||||
|
||||
const refEl = document.getElementById(href.slice(1));
|
||||
if (!refEl) return;
|
||||
|
||||
marker.addEventListener('mouseenter', () => {
|
||||
clearTimeout(hideTimer);
|
||||
if (activeTooltip) { activeTooltip.remove(); }
|
||||
activeTooltip = makeTooltip(getRefHtml(refEl));
|
||||
positionTooltip(activeTooltip, marker);
|
||||
});
|
||||
|
||||
marker.addEventListener('mouseleave', scheduleHide);
|
||||
});
|
||||
}
|
||||
|
||||
if (document.readyState === 'loading') {
|
||||
document.addEventListener('DOMContentLoaded', init);
|
||||
} else {
|
||||
init();
|
||||
}
|
||||
})();
|
||||
|
|
@ -9,9 +9,18 @@
|
|||
(function () {
|
||||
'use strict';
|
||||
|
||||
var PREFIX = 'section-collapsed:';
|
||||
/* Keys are namespaced by pathname: Pandoc auto-slugs (#introduction,
|
||||
#background) recur across essays, and an un-namespaced key would
|
||||
collapse the same-named section on every page. */
|
||||
var PREFIX = 'section-collapsed:' + location.pathname + ':';
|
||||
var store = window.lnUtils && window.lnUtils.safeStorage;
|
||||
|
||||
function initHeading(heading) {
|
||||
// Idempotence guard: reinitCollapse may be called more than once on
|
||||
// the same container — never re-wrap a section or stack toggle
|
||||
// buttons (matches the popups.js/sidenotes.js convention).
|
||||
if (heading.dataset.collapseBound === '1') return;
|
||||
|
||||
var level = parseInt(heading.tagName[1], 10);
|
||||
var content = [];
|
||||
var node = heading.nextElementSibling;
|
||||
|
|
@ -24,6 +33,7 @@
|
|||
node = node.nextElementSibling;
|
||||
}
|
||||
if (!content.length) return;
|
||||
heading.dataset.collapseBound = '1';
|
||||
|
||||
// Wrap collected nodes in a .section-body div.
|
||||
var wrapper = document.createElement('div');
|
||||
|
|
@ -41,7 +51,7 @@
|
|||
|
||||
// Restore persisted state without transition flash.
|
||||
var key = PREFIX + heading.id;
|
||||
var collapsed = localStorage.getItem(key) === '1';
|
||||
var collapsed = store ? store.get(key) === '1' : false;
|
||||
|
||||
function setCollapsed(c, animate) {
|
||||
if (!animate) wrapper.style.transition = 'none';
|
||||
|
|
@ -80,7 +90,7 @@
|
|||
void wrapper.offsetHeight; // force reflow
|
||||
}
|
||||
setCollapsed(!isCollapsed, true);
|
||||
localStorage.setItem(key, isCollapsed ? '0' : '1');
|
||||
if (store) store.set(key, isCollapsed ? '0' : '1');
|
||||
});
|
||||
|
||||
// After open animation: release the height cap so late-rendering
|
||||
|
|
|
|||
|
|
@ -17,9 +17,18 @@
|
|||
btn.setAttribute('aria-label', 'Copy code to clipboard');
|
||||
|
||||
btn.addEventListener('click', function () {
|
||||
var text = pre.querySelector('code')
|
||||
? pre.querySelector('code').innerText
|
||||
: pre.innerText;
|
||||
var code = pre.querySelector('code');
|
||||
var text;
|
||||
if (code) {
|
||||
text = code.innerText;
|
||||
} else {
|
||||
/* Code-less <pre>: clone and strip the injected button so
|
||||
its label is not copied along with the content. */
|
||||
var clone = pre.cloneNode(true);
|
||||
var cloneBtn = clone.querySelector('.copy-btn');
|
||||
if (cloneBtn) cloneBtn.remove();
|
||||
text = clone.innerText;
|
||||
}
|
||||
|
||||
navigator.clipboard.writeText(text).then(function () {
|
||||
btn.textContent = 'copied';
|
||||
|
|
|
|||
|
|
@ -88,6 +88,21 @@
|
|||
return exhibit.dataset.exhibitCaption || '';
|
||||
}
|
||||
|
||||
/* Make an exhibit wrapper keyboard-operable: role=button, tabindex,
|
||||
and Enter/Space sharing the click path. closeOverlay()'s focus
|
||||
return relies on the wrapper being focusable. */
|
||||
function bindActivation(el, activate) {
|
||||
el.setAttribute('role', 'button');
|
||||
el.setAttribute('tabindex', '0');
|
||||
el.addEventListener('click', activate);
|
||||
el.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Enter' || e.key === ' ') {
|
||||
e.preventDefault();
|
||||
activate();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
function discoverFocusableMath(markdownBody) {
|
||||
markdownBody.querySelectorAll('.katex-display').forEach(function (katexEl) {
|
||||
var source = getSource(katexEl);
|
||||
|
|
@ -118,8 +133,8 @@
|
|||
};
|
||||
focusables.push(entry);
|
||||
|
||||
/* Click anywhere on the wrapper opens the overlay */
|
||||
wrapper.addEventListener('click', function () {
|
||||
/* Click or Enter/Space anywhere on the wrapper opens the overlay */
|
||||
bindActivation(wrapper, function () {
|
||||
openOverlay(focusables.indexOf(entry));
|
||||
});
|
||||
});
|
||||
|
|
@ -151,7 +166,7 @@
|
|||
};
|
||||
focusables.push(entry);
|
||||
|
||||
figEl.addEventListener('click', function () {
|
||||
bindActivation(figEl, function () {
|
||||
openOverlay(focusables.indexOf(entry));
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -165,7 +165,12 @@
|
|||
var images = document.querySelectorAll('img[data-lightbox]');
|
||||
|
||||
images.forEach(function (el) {
|
||||
el.addEventListener('click', function () {
|
||||
// Keyboard activation: the trigger acts as a button, and the
|
||||
// tabindex also lets close() return focus to it.
|
||||
el.setAttribute('tabindex', '0');
|
||||
el.setAttribute('role', 'button');
|
||||
|
||||
function activate() {
|
||||
// Look for a sibling figcaption in the parent figure
|
||||
var figcaptionText = '';
|
||||
var parent = el.parentElement;
|
||||
|
|
@ -176,6 +181,14 @@
|
|||
}
|
||||
}
|
||||
open(el.src, el.alt, figcaptionText, el);
|
||||
}
|
||||
|
||||
el.addEventListener('click', activate);
|
||||
el.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Enter' || e.key === ' ') {
|
||||
e.preventDefault();
|
||||
activate();
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
|
|
@ -199,11 +212,42 @@
|
|||
setInfoVisible(!overlay.classList.contains('is-info-visible'));
|
||||
});
|
||||
|
||||
// Escape closes; "i" toggles info panel (darkroom only).
|
||||
/* Focus trap for the overlay: cycle Tab/Shift+Tab through the
|
||||
focusable controls inside the lightbox so keyboard users
|
||||
cannot tab out into the obscured page background. Same
|
||||
approach as gallery.js's trapTab; the [hidden] exclusion
|
||||
covers infoBtn, which is hidden outside darkroom mode. */
|
||||
function trapTab(e) {
|
||||
var focusable = Array.from(overlay.querySelectorAll(
|
||||
'button:not([disabled]):not([hidden]), [tabindex]:not([tabindex="-1"])'
|
||||
));
|
||||
if (focusable.length === 0) {
|
||||
e.preventDefault();
|
||||
return;
|
||||
}
|
||||
var first = focusable[0];
|
||||
var last = focusable[focusable.length - 1];
|
||||
var active = document.activeElement;
|
||||
if (e.shiftKey) {
|
||||
if (active === first || !overlay.contains(active)) {
|
||||
e.preventDefault();
|
||||
last.focus();
|
||||
}
|
||||
} else {
|
||||
if (active === last || !overlay.contains(active)) {
|
||||
e.preventDefault();
|
||||
first.focus();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Escape closes; Tab is trapped; "i" toggles info panel (darkroom only).
|
||||
document.addEventListener('keydown', function (e) {
|
||||
if (!overlay.classList.contains('is-open')) return;
|
||||
if (e.key === 'Escape') {
|
||||
close();
|
||||
} else if (e.key === 'Tab') {
|
||||
trapTab(e);
|
||||
} else if ((e.key === 'i' || e.key === 'I')
|
||||
&& overlay.classList.contains('darkroom')
|
||||
&& !infoBtn.hidden) {
|
||||
|
|
|
|||
|
|
@ -17,17 +17,23 @@
|
|||
const toggle = document.querySelector('.nav-portal-toggle');
|
||||
if (!portals || !toggle) return;
|
||||
|
||||
// safeStorage (utils.js, loaded synchronously before us) so a
|
||||
// storage-blocked context can't throw before the click listener
|
||||
// below binds; guarded like theme.js in case utils.js itself
|
||||
// failed to load.
|
||||
const store = window.lnUtils && window.lnUtils.safeStorage;
|
||||
|
||||
function setOpen(open) {
|
||||
portals.classList.toggle('is-open', open);
|
||||
toggle.setAttribute('aria-expanded', String(open));
|
||||
// Rotate arrow indicator if present.
|
||||
const arrow = toggle.querySelector('.nav-portal-arrow');
|
||||
if (arrow) arrow.textContent = open ? '▲' : '▼';
|
||||
localStorage.setItem(STORAGE_KEY, open ? '1' : '0');
|
||||
if (store) store.set(STORAGE_KEY, open ? '1' : '0');
|
||||
}
|
||||
|
||||
// Restore persisted state; default is collapsed.
|
||||
const stored = localStorage.getItem(STORAGE_KEY);
|
||||
const stored = store ? store.get(STORAGE_KEY) : null;
|
||||
setOpen(stored === '1');
|
||||
|
||||
toggle.addEventListener('click', function () {
|
||||
|
|
|
|||
|
|
@ -472,7 +472,12 @@
|
|||
if (!match) return Promise.resolve(null);
|
||||
|
||||
var ctx = { match: match, href: href };
|
||||
var url = p.url(ctx);
|
||||
/* p.url runs synchronously (before the .catch below attaches) and
|
||||
can throw — e.g. decodeURIComponent on a malformed percent
|
||||
sequence in the link path. Treat a throw as "no popup". */
|
||||
var url;
|
||||
try { url = p.url(ctx); }
|
||||
catch (e) { return Promise.resolve(null); }
|
||||
var fetcher = p.fetchType === 'xml' ? fetchXml : fetchJson;
|
||||
|
||||
return fetcher(url, p.fetchInit).then(function (data) {
|
||||
|
|
@ -951,10 +956,10 @@
|
|||
var agoDays = daysBetween(start, today);
|
||||
/* "~" prefix when we've rounded to a unit larger than days. */
|
||||
var span = humanDuration(spanDays, true);
|
||||
var ago = humanAgo(agoDays);
|
||||
var ago = humanAgo(agoDays); /* '' when start is in the future */
|
||||
lines.push(
|
||||
'<div class="popup-date-primary">'
|
||||
+ esc(span) + ' · started ' + esc(ago)
|
||||
+ esc(span) + (ago ? ' · started ' + esc(ago) : '')
|
||||
+ '</div>');
|
||||
if (commits && /^\d+$/.test(commits)) {
|
||||
var n = parseInt(commits, 10);
|
||||
|
|
@ -965,10 +970,16 @@
|
|||
}
|
||||
} else {
|
||||
var days = daysBetween(start, today);
|
||||
var ago2 = humanAgo(days); /* '' when the date is in the future */
|
||||
if (ago2) {
|
||||
lines.push(
|
||||
'<div class="popup-date-primary">'
|
||||
+ esc(humanAgo(days)) + '</div>');
|
||||
+ esc(ago2) + '</div>');
|
||||
}
|
||||
}
|
||||
|
||||
/* Nothing renderable (e.g. a lone future date): no popup. */
|
||||
if (!lines.length) return Promise.resolve(null);
|
||||
|
||||
return Promise.resolve('<div class="popup-date">' + lines.join('') + '</div>');
|
||||
}
|
||||
|
|
@ -981,9 +992,10 @@
|
|||
return isNaN(d.getTime()) ? null : d;
|
||||
}
|
||||
|
||||
/* Whole-day difference between two Dates, floored (never negative). */
|
||||
/* Whole-day difference b − a, floored. Negative when b precedes a,
|
||||
so callers can detect future dates instead of mislabelling them. */
|
||||
function daysBetween(a, b) {
|
||||
var ms = Math.abs(b.getTime() - a.getTime());
|
||||
var ms = b.getTime() - a.getTime();
|
||||
return Math.floor(ms / 86400000);
|
||||
}
|
||||
|
||||
|
|
@ -1005,9 +1017,12 @@
|
|||
return (approx ? '~' : '') + y + ' year' + (y === 1 ? '' : 's');
|
||||
}
|
||||
|
||||
/* Past-tense phrasing for a date N days in the past. */
|
||||
/* Past-tense phrasing for a date N days in the past. Returns '' for
|
||||
future dates (negative N) — mirror now.js — so callers render
|
||||
nothing rather than a false "N days ago". */
|
||||
function humanAgo(days) {
|
||||
if (days <= 0) return 'today';
|
||||
if (days < 0) return ''; /* future / clock skew */
|
||||
if (days === 0) return 'today';
|
||||
if (days === 1) return 'yesterday';
|
||||
if (days < 14) return days + ' days ago';
|
||||
return humanDuration(days, true) + ' ago';
|
||||
|
|
|
|||
|
|
@ -23,6 +23,9 @@
|
|||
|
||||
/* Read ?p= from the query string for deep linking. */
|
||||
var qs = new URLSearchParams(window.location.search);
|
||||
/* Keep the canonical URL clean on plain loads: only sync ?p= back to
|
||||
the URL when one was already present or the user navigates. */
|
||||
var syncUrl = qs.has('p');
|
||||
var initPage = parseInt(qs.get('p'), 10);
|
||||
if (!isNaN(initPage) && initPage >= 1 && initPage <= pageCount) {
|
||||
currentPage = initPage;
|
||||
|
|
@ -47,7 +50,7 @@
|
|||
|
||||
/* Replace URL so the page is bookmarkable at the current position.
|
||||
The back button still returns to the landing page. */
|
||||
history.replaceState(null, '', '?p=' + currentPage);
|
||||
if (syncUrl) history.replaceState(null, '', '?p=' + currentPage);
|
||||
|
||||
/* Preload the adjacent pages for smooth turning. */
|
||||
if (currentPage > 1) new Image().src = pages[currentPage - 2];
|
||||
|
|
@ -132,4 +135,5 @@
|
|||
------------------------------------------------------------------ */
|
||||
|
||||
navigate(currentPage);
|
||||
syncUrl = true; /* any later navigate() is a user action — sync from here on */
|
||||
}());
|
||||
|
|
|
|||
|
|
@ -113,12 +113,17 @@
|
|||
/* ---- URL extraction ---- */
|
||||
|
||||
/* Normalise a URL to a pathname for lookup in epistemicMeta.
|
||||
Pagefind results use full URLs; semantic results use relative paths. */
|
||||
Pagefind results use full URLs; semantic results use relative paths.
|
||||
epistemicMeta keys are emitted as routed paths (".../index.html"),
|
||||
while result links use the clean directory form (".../"), so the
|
||||
trailing-slash form must be expanded before lookup. */
|
||||
function normUrl(href) {
|
||||
if (!href) return null;
|
||||
try {
|
||||
var u = new URL(href, window.location.origin);
|
||||
return u.pathname;
|
||||
var p = u.pathname;
|
||||
if (p.charAt(p.length - 1) === '/') p += 'index.html';
|
||||
return p;
|
||||
} catch (e) {
|
||||
return href;
|
||||
}
|
||||
|
|
@ -268,7 +273,12 @@
|
|||
if (!el) return;
|
||||
el.addEventListener('input', function () {
|
||||
var v = el.value.trim();
|
||||
state[field] = v !== '' ? Math.max(0, Math.min(100, parseInt(v, 10) || 0)) : null;
|
||||
var n = parseInt(v, 10);
|
||||
/* Non-numeric input deactivates the filter (null) rather
|
||||
than coercing to an always-matching >= 0 threshold. */
|
||||
state[field] = (v !== '' && !isNaN(n))
|
||||
? Math.max(0, Math.min(100, n))
|
||||
: null;
|
||||
loadMeta().then(applyFilters);
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -7,11 +7,18 @@
|
|||
'use strict';
|
||||
|
||||
window.addEventListener('DOMContentLoaded', function () {
|
||||
var ui = new PagefindUI({
|
||||
/* If the Pagefind bundle failed to load (e.g. 404), skip only the
|
||||
Pagefind setup — the rest of this handler must still run. */
|
||||
var ui = null;
|
||||
if (typeof PagefindUI === 'undefined') {
|
||||
console.warn('search.js: PagefindUI not loaded — keyword search disabled.');
|
||||
} else {
|
||||
ui = new PagefindUI({
|
||||
element: '#search',
|
||||
showImages: false,
|
||||
excerptLength: 30,
|
||||
});
|
||||
}
|
||||
|
||||
/* Timing instrumentation ------------------------------------------ */
|
||||
var timingEl = document.getElementById('search-timing');
|
||||
|
|
@ -46,7 +53,7 @@
|
|||
/* Pre-fill from URL parameter and trigger the search -------------- */
|
||||
var params = new URLSearchParams(window.location.search);
|
||||
var q = params.get('q');
|
||||
if (q) {
|
||||
if (q && ui) {
|
||||
startTime = performance.now();
|
||||
ui.triggerSearch(q);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -88,6 +88,15 @@
|
|||
}
|
||||
|
||||
function onKeyUp(e) {
|
||||
/* Typing capitals in the annotation picker's note input (or any
|
||||
other editable field) releases Shift — don't re-summon the
|
||||
toolbar over the UI the user is typing into. */
|
||||
var t = e.target;
|
||||
if (t && t.nodeType === Node.ELEMENT_NODE) {
|
||||
if (popup.contains(t)) return;
|
||||
if (picker && picker.contains(t)) return;
|
||||
if (t.isContentEditable || t.closest('input, textarea')) return;
|
||||
}
|
||||
if (e.shiftKey || e.key === 'End' || e.key === 'Home') {
|
||||
clearTimeout(showTimer);
|
||||
showTimer = setTimeout(tryShow, SHOW_DELAY);
|
||||
|
|
|
|||
|
|
@ -39,10 +39,17 @@
|
|||
Index loading — fetch once, lazily
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
/* In-flight promise so concurrent first searches share a single
|
||||
index fetch (mirrors loadModelPromise below). Without this guard,
|
||||
two rapid keystrokes would each fetch semantic-index.bin and
|
||||
semantic-meta.json before the first resolves. */
|
||||
var loadIndexPromise = null;
|
||||
|
||||
function loadIndex() {
|
||||
if (indexReady) return Promise.resolve();
|
||||
if (loadIndexPromise) return loadIndexPromise;
|
||||
|
||||
return Promise.all([
|
||||
loadIndexPromise = Promise.all([
|
||||
fetch('/data/semantic-index.bin').then(function (r) {
|
||||
if (!r.ok) throw new Error('semantic-index.bin not found');
|
||||
return r.arrayBuffer();
|
||||
|
|
@ -54,8 +61,23 @@
|
|||
]).then(function (results) {
|
||||
vectors = new Float32Array(results[0]);
|
||||
meta = results[1];
|
||||
/* Consistency check: a stale CDN-cached bin/json pair would
|
||||
otherwise produce NaN scores and silently garbage ranking. */
|
||||
if (vectors.length !== meta.length * DIM) {
|
||||
console.error('semantic-search: index/meta size mismatch ('
|
||||
+ vectors.length + ' floats vs ' + meta.length + ' × ' + DIM + ')');
|
||||
vectors = null;
|
||||
meta = null;
|
||||
throw new Error('semantic index not available: index/meta size mismatch');
|
||||
}
|
||||
indexReady = true;
|
||||
}).catch(function (err) {
|
||||
/* Allow a retry on the next call instead of caching the
|
||||
failed promise forever. */
|
||||
loadIndexPromise = null;
|
||||
throw err;
|
||||
});
|
||||
return loadIndexPromise;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
|
|
@ -114,14 +136,23 @@
|
|||
});
|
||||
}
|
||||
|
||||
/* Generation token: each runSearch call invalidates all still-in-flight
|
||||
predecessors, so a stale (earlier) query's results can never render
|
||||
after a newer query's. */
|
||||
var searchGeneration = 0;
|
||||
|
||||
function runSearch(query) {
|
||||
var gen = ++searchGeneration;
|
||||
|
||||
query = query.trim();
|
||||
if (!query) { clearResults(); return; }
|
||||
|
||||
setStatus('Searching…');
|
||||
|
||||
var indexPromise = loadIndex().catch(function (err) {
|
||||
if (gen === searchGeneration) {
|
||||
setStatus('Semantic index not available — run make build first.');
|
||||
}
|
||||
throw err;
|
||||
});
|
||||
var modelPromise = loadModel();
|
||||
|
|
@ -130,12 +161,14 @@
|
|||
var pipe = results[1];
|
||||
return pipe(query, { pooling: 'mean', normalize: true });
|
||||
}).then(function (output) {
|
||||
if (gen !== searchGeneration) return; /* superseded by a newer query */
|
||||
var queryVec = output.data; /* Float32Array, length 384 */
|
||||
var scores = cosineSims(queryVec);
|
||||
var hits = topK(scores);
|
||||
renderResults(hits);
|
||||
setStatus(hits.length ? '' : 'No results found.');
|
||||
}).catch(function (err) {
|
||||
if (gen !== searchGeneration) return; /* superseded by a newer query */
|
||||
if (err.message && err.message.indexOf('not available') === -1) {
|
||||
setStatus('Search error — see console for details.');
|
||||
console.error('semantic-search:', err);
|
||||
|
|
|
|||
|
|
@ -108,11 +108,26 @@
|
|||
}
|
||||
}
|
||||
|
||||
function loadTransclusion(el) {
|
||||
/* Nested transclusion limits: ancestors carries the chain of srcs
|
||||
* currently being expanded (cycle guard — a self-transcluding page
|
||||
* must not loop), and MAX_DEPTH caps pathological nesting. */
|
||||
var MAX_DEPTH = 3;
|
||||
|
||||
function loadTransclusion(el, depth, ancestors) {
|
||||
depth = depth || 0;
|
||||
ancestors = ancestors || [];
|
||||
|
||||
var src = el.dataset.src;
|
||||
var section = el.dataset.section || null;
|
||||
if (!src) return;
|
||||
|
||||
if (depth >= MAX_DEPTH || ancestors.indexOf(src) !== -1) {
|
||||
el.classList.add('transclude--error');
|
||||
el.textContent = '[transclusion omitted (cycle or depth limit): '
|
||||
+ src + (section ? '#' + section : '') + ']';
|
||||
return;
|
||||
}
|
||||
|
||||
el.classList.add('transclude--loading');
|
||||
|
||||
fetchPage(src)
|
||||
|
|
@ -138,6 +153,14 @@
|
|||
el.classList.replace('transclude--loading', 'transclude--loaded');
|
||||
el.appendChild(wrapper);
|
||||
|
||||
/* The fetched page may itself contain transclusion
|
||||
placeholders — process them too, extending the
|
||||
ancestor chain for cycle/depth guarding. */
|
||||
var chain = ancestors.concat(src);
|
||||
wrapper.querySelectorAll('div.transclude').forEach(function (nested) {
|
||||
loadTransclusion(nested, depth + 1, chain);
|
||||
});
|
||||
|
||||
reinitFragment(el);
|
||||
})
|
||||
.catch(function (err) {
|
||||
|
|
@ -147,6 +170,8 @@
|
|||
}
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
document.querySelectorAll('div.transclude').forEach(loadTransclusion);
|
||||
document.querySelectorAll('div.transclude').forEach(function (el) {
|
||||
loadTransclusion(el);
|
||||
});
|
||||
});
|
||||
}());
|
||||
|
|
|
|||
|
|
@ -94,6 +94,9 @@
|
|||
function isDark() {
|
||||
var t = document.documentElement.dataset.theme;
|
||||
if (t === 'dark') return true;
|
||||
/* cappuccino is a dark-brown theme (light text on #553a28) — charts
|
||||
need the dark palette or axis labels become unreadable. */
|
||||
if (t === 'cappuccino') return true;
|
||||
if (t === 'light') return false;
|
||||
return window.matchMedia('(prefers-color-scheme: dark)').matches;
|
||||
}
|
||||
|
|
|
|||
|
After Width: | Height: | Size: 32 KiB |
|
After Width: | Height: | Size: 79 KiB |
|
|
@ -1,13 +1,28 @@
|
|||
{
|
||||
"name": "levineuwirth.org",
|
||||
"name": "Levi Neuwirth",
|
||||
"short_name": "ln",
|
||||
"description": "Personal site of Levi Neuwirth — essays, research, music, and photography.",
|
||||
"start_url": "/",
|
||||
"scope": "/",
|
||||
"icons": [
|
||||
{
|
||||
"src": "/web-app-manifest-192x192.png",
|
||||
"sizes": "192x192",
|
||||
"type": "image/png",
|
||||
"purpose": "any"
|
||||
},
|
||||
{
|
||||
"src": "/web-app-manifest-192x192.png",
|
||||
"sizes": "192x192",
|
||||
"type": "image/png",
|
||||
"purpose": "maskable"
|
||||
},
|
||||
{
|
||||
"src": "/web-app-manifest-512x512.png",
|
||||
"sizes": "512x512",
|
||||
"type": "image/png",
|
||||
"purpose": "any"
|
||||
},
|
||||
{
|
||||
"src": "/web-app-manifest-512x512.png",
|
||||
"sizes": "512x512",
|
||||
|
|
@ -15,7 +30,7 @@
|
|||
"purpose": "maskable"
|
||||
}
|
||||
],
|
||||
"theme_color": "#ffffff",
|
||||
"background_color": "#ffffff",
|
||||
"theme_color": "#16140f",
|
||||
"background_color": "#16140f",
|
||||
"display": "standalone"
|
||||
}
|
||||
|
Before Width: | Height: | Size: 1.3 KiB After Width: | Height: | Size: 22 KiB |
|
Before Width: | Height: | Size: 7.8 KiB After Width: | Height: | Size: 106 KiB |
|
|
@ -17,24 +17,27 @@
|
|||
$body$
|
||||
$if(backlinks)$
|
||||
<footer class="page-meta-footer">
|
||||
$else$
|
||||
$if(similar-links)$
|
||||
<footer class="page-meta-footer">
|
||||
$endif$
|
||||
$endif$
|
||||
$if(backlinks)$
|
||||
<div class="meta-footer-full meta-footer-backlinks" id="backlinks">
|
||||
<h3>Backlinks</h3>
|
||||
$backlinks$
|
||||
</div>
|
||||
$endif$
|
||||
$if(similar-links)$
|
||||
<div class="meta-footer-full meta-footer-similar" id="similar-links">
|
||||
<h3>Related</h3>
|
||||
$similar-links$
|
||||
</div>
|
||||
$endif$
|
||||
$if(backlinks)$
|
||||
</footer>
|
||||
$else$
|
||||
$if(similar-links)$
|
||||
<footer class="page-meta-footer">
|
||||
<div class="meta-footer-full meta-footer-similar" id="similar-links">
|
||||
<h3>Related</h3>
|
||||
$similar-links$
|
||||
</div>
|
||||
</footer>
|
||||
$endif$
|
||||
$endif$
|
||||
|
|
|
|||
|
|
@ -14,8 +14,10 @@ $if(home)$<meta property="og:title" content="Levi Neuwirth">$else$$if(title)$<me
|
|||
$if(description)$<meta property="og:description" content="$description$">$endif$
|
||||
<meta property="og:url" content="$site-url$$url$">
|
||||
$if(date)$<meta property="og:type" content="article">$else$<meta property="og:type" content="website">$endif$
|
||||
<meta property="og:image" content="$site-url$/web-app-manifest-512x512.png">
|
||||
<meta name="twitter:card" content="summary">
|
||||
<meta property="og:image" content="$site-url$/og-image.png">
|
||||
<meta property="og:image:width" content="1200">
|
||||
<meta property="og:image:height" content="630">
|
||||
<meta name="twitter:card" content="summary_large_image">
|
||||
$if(description)$<meta name="twitter:description" content="$description$">$endif$
|
||||
|
||||
<link rel="icon" type="image/png" href="/favicon-96x96.png" sizes="96x96">
|
||||
|
|
|
|||
|
|
@ -2,7 +2,13 @@
|
|||
<nav class="site-nav">
|
||||
<!-- Row 1: primary links -->
|
||||
<div class="nav-row-primary">
|
||||
<a href="/" class="nav-logo" aria-label="Home"></a>
|
||||
<!-- The mark lives in /logo-sprite.svg and is referenced via
|
||||
<use> instead of being inlined: the traced path is ~33 KB,
|
||||
and a per-page inline copy would dwarf most documents. CSS
|
||||
custom properties (--logo-ink/--logo-bg) cascade into the
|
||||
use-element shadow tree, so the two-tone cutout still
|
||||
renders. -->
|
||||
<a href="/" class="nav-logo" aria-label="Home"><svg class="nav-logo__mark" aria-hidden="true" focusable="false"><use href="/logo-sprite.svg#logo-mark"/></svg></a>
|
||||
<div class="nav-primary">
|
||||
<a href="/">Home</a>
|
||||
<a href="/current.html">Current</a>
|
||||
|
|
|
|||
|
|
@ -7,6 +7,9 @@
|
|||
<link rel="stylesheet" href="/css/base.css">
|
||||
<link rel="stylesheet" href="/css/components.css">
|
||||
<link rel="stylesheet" href="/css/score-reader.css">
|
||||
<!-- utils.js must precede theme.js: theme.js reads saved settings via
|
||||
window.lnUtils.safeStorage and silently restores nothing without it. -->
|
||||
<script src="/js/utils.js"></script>
|
||||
<script src="/js/theme.js"></script>
|
||||
</head>
|
||||
<body class="score-reader-page">
|
||||
|
|
|
|||
|
|
@ -49,6 +49,10 @@ EOF
|
|||
bold "── new popup provider ──"
|
||||
NAME=$(prompt "slug (lowercase, used as class + data-popup-source key, e.g. 'zenodo'):")
|
||||
[[ -z "$NAME" ]] && { warn "slug required"; exit 1; }
|
||||
# The slug is interpolated into nginx directives (location /proxy/$NAME/,
|
||||
# set \$upstream_$NAME) — validate like import-photo.sh does so a space,
|
||||
# ';', or '{' can't produce a config that fails to load.
|
||||
[[ "$NAME" =~ ^[a-z0-9-]+$ ]] || { warn "slug must match ^[a-z0-9-]+\$"; exit 1; }
|
||||
|
||||
LABEL=$(prompt "display label (e.g. 'Zenodo'):")
|
||||
[[ -z "$LABEL" ]] && LABEL="$NAME"
|
||||
|
|
@ -107,14 +111,16 @@ fi
|
|||
|
||||
# ── proxy prefix + upstream host derivation ──────────────────────────
|
||||
|
||||
if [[ "$NEEDS_PROXY" -eq 1 ]]; then
|
||||
# UPSTREAM_HOST is derived unconditionally: the no-proxy (direct CORS
|
||||
# fetch) case is exactly when the host must be added to connect-src, so
|
||||
# the checklist's CSP reminder below needs it populated either way.
|
||||
UPSTREAM_HOST=$(printf '%s' "$API_URL" | awk -F/ '{print $3}')
|
||||
if [[ "$NEEDS_PROXY" -eq 1 ]]; then
|
||||
UPSTREAM_PATH=$(printf '%s' "$API_URL" | awk -F/ 'BEGIN{OFS="/"} {$1=""; $2=""; $3=""; print}' | sed 's|^///||')
|
||||
PROXY_PATH="/proxy/$NAME/"
|
||||
PROXY_API_URL="$PROXY_PATH${UPSTREAM_PATH%%\?*}"
|
||||
[[ "$API_URL" == *"?"* ]] && PROXY_API_URL="$PROXY_API_URL?${API_URL#*\?}"
|
||||
else
|
||||
UPSTREAM_HOST=""
|
||||
PROXY_API_URL="$API_URL"
|
||||
fi
|
||||
|
||||
|
|
@ -205,8 +211,9 @@ cat <<EOF
|
|||
EOF
|
||||
|
||||
if [[ "$NEEDS_PROXY" -eq 0 && -n "$UPSTREAM_HOST" ]]; then
|
||||
echo " 5. In static/js/popups.js top-comment: add $UPSTREAM_HOST to the"
|
||||
echo " connect-src CSP list."
|
||||
echo " 5. Add https://$UPSTREAM_HOST to connect-src in"
|
||||
echo " nginx/security-headers.conf (direct CORS fetches are blocked"
|
||||
echo " by CSP otherwise), and mirror it in the popups.js top-comment."
|
||||
fi
|
||||
|
||||
echo
|
||||
|
|
|
|||
128
tools/archive.py
|
|
@ -104,6 +104,30 @@ def err(msg: str) -> None:
|
|||
print(f"[archive] ERROR: {msg}", file=sys.stderr)
|
||||
|
||||
|
||||
def atomic_write_text(path: Path, text: str) -> None:
|
||||
"""Write to a PID-unique temp then os.replace. PROVENANCE.json and
|
||||
the generated index/state files are integrity records — an interrupt
|
||||
mid-write must never leave a truncated file that the next run parses
|
||||
(or mistakes for corruption); fsync makes the rename durable and the
|
||||
PID suffix keeps concurrent runs from sharing a temp file."""
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with tmp.open("w", encoding="utf-8") as f:
|
||||
f.write(text)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp, path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def atomic_write_json(path: Path, obj) -> None:
|
||||
atomic_write_text(
|
||||
path, json.dumps(obj, indent=2, ensure_ascii=False) + "\n")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Manifest / removed.yaml
|
||||
# ---------------------------------------------------------------------------
|
||||
|
|
@ -119,6 +143,15 @@ def load_yaml_list(path: Path) -> list[dict]:
|
|||
if not isinstance(data, list):
|
||||
err(f"{path.name}: expected a YAML list, got {type(data).__name__}")
|
||||
sys.exit(1)
|
||||
# Validate items too: a stray scalar line (`- https://example.com`
|
||||
# instead of `- url: ...`) would otherwise surface much later as an
|
||||
# AttributeError deep inside fetch/wayback/check.
|
||||
for i, item in enumerate(data):
|
||||
if not isinstance(item, dict):
|
||||
err(f"{path.name}: entry {i + 1} is not a mapping "
|
||||
f"(got {type(item).__name__}: {item!r}); "
|
||||
f"each entry must be `- url: ...`")
|
||||
sys.exit(1)
|
||||
return data
|
||||
|
||||
|
||||
|
|
@ -241,7 +274,10 @@ def extract_text_pdf(pdf: Path, txt: Path) -> None:
|
|||
"""Extract plain text from `pdf` into `txt` via pdftotext. On any
|
||||
failure an empty file is written so downstream steps still find it."""
|
||||
try:
|
||||
subprocess.run(["pdftotext", "-q", str(pdf), str(txt)], check=True)
|
||||
# `--` ends option parsing so a slug starting with `-` cannot be
|
||||
# mistaken for a pdftotext option.
|
||||
subprocess.run(["pdftotext", "-q", "--", str(pdf), str(txt)],
|
||||
check=True)
|
||||
except (subprocess.CalledProcessError, FileNotFoundError) as exc:
|
||||
err(f"{pdf.name}: pdftotext failed ({exc}); writing empty text sidecar")
|
||||
txt.write_text("", encoding="utf-8")
|
||||
|
|
@ -263,6 +299,51 @@ def find_monolith() -> str | None:
|
|||
return shutil.which("monolith")
|
||||
|
||||
|
||||
MONOLITH_VERSION_FILE = REPO_ROOT / "tools" / "monolith-version.txt"
|
||||
|
||||
# Binaries already verified this run — the pin check hashes the binary
|
||||
# once, not once per snapshot.
|
||||
_monolith_verified: set[str] = set()
|
||||
|
||||
|
||||
def _pinned_monolith_sha256() -> str | None:
|
||||
"""Parse the `sha256 = <hex>` line from tools/monolith-version.txt.
|
||||
Returns None when the file is missing or unparseable (the caller
|
||||
warns and continues — only a *mismatch* is fatal)."""
|
||||
try:
|
||||
text = MONOLITH_VERSION_FILE.read_text(encoding="utf-8")
|
||||
except OSError:
|
||||
return None
|
||||
m = re.search(r"^\s*sha256\s*=\s*([0-9a-fA-F]{64})\s*$",
|
||||
text, re.MULTILINE)
|
||||
return m.group(1).lower() if m else None
|
||||
|
||||
|
||||
def verify_monolith(mono: str) -> None:
|
||||
"""Integrity gate for the snapshot tool itself: the binary that
|
||||
produces committed artifacts must match the SHA-256 pinned in
|
||||
tools/monolith-version.txt. A mismatch is an integrity error (print
|
||||
loudly, exit non-zero, halt `make build`); a missing or unparseable
|
||||
version file is a warning only."""
|
||||
if mono in _monolith_verified:
|
||||
return
|
||||
pinned = _pinned_monolith_sha256()
|
||||
if pinned is None:
|
||||
print(f"[archive] WARNING: {MONOLITH_VERSION_FILE.name} is missing "
|
||||
f"or has no parseable `sha256 = …` line — monolith binary "
|
||||
f"integrity NOT verified ({mono})", file=sys.stderr)
|
||||
_monolith_verified.add(mono)
|
||||
return
|
||||
live = sha256_of(Path(mono))
|
||||
if live != pinned:
|
||||
err(f"monolith binary {mono} fails SHA-256 verification "
|
||||
f"(pinned {pinned}, found {live}). The snapshot tool's bytes "
|
||||
f"do not match tools/monolith-version.txt — re-vendor the "
|
||||
f"binary or update the pin (see that file's instructions).")
|
||||
sys.exit(1)
|
||||
_monolith_verified.add(mono)
|
||||
|
||||
|
||||
def body_noarchive(path: Path) -> bool:
|
||||
"""True if the snapshot declares <meta name=robots ... noarchive> —
|
||||
the in-document equivalent of the X-Robots-Tag header."""
|
||||
|
|
@ -327,6 +408,7 @@ def fetch_html(url: str, dest: Path) -> bool:
|
|||
f"tools/bin/monolith (see tools/monolith-version.txt) or set "
|
||||
f"$MONOLITH_BIN; HTML snapshot skipped")
|
||||
return False
|
||||
verify_monolith(mono)
|
||||
|
||||
source = dest.with_suffix(dest.suffix + ".source.part")
|
||||
tmp = dest.with_suffix(dest.suffix + ".part")
|
||||
|
|
@ -715,10 +797,7 @@ def cmd_fetch() -> int:
|
|||
"snapshot-quality": quality,
|
||||
"wayback": None,
|
||||
}
|
||||
prov_path.write_text(
|
||||
json.dumps(prov, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
atomic_write_json(prov_path, prov)
|
||||
log(f"{slug}: archived [{atype}, {quality}] ({prov['bytes']} bytes)")
|
||||
|
||||
# --- contribute to the Hakyll index -------------------------------
|
||||
|
|
@ -730,11 +809,7 @@ def cmd_fetch() -> int:
|
|||
}
|
||||
|
||||
# archive-index.json is always rewritten to mirror the manifest exactly.
|
||||
INDEX_OUT.parent.mkdir(parents=True, exist_ok=True)
|
||||
INDEX_OUT.write_text(
|
||||
json.dumps(index, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
atomic_write_json(INDEX_OUT, index)
|
||||
log(f"wrote {INDEX_OUT.relative_to(REPO_ROOT)} ({len(index)} entries)")
|
||||
|
||||
if skipped:
|
||||
|
|
@ -785,14 +860,18 @@ def cmd_refresh(argv: list[str]) -> int:
|
|||
try:
|
||||
prev = json.loads(prov_path.read_text(encoding="utf-8"))
|
||||
prev_sha = prev.get("sha256")
|
||||
prev_artifact = slug_dir / prev.get("artifact", "")
|
||||
prev_art_name = prev.get("artifact") or ""
|
||||
prev_artifact = slug_dir / prev_art_name
|
||||
except Exception as exc: # noqa: BLE001
|
||||
err(f"refresh: cannot parse prior provenance for {slug}: {exc}")
|
||||
return 2
|
||||
# The prior snapshot must be committed and clean — otherwise
|
||||
# `previous-sha256` would point at bytes git can no longer give
|
||||
# back, breaking the auditable replacement contract.
|
||||
if not prev_sha or not prev_artifact.exists():
|
||||
# back, breaking the auditable replacement contract. The empty-
|
||||
# artifact guard matters: without it prev_artifact would be the
|
||||
# slug directory itself, which exists() accepts and sha256_of
|
||||
# then crashes on with IsADirectoryError.
|
||||
if not prev_sha or not prev_art_name or not prev_artifact.is_file():
|
||||
err(f"refresh: prior snapshot for {slug} is incomplete; restore "
|
||||
f"its artifact and provenance before replacing it.")
|
||||
return 2
|
||||
|
|
@ -850,11 +929,7 @@ def cmd_refresh(argv: list[str]) -> int:
|
|||
if art_name and (slug_dir / art_name).exists():
|
||||
if prev_sha:
|
||||
new_prov["previous-sha256"] = prev_sha
|
||||
prov_path.write_text(
|
||||
json.dumps(new_prov, indent=2,
|
||||
ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
atomic_write_json(prov_path, new_prov)
|
||||
log(f"refresh: recorded previous-sha256 "
|
||||
f"{prev_sha[:12]}…")
|
||||
succeeded = True
|
||||
|
|
@ -893,7 +968,11 @@ def wayback_save(url: str) -> None:
|
|||
"""Trigger a fresh Wayback capture via Save Page Now. Best-effort: any
|
||||
outcome is tolerated — the resulting URL is read back via the
|
||||
availability API (which also surfaces a pre-existing capture)."""
|
||||
req = urllib.request.Request("https://web.archive.org/save/" + url,
|
||||
# Quote only what can't appear raw in a request line (spaces,
|
||||
# control chars); URL structure (:/?&=#) passes through so Save
|
||||
# Page Now sees the original URL shape.
|
||||
req = urllib.request.Request(
|
||||
"https://web.archive.org/save/" + quote(url, safe=":/?&=#"),
|
||||
headers={"User-Agent": USER_AGENT})
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=WAYBACK_TIMEOUT):
|
||||
|
|
@ -951,10 +1030,7 @@ def cmd_wayback() -> int:
|
|||
capture = wayback_lookup(url)
|
||||
if capture:
|
||||
prov["wayback"] = capture
|
||||
prov_path.write_text(
|
||||
json.dumps(prov, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
atomic_write_json(prov_path, prov)
|
||||
log(f"{slug}: wayback -> {capture}")
|
||||
backfilled += 1
|
||||
else:
|
||||
|
|
@ -1073,11 +1149,7 @@ def cmd_check() -> int:
|
|||
note = f" -> {new_url}" if new_url else ""
|
||||
log(f"check: {url} [{rec['status']}]{note}")
|
||||
|
||||
STATE_OUT.parent.mkdir(parents=True, exist_ok=True)
|
||||
STATE_OUT.write_text(
|
||||
json.dumps(state, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
atomic_write_json(STATE_OUT, state)
|
||||
log(f"check: {tally['live']} live, {tally['moved']} moved, "
|
||||
f"{tally['error']} error, {tally['rotted']} rotted "
|
||||
f"-> {STATE_OUT.relative_to(REPO_ROOT)}")
|
||||
|
|
|
|||
|
|
@ -32,7 +32,11 @@ while IFS= read -r -d '' img; do
|
|||
skipped=$((skipped + 1))
|
||||
else
|
||||
echo " webp ${img#"$REPO_ROOT/"}"
|
||||
cwebp -quiet -q 85 "$img" -o "$webp"
|
||||
# Write to a temp name then move: an interrupted cwebp would
|
||||
# otherwise leave a truncated .webp that is newer than its
|
||||
# source, which the staleness gate above then skips forever.
|
||||
cwebp -quiet -q 85 "$img" -o "$webp.part"
|
||||
mv "$webp.part" "$webp"
|
||||
converted=$((converted + 1))
|
||||
fi
|
||||
done < <(find "$REPO_ROOT/static" "$REPO_ROOT/content" \
|
||||
|
|
|
|||
|
|
@ -7,8 +7,9 @@
|
|||
# the site, no third-party request at view time.
|
||||
#
|
||||
# Run once before deploying. The vendored copy is gitignored
|
||||
# (~150 KB total); re-running is safe — the script skips when the
|
||||
# files already exist.
|
||||
# (~150 KB total); re-running is safe — files that already exist AND
|
||||
# match their pinned checksum are skipped; anything missing or
|
||||
# mismatched is re-fetched.
|
||||
#
|
||||
# To bump the pinned versions, set LEAFLET_VERSION / MARKERCLUSTER_VERSION,
|
||||
# re-run, then update tools/leaflet-checksums.sha256 with the new hashes.
|
||||
|
|
@ -39,13 +40,6 @@ files_to_fetch=(
|
|||
"$UNPKG_MC|MarkerCluster.Default.css|leaflet.markercluster-${MARKERCLUSTER_VERSION}-MarkerCluster.Default.css"
|
||||
)
|
||||
|
||||
# Skip the whole step if the canonical entry-point already exists.
|
||||
# Force a re-fetch by removing the directory.
|
||||
if [ -f "$LEAFLET_DIR/leaflet.js" ] && [ -f "$LEAFLET_DIR/leaflet.markercluster.js" ]; then
|
||||
echo "leaflet: already vendored at $LEAFLET_DIR (skipping)"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
mkdir -p "$LEAFLET_DIR/images"
|
||||
|
||||
verify_or_warn() {
|
||||
|
|
@ -71,15 +65,35 @@ verify_or_warn() {
|
|||
fi
|
||||
}
|
||||
|
||||
# Per-file skip: existing files are skipped only after re-verifying
|
||||
# their checksum, so a partial or tampered file from an interrupted
|
||||
# earlier run can never be silently accepted. Downloads land in a
|
||||
# .part temp and are only moved into place after verification — a
|
||||
# failed verification leaves nothing at the final path.
|
||||
for entry in "${files_to_fetch[@]}"; do
|
||||
IFS='|' read -r url_base local_path pin_key <<<"$entry"
|
||||
src_name="${local_path##*/}"
|
||||
target="$LEAFLET_DIR/$local_path"
|
||||
mkdir -p "$(dirname "$target")"
|
||||
|
||||
if [ -f "$target" ]; then
|
||||
if verify_or_warn "$target" "$pin_key"; then
|
||||
echo "leaflet: $local_path present and verified (skipping)"
|
||||
continue
|
||||
fi
|
||||
echo "leaflet: $local_path failed verification — re-fetching" >&2
|
||||
rm -f "$target"
|
||||
fi
|
||||
|
||||
echo "leaflet: fetching $local_path ($pin_key)"
|
||||
curl -fsSL --progress-bar "$url_base/$src_name" -o "$target"
|
||||
verify_or_warn "$target" "$pin_key"
|
||||
tmp="$target.part"
|
||||
curl -fsSL --progress-bar "$url_base/$src_name" -o "$tmp"
|
||||
if ! verify_or_warn "$tmp" "$pin_key"; then
|
||||
rm -f "$tmp"
|
||||
echo "leaflet: refusing to vendor unverified $local_path" >&2
|
||||
exit 1
|
||||
fi
|
||||
mv "$tmp" "$target"
|
||||
done
|
||||
|
||||
echo "leaflet: vendored to $LEAFLET_DIR"
|
||||
|
|
|
|||
|
|
@ -68,8 +68,13 @@ fetch() {
|
|||
return
|
||||
fi
|
||||
echo " fetch $src"
|
||||
curl -fsSL --progress-bar "$BASE_URL/$src" -o "$dst"
|
||||
verify_sha "$src" "$dst"
|
||||
# Download to a temp name and move into place only after
|
||||
# verification: an interrupted curl must never leave a partial
|
||||
# file at the final path, where the present-file skip (or, for an
|
||||
# unpinned file, nothing at all) would accept it forever.
|
||||
curl -fsSL --progress-bar "$BASE_URL/$src" -o "$dst.part"
|
||||
verify_sha "$src" "$dst.part"
|
||||
mv "$dst.part" "$dst"
|
||||
}
|
||||
|
||||
if [ ! -f "$CHECKSUMS" ]; then
|
||||
|
|
|
|||
253
tools/embed.py
|
|
@ -5,20 +5,36 @@ embed.py — Build-time embedding pipeline.
|
|||
Produces two outputs from _site/**/*.html:
|
||||
|
||||
data/similar-links.json Page-level similarity (for "Related" footer section)
|
||||
data/semantic-index.bin Paragraph vectors as raw Float32 array (N × DIM)
|
||||
data/semantic-index.bin Paragraph vectors as raw Float32 array (N × PARA_DIM)
|
||||
data/semantic-meta.json Paragraph metadata: [{url, title, heading, excerpt}]
|
||||
|
||||
Both use all-MiniLM-L6-v2 (384 dims) — the same model shipped to the browser
|
||||
via transformers.js for query-time semantic search.
|
||||
Two models, one process:
|
||||
|
||||
* Pages use nomic-embed-text-v1.5 (768 dims) — build-time only, never
|
||||
shipped to the browser. Chosen for its well-separated cosine scores on
|
||||
small corpora, which keeps the MIN_SCORE gate meaningful so every essay
|
||||
reliably gets a "Related" footer section.
|
||||
|
||||
* Paragraphs use all-MiniLM-L6-v2 (384 dims) — must match what the
|
||||
browser runs via transformers.js (static/js/semantic-search.js) since
|
||||
query vectors are dotted against the shipped index.
|
||||
|
||||
Called by `make build` when .venv exists. Failures are non-fatal.
|
||||
Staleness check: skips if all output files are newer than every HTML in _site/.
|
||||
|
||||
Staleness: both passes are content-hash cached (data/embed-cache-*.npz),
|
||||
so an unchanged site re-embeds nothing and loads no model — only the
|
||||
HTML extraction pass runs. There is deliberately no mtime-based skip:
|
||||
stamp-build-time.py rewrites every page's footer after this script runs,
|
||||
so "are outputs newer than the HTML" is always false and a check based
|
||||
on it can never fire.
|
||||
"""
|
||||
|
||||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
import faiss
|
||||
|
|
@ -35,13 +51,48 @@ SITE_DIR = REPO_ROOT / "_site"
|
|||
SIMILAR_OUT = REPO_ROOT / "data" / "similar-links.json"
|
||||
SEMANTIC_BIN = REPO_ROOT / "data" / "semantic-index.bin"
|
||||
SEMANTIC_META = REPO_ROOT / "data" / "semantic-meta.json"
|
||||
# Content-addressed caches, one per pass. Keyed by sha256 of the (prefixed)
|
||||
# input text; invalidated wholesale on model name/revision/dim change.
|
||||
# Gitignored — build artifacts, not source. Survive `make clean`.
|
||||
PAGE_CACHE = REPO_ROOT / "data" / "embed-cache-pages.npz"
|
||||
PARA_CACHE = REPO_ROOT / "data" / "embed-cache-paragraphs.npz"
|
||||
|
||||
MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"
|
||||
# Pinned to a specific HuggingFace commit so a future model bump can't
|
||||
# silently change embedding semantics. Bump deliberately when validating
|
||||
# (and re-run a full embed pass to refresh data/semantic-* + similar-links).
|
||||
MODEL_REVISION = "c9745ed1d9f207416be6d2e6f8de32d1f16199bf"
|
||||
DIM = 384
|
||||
# Two models, deliberately split:
|
||||
#
|
||||
# PARA_MODEL — embeds paragraphs for data/semantic-index.bin. This index
|
||||
# is fetched by the browser at /search/ and ranked against query vectors
|
||||
# computed client-side. The client (static/js/semantic-search.js) embeds
|
||||
# queries with MiniLM-L6-v2 via transformers.js, so the build-time model
|
||||
# must match exactly — both the architecture and the embedding dimension
|
||||
# are part of the wire contract.
|
||||
#
|
||||
# PAGE_MODEL — embeds full pages for data/similar-links.json. This file
|
||||
# is consumed only at Hakyll-build time (SimilarLinks.hs) and never
|
||||
# shipped to the browser, so it is free to use a different, stronger
|
||||
# model. nomic-embed-text-v1.5 produces well-separated cosine scores on
|
||||
# small corpora (top neighbours at 0.7–0.9 instead of MiniLM's compressed
|
||||
# 0.1–0.3), so the MIN_SCORE gate below is meaningful and every essay
|
||||
# reliably gets a "Related" footer section.
|
||||
#
|
||||
# Both pins are deliberate. Bump only when validating and re-run a full
|
||||
# embed pass to refresh the corresponding output files.
|
||||
|
||||
PARA_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"
|
||||
PARA_MODEL_REVISION = "c9745ed1d9f207416be6d2e6f8de32d1f16199bf"
|
||||
PARA_DIM = 384
|
||||
|
||||
PAGE_MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
|
||||
PAGE_MODEL_REVISION = "e9b6763023c676ca8431644204f50c2b100d9aab"
|
||||
# The weights repo above declares its modeling code via auto_map in a
|
||||
# SEPARATE repo (nomic-ai/nomic-bert-2048), which `revision=` does NOT
|
||||
# pin — without this second pin, trust_remote_code executes whatever is
|
||||
# at that repo's head at build time.
|
||||
PAGE_MODEL_CODE_REVISION = "7710840340a098cfb869c4f65e87cf2b1b70caca"
|
||||
PAGE_DIM = 768
|
||||
# Nomic requires task-prefixed input. Documents (corpus side) get
|
||||
# "search_document: "; queries would get "search_query: ". similar-links
|
||||
# only ever embeds documents, so the prefix is constant here.
|
||||
PAGE_PREFIX = "search_document: "
|
||||
|
||||
TOP_N = 5 # similar-links: neighbours per page
|
||||
MIN_SCORE = 0.30 # similar-links: discard weak matches
|
||||
|
|
@ -69,33 +120,111 @@ PORTAL_BODY_ATTR = "data-portal"
|
|||
|
||||
|
||||
def atomic_write_bytes(path: Path, data: bytes) -> None:
|
||||
"""Write to path.tmp then os.replace, so an interrupt mid-write
|
||||
cannot leave a truncated file that the next build/serve loads."""
|
||||
"""Write to a PID-unique temp then os.replace: an interrupt mid-write
|
||||
cannot leave a truncated file at the final path, fsync makes the
|
||||
rename durable across power loss, and the PID suffix keeps two
|
||||
concurrent runs from interleaving writes into one temp file."""
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
tmp.write_bytes(data)
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with tmp.open("wb") as f:
|
||||
f.write(data)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp, path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def atomic_write_text(path: Path, text: str) -> None:
|
||||
atomic_write_bytes(path, text.encode("utf-8"))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Page-embedding cache
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# Loading the nomic model and embedding 26 pages on CPU takes ~3 minutes
|
||||
# every `make build`. Pages rarely change between builds — usually one
|
||||
# essay is edited and everything else is identical. This cache stores
|
||||
# one nomic vector per page content hash so unchanged pages are reused
|
||||
# verbatim and only edited/new pages are re-embedded. A fully-warm cache
|
||||
# skips the model load entirely.
|
||||
|
||||
def content_hash(text: str) -> str:
|
||||
return hashlib.sha256(text.encode("utf-8")).hexdigest()
|
||||
|
||||
|
||||
def load_vec_cache(path: Path, model: str, revision: str,
|
||||
dim: int) -> dict[str, np.ndarray]:
|
||||
"""Load {hash: vector} from disk. Returns an empty dict if the cache
|
||||
is absent, unreadable, or pinned to a different model — in those
|
||||
cases save_vec_cache() will overwrite the stale file on next save."""
|
||||
if not path.exists():
|
||||
return {}
|
||||
try:
|
||||
npz = np.load(path, allow_pickle=False)
|
||||
if (npz["model"].item() != model or
|
||||
npz["revision"].item() != revision or
|
||||
int(npz["dim"].item()) != dim):
|
||||
return {}
|
||||
hashes = npz["hashes"]
|
||||
vectors = npz["vectors"]
|
||||
if vectors.shape != (len(hashes), dim):
|
||||
return {}
|
||||
return {h.item(): vectors[i] for i, h in enumerate(hashes)}
|
||||
except (OSError, KeyError, ValueError, EOFError,
|
||||
zipfile.BadZipFile) as e:
|
||||
print(f"embed.py: cache {path.name} unreadable ({e}) — discarding",
|
||||
file=sys.stderr)
|
||||
return {}
|
||||
|
||||
|
||||
def save_vec_cache(path: Path, model: str, revision: str, dim: int,
|
||||
cache: dict[str, np.ndarray]) -> None:
|
||||
"""Atomically persist {hash: vector}. Empty cache writes an empty
|
||||
file so a subsequent load returns {} cleanly (instead of falling
|
||||
through to the "no file" path)."""
|
||||
if cache:
|
||||
hashes = np.array(list(cache.keys()))
|
||||
vectors = np.stack(list(cache.values())).astype(np.float32)
|
||||
else:
|
||||
hashes = np.array([], dtype="U64")
|
||||
vectors = np.zeros((0, dim), dtype=np.float32)
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
# Pass an open file handle, not a path: np.savez_compressed appends
|
||||
# ".npz" to bare paths, which would mangle our atomic-rename target.
|
||||
# PID-unique temp so concurrent runs can't interleave; fsync so the
|
||||
# rename is durable.
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with open(tmp, "wb") as f:
|
||||
np.savez_compressed(
|
||||
f,
|
||||
model=model,
|
||||
revision=revision,
|
||||
dim=dim,
|
||||
hashes=hashes,
|
||||
vectors=vectors,
|
||||
)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp, path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
STRIP_SELECTORS = [
|
||||
"nav", "footer", "#toc", ".link-popup", "script", "style",
|
||||
".page-meta-footer", ".metadata", "[data-pagefind-ignore]",
|
||||
# The no-JS footnotes fallback duplicates each sidenote's text
|
||||
# verbatim at the document end — indexing it would double every
|
||||
# footnote in search results and skew page similarity.
|
||||
"section.footnotes",
|
||||
]
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Staleness check
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def needs_update() -> bool:
|
||||
outputs = [SIMILAR_OUT, SEMANTIC_BIN, SEMANTIC_META]
|
||||
if not all(p.exists() for p in outputs):
|
||||
return True
|
||||
oldest = min(p.stat().st_mtime for p in outputs)
|
||||
return any(html.stat().st_mtime > oldest for html in SITE_DIR.rglob("*.html"))
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML parsing helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
|
@ -191,10 +320,6 @@ def main() -> int:
|
|||
print("embed.py: _site/ not found — skipping", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
if not needs_update():
|
||||
print("embed.py: all outputs up to date — skipping")
|
||||
return 0
|
||||
|
||||
# --- Extract pages + paragraphs in one pass ---
|
||||
print("embed.py: extracting pages…")
|
||||
pages = []
|
||||
|
|
@ -211,18 +336,44 @@ def main() -> int:
|
|||
print("embed.py: no indexable pages found", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
# --- Load model once for both tasks ---
|
||||
print(f"embed.py: loading {MODEL_NAME}@{MODEL_REVISION[:8]}…")
|
||||
model = SentenceTransformer(MODEL_NAME, revision=MODEL_REVISION)
|
||||
# --- Similar-links (page level, nomic, content-hash cached) ---
|
||||
cache = load_vec_cache(PAGE_CACHE, PAGE_MODEL_NAME,
|
||||
PAGE_MODEL_REVISION, PAGE_DIM)
|
||||
page_inputs = [PAGE_PREFIX + p["text"] for p in pages]
|
||||
hashes = [content_hash(t) for t in page_inputs]
|
||||
miss_idxs = [i for i, h in enumerate(hashes) if h not in cache]
|
||||
|
||||
# --- Similar-links (page level) ---
|
||||
print(f"embed.py: embedding {len(pages)} pages…")
|
||||
page_vecs = model.encode(
|
||||
[p["text"] for p in pages],
|
||||
print(f"embed.py: pages: {len(pages) - len(miss_idxs)} cached / "
|
||||
f"{len(miss_idxs)} to embed")
|
||||
|
||||
if miss_idxs:
|
||||
print(f"embed.py: loading {PAGE_MODEL_NAME}@{PAGE_MODEL_REVISION[:8]}…")
|
||||
page_model = SentenceTransformer(
|
||||
PAGE_MODEL_NAME, revision=PAGE_MODEL_REVISION, trust_remote_code=True,
|
||||
# code_revision pins the auto_map modeling repo; it must reach
|
||||
# both AutoConfig and AutoModel.from_pretrained.
|
||||
model_kwargs={"code_revision": PAGE_MODEL_CODE_REVISION},
|
||||
config_kwargs={"code_revision": PAGE_MODEL_CODE_REVISION},
|
||||
)
|
||||
new_vecs = page_model.encode(
|
||||
[page_inputs[i] for i in miss_idxs],
|
||||
normalize_embeddings=True,
|
||||
show_progress_bar=True,
|
||||
batch_size=64,
|
||||
batch_size=8,
|
||||
).astype(np.float32)
|
||||
for i, vec in zip(miss_idxs, new_vecs):
|
||||
cache[hashes[i]] = vec
|
||||
# Drop the model before loading MiniLM below; sentence-transformers
|
||||
# holds the full weight tensor in RAM until GC runs.
|
||||
del page_model
|
||||
|
||||
# Assemble page_vecs in the original pages[] order.
|
||||
page_vecs = np.stack([cache[h] for h in hashes]).astype(np.float32)
|
||||
|
||||
# Prune the cache to only currently-present hashes so a deleted page
|
||||
# doesn't keep its vector around forever. Then persist.
|
||||
save_vec_cache(PAGE_CACHE, PAGE_MODEL_NAME, PAGE_MODEL_REVISION,
|
||||
PAGE_DIM, {h: cache[h] for h in hashes})
|
||||
|
||||
index = faiss.IndexFlatIP(page_vecs.shape[1])
|
||||
index.add(page_vecs)
|
||||
|
|
@ -245,18 +396,38 @@ def main() -> int:
|
|||
atomic_write_text(SIMILAR_OUT, json.dumps(similar, ensure_ascii=False, indent=2))
|
||||
print(f"embed.py: wrote {len(similar)} similar-links entries")
|
||||
|
||||
# --- Semantic index (paragraph level) ---
|
||||
# --- Semantic index (paragraph level, MiniLM, content-hash cached) ---
|
||||
if not paragraphs:
|
||||
print("embed.py: no paragraphs extracted — skipping semantic index")
|
||||
return 0
|
||||
|
||||
print(f"embed.py: embedding {len(paragraphs)} paragraphs…")
|
||||
para_vecs = model.encode(
|
||||
[p["text"] for p in paragraphs],
|
||||
pcache = load_vec_cache(PARA_CACHE, PARA_MODEL_NAME,
|
||||
PARA_MODEL_REVISION, PARA_DIM)
|
||||
para_inputs = [p["text"] for p in paragraphs]
|
||||
para_hashes = [content_hash(t) for t in para_inputs]
|
||||
para_miss = [i for i, h in enumerate(para_hashes) if h not in pcache]
|
||||
|
||||
print(f"embed.py: paragraphs: {len(paragraphs) - len(para_miss)} cached / "
|
||||
f"{len(para_miss)} to embed")
|
||||
|
||||
if para_miss:
|
||||
print(f"embed.py: loading {PARA_MODEL_NAME}@{PARA_MODEL_REVISION[:8]}…")
|
||||
para_model = SentenceTransformer(PARA_MODEL_NAME,
|
||||
revision=PARA_MODEL_REVISION)
|
||||
new_para_vecs = para_model.encode(
|
||||
[para_inputs[i] for i in para_miss],
|
||||
normalize_embeddings=True,
|
||||
show_progress_bar=True,
|
||||
batch_size=64,
|
||||
).astype(np.float32)
|
||||
for i, vec in zip(para_miss, new_para_vecs):
|
||||
pcache[para_hashes[i]] = vec
|
||||
del para_model
|
||||
|
||||
# Assemble in original paragraph order; prune + persist the cache.
|
||||
para_vecs = np.stack([pcache[h] for h in para_hashes]).astype(np.float32)
|
||||
save_vec_cache(PARA_CACHE, PARA_MODEL_NAME, PARA_MODEL_REVISION,
|
||||
PARA_DIM, {h: pcache[h] for h in para_hashes})
|
||||
|
||||
atomic_write_bytes(SEMANTIC_BIN, para_vecs.tobytes())
|
||||
|
||||
|
|
|
|||
|
|
@ -31,6 +31,7 @@ images are logged and the rest of the walk continues.
|
|||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
|
@ -62,13 +63,20 @@ def _is_stale(image: Path, sidecar: Path) -> bool:
|
|||
|
||||
|
||||
def _atomic_write_yaml(path: Path, data: dict[str, Any]) -> None:
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
# PID-unique temp (concurrent runs can't share it), removed on
|
||||
# failure. No fsync: sidecars are regenerated from the photo on the
|
||||
# next build, so a lost rename costs one re-extraction, not data.
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with tmp.open("w", encoding="utf-8") as f:
|
||||
# Preserve a stable key order (width before height) so a manual
|
||||
# diff stays easy to read across regenerations.
|
||||
ordered = {k: data[k] for k in ("width", "height") if k in data}
|
||||
yaml.safe_dump(ordered, f, sort_keys=False, allow_unicode=True)
|
||||
tmp.replace(path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def _read_dimensions(image: Path) -> dict[str, int]:
|
||||
|
|
|
|||
|
|
@ -36,6 +36,7 @@ images are logged and the rest of the walk continues.
|
|||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
|
|
@ -133,6 +134,12 @@ def _read_exif_via_exiftool(image: Path) -> dict[str, Any]:
|
|||
entry. Numeric values come through as numbers; text values as
|
||||
strings. We accept missing keys silently.
|
||||
"""
|
||||
# exiftool does not reliably support `--` as an end-of-options
|
||||
# marker, so make the path argument non-option-shaped instead: a
|
||||
# relative path is prefixed with ./ so it can never start with `-`.
|
||||
image_arg = str(image)
|
||||
if not os.path.isabs(image_arg):
|
||||
image_arg = os.path.join(os.curdir, image_arg)
|
||||
result = subprocess.run(
|
||||
[
|
||||
"exiftool",
|
||||
|
|
@ -156,7 +163,7 @@ def _read_exif_via_exiftool(image: Path) -> dict[str, Any]:
|
|||
"-ImageWidth",
|
||||
"-ImageHeight",
|
||||
"-n", # numeric output for shutter/aperture/GPS/dimensions
|
||||
str(image),
|
||||
image_arg,
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
|
|
@ -374,12 +381,19 @@ def _is_stale(image: Path, sidecar: Path) -> bool:
|
|||
|
||||
|
||||
def _atomic_write_yaml(path: Path, data: dict[str, Any]) -> None:
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
# PID-unique temp (concurrent runs can't share it), removed on
|
||||
# failure. No fsync: sidecars are regenerated from the photo on the
|
||||
# next build, so a lost rename costs one re-extraction, not data.
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with tmp.open("w", encoding="utf-8") as f:
|
||||
# Preserve the SIDECAR_KEYS order so a manual diff is easy to read.
|
||||
ordered = {k: data[k] for k in SIDECAR_KEYS if k in data}
|
||||
yaml.safe_dump(ordered, f, sort_keys=False, allow_unicode=True)
|
||||
tmp.replace(path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def _read_one(image: Path) -> dict[str, Any]:
|
||||
|
|
|
|||
|
|
@ -23,6 +23,7 @@ a palette extraction error.
|
|||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
|
@ -62,10 +63,17 @@ def _is_stale(image: Path, sidecar: Path) -> bool:
|
|||
|
||||
|
||||
def _atomic_write_yaml(path: Path, data: dict[str, Any]) -> None:
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
# PID-unique temp (concurrent runs can't share it), removed on
|
||||
# failure. No fsync: sidecars are regenerated from the photo on the
|
||||
# next build, so a lost rename costs one re-extraction, not data.
|
||||
tmp = path.with_suffix(path.suffix + f".tmp.{os.getpid()}")
|
||||
try:
|
||||
with tmp.open("w", encoding="utf-8") as f:
|
||||
yaml.safe_dump(data, f, sort_keys=False, allow_unicode=True)
|
||||
tmp.replace(path)
|
||||
except BaseException:
|
||||
tmp.unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def _extract_palette(image: Path) -> list[str]:
|
||||
|
|
|
|||
|
|
@ -20,9 +20,11 @@
|
|||
set -u
|
||||
|
||||
# Newly-added .md files under content/essays/ in this commit.
|
||||
# `--name-status` output is TAB-separated (status<TAB>path); split on the
|
||||
# tab so paths containing spaces survive intact.
|
||||
mapfile -t added < <(
|
||||
git diff --cached --name-status --diff-filter=A -- 'content/essays/*.md' \
|
||||
| awk '{ print $2 }'
|
||||
| cut -f2-
|
||||
)
|
||||
|
||||
if [[ ${#added[@]} -eq 0 ]]; then
|
||||
|
|
@ -47,8 +49,10 @@ for path in "${added[@]}"; do
|
|||
# Best-effort frontmatter probe: does any line in the YAML head
|
||||
# block start with `status:`? Avoids a YAML dependency in the
|
||||
# hook, which has to run before the build environment is sourced.
|
||||
if awk '/^---$/{f++; next} f==1 && /^status:[[:space:]]*[^[:space:]]/{print; exit}' \
|
||||
-- "$path" \
|
||||
# Probe the STAGED blob (`git show :path`), not the working tree —
|
||||
# the commit contains the index content, which may differ.
|
||||
if git show ":$path" 2>/dev/null \
|
||||
| awk '/^---$/{f++; next} f==1 && /^status:[[:space:]]*[^[:space:]]/{print; exit}' \
|
||||
| grep -q .; then
|
||||
has_status=1
|
||||
fi
|
||||
|
|
|
|||
|
|
@ -148,7 +148,14 @@ fi
|
|||
|
||||
echo "import-photo: stripping EXIF from delivered file..."
|
||||
magick mogrify -strip "$TARGET" \
|
||||
|| { echo "import-photo: magick mogrify -strip failed for $TARGET (EXIF NOT stripped)" >&2; exit 1; }
|
||||
|| {
|
||||
# The copy under content/ still carries full EXIF (GPS, serial
|
||||
# numbers); the Makefile's `git add content/` could auto-commit
|
||||
# and publish it. Remove it before bailing out.
|
||||
rm -f -- "$TARGET"
|
||||
echo "import-photo: magick mogrify -strip failed for $TARGET (EXIF NOT stripped); deleted the copied target so the EXIF-laden JPEG cannot be auto-committed" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Step 4: extract palette (does its own walk; idempotent on already-done photos)
|
||||
|
|
|
|||
|
|
@ -28,7 +28,9 @@ echo -n "Signing subkey passphrase: "
|
|||
read -rs PASSPHRASE
|
||||
echo
|
||||
|
||||
echo -n "$PASSPHRASE" | GNUPGHOME="$GNUPGHOME" "$GPG_PRESET" --homedir "$GNUPGHOME" --preset "$KEYGRIP"
|
||||
# printf, not `echo -n`: a passphrase starting with -e/-n/-E would be
|
||||
# eaten as an echo option.
|
||||
printf '%s' "$PASSPHRASE" | GNUPGHOME="$GNUPGHOME" "$GPG_PRESET" --homedir "$GNUPGHOME" --preset "$KEYGRIP"
|
||||
|
||||
echo "Passphrase cached for keygrip $KEYGRIP (24 h TTL)."
|
||||
echo "Test: GNUPGHOME=$GNUPGHOME gpg --homedir $GNUPGHOME --batch --detach-sign --armor --output /dev/null /dev/null"
|
||||
|
|
|
|||
|
|
@ -8,11 +8,29 @@ FREEZE="$REPO_ROOT/cabal.project.freeze"
|
|||
|
||||
cd "$REPO_ROOT"
|
||||
|
||||
# Back up the current freeze and restore it if resolution fails, so an
|
||||
# unsolvable index never leaves the repo with no freeze file at all
|
||||
# (recoverable via git, but the script shouldn't depend on that).
|
||||
BACKUP=""
|
||||
if [ -f "$FREEZE" ]; then
|
||||
BACKUP="$(mktemp "$FREEZE.bak.XXXXXX")"
|
||||
cp "$FREEZE" "$BACKUP"
|
||||
fi
|
||||
restore_on_failure() {
|
||||
if [ -n "$BACKUP" ]; then
|
||||
echo "==> Refreeze failed — restoring previous freeze file." >&2
|
||||
mv "$BACKUP" "$FREEZE"
|
||||
fi
|
||||
}
|
||||
trap restore_on_failure ERR
|
||||
|
||||
echo "==> Removing stale freeze file..."
|
||||
rm -f "$FREEZE"
|
||||
|
||||
echo "==> Resolving dependencies and writing new freeze file..."
|
||||
cabal freeze
|
||||
trap - ERR
|
||||
[ -n "$BACKUP" ] && rm -f "$BACKUP"
|
||||
|
||||
echo "==> Verifying build..."
|
||||
cabal build
|
||||
|
|
|
|||
|
|
@ -49,8 +49,19 @@ def stamp_file(path: str, replacement_bytes: bytes) -> bool:
|
|||
data,
|
||||
)
|
||||
if count and new_data != data:
|
||||
with open(path, "wb") as f:
|
||||
# Write to a sibling temp file and os.replace so an interrupt
|
||||
# mid-write never leaves a truncated deployed HTML file.
|
||||
tmp = path + ".stamp-tmp"
|
||||
try:
|
||||
with open(tmp, "wb") as f:
|
||||
f.write(new_data)
|
||||
os.replace(tmp, path)
|
||||
except BaseException:
|
||||
try:
|
||||
os.unlink(tmp)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
raise
|
||||
return True
|
||||
return False
|
||||
|
||||
|
|
|
|||
11
uv.lock
|
|
@ -156,6 +156,15 @@ wheels = [
|
|||
{ url = "https://files.pythonhosted.org/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl", hash = "sha256:85cef7cff222d8644161529808465972e51340599459b8ac3ccbac5a854e0d30", size = 8321, upload-time = "2023-10-07T05:32:16.783Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "einops"
|
||||
version = "0.8.2"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/2c/77/850bef8d72ffb9219f0b1aac23fbc1bf7d038ee6ea666f331fa273031aa2/einops-0.8.2.tar.gz", hash = "sha256:609da665570e5e265e27283aab09e7f279ade90c4f01bcfca111f3d3e13f2827", size = 56261, upload-time = "2026-01-26T04:13:17.638Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/2a/09/f8d8f8f31e4483c10a906437b4ce31bdf3d6d417b73fe33f1a8b59e34228/einops-0.8.2-py3-none-any.whl", hash = "sha256:54058201ac7087911181bfec4af6091bb59380360f069276601256a76af08193", size = 65638, upload-time = "2026-01-26T04:13:18.546Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "faiss-cpu"
|
||||
version = "1.13.2"
|
||||
|
|
@ -364,6 +373,7 @@ dependencies = [
|
|||
{ name = "altair" },
|
||||
{ name = "beautifulsoup4" },
|
||||
{ name = "colorthief" },
|
||||
{ name = "einops" },
|
||||
{ name = "faiss-cpu" },
|
||||
{ name = "matplotlib" },
|
||||
{ name = "numpy" },
|
||||
|
|
@ -379,6 +389,7 @@ requires-dist = [
|
|||
{ name = "altair", specifier = ">=5.4,<6" },
|
||||
{ name = "beautifulsoup4", specifier = ">=4.12,<5" },
|
||||
{ name = "colorthief", specifier = ">=0.2,<1" },
|
||||
{ name = "einops", specifier = ">=0.8.2,<1" },
|
||||
{ name = "faiss-cpu", specifier = ">=1.9,<2" },
|
||||
{ name = "matplotlib", specifier = ">=3.9,<4" },
|
||||
{ name = "numpy", specifier = ">=2.0,<3" },
|
||||
|
|
|
|||