Compare commits

...

9 Commits

Author SHA1 Message Date
Levi Neuwirth 620b974d3f auto: 2026-05-26T15:50:02Z [skip ci] 2026-05-26 11:50:02 -04:00
Levi Neuwirth b83af076e0 Bump cabal.project.freeze: minor patch versions
Patch-level bumps to five transitive dependencies (attoparsec-aeson,
http2, prettyprinter-ansi-terminal, semialign, zlib). index-state is
unchanged; refreezes against the same hackage snapshot.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:07:02 -04:00
Levi Neuwirth af27479c6e Add now.js: recompute "Last updated" relative phrase client-side
build/Now.hs renders the .now-stamp-relative phrase ("3 days ago") at
build time against the build machine's clock; a page served days later
from cache or a CDN would then read stale. now.js recomputes the
phrase in the browser from the <time datetime> attribute (an
unambiguous YYYY-MM-DD) against the visitor's clock, with bucket
thresholds that mirror Now.hs:relativeTime exactly so the no-JS
fallback and the recomputed value agree.

* static/js/now.js — the recomputation script.
* templates/default.html includes it via $if(now)$ so it only loads
  on the Current page.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:06:49 -04:00
Levi Neuwirth 70dda56625 Popups: render the source page's monogram in internal previews
Internal-page hover popups now show the source's monogram alongside
the title / abstract / metadata when one exists. Two-column grid is
gated on .has-monogram so popups for pieces without an authored mark
keep their default single-column body. The serialised SVG comes from
the rendered page's own .frontmatter-mark--monogram figure, excluded
when it is the symmetric-layout placeholder roundel so empty-slot
pieces do not get a fake mark in the preview.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:06:35 -04:00
Levi Neuwirth a3b3457803 print.css: refresh page rules and prose treatments
Tightens what gets printed and how. Reading-mode warm tints are
disabled so pages do not repaint cream; the mobile TOC bar's
screen-only body padding is reset; sidenote / footnote treatments
are reworked so the prose flows continuously instead of breaking
into a separate footnotes section; decorative link-icon glyphs
are suppressed while external links keep their underline so a
reader can follow them in the printed copy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:06:20 -04:00
Levi Neuwirth 802fc75968 Layout: page-shell wrapper for iOS sticky, scrollable TOC
iOS WebKit silently degrades position: sticky to static on direct
flex/grid children of <body>, so the sticky nav was breaking on
mobile. Wrapping everything below the nav in a .page-shell flex
column keeps the sticky-footer math out of body { } and restores
sticky behaviour across browsers. The essay-frontmatter hoisted in
the Marks II commit becomes a body-level sibling of .page-shell so
its monogram and epistemic-figure columns can span viewport width.

* templates/default.html wraps $body$ + footer in .page-shell.
* static/css/layout.css moves the flex-column + min-height math from
  body to .page-shell; the body > header rule now excludes
  .essay-frontmatter so the essay header does not inherit nav chrome
  (sticky, nav-bg, border-bottom).
* static/css/base.css drops the html/body overflow-x: clip — the
  page-shell wrapper handles horizontal containment and clipping at
  the viewport level was interfering with position: sticky.
* static/css/reading.css updates its #markdownBody centering selector
  to match .page-shell > #markdownBody.
* static/css/components.css makes the TOC outline scrollable when
  it overflows: bounded max-height tied to the sticky budget plus a
  thin themed scrollbar, with overflow: hidden preserved for the
  collapse transition.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:06:02 -04:00
Levi Neuwirth fad8719045 Stamp the site-wide build time post-render
Hakyll caches per-page outputs, so pages whose dependencies have not
changed are not recompiled and the rendered $build-time$ in their
footer goes stale relative to a fresh build. The right granularity for
"last built at" is site-wide rather than per-page; wrapping the footer
timestamp in <span data-build-time> and rewriting it after Hakyll lets
every page reflect the current build without paying recompilation cost.

* tools/stamp-build-time.py walks _site/**/*.html after Hakyll runs and
  rewrites each element wrapped in [data-build-time] to the same format
  Contexts.hs:buildTimeField emits, so a fresh render and the post-pass
  agree.
* templates/partials/footer.html wraps $build-time$ in
  <span class="footer-build-time" data-build-time>...</span> so the
  sweep has a stable selector.
* Makefile invokes the sweep between embed.py and compress-assets so
  the .gz/.br sidecars include the fresh stamp.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:05:28 -04:00
Levi Neuwirth 154b47a4cb Marks II: broader monogram coverage + audit-marks tool
Extends the Phase-1 monogram mark system to every long-form content
type (essays, blog posts, poems, fiction, music) and introduces a
coverage audit so gaps are visible.

* build/Marks.hs gains hasMonogram (predicate), monogramSvgFieldFor +
  hasMonogramFieldFor (for explicit-path callers like the /build/ and
  /stats/ pages). Contexts.hs exports hasMonogramField as a siteCtx
  boolean so templates can conditionally render the slot without
  emitting an empty <div>.
* essay.html, blog-post.html, reading.html: hoist the frontmatter
  block out of <main id="markdownBody"> so the monogram + epistemic
  marks render as wrapper chrome rather than indexable prose; left
  + right mark slots are now unconditional (CSS handles the empty
  state) so the layout is grid-stable across pieces.
* templates/partials/item-card.html: optional monogram chip on cards
  (item-card--has-monogram modifier), gated on $has-monogram$ so
  monogram-less pieces stay flush.
* build/Stats.hs grows a "Marks coverage" telemetry section: per-type
  pieces / monogram / epistemic-figure counts + a coverage rollup,
  rendered between epistemic and output on /build/.
* tools/audit-marks.py: coverage report (ASCII table) walking
  content/**/*.md, plus a pre-commit hook at
  tools/hooks/pre-commit-marks.sh that runs the same scan against
  newly-staged .md files. New `make audit-marks` runs the report
  manually; the hook gates commits.
* static/css/marks.css: layout for the new frontmatter slots and the
  item-card monogram chip.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:05:08 -04:00
Levi Neuwirth 77e31efdae Add link archive system: snapshots, backlinks, link-rot
Preserve external works the site cites against link rot, host them at
permanent /archive/<slug>/ URLs in site chrome, and treat them as
first-class citizens of the backlinks and similar-pages indexes.
Curated, not crawled: the author adds one line to archive/manifest.yaml
and the build fetches, hashes, snapshots, and indexes the work.

* archive/manifest.yaml + tools/archive.py (fetch / refresh / wayback /
  check / gc) — PDFs downloaded directly, HTML pages snapshotted with a
  vendored monolith (tools/bin/monolith @ 2.10.1) into a single
  self-contained file with the archive CSP and a noarchive robots meta
  injected. Per-entry PROVENANCE.json committed; gitignored .txt
  sidecars regenerated from the artifact's SHA-256.
* build/Archive.hs + build/ArchiveIndex.hs + build/Filters/Archive.hs
  — Hakyll rules for /archive/ and /archive/<slug>/, a body Pandoc
  filter that appends an archive affordance to live citations and
  flips dead ones to the local copy on archive.py check's asymmetric
  hysteresis (rotted needs 3 fails over >= 14 days; one ok recovers).
* build/Backlinks.hs — keeps archived external URLs through pass 1 and
  canonicalises them to /archive/<slug>/ in pass 2, producing a
  "Referenced by" section grouped by the fragment each citation
  targets. build/Stats.hs gains a "Link archive" telemetry block on
  /build/ (count, total size, median age, by-status / by-quality /
  by-visibility, orphans).
* Integrity: archive.py fetch and build/Archive.hs (via sha256sum)
  both re-hash every committed artifact, so a tampered file halts the
  build even with cabal invoked directly or no .venv present. refresh
  refuses to replace an uncommitted prior snapshot and rolls back
  atomically on any exit path. removed.yaml is honoured by fetch,
  wayback, and check using canonical-form (tracking-stripped,
  arXiv-canonicalised) comparison.
* visibility: private keeps an entry in-repo but undeployed.
  nginx/archive.conf emits X-Robots-Tag: noindex, noarchive for raw
  artifacts that cannot carry meta directives.

The full design, phase plan (1-5), and three refinement passes live
in ARCHIVE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 10:06:33 -04:00
53 changed files with 6375 additions and 178 deletions

10
.gitignore vendored
View File

@ -69,10 +69,20 @@ data/similar-links.json
data/backlinks.json
data/build-stats.json
data/build-start.txt
data/build-stamp.txt
data/last-build-seconds.txt
data/semantic-index.bin
data/semantic-meta.json
# Archive: generated text + its staleness stamp (recreated from the
# committed artifact on every build — deterministic, so committing them is
# churn). archive/**/PROVENANCE.json is deliberately NOT ignored — it is
# the committed, immutable record of each archival event.
archive/**/*.txt
archive/**/*.txt.sha256
data/archive-index.json
data/archive-state.json
# IGNORE.txt is for the local build and need not be synced.
IGNORE.txt

1535
ARCHIVE.md Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.PHONY: build deploy sign download-model download-pdfjs download-leaflet compress-assets convert-images pdf-thumbs pdfs watch clean dev
.PHONY: build deploy sign download-model download-pdfjs download-leaflet compress-assets convert-images pdf-thumbs pdfs watch clean dev audit-marks archive-gc archive-wayback archive-check
# Source .env for deploy / GitHub config if it exists.
# .env format: KEY=value (one per line, no `export` prefix, no quotes needed).
@ -43,6 +43,16 @@ build:
else \
echo "Photography sidecars skipped: run 'uv sync' to enable EXIF + palette + dimension extraction (build continues with frontmatter only)"; \
fi
# Archive pipeline (Phase 1): fetch any manifest URL without a local
# artifact, extract text, write archive/<slug>/PROVENANCE.json and
# data/archive-index.json. Gated on .venv, same as embed.py. A SHA or
# slug-URL integrity error exits non-zero and halts the build; a
# transient network failure is non-fatal (the entry retries next build).
@if [ -d .venv ]; then \
uv run python tools/archive.py fetch; \
else \
echo "Archive fetch skipped: run 'uv sync' to enable link archiving (build continues)"; \
fi
cabal run site -- build
pagefind --site _site
@if [ -d .venv ]; then \
@ -50,6 +60,12 @@ build:
else \
echo "Embedding skipped: run 'uv sync' to enable similar-links (build continues)"; \
fi
# Site-wide footer timestamp: rewrite every <span data-build-time>
# in _site/**/*.html so cached (un-recompiled) pages don't show a
# stale per-page build time. See tools/stamp-build-time.py for the
# full rationale. Must run before compress-assets so the .gz/.br
# sidecars include the fresh stamp.
@python3 tools/stamp-build-time.py _site
@./tools/compress-assets.sh _site
> IGNORE.txt
@BUILD_END=$$(date +%s); \
@ -153,6 +169,50 @@ watch:
clean:
cabal run site -- clean
# Report which content pieces are missing a monogram (mark.svg) and / or
# the epistemic figure (status: frontmatter). Exits 0 unconditionally;
# this is a coverage report, not a build gate. The pre-commit hook at
# tools/hooks/pre-commit-marks.sh runs the same script for newly-staged
# .md files.
audit-marks:
@if [ -d .venv ]; then \
uv run python tools/audit-marks.py; \
else \
python3 tools/audit-marks.py; \
fi
# Evict archived works: delete archive/<slug>/ directories whose slug is
# recorded in archive/removed.yaml. Opt-in — NEVER run by `make build`.
# Orphan directories (not in manifest.yaml, not in removed.yaml) are
# reported, never deleted. See ARCHIVE.md - Eviction & removal.
archive-gc:
@if [ -d .venv ]; then \
uv run python tools/archive.py gc; \
else \
python3 tools/archive.py gc; \
fi
# Submit archived URLs to the Wayback Machine and backfill the capture URL
# into each PROVENANCE.json. A slow network job — opt-in, never run by
# `make build`. Always exits 0; an entry without a capture retries next run.
archive-wayback:
@if [ -d .venv ]; then \
uv run python tools/archive.py wayback; \
else \
python3 tools/archive.py wayback; \
fi
# Probe every archived URL for link rot, updating data/archive-state.json.
# A slow network job — opt-in, never run by `make build`. Asymmetric
# hysteresis: `rotted` needs 3 consecutive failures over >=14 days; a
# single success recovers immediately. The next build consumes the state.
archive-check:
@if [ -d .venv ]; then \
uv run python tools/archive.py check; \
else \
python3 tools/archive.py check; \
fi
# Dev build includes any in-progress drafts under content/drafts/essays/.
# SITE_ENV=dev is read by build/Site.hs; drafts are otherwise invisible to
# every build (make build / make deploy / cabal run site -- build directly).

View File

@ -0,0 +1,14 @@
{
"url": "https://cr.yp.to/aes-speed.html",
"slug": "djb-aes-speed",
"title": "Cache-timing attacks on AES (cr.yp.to)",
"type": "html",
"artifact": "snapshot.html",
"sha256": "8da2d5aedeccf9f602e1680631aa77308683803c0cc9b04caad52c7a70c60832",
"previous-sha256": "0a50bf6d64b2ec08771d83be5ef47721ecbfc431e3512ff55978e76f452dbd3f",
"bytes": 26186,
"archived": "2026-05-23",
"source-date": null,
"snapshot-quality": "ok",
"wayback": null
}

View File

@ -0,0 +1,470 @@
<!-- Saved from https://cr.yp.to/aes-speed.html at 2026-05-23T13:04:33Z using monolith v2.10.1 -->
<html><head><meta content="default-src 'none'; img-src data:; style-src 'unsafe-inline'; style-src-elem 'unsafe-inline'; style-src-attr 'unsafe-inline'; font-src data:; script-src 'none'; object-src 'none'; frame-src 'none'" http-equiv="Content-Security-Policy"/><meta content="noindex, noarchive" name="robots"/><link href="data:text/html;base64,PGh0bWw+PGJvZHk+ZmlsZSBkb2VzIG5vdCBleGlzdDwvYm9keT48L2h0bWw+DQo=" rel="icon"/></head><body>
<title>AES speed</title>
<meta content="aes" name="keywords"/>
<a href="https://cr.yp.to/djb.html">D. J. Bernstein</a>
<br/><a href="https://cr.yp.to/hash.html">Hash functions and ciphers</a>
<h1>AES speed</h1>
<b>Update:</b>
Peter Schwabe and I now have a paper on this topic:
<ul>
<li>
<a name="aesspeed-paper">[aesspeed]</a>
15pp.
<a href="https://cr.yp.to/aes-speed/aesspeed-20080926.pdf">(PDF)</a>
D. J. Bernstein, Peter Schwabe.
New AES software speed records.
Document ID: b90c51d2f7eef86b78068511135a231f.
URL: https://cr.yp.to/papers.html#aesspeed.
Date: 2008.09.26.
Supersedes:
<a href="https://cr.yp.to/aes-speed/aesspeed-20080908.pdf">(PDF)</a>
2008.09.08.
</li></ul>
The software is now available as part of the
<a href="https://cr.yp.to/streamciphers/timings.html#toolkit-estreambench">estreambench</a>
toolkit.
We have placed the software into the public domain;
feel free to integrate it into your own AES applications!
<p>
Information below this line has not yet been updated.
</p><hr/>
This document describes various speedups in AES software.
This document assumes that
the software is going to be used in an application
where timing information is <i>not</i> exposed to attackers.
<p>
The reader is expected to already know the standard structure of AES software:
</p><ul>
<li>each of the 16 state bytes is used as an index for a table lookup producing a 32-bit word;
</li><li>16 xors combine these 16 words and 4 expanded key words into 4 new state words;
</li><li>those 4 words are viewed as the starting 16 bytes for the next round.
</li></ul>
See Section 5.2.1 of "AES Proposal: Rijndael" by Daemen and Rijmen.
<h2>Endianness</h2>
On a little-endian CPU,
extracting the first byte of a 32-bit word
is an &amp;0xff arithmetic instruction;
on a big-endian CPU,
extracting the first byte of a 32-bit word
is a &gt;&gt;24 arithmetic instruction.
Similar comments apply to the other bytes.
<p>
One can write AES software
that uses arithmetic instructions as if the CPU were little-endian.
If the CPU is actually big-endian,
the software swaps the bytes of the AES key, input, and output (at run time).
The software also swaps the bytes of the table (at compile time),
for example by expressing the table as a sequence of 32-bit integers.
</p><p>
<b>Matched endianness.</b>
One can easily eliminate the byte-swapping time for the AES key, input, and output:
simply use the appropriate arithmetic instructions
for the endianness of the CPU.
In this case the table must not be swapped.
</p><h2>Table structure</h2>
All else being equal, smaller AES tables are faster:
they take less time to load into cache and are more likely to stay in cache.
Beware that most benchmarking tools preload caches and thus can't see this speedup.
<p>
Daemen and Rijmen suggest "4 KBytes of tables."
There are 4 tables.
Each table has 256 words occupying 1024 bytes.
The loads are spread evenly across the tables.
</p><p>
<b>Rotated lookups.</b>
Daemen and Rijmen suggest an alternative "with a total table size of 1KByte"
but with extra arithmetic.
The point is that the tables are rotations of each other:
for example,
the first word of the first table is (0xc6,0x63,0x63,0xa5),
the first word of the second table is (0xa5,0xc6,0x63,0x63),
the first word of the third table is (0x63,0xa5,0xc6,0x63),
and the first word of the fourth table is (0x63,0x63,0xa5,0xc6).
One can store the first table,
and simulate a lookup in another table at the cost of an extra rotation.
</p><p>
<b>Unaligned loads.</b>
One can instead use a single 2KB table having 256 8-byte entries
such as (0x00,0x63,0xa5,0xc6,0x63,0x63,0xa5,0xc6).
There are many reasonable choices of pattern here;
what's important is that the pattern includes the desired
(0xc6,0x63,0x63,0xa5) and (0xa5,0xc6,0x63,0x63) and so on as substrings.
On the Pentium, the PowerPC, et al.,
one can load 4-byte words from memory addresses that aren't divisible by 4,
and there's no penalty when the word doesn't cross an 8-byte boundary.
</p><h2>Masked loads</h2>
16 of the 160 table lookups in 10-round AES are masked.
The 40 table lookups in 10-round AES key expansion are also masked.
The masks are 0x000000ff, 0x0000ff00, 0x00ff0000, and 0xff000000, each used equally often.
<p>
The simplest way to compute a mask is with an arithmetic instruction: for example, &amp;0xff00.
</p><p>
<b>Byte loads.</b>
One can eliminate 25% of the masks,
namely the bottom-byte masks,
by combining them with load instructions.
All popular CPUs have single-byte-load instructions.
</p><p>
<b>Two-byte loads.</b>
One can eliminate another 25% of the masks
on CPUs with two-byte-load instructions.
This constrains the table pattern:
it's important to have (0x00,0x63) on little-endian CPUs,
and (0x63,0x00) on big-endian CPUs.
</p><p>
<b>Masked tables.</b>
One can eliminate all of the masks by precomputing masked tables, using extra table space.
The simplest table structure uses a total of 8KB.
Two tables, one with entries such as (0x00,0x63,0xa5,0xc6,0x63,0x63,0xa5,0xc6)
and another with entries such as (0x00,0x00,0x00,0x00,0x63,0x00,0x00,0x00),
use a total of 4KB.
In my experience,
the cost of larger tables outweighs the benefit of eliminating a few masks.
</p><h2>Key expansion</h2>
A 4-word (128-bit) key is expanded in 40 steps.
Each step produces a new word, totalling 44 words in the expanded key.
A step has a byte extraction (see below), a masked load, and two xors.
The total work is 40 byte extractions, 40 masked loads, and 80 xors.
For comparison, the subsequent work to encrypt a block involves
160 byte extractions, 160 loads (of which 16 are masked), and 160 xors.
<p>
Daemen and Rijmen say (Section 4.3.2)
that key expansion involves "almost no computational overhead."
Obviously key expansion is less expensive than encrypting a block.
On the other hand, the cost of key expansion is still quite noticeable.
</p><p>
<b>Expanded keys.</b>
A typical AES implementation precomputes and stores an expanded key.
The 40 byte extractions, 40 masked loads, and 80 xors aren't repeated for every block;
they are done only once, along with 44 stores.
Each block then involves 44 extra loads for the expanded key.
Some stores and loads can be eliminated
if many blocks are handled at once
and some extra registers are available.
</p><p>
Long-term storage of an expanded key can slow down applications that handle many keys:
the expanded keys take more time to load into cache
than the original keys and are less likely to stay in cache.
</p><p>
<b>Partially expanded keys.</b>
An alternative is to precompute and store a partially expanded key,
only 14 words instead of 44 words.
The partially expanded key consists of words
0, 1, 2, 3, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40 from the expanded key.
Loading the partially expanded key, and converting it into the fully expanded key,
takes only 14 loads and 30 xors.
</p><p>
One can interpolate between partial expansion and full expansion,
using various amounts of storage per key and achieving various balances between load and xor.
</p><h2>Index extraction</h2>
The 16 xor operations in an AES round
produce 4 words in 4 integer registers.
The 16 bytes of these words are then extracted and used as indices for the next round.
<p>
The simplest way to extract 4 bytes is using 6 instructions,
namely 3 shifts and 3 bottom-byte extractions:
&amp;255;
(&gt;&gt;8)&amp;255;
(&gt;&gt;16)&amp;255;
&gt;&gt;24.
</p><p>
Using a byte as an index then requires multiplying the byte by a constant
that depends on the table structure.
Let's assume the 2KB tables described above; then the constant is 8.
The multiplications use 4 shifts:
&lt;&lt;3;
&lt;&lt;3;
&lt;&lt;3;
&lt;&lt;3.
</p><p>
<b>Scaled-index loads.</b>
Many CPUs can multiply an index register by 8 for free as part of a load.
</p><p>
<b>Scaled-index extractions.</b>
What about CPUs that can't multiply an index register by 8 for free?
Two of the multiplications can nevertheless be eliminated,
because they can be combined with shifts.
The overall extract-and-scale sequence has 8 instructions:
(&lt;&lt;3)&amp;2040;
(&gt;&gt;5)&amp;2040;
(&gt;&gt;13)&amp;2040;
(&gt;&gt;21)&amp;2040.
The PowerPC has a combined rotate-and-mask instruction,
making this sequence take only 4 instructions.
</p><p>
<b>Scaled tables.</b>
One can rotate table entries by 3 bits,
reducing the above 8 instructions to 7 instructions.
</p><p>
<b>Second-byte instructions.</b>
The x86 architecture (Pentium, Athlon, etc.)
includes a combined (&gt;&gt;8)&amp;255 instruction.
This means that extracting 4 bytes takes only 5 instructions:
&amp;255;
(&gt;&gt;8)&amp;255;
&gt;&gt;16;
&amp;255;
&gt;&gt;8.
Alternate 5-instruction sequence:
&amp;255;
(&gt;&gt;8)&amp;255;
&gt;&gt;16;
&amp;255;
(&gt;&gt;8)&amp;255.
</p><p>
Of course, the ultimate measure of performance is a cycle count, not an instruction count.
Matsui states that the (&gt;&gt;8)&amp;255; instruction is "a bit expensive"
on the Pentium 4 Prescott (f33, f34, f41);
presumably this means that the instruction takes more cycles than, e.g., a mere &amp;255.
But all of the measurements I've seen indicate the opposite.
I'm not sure what I'm missing here.
</p><p>
<b>32-bit shifts on 64-bit architectures.</b>
The amd64 architecture (P4E, Athlon 64, Core 2, etc.) can right-shift a 64-bit register,
but Matsui comments that this operation is extremely slow on the P4E.
It's much better to use the amd64's x86-compatible right-shift instruction;
this instruction sets the top 32 bits of its 64-bit input to 0 before shifting.
</p><p>
<b>Byte extraction via loads.</b>
A completely different way to extract 4 bytes is with 1 store and 4 loads.
One can mix this with the previous approaches
to achieve various balances between load and arithmetic.
</p><p>
Consider, for example, the UltraSPARC,
which has 2 integer units and 1 load/store unit.
A traditional sequence of
14 partially-expanded-key loads (see below), 30 key-expansion xors,
160 scaled-index extractions, 160 table-lookup loads, 160 xors, 16 masks,
4 input loads, and 4 output stores
occupies a total of 526 integer instructions (at least 263 cycles)
and 182 loads (at least 182 cycles).
Using loads for some byte extractions,
replacing 36 scaled-index extractions with 9 stores and 36 loads,
means a total of 454 integer instructions (at least 227 cycles)
and 227 loads/stores (at least 227 cycles).
</p><h2>Unrolling</h2>
A typical 9-iteration AES loop
involves 9 increments of a loop index, 9 comparisons, and 9 branches,
one of which is mispredicted on most CPUs.
The loop index also consumes a register,
forcing an extra 9 stores and 9 loads on CPUs that don't have registers to spare.
<p>
<b>Full unrolling.</b>
One can eliminate all of these costs by fully unrolling the loop.
Beware, however, that full unrolling costs a few kilobytes of code-cache space.
</p><p>
<b>Partial unrolling.</b>
CPUs are more likely to correctly predict a 4-iteration loop than a 9-iteration loop.
</p><h2>Instruction scheduling</h2>
The 16 table lookups in an AES round are independent
and can be scheduled in many different ways.
One can, for example,
perform all the table lookups for the first input from bottom byte to top
(outputs 0, 3, 2, 1),
then perform all the table lookups for the second input from bottom byte to top
(outputs 1, 0, 3, 2),
then perform all the table lookups for the third input from bottom byte to top
(outputs 2, 1, 0, 3),
then perform all the table lookups for the fourth input from bottom byte to top
(outputs 3, 2, 1, 0).
One can, as another example,
first perform all the table lookups for the first output in order of the inputs,
then perform all the table lookups for the second output in order of the inputs,
etc.
<p>
<b>Maximum parallelism.</b>
The overall depth of the AES round is
one byte extraction plus one table lookup plus two xors:
a mythical CPU offering extensive parallelism
could perform all sixteen byte extractions in parallel,
then all sixteen table lookups in parallel,
then eight xors in parallel,
then four xors in parallel.
Note that each output is obtained by xor'ing two parallel xor's,
rather than by three serial xor's.
</p><p>
<b>Deferring loads.</b>
The amd64 architecture poses several challenges to AES instruction scheduling.
First,
most integer instructions require the output register to be one of the input registers.
Second,
typical amd64 CPUs handle a load and xor most efficiently as a unified load-xor,
but a unified load-xor gives no opportunity to switch registers.
Third,
only 4 registers (eax, ebx, ecx, edx) allow second-byte instructions.
</p><p>
Matsui concludes that, on amd64 (and x86),
keeping each round's inputs y0, y1, y2, y3 and outputs z0, z1, z2, z3 in eax, ebx, ecx, edx,
to allow second-byte instructions,
is "impossible without saving/restoring."
But that's incorrect.
No extra copies are required.
A careful instruction sequence
uses the minimal conceivable number of instructions:
20 for byte extraction,
16 for table lookups,
and 4 for handling the expanded key.
The idea is to extract all the bytes from an input,
freeing the input's register for an output,
before doing any table lookups involving that output:
</p><ul>
<li>Extract the 4 bytes from y0.
At this point y1, y2, y3, and the 4 bytes are live.
</li><li>Feed 1 byte into z0.
At this point y1, y2, y3, z0, and 3 more bytes are live.
</li><li>Extract the 4 bytes from y1, immediately feeding 1 into z0.
At this point y2, y3, z0, and 6 more bytes are live.
</li><li>Feed 2 bytes into z1.
At this point y2, y3, z0, z1, and 4 more bytes are live.
</li><li>Extract the 4 bytes from y2, immediately feeding 2 into z0 and z1.
At this point y3, z0, z1, and 6 more bytes are live.
</li><li>Feed 3 bytes into z2.
At this point y3, z0, z1, z2, and 3 more bytes are live.
</li><li>Extract the 4 bytes from y3, immediately feeding 3 into z0, z1, and z2.
At this point z0, z1, z2, and 4 more bytes are live.
</li><li>Feed 4 bytes into z3.
At this point z0, z1, z2, and z3 are live.
</li><li>Handle 4 words of the expanded key.
</li></ul>
The maximum number of live registers here is 9,
fitting easily into the amd64 instruction set.
<p>
<b>Squeezing inputs and outputs into 7 32-bit registers.</b>
The x86 architecture poses an additional challenge to AES instruction scheduling:
there are only 7 general-purpose integer registers.
</p><p>
It's still possible to handle a round with 0 stores, 4 expanded-key loads,
and 16 loads for table lookups.
The shortest instruction sequence that I know has a total of 46 instructions,
6 more than what would be possible with extra registers;
1 of the 46 instructions can be eliminated if the key expansion is changed.
</p><p>
The idea of this instruction sequence
is to rotate y0 by 16 bits,
use the bottom two bytes of both y0 and y2,
and then merge the remaining four bytes of y0 and y2 into a single register
(for example, shifting y0 down 16 bits, masking y1, and adding the results),
freeing a register at the cost of 3 extra instructions (the rotate, the mask, and the add);
splitting 3 load-xor instructions into 3 loads and 3 xors
then easily puts all outputs into suitable registers.
The rotation can be eliminated if the expanded-key word that corresponds to y0
is rotated by 16 bits.
</p><h2>Speed reports</h2>
Speed reports vary in whether they use CTR, CBC, etc.,
and in the exact rules for measuring speeds.
The "eSTREAM" cycles/byte counts are
for counter-mode AES measured by the eSTREAM benchmarking toolkit;
future implementors are encouraged to support the eSTREAM interface for direct comparability.
<table border="">
<tbody><tr><th>Architecture</th><th>CPU</th><th>eSTREAM cycles/byte</th><th>Ad-hoc cycles/byte</th><th>Software</th></tr>
<tr><td>amd64</td><td>Intel Core 2 Duo (6f6)?</td><td></td><td>9.2</td><td>Matsui/Nakajima (CHES 2007)</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 (15,75,2)?</td><td></td><td>10.625 (170/block)</td><td>Matsui (FSE 2006)</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 (15,75,2)?</td><td></td><td>12.4375 (199/block)</td><td>Lipmaa</td></tr>
<tr><td>amd64</td><td>Intel Core 2 Duo (6f6); katana</td><td>12.56</td><td></td><td>hongjun/v1/1</td></tr>
<tr><td>amd64</td><td>Intel Core 2 Quad Q6600 (6fb); latour</td><td>12.57</td><td></td><td>hongjun/v1/1</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 (15,75,2)?</td><td></td><td>13.125 (210/block)</td><td>Osvik</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 X2 (15,75,2); mace</td><td>13.32</td><td></td><td>hongjun/v1/1</td></tr>
<tr><td>amd64</td><td>AMD Opteron 240 (f58); nmisles8amd64</td><td>13.45</td><td></td><td>bernstein/amd64-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium III (68a)?</td><td></td><td>14 (224/block)</td><td>Osvik</td></tr>
<tr><td>x86</td><td>AMD Athlon (622)?</td><td></td><td>14.0625 (225/block)</td><td>Osvik</td></tr>
<tr><td>x86</td><td>Intel Pentium III (68a)?</td><td></td><td>14.125 (226/block)</td><td>Lipmaa</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f12)?</td><td></td><td>15 (240/block)</td><td>Osvik</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f12)?</td><td></td><td>15.875 (254/block)</td><td>Lipmaa</td></tr>
<tr><td>x86</td><td>Intel Pentium M (695); whisper</td><td>15.96</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium 4 (f64)?</td><td></td><td>16 (256/block)</td><td>Matsui (FSE 2006)</td></tr>
<tr><td>x86</td><td>Intel Pentium III (68a)?</td><td></td><td>16.25 (260/block)</td><td>Gladman</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); nmi0161</td><td>16.74</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); svlin001</td><td>16.75</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Xeon (f41); nmi0056</td><td>16.75</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Xeon (f4a); nmi0090</td><td>16.77</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>sparc</td><td>Sun UltraSPARC III</td><td></td><td>16.875 (270/block)</td><td>Lipmaa</td></tr>
<tr><td>amd64</td><td>Intel Xeon (f41); nmi0057</td><td>16.89</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); speed</td><td>16.90</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); nmi0104</td><td>16.90</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); nmi0241</td><td>16.93</td><td></td><td>bernstein/amd64-2/1</td></tr>
<tr><td>ppc64</td><td>IBM POWER5; nmi0154</td><td>16.93</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f24); nmi0086</td><td>16.96</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f12); fireball</td><td>16.98</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f24); nmitest4</td><td>17.01</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>ppc64</td><td>IBM PowerPC G5 970; nmi0048</td><td>17.17</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 2 (652); boris</td><td>17.33</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 3 (68a)</td><td>17.49</td><td></td><td>Bernstein aes-128/x86-mmx-1</td></tr>
<tr><td>x86</td><td>Intel Pentium 3 (672); orpheus</td><td>17.55</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium M (6d8)</td><td>17.57</td><td></td><td>Wu v0/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f33)?</td><td></td><td>17.75 (284/block)</td><td>Matsui/Fukuda (FSE 2005)</td></tr>
<tr><td>x86</td><td>Intel Xeon (f29); nmibuild40</td><td>17.79</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f27); nmi0059</td><td>17.79</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmibuild16</td><td>17.79</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmi0013</td><td>17.79</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f29); nmi0059</td><td>17.80</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f29); nmibuild17</td><td>17.81</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmibuild15</td><td>17.82</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmibuild26</td><td>17.83</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmibuild21</td><td>17.83</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmi0036</td><td>17.84</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f25); nmibuild22</td><td>17.84</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>AMD Athlon (622); thoth</td><td>18.38</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>ppc32</td><td>IBM POWER4; nmibuild14</td><td>18.55</td><td></td><td>bernstein/little-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0079</td><td>18.88</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0062</td><td>18.89</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>amd64</td><td>Intel Core 2 Duo (6f6)</td><td></td><td>18.9</td><td>OpenSSL 0.9.8e</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0061</td><td>18.91</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f41); svlin002</td><td>18.94</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0076</td><td>18.96</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f4a); nmi0102</td><td>18.97</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0060</td><td>18.97</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Xeon (f41); nmi0063</td><td>18.95</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium 3 (68a)</td><td>19.06</td><td></td><td>Wu v1/1</td></tr>
<tr><td>ppc32</td><td>Motorola PowerPC G4 7410; gggg</td><td>19.11</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>amd64</td><td>Intel Core 2 Duo (6f6)</td><td></td><td>19.5</td><td>OpenSSL 0.9.8a</td></tr>
<tr><td>x86</td><td>AMD Athlon (622)?</td><td></td><td>19.9375 (319/block)</td><td>Lipmaa</td></tr>
<tr><td>x86</td><td>Intel Pentium 1 (52c)</td><td></td><td>20 (320/block)</td><td>Lipmaa</td></tr>
<tr><td>sparc</td><td>Sun UltraSPARC III</td><td>20.75</td><td></td><td>Bernstein big-1/1</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 (15,75,2)</td><td></td><td>20.9</td><td>OpenSSL 0.9.8e</td></tr>
<tr><td>ppc32</td><td>Motorola PowerPC G4 7400; nmi0042</td><td>20.92</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>x86</td><td>Intel Pentium M (6d8)</td><td></td><td>21</td><td>OpenSSL 0.9.8a</td></tr>
<tr><td>x86</td><td>Intel Pentium D (f47); shell</td><td>21.58</td><td></td><td>bernstein/x86-mmx-1/1</td></tr>
<tr><td>x86</td><td>AMD Athlon (622)</td><td></td><td>22</td><td>OpenSSL 0.9.8a</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f29)</td><td></td><td>22</td><td>OpenSSL 0.9.8b</td></tr>
<tr><td>amd64</td><td>AMD Athlon 64 (15,75,2)?</td><td></td><td>23.5</td><td>OpenSSL 0.9.7e</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f41)</td><td></td><td>23.5</td><td>OpenSSL 0.9.8a</td></tr>
<tr><td>x86</td><td>Intel Pentium 3 (672); orpheus</td><td></td><td>23.62</td><td>OpenSSL 0.9.8e</td></tr>
<tr><td>ppc32</td><td>Motorola PowerPC G4 7410</td><td></td><td>24.0625 (385/block)</td><td>Ahrens</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f12)</td><td></td><td>24.4</td><td>OpenSSL 0.9.8a</td></tr>
<tr><td>sparc</td><td>Sun UltraSPARC III</td><td></td><td>25</td><td>OpenSSL</td></tr>
<tr><td>ppc32</td><td>Motorola PowerPC G4 7410</td><td></td><td>25.0625 (401/block)</td><td>Ahrens</td></tr>
<tr><td>x86</td><td>Intel Core Duo; nmi0068</td><td>25.74</td><td></td><td>gladman/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium D (f64); speed</td><td></td><td>27.33</td><td>OpenSSL 0.9.8e</td></tr>
<tr><td>ppc32</td><td>Motorola PowerPC G4 7410; gggg</td><td></td><td>29.32</td><td>OpenSSL 0.9.8c</td></tr>
<tr><td>sparcv9</td><td>Sun UltraSPARC III; nmi0051</td><td>29.45</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>sparcv9</td><td>Sun UltraSPARC III; nmisolaris10</td><td>29.46</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>ppc64</td><td>IBM Cell PPE; nmips3</td><td>35.20</td><td></td><td>bernstein/big-1/1</td></tr>
<tr><td>amd64</td><td>Intel Pentium 4 (f64)</td><td></td><td>37</td><td>OpenSSL 0.9.7f</td></tr>
<tr><td>x86</td><td>Intel Pentium 4 (f29)</td><td></td><td>39</td><td>OpenSSL 0.9.7e</td></tr>
<tr><td>sparc</td><td>Sun UltraSPARC III</td><td></td><td>46.875 (750/block)</td><td>Bassham</td></tr>
<tr><td>x86</td><td>Intel Pentium 1 (52c); cruncher</td><td>38.20</td><td></td><td>hongjun/v1/1</td></tr>
</tbody></table>
<p>
Regarding amd64 Intel Pentium 4,
Matsui writes:
"The number of memory reads
for one block encryption of AES
is 4 (for plaintext loads)
+ 11 x 4 (for subkey loads)
+ 16 x 10 (for table lookups)
= 208,
which means that Pentium 4 takes at least 208 cycles/block for one block encryption."
But this lower bound ignores the possibility of loading partially expanded keys,
saving as many as 30 loads,
and using 64-bit loads for keys and plaintext,
saving 9 more loads.
</p><p>
Regarding amd64 AMD Athlon 64,
Matsui writes:
"Considering an instruction latency of Athlon 64, the theoretical limit of AES
performance on this processor seems around 16 cycles/round = 160 cycles/block.
Our result is hence reaching closely this limit."
</p></body></html>

28
archive/manifest.yaml Normal file
View File

@ -0,0 +1,28 @@
# archive/manifest.yaml — curated list of works to preserve.
# Edited by hand. Tools never write to this file. See ARCHIVE.md.
#
# Per-artifact cap: 25 MB. Above that, archive.py warns and skips the fetch;
# commit an oversize artifact deliberately with `git add -f`.
#
# To evict an entry, see archive/removed.yaml — record there FIRST, then
# delete the line here, then run `make archive-gc`.
- url: "https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf"
slug: nist-fips-203
title: "FIPS 203 — Module-Lattice-Based Key-Encapsulation Mechanism Standard"
type: pdf
tags: [research]
note: >
The ML-KEM standard. Cited in the SIMD / post-quantum systems work;
archived so the citation survives any future reorganization of the
NIST publications site.
- url: "https://cr.yp.to/aes-speed.html"
slug: djb-aes-speed
title: "Cache-timing attacks on AES (cr.yp.to)"
# type: html — auto-detected from the .html extension; no override needed.
tags: [research]
note: >
Bernstein's cache-timing-attacks page, cited in the SIMD work. The
Phase 2 bootstrap entry: a stable, JavaScript-free static page, so its
monolith snapshot is reproducible and classifies cleanly as `ok`.

View File

@ -0,0 +1,14 @@
{
"url": "https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf",
"slug": "nist-fips-203",
"title": "FIPS 203 — Module-Lattice-Based Key-Encapsulation Mechanism Standard",
"type": "pdf",
"artifact": "document.pdf",
"sha256": "fe1f12f32a7e44ec9fdebbf400cda843a40b506dee676725234dc6f7923b6cac",
"previous-sha256": null,
"bytes": 1252341,
"archived": "2026-05-22",
"source-date": null,
"snapshot-quality": "ok",
"wayback": "http://web.archive.org/web/20260515100505/https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf"
}

Binary file not shown.

19
archive/removed.yaml Normal file
View File

@ -0,0 +1,19 @@
# archive/removed.yaml — record of evicted archive entries.
#
# Append an entry here BEFORE deleting its line from manifest.yaml, then
# run `make archive-gc`. The GC deletes only archive/<slug>/ directories
# whose slug is recorded here; an orphaned directory absent from this file
# is reported, never deleted. See ARCHIVE.md § Eviction & removal.
#
# Schema (all fields but `note` required):
# url: original URL at time of removal
# slug: the archive/<slug>/ directory archive-gc may delete
# removed: ISO date of removal
# reason: takedown | author-request | legal | quality
# note: optional free-text context
#
# This is not a hostile-tracking list — it exists so GC knows what is safe
# to delete, re-adding a removed URL is surfaced loudly, and the link-rot
# scanner and `archive-suggest` skip removed works.
[]

579
build/Archive.hs Normal file
View File

@ -0,0 +1,579 @@
{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE OverloadedStrings #-}
-- | Archive section — the link-archiving system. Phases 1-2: PDF and HTML.
--
-- Authored input: archive/manifest.yaml (one line per archived link)
-- Generated, committed: archive/<slug>/{document.pdf | snapshot.html}
-- + PROVENANCE.json
-- Generated, gitignored: archive/<slug>/{document,snapshot}.txt
-- + data/archive-index.json
--
-- @tools/archive.py fetch@ runs before the Hakyll build: it downloads
-- PDFs / snapshots HTML pages with @monolith@, extracts text, and writes
-- each PROVENANCE.json. This module then routes the artifacts and renders
-- one @/archive/<slug>/@ page per entry plus the @/archive/@ index.
--
-- An entry whose artifact has not been fetched (no PROVENANCE.json, or
-- no artifact file on disk) is skipped — it produces no page, and an
-- orphaned @archive/<slug>/@ directory with no manifest line is inert
-- (no page, not deployed). Artifact-integrity (SHA-256) verification
-- runs on both sides: @archive.py fetch@ re-hashes before the Hakyll
-- build, and 'verifyArtifactSha' (below) re-hashes again in
-- 'loadArchiveEntries' — so the guarantee holds even when @archive.py@
-- does not run first (no @.venv@, a direct @cabal run site -- build@,
-- or a deploy host without the Python toolchain).
--
-- See @ARCHIVE.md@ at the repo root for the full design and phase plan.
module Archive (archiveRules, archiveBuildStats) where
import Control.Exception (SomeException, catch)
import Control.Monad (filterM, forM, when)
import Data.Function (on)
import Data.List (groupBy, intercalate, sort, sortBy)
import qualified Data.Map.Strict as Map
import Data.Maybe (catMaybes, fromMaybe)
import Data.Ord (Down (..), comparing)
import qualified Data.Set as Set
import qualified Data.Text as T
import Data.Time (Day, diffDays, fromGregorian,
getCurrentTime, utctDay)
import qualified Data.Aeson as A
import Data.Aeson ((.:), (.:?))
import qualified Data.Yaml as Y
import System.Directory (doesDirectoryExist, doesFileExist,
listDirectory)
import System.Exit (exitFailure)
import System.IO (hPutStrLn, readFile', stderr)
import System.Process (readProcess)
import Text.Read (readMaybe)
import Hakyll
import Contexts (siteCtx)
import Backlinks (referencedByField)
import SimilarLinks (similarLinksField)
import ArchiveIndex (ArchiveStatus (..), statusName,
archiveStatusForSlug, normalizeUrl)
-- ---------------------------------------------------------------------------
-- Data model
-- ---------------------------------------------------------------------------
-- | One authored entry in @archive/manifest.yaml@ — only the fields this
-- module consumes. @title:@, @type:@ and @tags:@ are read by
-- @tools/archive.py@ (title and type fold into PROVENANCE.json; tags are
-- Phase 4) and need no Haskell-side binding.
data ManifestEntry = ManifestEntry
{ meUrl :: String
, meNote :: Maybe String
, mePaywalled :: Bool
, meVisibility :: String -- ^ "public" (default) | "private"
}
instance A.FromJSON ManifestEntry where
parseJSON = A.withObject "ManifestEntry" $ \o -> do
url <- o .: "url"
note <- o .:? "note"
paywalled <- fromMaybe False <$> o .:? "paywalled"
visibility <- fromMaybe "public" <$> o .:? "visibility"
-- A publication/privacy field must fail closed: an unknown value
-- (e.g. a typo'd "privte") would otherwise be treated as public
-- and publish an artifact the author intended to keep offline.
when (visibility `notElem` ["public", "private"]) $ fail $
"manifest entry " ++ url
++ ": visibility must be \"public\" or \"private\", got "
++ show visibility
return (ManifestEntry url note paywalled visibility)
newtype RemovedEntry = RemovedEntry { reUrl :: String }
instance A.FromJSON RemovedEntry where
parseJSON = A.withObject "RemovedEntry" $ \o ->
RemovedEntry <$> o .: "url"
-- | One generated @archive/<slug>/PROVENANCE.json@ — the immutable
-- record of an archival event, written by @tools/archive.py@.
data Provenance = Provenance
{ pvUrl :: String
, pvSlug :: String
, pvTitle :: String
, pvType :: String -- ^ "pdf" | "html"
, pvArtifact :: String -- ^ "document.pdf" | "snapshot.html"
, pvSha256 :: String
, pvBytes :: Integer
, pvArchived :: String
, pvQuality :: String -- ^ "ok" | "degraded" | "js-required"
, pvWayback :: Maybe String
}
instance A.FromJSON Provenance where
parseJSON = A.withObject "Provenance" $ \o -> Provenance
<$> o .: "url"
<*> o .: "slug"
<*> o .: "title"
<*> o .: "type"
<*> o .: "artifact"
<*> o .: "sha256"
<*> o .: "bytes"
<*> o .: "archived"
<*> (fromMaybe "ok" <$> o .:? "snapshot-quality")
<*> o .:? "wayback"
-- | A renderable archive entry: the authored manifest line joined with
-- its generated provenance and extracted full text. @aeTextId@ is the
-- on-disk path of the extracted-text sidecar when it exists (it is
-- gitignored, so a no-@.venv@ build may lack it).
data ArchiveEntry = ArchiveEntry
{ aeManifest :: ManifestEntry
, aeProv :: Provenance
, aeFulltext :: String
, aeTextId :: Maybe FilePath
, aeStatus :: ArchiveStatus -- ^ link-rot status of the original
}
-- | The extracted-text sidecar name for an artifact type.
textFileFor :: Provenance -> String
textFileFor pv
| pvType pv == "html" = "snapshot.txt"
| otherwise = "document.txt"
-- | True for a @visibility: private@ entry — kept in-repo as a local
-- preservation copy, but its artifact is never routed to @_site/@ and
-- its extracted text is never rendered into the page.
isPrivate :: ArchiveEntry -> Bool
isPrivate = (== "private") . meVisibility . aeManifest
-- ---------------------------------------------------------------------------
-- Rule-generation-time IO (runs inside 'preprocess')
-- ---------------------------------------------------------------------------
manifestPath, removedPath :: FilePath
manifestPath = "archive/manifest.yaml"
removedPath = "archive/removed.yaml"
-- | Read @archive/manifest.yaml@. An absent file yields an empty list
-- (the archive degrades to invisible, matching the @.venv@-gated
-- silent-skip convention). A *parse error on a present file* halts the
-- build: the file exists but is broken — degrading to invisible would
-- swallow real errors like a typo'd @visibility@ value or a malformed
-- entry, both of which are publication-relevant.
readManifest :: IO [ManifestEntry]
readManifest = do
exists <- doesFileExist manifestPath
if not exists
then return []
else do
parsed <- Y.decodeFileEither manifestPath
case parsed of
Right es -> return es
Left e -> do
hPutStrLn stderr $
"[archive] FATAL: manifest.yaml: " ++ show e
exitFailure
readRemovedUrls :: IO (Set.Set T.Text)
readRemovedUrls = do
exists <- doesFileExist removedPath
if not exists
then return Set.empty
else do
parsed <- Y.decodeFileEither removedPath
case parsed of
Right entries -> return . Set.fromList $
map (normalizeUrl . T.pack . reUrl) (entries :: [RemovedEntry])
Left e -> do
hPutStrLn stderr $
"[archive] FATAL: removed.yaml: " ++ show e
exitFailure
validateManifestEntries :: [ManifestEntry] -> Set.Set T.Text -> IO ()
validateManifestEntries manifest removed = go Map.empty manifest
where
go _ [] = return ()
go seen (entry : rest) = do
let url = meUrl entry
norm = normalizeUrl (T.pack url)
when (norm `Set.member` removed) $ do
hPutStrLn stderr $
"[archive] FATAL: manifest URL " ++ show url
++ " is also recorded in removed.yaml; refusing to publish "
++ "a deliberately removed work."
exitFailure
case Map.lookup norm seen of
Just prior -> do
hPutStrLn stderr $
"[archive] FATAL: manifest URLs " ++ show prior ++ " and "
++ show url ++ " normalise to the same archive target."
exitFailure
Nothing -> go (Map.insert norm url seen) rest
-- | Scan @archive/<slug>/PROVENANCE.json@ into a @url -> (slug, Provenance)@
-- map. The directory name is the slug; the join key is the URL.
readProvenances :: IO (Map.Map String (String, Provenance))
readProvenances = do
exists <- doesDirectoryExist "archive"
if not exists
then return Map.empty
else do
names <- listDirectory "archive"
entries <- forM names $ \name -> do
let provPath = "archive/" ++ name ++ "/PROVENANCE.json"
isFile <- doesFileExist provPath
if not isFile
then return Nothing
else do
decoded <- A.eitherDecodeFileStrict' provPath
case decoded of
Right p -> return (Just (pvUrl p, (name, p)))
Left e -> do
hPutStrLn stderr $
"[archive] FATAL: " ++ provPath ++ ": " ++ show e
exitFailure
return (Map.fromList (catMaybes entries))
-- | Read a file, returning "" on any error (e.g. an absent text sidecar).
readFileSafe :: FilePath -> IO String
readFileSafe path =
catch (readFile' path) (\(_ :: SomeException) -> return "")
-- | Verify a committed artifact's SHA-256 against its recorded value.
-- The build halts with a clear message on mismatch — so the integrity
-- guarantee holds even when @tools/archive.py@ does not run first
-- (e.g. no @.venv@, or a direct @cabal run site -- build@), and a
-- tampered or corrupted artifact can never be deployed.
--
-- Shells out to @sha256sum@ (GNU coreutils — same toolchain the rest of
-- the build assumes); a missing or non-zero @sha256sum@ surfaces as an
-- exception that also halts the build.
verifyArtifactSha :: String -> FilePath -> String -> IO ()
verifyArtifactSha slug path expected = do
out <- readProcess "sha256sum" [path] ""
let actual = takeWhile (/= ' ') out
when (actual /= expected) $ do
hPutStrLn stderr $
"[archive] FATAL: " ++ slug ++ ": " ++ path
++ " SHA-256 mismatch (recorded " ++ expected
++ ", found " ++ actual
++ "). The committed artifact is corrupt or was replaced; "
++ "halting build."
exitFailure
-- | Join the authored manifest with generated provenance. A manifest
-- entry with no matching provenance — or whose artifact is not on disk
-- — is dropped, so it produces no page.
loadArchiveEntries :: IO [ArchiveEntry]
loadArchiveEntries = do
manifest <- readManifest
removed <- readRemovedUrls
validateManifestEntries manifest removed
provByUrl <- readProvenances
fmap catMaybes $ forM manifest $ \me ->
case Map.lookup (meUrl me) provByUrl of
Nothing -> return Nothing
Just (slug, pv) -> do
let dir = "archive/" ++ slug
txtPath = dir ++ "/" ++ textFileFor pv
let artPath = dir ++ "/" ++ pvArtifact pv
artifactThere <- doesFileExist artPath
if not artifactThere
then do
hPutStrLn stderr $
"[archive] FATAL: " ++ slug ++ ": " ++ artPath
++ " is missing although PROVENANCE.json exists; "
++ "restore the committed artifact before building."
exitFailure
else do
verifyArtifactSha slug artPath (pvSha256 pv)
txtThere <- doesFileExist txtPath
txt <- if txtThere then readFileSafe txtPath
else return ""
return $ Just ArchiveEntry
{ aeManifest = me
, aeProv = pv
, aeFulltext = txt
, aeTextId = if txtThere then Just txtPath
else Nothing
, aeStatus = archiveStatusForSlug slug
}
-- ---------------------------------------------------------------------------
-- Rules
-- ---------------------------------------------------------------------------
-- | All archive rules. Called once from 'Site.rules'.
archiveRules :: Rules ()
archiveRules = do
entries <- preprocess loadArchiveEntries
-- Raw artifacts: the PDF / HTML snapshot of every *public* entry,
-- served at its own path (/archive/<slug>/...). Routing this explicit
-- list rather than a glob means a `visibility: private` entry's
-- artifact is never deployed, and an orphan directory's artifact
-- (no manifest line) is not deployed either.
let publicArtifacts =
[ fromFilePath ("archive/" ++ pvSlug (aeProv e)
++ "/" ++ pvArtifact (aeProv e))
| e <- entries, not (isPrivate e) ]
match (fromList publicArtifacts) $ do
route idRoute
compile copyFileCompiler
-- Provenance, extracted text, and the manifest: matched (not routed)
-- so the generated pages can `load` them as dependencies and recompile
-- when they change.
match "archive/*/PROVENANCE.json" $ compile getResourceBody
match "archive/*/document.txt" $ compile getResourceBody
match "archive/*/snapshot.txt" $ compile getResourceBody
match "archive/manifest.yaml" $ compile getResourceBody
mapM_ archiveEntryRule entries
archiveIndexRule entries
-- | One @/archive/<slug>/@ page.
archiveEntryRule :: ArchiveEntry -> Rules ()
archiveEntryRule ae =
create [fromFilePath ("archive/" ++ slug ++ "/index.html")] $ do
route idRoute
compile $ do
-- Dependency edges: recompile when provenance or the manifest
-- changes. The extracted-text sidecar is gitignored and may be
-- absent (no .venv / fetch never ran); load it as a dependency
-- only when present, so the build never fails for a missing
-- generated file.
_ <- load provId :: Compiler (Item String)
_ <- load manifestId :: Compiler (Item String)
case aeTextId ae of
Just tp -> do
_ <- load (fromFilePath tp) :: Compiler (Item String)
return ()
Nothing -> return ()
makeItem ""
>>= loadAndApplyTemplate "templates/archive.html" ctx
>>= loadAndApplyTemplate "templates/default.html" ctx
>>= relativizeUrls
where
slug = pvSlug (aeProv ae)
provId = fromFilePath ("archive/" ++ slug ++ "/PROVENANCE.json")
manifestId = fromFilePath manifestPath
ctx = archiveEntryCtx ae
-- | The @/archive/@ index — every archived work, newest snapshot first.
archiveIndexRule :: [ArchiveEntry] -> Rules ()
archiveIndexRule entries =
create ["archive/index.html"] $ do
route idRoute
compile $ do
-- Recompile when any provenance appears / changes, or the
-- manifest changes.
_ <- loadAll "archive/*/PROVENANCE.json" :: Compiler [Item String]
_ <- load (fromFilePath manifestPath) :: Compiler (Item String)
let sorted = sortBy (comparing (Down . pvArchived . aeProv)) entries
items = map (\e -> Item (fromFilePath ("archive/" ++ pvSlug (aeProv e))) e)
sorted
ctx = listField "entries" entryListCtx (return items)
<> constField "title" "Archive"
<> constField "archive" "true"
<> constField "noindex" "true"
<> (if null entries then mempty
else constField "has-entries" "true")
<> siteCtx
makeItem ""
>>= loadAndApplyTemplate "templates/archive-index.html" ctx
>>= loadAndApplyTemplate "templates/default.html" ctx
>>= relativizeUrls
-- ---------------------------------------------------------------------------
-- Contexts
-- ---------------------------------------------------------------------------
-- | Per-entry context for the @/archive/<slug>/@ page.
archiveEntryCtx :: ArchiveEntry -> Context String
archiveEntryCtx ae = mconcat
[ constField "title" (pvTitle pv)
, constField "archive" "true"
, constField "noindex" "true"
, constField "original-url" (meUrl me)
, constField "archived" (pvArchived pv)
, constField "archive-type" (pvType pv)
, constField "sha-short" (take 12 (pvSha256 pv))
, constField "size" (formatBytes (pvBytes pv))
, constField "snapshot-quality" (pvQuality pv)
, constField "status" (statusName (aeStatus ae))
, qualityFlag
, maybeField "status-note" (statusNote (aeStatus ae))
, maybeField "note" (meNote me)
, maybeField "wayback" (pvWayback pv)
, maybeField "paywalled" (if mePaywalled me then Just "true" else Nothing)
, visibilityFields
-- "Referenced by" (the pages that cite this work) and "Related"
-- (semantically near content). Both resolve by this page's route, so
-- they need no archive-specific wiring; each is a $if(...)$-guarded
-- section in archive.html.
, referencedByField
, similarLinksField
, siteCtx
]
where
me = aeManifest ae
pv = aeProv ae
slug = pvSlug pv
artUrl = "/archive/" ++ slug ++ "/" ++ pvArtifact pv
-- A non-'ok' snapshot raises a visible flag on the page.
qualityFlag
| pvQuality pv == "ok" = mempty
| otherwise = constField "degraded" "true"
-- A private entry keeps a local preservation copy but publishes none
-- of it: no embed, no extracted text — only the provenance metadata
-- and a 'held offline' note. A public entry embeds the artifact raw
-- (the browser renders the PDF natively, the snapshot loads directly;
-- no PDF.js wrapper) and renders its extracted text into the page.
-- The is-pdf / is-html flag drives only the iframe sandbox: a
-- third-party HTML snapshot is sandboxed, our own committed PDF is not.
visibilityFields
| isPrivate ae = constField "private" "true"
| otherwise = typeField
<> constField "artifact-url" artUrl
<> constField "artifact-name" (pvArtifact pv)
<> fulltextField (pvType pv) (aeFulltext ae)
typeField
| pvType pv == "html" = constField "is-html" "true"
| otherwise = constField "is-pdf" "true"
-- | Renders the extracted full text into the page DOM so embed.py and
-- Pagefind index real text, not an opaque iframe. PDF text keeps its
-- pdftotext layout in a @<pre>@; HTML text is block-separated prose, so
-- it renders as escaped @<p>@ paragraphs. Absent when the text is empty
-- / whitespace, so the @$if(fulltext)$@ guard hides the section.
fulltextField :: String -> String -> Context String
fulltextField ftype txt
| all isBlank txt = mempty
| ftype == "html" = constField "fulltext" (htmlParagraphs txt)
| otherwise = constField "fulltext" preBlock
where
isBlank c = c == ' ' || c == '\n' || c == '\t' || c == '\r'
preBlock = "<pre class=\"archive-fulltext\">"
++ escapeHtml txt ++ "</pre>"
-- | Block-separated text (paragraphs delimited by blank lines, as
-- @archive.py@'s HTML extractor writes it) → escaped @<p>@ elements.
htmlParagraphs :: String -> String
htmlParagraphs = concatMap para . paragraphsOf
where
para p = "<p>" ++ escapeHtml p ++ "</p>\n"
paragraphsOf = map (unwords . concatMap words)
. filter (not . blankGroup)
. groupBy ((==) `on` blankLine)
. lines
blankGroup g = null g || blankLine (head g)
blankLine = all (`elem` (" \t\r" :: String))
-- | List-item context for the @/archive/@ index.
entryListCtx :: Context ArchiveEntry
entryListCtx = mconcat
[ field "entry-title" (return . pvTitle . aeProv . itemBody)
, field "entry-archived" (return . pvArchived . aeProv . itemBody)
, field "entry-type" (return . pvType . aeProv . itemBody)
, field "entry-quality" (return . pvQuality . aeProv . itemBody)
, boolField "entry-degraded" ((/= "ok") . pvQuality . aeProv . itemBody)
, boolField "entry-private" (isPrivate . itemBody)
, field "entry-status" (return . statusName . aeStatus . itemBody)
, boolField "entry-rotted" ((== Rotted) . aeStatus . itemBody)
, field "entry-url" (\i -> return $
"/archive/" ++ pvSlug (aeProv (itemBody i)) ++ "/")
]
-- | Provide a field only when the value is present; otherwise contribute
-- nothing, so the template's @$if(...)$@ guard is false.
maybeField :: String -> Maybe String -> Context String
maybeField k = maybe mempty (constField k)
-- | A prose note for a non-live link-rot status, shown on the archive
-- page; 'Nothing' for 'Live' / 'Error' (no note rendered).
statusNote :: ArchiveStatus -> Maybe String
statusNote Rotted = Just "The original is no longer reachable. This archived \
\copy is now the live link."
statusNote Moved = Just "The original page has moved since this snapshot was \
\taken; the link above may redirect."
statusNote _ = Nothing
-- ---------------------------------------------------------------------------
-- Formatting
-- ---------------------------------------------------------------------------
-- | Human-readable byte count (mirrors the helper in build/Stats.hs).
formatBytes :: Integer -> String
formatBytes b
| b < 1024 = show b ++ " B"
| b < 1024 * 1024 = showD (b * 10 `div` 1024) ++ " KB"
| otherwise = showD (b * 10 `div` (1024 * 1024)) ++ " MB"
where
showD n = show (n `div` 10) ++ "." ++ show (n `mod` 10)
-- ---------------------------------------------------------------------------
-- /build/ telemetry
-- ---------------------------------------------------------------------------
-- | Archive metrics for the @/build/@ telemetry page — count, total size,
-- median artifact age, breakdowns by link-rot status / snapshot quality
-- / visibility, the paywalled count, and any orphan directories.
-- Rendered by @Stats.hs@; an empty archive yields just the count.
archiveBuildStats :: IO [(String, String)]
archiveBuildStats = do
entries <- loadArchiveEntries
today <- utctDay <$> getCurrentTime
orphans <- findOrphanDirs entries
let n = length entries
bytes = sum (map (pvBytes . aeProv) entries)
ages = [ fromInteger (diffDays today d)
| e <- entries
, Just d <- [parseIsoDay (pvArchived (aeProv e))] ]
paywalled = length (filter (mePaywalled . aeManifest) entries)
return $
[ ("Entries", show n) ]
++ (if n == 0 then [] else
[ ("Total size", formatBytes bytes)
, ("Median age", medianAge ages)
, ("By status", tallyOf (map (statusName . aeStatus) entries))
, ("By quality", tallyOf (map (pvQuality . aeProv) entries))
, ("By visibility", tallyOf (map (meVisibility . aeManifest) entries))
])
++ [ ("Paywalled", show paywalled) | paywalled > 0 ]
++ [ ("Orphan directories", unwords orphans) | not (null orphans) ]
-- | Directory names under @archive/@ that hold a @PROVENANCE.json@ but are
-- not a live manifest entry — drift the @/build/@ page should surface.
findOrphanDirs :: [ArchiveEntry] -> IO [String]
findOrphanDirs entries = do
exists <- doesDirectoryExist "archive"
if not exists
then return []
else do
names <- listDirectory "archive"
let live = map (pvSlug . aeProv) entries
filterM
(\name -> do
hasProv <- doesFileExist
("archive/" ++ name ++ "/PROVENANCE.json")
return (hasProv && name `notElem` live))
(sort names)
-- | Format a multiset of string values as @"a 2 \183 b 1"@.
tallyOf :: [String] -> String
tallyOf xs = intercalate " \183 "
[ k ++ " " ++ show c
| (k, c) <- Map.toList (Map.fromListWith (+) [ (x, 1 :: Int) | x <- xs ]) ]
-- | The median of a list of ages, as @"N days"@; an em dash when empty.
medianAge :: [Int] -> String
medianAge [] = "\8212"
medianAge xs =
let m = sort xs !! (length xs `div` 2)
in show m ++ if m == 1 then " day" else " days"
-- | Parse a @YYYY-MM-DD@ date; 'Nothing' on malformed input.
parseIsoDay :: String -> Maybe Day
parseIsoDay s = case splitOnDash s of
[y, m, d] -> fromGregorian <$> readMaybe y <*> readMaybe m <*> readMaybe d
_ -> Nothing
where
splitOnDash str = case break (== '-') str of
(a, '-' : rest) -> a : splitOnDash rest
(a, _) -> [a]

255
build/ArchiveIndex.hs Normal file
View File

@ -0,0 +1,255 @@
{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE OverloadedStrings #-}
-- | ArchiveIndex — shared read-only access to the archive's two JSON
-- sidecars: @data/archive-index.json@ (the @url\/alias -> slug@ map
-- written by @archive.py fetch@) and @data/archive-state.json@ (the
-- per-URL link-rot status written by @archive.py check@).
--
-- Consumers:
--
-- * @Filters.Archive@ — appends the archive affordance to body links
-- whose target is archived, and flips a @rotted@ link to the local
-- copy.
-- * @Backlinks@ — keeps archived external links through pass 1 and
-- canonicalises them to their @/archive/<slug>/@ page in pass 2.
-- * @Archive@ — surfaces each entry's rot status on its page, the
-- @/archive/@ index, and the @/build/@ telemetry.
--
-- Both files are loaded once per build via @unsafePerformIO@ CAFs. An
-- absent or malformed file degrades safely: an empty index makes the
-- link consumers no-op; an absent state file makes every entry @Live@
-- (the safe default — no link flip). @archive.py check@ is decoupled
-- from @make build@; a build consumes whatever state file exists.
module ArchiveIndex
( ArchiveStatus (..)
, statusName
, archiveSlugFor
, archiveStatusForSlug
, archiveIndexIsEmpty
, normalizeUrl
) where
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import Data.Maybe (fromMaybe)
import Data.Set (Set)
import qualified Data.Set as Set
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Aeson as A
import Data.Aeson ((.!=), (.:), (.:?))
import qualified Data.Yaml as Y
import System.Directory (doesFileExist)
import System.IO.Unsafe (unsafePerformIO)
-- ---------------------------------------------------------------------------
-- Link-rot status
-- ---------------------------------------------------------------------------
-- | The link-rot status of an archived work's original URL, as set by
-- @archive.py check@. 'Live' is the safe default for an unscanned or
-- unknown entry.
data ArchiveStatus = Live | Moved | Rotted | Error
deriving (Eq, Show)
-- | The lower-case wire name, matching @archive-state.json@ and the
-- @status:@ Pagefind filter tag.
statusName :: ArchiveStatus -> String
statusName Live = "live"
statusName Moved = "moved"
statusName Rotted = "rotted"
statusName Error = "error"
parseStatus :: Text -> ArchiveStatus
parseStatus "moved" = Moved
parseStatus "rotted" = Rotted
parseStatus "error" = Error
parseStatus _ = Live
-- ---------------------------------------------------------------------------
-- JSON shapes
-- ---------------------------------------------------------------------------
-- | One @archive-index.json@ entry. Only @slug@ and @aliases@ are used.
data IdxEntry = IdxEntry
{ ieSlug :: String
, ieAliases :: [Text]
}
instance A.FromJSON IdxEntry where
parseJSON = A.withObject "IdxEntry" $ \o -> IdxEntry
<$> o .: "slug"
<*> (o .:? "aliases" .!= [])
-- | One @archive-state.json@ entry — only the @status@ is consumed here.
newtype StateEntry = StateEntry { seStatus :: ArchiveStatus }
instance A.FromJSON StateEntry where
parseJSON = A.withObject "StateEntry" $ \o ->
StateEntry . parseStatus <$> (o .:? "status" .!= "live")
newtype UrlEntry = UrlEntry { ueUrl :: Text }
instance A.FromJSON UrlEntry where
parseJSON = A.withObject "UrlEntry" $ \o ->
UrlEntry <$> o .: "url"
-- ---------------------------------------------------------------------------
-- Loaded-once CAFs
-- ---------------------------------------------------------------------------
indexPath, statePath, manifestPath, removedPath :: FilePath
indexPath = "data/archive-index.json"
statePath = "data/archive-state.json"
manifestPath = "archive/manifest.yaml"
removedPath = "archive/removed.yaml"
readUrlSet :: FilePath -> IO (Set Text)
readUrlSet path = do
exists <- doesFileExist path
if not exists
then return Set.empty
else do
decoded <- Y.decodeFileEither path
case decoded of
Right entries -> return . Set.fromList $
map (normalizeUrl . ueUrl) (entries :: [UrlEntry])
Left e -> ioError . userError $
"[archive] FATAL: " ++ path ++ ": " ++ show e
-- | Canonical URLs still permitted to participate in link annotation.
-- Filtering the generated index at build time makes a direct Hakyll build
-- respect authored manifest/removal state even when archive.py did not run.
{-# NOINLINE activeUrls #-}
activeUrls :: Set Text
activeUrls = unsafePerformIO $ do
manifest <- readUrlSet manifestPath
removed <- readUrlSet removedPath
return (manifest `Set.difference` removed)
-- | @canonical-url -> entry@. Absent/malformed file -> empty; entries no
-- longer permitted by the authored manifest/removal state are removed.
{-# NOINLINE rawIndex #-}
rawIndex :: Map Text IdxEntry
rawIndex = unsafePerformIO $ do
decoded <- A.eitherDecodeFileStrict' indexPath
let parsed = either (const Map.empty) id decoded
return $ Map.filterWithKey
(\canon _ -> normalizeUrl canon `Set.member` activeUrls)
parsed
-- | @url -> status@. Absent/malformed file -> empty (every entry 'Live').
{-# NOINLINE rawState #-}
rawState :: Map Text ArchiveStatus
rawState = unsafePerformIO $ do
decoded <- A.eitherDecodeFileStrict' statePath
return $ either (const Map.empty) (Map.map seStatus) decoded
-- | @normalised-url -> slug@: the canonical key and every alias from
-- @archive-index.json@, each fed through 'normalizeUrl'. Both keys and
-- lookups are normalised, so a citation form the alias set cannot
-- enumerate (e.g. an unbounded arXiv version, or any tracking-laden
-- variant of a clean manifest URL) still resolves.
{-# NOINLINE flatIndex #-}
flatIndex :: Map Text String
flatIndex = Map.fromList
[ (normalizeUrl key, ieSlug e)
| (canon, e) <- Map.toList rawIndex
, key <- canon : ieAliases e
]
-- | @slug -> status@: each entry's status, looked up by its canonical URL
-- in the state file (the two files share the manifest URL as key).
{-# NOINLINE slugStatus #-}
slugStatus :: Map String ArchiveStatus
slugStatus = Map.fromList
[ (ieSlug e, Map.findWithDefault Live canon rawState)
| (canon, e) <- Map.toList rawIndex
]
-- ---------------------------------------------------------------------------
-- Public lookups
-- ---------------------------------------------------------------------------
-- | True when no archive index is available — the link consumers no-op.
archiveIndexIsEmpty :: Bool
archiveIndexIsEmpty = Map.null rawIndex
-- | The archive slug for an outbound URL, or 'Nothing'. Both the index
-- keys and the input go through 'normalizeUrl', so a citation form that
-- the alias set cannot enumerate — an unbounded arXiv version, or any
-- tracking-laden variant of a clean manifest URL — still resolves.
archiveSlugFor :: Text -> Maybe String
archiveSlugFor url = Map.lookup (normalizeUrl url) flatIndex
-- | The link-rot status of an archived entry, by slug. 'Live' for an
-- unknown slug or when no scan has run.
archiveStatusForSlug :: String -> ArchiveStatus
archiveStatusForSlug slug = Map.findWithDefault Live slug slugStatus
-- ---------------------------------------------------------------------------
-- URL normalisation (matching, not display)
-- ---------------------------------------------------------------------------
-- | Tracking-only query parameters: their presence or absence is
-- semantically irrelevant; the lookup strips them before matching.
-- Sync with @TRACKING_PARAMS@ in @tools/archive.py@.
trackingParams :: [Text]
trackingParams =
[ "utm_source", "utm_medium", "utm_campaign", "utm_term", "utm_content"
, "fbclid", "gclid", "mc_eid", "mc_cid", "ref", "igshid"
, "_hsenc", "_hsmi", "mkt_tok"
]
-- | Remove tracking-only query parameters; preserve every other parameter
-- in its original order.
stripTracking :: Text -> Text
stripTracking url = case T.breakOn "?" url of
(_, "") -> url
(path, q) ->
let kept = filter notTracking (T.splitOn "&" (T.drop 1 q))
in if null kept then path
else path <> "?" <> T.intercalate "&" kept
where
notTracking p = T.takeWhile (/= '=') p `notElem` trackingParams
-- | The canonical form of an arXiv URL: @https://arxiv.org/abs/<id>@ with
-- no version suffix and no @.pdf@. Maps every member of the
-- abs/pdf/versioned/@.pdf@ family to the same key. Non-arXiv passes through.
arxivCanonical :: Text -> Text
arxivCanonical url
| Just rest <- T.stripPrefix "https://arxiv.org/" url
, Just key <- arxivKey rest = key
| Just rest <- T.stripPrefix "http://arxiv.org/" url
, Just key <- arxivKey rest = key
| otherwise = url
where
arxivKey rest = case T.breakOn "/" rest of
(kind, slashId)
| kind `elem` ["abs", "pdf"], not (T.null slashId) ->
Just $ "https://arxiv.org/abs/"
<> stripVer (stripPdfSuf (T.tail slashId))
_ -> Nothing
stripPdfSuf t = fromMaybe t (T.stripSuffix ".pdf" t)
stripVer t = case T.breakOnEnd "v" t of
(before, ver)
| not (T.null before)
, not (T.null ver)
, T.all isAsciiDigit ver
-> T.dropEnd 1 before
_ -> t
isAsciiDigit c = c >= '0' && c <= '9'
-- | The full normalisation: drop fragment, strip tracking, fold
-- @http://@→@https://@, arXiv-canonicalise, trim a trailing slash. Both
-- 'flatIndex' keys and 'archiveSlugFor' inputs go through this so the
-- index never misses a citation form the design promises to match.
normalizeUrl :: Text -> Text
normalizeUrl url =
let noFrag = T.takeWhile (/= '#') url
clean = stripTracking noFrag
https = case T.stripPrefix "http://" clean of
Just rest -> "https://" <> rest
Nothing -> clean
arxiv = arxivCanonical https
in T.dropWhileEnd (== '/') arxiv

View File

@ -25,9 +25,11 @@
module Backlinks
( backlinkRules
, backlinksField
, referencedByField
) where
import Data.List (nubBy, sortBy)
import Data.List (nubBy, partition, sortBy,
stripPrefix)
import Data.Ord (comparing)
import Data.Maybe (fromMaybe)
import qualified Data.Map.Strict as Map
@ -50,6 +52,7 @@ import Hakyll
import Compilers (readerOpts, writerOpts)
import Filters (preprocessSource)
import qualified Patterns as P
import ArchiveIndex (archiveSlugFor)
-- ---------------------------------------------------------------------------
-- Link-with-context entry (intermediate, saved by the "links" pass)
@ -85,6 +88,7 @@ data BacklinkSource = BacklinkSource
, blAbstract :: String
, blSentence :: String -- raw HTML of the sentence containing the link
, blParagraph :: String -- raw HTML of the full paragraph (hover popup)
, blFragment :: String -- archived-target fragment (no '#'), else ""
} deriving (Show, Eq, Ord)
instance Aeson.ToJSON BacklinkSource where
@ -94,16 +98,18 @@ instance Aeson.ToJSON BacklinkSource where
, "abstract" .= blAbstract bl
, "sentence" .= blSentence bl
, "paragraph" .= blParagraph bl
, "fragment" .= blFragment bl
]
instance Aeson.FromJSON BacklinkSource where
parseJSON = Aeson.withObject "BacklinkSource" $ \o ->
BacklinkSource
<$> o Aeson..: "url"
<*> o Aeson..: "title"
<*> o Aeson..: "abstract"
<*> o Aeson..: "sentence"
<*> o Aeson..: "paragraph"
<$> o Aeson..: "url"
<*> o Aeson..: "title"
<*> o Aeson..: "abstract"
<*> o Aeson..: "sentence"
<*> o Aeson..: "paragraph"
<*> o Aeson..:? "fragment" Aeson..!= ""
-- ---------------------------------------------------------------------------
-- Writer options for context rendering
@ -125,15 +131,22 @@ contextWriterOpts = writerOpts
-- | URL filter: skip external links, pseudo-schemes, anchor-only fragments,
-- and static-asset paths.
isPageLink :: T.Text -> Bool
isPageLink u =
not (T.isPrefixOf "http://" u) &&
not (T.isPrefixOf "https://" u) &&
not (T.isPrefixOf "#" u) &&
not (T.isPrefixOf "mailto:" u) &&
not (T.isPrefixOf "tel:" u) &&
not (T.null u) &&
not (hasStaticExt u)
isPageLink u
-- An archived external URL is kept regardless of scheme or extension:
-- pass 2 inverts it to its /archive/<slug>/ page.
| isArchived = True
| otherwise =
not (T.isPrefixOf "http://" u) &&
not (T.isPrefixOf "https://" u) &&
not (T.isPrefixOf "#" u) &&
not (T.isPrefixOf "mailto:" u) &&
not (T.isPrefixOf "tel:" u) &&
not (T.null u) &&
not (hasStaticExt u)
where
isArchived = case archiveSlugFor u of
Just _ -> True
Nothing -> False
staticExts = [".pdf",".svg",".png",".jpg",".jpeg",".webp",
".mp3",".mp4",".woff2",".woff",".ttf",".ico",
".json",".asc",".xml",".gz",".zip"]
@ -289,6 +302,28 @@ percentDecode = T.unpack . TE.decodeUtf8With lenientDecode . pack . go
pack = BS.pack
lenientDecode = TE.lenientDecode
-- ---------------------------------------------------------------------------
-- Archive-aware target keying
-- ---------------------------------------------------------------------------
-- | The @data/backlinks.json@ key an outbound URL inverts to. An archived
-- external URL canonicalises to its @/archive/<slug>/@ page key — computed
-- exactly as 'backlinksFieldWith' computes the archive page's own key (the
-- same string fed through 'normaliseUrl'), so the two always agree. Every
-- other URL is normalised as before.
targetKey :: T.Text -> T.Text
targetKey u = case archiveSlugFor u of
Just slug -> T.pack (normaliseUrl ("/archive/" ++ slug ++ "/index.html"))
Nothing -> T.pack (normaliseUrl (T.unpack u))
-- | The fragment (without @#@) of an archived URL, for granular grouping
-- of "Referenced by". Empty for a non-archived URL or one with no fragment
-- — so granular grouping stays an archive-only behaviour.
archiveFragment :: T.Text -> String
archiveFragment u = case archiveSlugFor u of
Just _ -> T.unpack (T.drop 1 (T.dropWhile (/= '#') u))
Nothing -> ""
-- ---------------------------------------------------------------------------
-- Content patterns (must match the rules in Site.hs — sourced from
-- Patterns.allContent so additions to the canonical list automatically
@ -337,10 +372,11 @@ toSourcePairs item = do
:: Maybe [LinkEntry] of
Nothing -> return []
Just entries ->
return [ ( T.pack (normaliseUrl (T.unpack (leUrl e)))
return [ ( targetKey (leUrl e)
, BacklinkSource srcUrl title abstract
(leSentence e)
(leParagraph e)
(archiveFragment (leUrl e))
)
| e <- entries ]
@ -352,7 +388,20 @@ toSourcePairs item = do
-- to the current page, each with its paragraph context.
-- Returns @noResult@ (so @$if(backlinks)$@ is false) when there are none.
backlinksField :: Context String
backlinksField = field "backlinks" $ \item -> do
backlinksField = backlinksFieldWith renderBacklinks "backlinks"
-- | "Referenced by" for archive pages. Same lookup as 'backlinksField',
-- but the sources are grouped by the fragment each citation targets, so an
-- archived work's page can show which section/page each citing essay points
-- at (granular backlinks).
referencedByField :: Context String
referencedByField = backlinksFieldWith renderReferencedBy "referenced-by"
-- | Shared machinery for 'backlinksField' and 'referencedByField': look the
-- page up in @data/backlinks.json@ by its normalised route, then hand the
-- sorted sources to the given renderer.
backlinksFieldWith :: ([BacklinkSource] -> String) -> String -> Context String
backlinksFieldWith renderSources name = field name $ \item -> do
blItem <- load (fromFilePath "data/backlinks.json") :: Compiler (Item String)
case Aeson.decodeStrict (TE.encodeUtf8 (T.pack (itemBody blItem)))
:: Maybe (Map T.Text [BacklinkSource]) of
@ -367,7 +416,7 @@ backlinksField = field "backlinks" $ \item -> do
sorted = sortBy (comparing blTitle) sources
in if null sorted
then fail "no backlinks"
else return (renderBacklinks sorted)
else return (renderSources sorted)
-- ---------------------------------------------------------------------------
-- HTML rendering
@ -384,25 +433,59 @@ backlinksField = field "backlinks" $ \item -> do
renderBacklinks :: [BacklinkSource] -> String
renderBacklinks sources =
"<ul class=\"backlinks-list\">\n"
++ concatMap renderOne sources
++ concatMap renderBacklinkItem sources
++ "</ul>"
where
renderOne bl =
"<li class=\"backlink-item\">"
++ "<a class=\"backlink-source\" href=\""
++ escapeHtml (blUrl bl) ++ "\">"
++ escapeHtml (blTitle bl) ++ "</a>"
++ ( if null (blSentence bl) then ""
else "<blockquote class=\"backlink-quote\">"
++ blSentence bl
++ paragraphAffordance bl
++ "</blockquote>" )
++ "</li>\n"
paragraphAffordance bl
| null (blParagraph bl) = ""
| blParagraph bl == blSentence bl = ""
| otherwise =
-- | "Referenced by", grouped by the fragment each citation targets.
-- Sources citing the work with no fragment render first as a plain list;
-- each distinct fragment then gets its own subheading. With no fragments
-- anywhere (the common case) this collapses to exactly the flat list.
renderReferencedBy :: [BacklinkSource] -> String
renderReferencedBy sources =
let (general, fragmented) = partition (null . blFragment) sources
groups = Map.toList $ Map.fromListWith (flip (++))
[ (blFragment s, [s]) | s <- fragmented ]
in renderList general ++ concatMap renderGroup groups
where
renderList [] = ""
renderList ss = "<ul class=\"backlinks-list\">\n"
++ concatMap renderBacklinkItem ss ++ "</ul>\n"
renderGroup (frag, ss) =
"<div class=\"referenced-by-group\">"
++ "<h3 class=\"referenced-by-fragment\">"
++ escapeHtml (fragmentLabel frag) ++ "</h3>"
++ renderList ss
++ "</div>\n"
-- | Human label for a cited fragment: a PDF @#page=N@ becomes "Page N";
-- any other @#anchor@ is shown verbatim behind a section mark.
fragmentLabel :: String -> String
fragmentLabel frag =
case stripPrefix "page=" frag of
Just n -> "Page " ++ n
Nothing -> "\x00A7 " ++ frag
-- | One backlink @<li>@: the source title as a link, the sentence of
-- context as a blockquote, and a hover affordance revealing the full
-- paragraph. 'blSentence' / 'blParagraph' are already HTML fragments from
-- the Pandoc writer, so they are emitted unescaped.
renderBacklinkItem :: BacklinkSource -> String
renderBacklinkItem bl =
"<li class=\"backlink-item\">"
++ "<a class=\"backlink-source\" href=\""
++ escapeHtml (blUrl bl) ++ "\">"
++ escapeHtml (blTitle bl) ++ "</a>"
++ ( if null (blSentence bl) then ""
else "<blockquote class=\"backlink-quote\">"
++ blSentence bl
++ paragraphAffordance
++ "</blockquote>" )
++ "</li>\n"
where
paragraphAffordance
| null (blParagraph bl) = ""
| blParagraph bl == blSentence bl = ""
| otherwise =
"<span class=\"backlink-full\">"
++ "<button type=\"button\" class=\"backlink-full-trigger\""
++ " aria-label=\"Show full paragraph\" tabindex=\"0\">\x00B6</button>"

View File

@ -48,7 +48,7 @@ import Text.Pandoc.Options (WriterOptions(..), HTMLMathMethod(..))
import Hakyll hiding (trim)
import Backlinks (backlinksField)
import Dingbat (dingbatField)
import Marks (monogramSvgField, epistemicSvgField)
import Marks (monogramSvgField, hasMonogramField, epistemicSvgField)
import SimilarLinks (similarLinksField)
import Stability (stabilityField, lastReviewedField, lastReviewedIsoField,
versionHistoryField,
@ -437,6 +437,7 @@ siteCtx =
<> summaryField
<> dingbatField
<> monogramSvgField
<> hasMonogramField
<> defaultContext
-- ---------------------------------------------------------------------------

View File

@ -13,6 +13,7 @@ import qualified Filters.Typography as Typography
import qualified Filters.Links as Links
import qualified Filters.SourceRefs as SourceRefs
import qualified Filters.Smallcaps as Smallcaps
import qualified Filters.Archive as Archive
import qualified Filters.Dropcaps as Dropcaps
import qualified Filters.Math as Math
import qualified Filters.Wikilinks as Wikilinks
@ -40,6 +41,7 @@ applyAll srcDir doc = do
. Sidenotes.apply
. Typography.apply
. Links.apply
. Archive.apply
. Smallcaps.apply
. Dropcaps.apply
. Math.apply

82
build/Filters/Archive.hs Normal file
View File

@ -0,0 +1,82 @@
{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE OverloadedStrings #-}
-- | Filters.Archive — annotate (and, for dead links, redirect) body links
-- to archived works.
--
-- For every @Link@ whose URL matches an entry in @data/archive-index.json@
-- (the equivalent-URL alias set included):
--
-- * a 'live', 'moved' or (inconclusive) 'error' target keeps its
-- original link and gains a small superscript affordance pointing at
-- the local @/archive/<slug>/@ page — purely additive;
--
-- * a 'rotted' target (confirmed dead by @archive.py check@'s
-- hysteresis) has its primary link flipped to the archived copy, so
-- a reader of an old essay reaches a working snapshot instead of a
-- 404. A "archived" marker replaces the affordance.
--
-- Registered in 'Filters.applyAll' immediately after @Smallcaps@ and
-- before @Links@: it must see the smallcaps-rewritten text, and it emits
-- the affordance/marker as @RawInline@ so the downstream @Links@ pass
-- never re-classifies it.
--
-- No-op when @data/archive-index.json@ is absent. When no rot scan has
-- run, every entry is 'Live' — no link is ever flipped.
module Filters.Archive (apply) where
import qualified Data.Text as T
import Text.Pandoc.Definition
import Text.Pandoc.Walk (walk)
import ArchiveIndex (ArchiveStatus (..), archiveIndexIsEmpty,
archiveSlugFor, archiveStatusForSlug)
-- | Annotate body links. Headings are left alone — an affordance there
-- would be noise. Identity when the index is empty.
apply :: Pandoc -> Pandoc
apply doc@(Pandoc meta blocks)
| archiveIndexIsEmpty = doc
| otherwise = Pandoc meta (map annotateBlock blocks)
annotateBlock :: Block -> Block
annotateBlock h@Header{} = h
annotateBlock b = walk annotateInlines b
-- | For each archived @Link@: flip it if the target is 'Rotted', else
-- append the affordance. Non-archived links pass through untouched.
annotateInlines :: [Inline] -> [Inline]
annotateInlines = concatMap expand
where
expand l@(Link attr text (url, _)) =
case archiveSlugFor url of
Nothing -> [l]
Just slug -> case archiveStatusForSlug slug of
Rotted -> [flipped slug attr text, marker slug "rotted"
"The original is a dead link &mdash; \
\opens the local archived copy"]
_ -> [l, marker slug "" "Archived &mdash; \
\local preservation copy"]
expand x = [x]
-- | A 'Rotted' link, redirected to the local archived copy. Keeps the
-- link text; the @archive-rotted@ class lets CSS mark it.
flipped :: String -> Attr -> [Inline] -> Inline
flipped slug (ident, classes, kvs) text =
Link (ident, "archive-rotted" : classes, kvs) text
( T.pack ("/archive/" ++ slug ++ "/")
, "Original link is dead \8212 opens the local archived copy" )
-- | The superscript marker after the link: "A" for a normal affordance,
-- "archived" for a flipped dead link. Emitted as raw HTML so the
-- downstream @Links@ filter (which classifies @Link@ nodes) leaves it
-- alone. Slugs are @[a-z0-9-]@ by construction in @archive.py@.
marker :: String -> String -> T.Text -> Inline
marker slug modifier title = RawInline "html" $ T.concat
[ "<sup class=\"archive-affordance", modifierClass, "\">"
, "<a href=\"/archive/", T.pack slug, "/\" title=\"", title, "\">"
, label, "</a></sup>"
]
where
modifierClass = if null modifier
then ""
else " archive-affordance--" <> T.pack modifier
label = if null modifier then "A" else "archived"

View File

@ -1,7 +1,23 @@
module Main where
import Hakyll (hakyll)
import Site (rules)
import Data.Time.Clock.POSIX (getPOSIXTime)
import System.Directory (createDirectoryIfMissing)
import Hakyll (hakyll)
import Site (rules)
-- | Stamp the start of this build into @data/build-stamp.txt@ before
-- Hakyll scans the provider directory. The file therefore always exists
-- and always differs from the previous run. The telemetry pages
-- (@/build/@, @/stats/@) @load@ it as a dependency so Hakyll recompiles
-- them on every build instead of serving a stale cached copy when no
-- tracked content changed. See build/Stats.hs and build/Site.hs.
writeBuildStamp :: IO ()
writeBuildStamp = do
createDirectoryIfMissing True "data"
t <- getPOSIXTime
writeFile "data/build-stamp.txt" (show t ++ "\n")
main :: IO ()
main = hakyll rules
main = do
writeBuildStamp
hakyll rules

View File

@ -16,7 +16,11 @@
-- byte-identical SVGs, so the GPG signing pipeline is undisturbed.
module Marks
( monogramSvgField
, hasMonogramField
, monogramSvgFieldFor
, hasMonogramFieldFor
, epistemicSvgField
, hasMonogram
) where
import Control.Exception (IOException, try)
@ -52,6 +56,24 @@ monogramCandidates fp =
then [dir </> "mark.svg"]
else [dir </> takeBaseName fp ++ ".mark.svg"]
-- | Predicate form of 'resolveMonogramPath' — used by Stats.hs to
-- compute monogram coverage on @/build/@. Returns 'True' when at
-- least one of the dual-form candidate paths exists on disk.
hasMonogram :: Item a -> Compiler Bool
hasMonogram item = isJust <$> resolveMonogramPath item
-- | @$has-monogram$@ — present (renders as @"true"@) only when the
-- item has an actual @mark.svg@ on disk; 'noResult' for the
-- placeholder-roundel case. Templates that don't want to display
-- placeholder roundels (e.g. item-card listings, popup previews)
-- gate on this flag instead of @$monogramSvg$@, which the
-- frontmatter header relies on always rendering for symmetric
-- column layout.
hasMonogramField :: Context String
hasMonogramField = field "has-monogram" $ \item -> do
has <- hasMonogram item
if has then return "true" else noResult "no real monogram"
-- | Return the first candidate path that exists on disk, or 'Nothing'.
resolveMonogramPath :: Item a -> Compiler (Maybe FilePath)
resolveMonogramPath item =
@ -72,13 +94,17 @@ resolveMonogramPath item =
-- tools may produce hardcoded blacks; the contract still holds), strips
-- the @width@/@height@ presentation attributes from the root @<svg>@,
-- and wraps the result in @<figure class="frontmatter-mark
-- frontmatter-mark--monogram">@. Returns 'noResult' when no candidate
-- exists; warns and returns 'noResult' on read failure.
-- frontmatter-mark--monogram">@.
--
-- When no @mark.svg@ exists, returns the placeholder roundel — an
-- empty outer ring at lower opacity that visually balances the
-- epistemic-figure column and signals "monogram not yet authored".
-- Read failures fall back to the same placeholder.
monogramSvgField :: Context String
monogramSvgField = field "monogramSvg" $ \item -> do
mPath <- resolveMonogramPath item
case mPath of
Nothing -> noResult "no mark.svg"
Nothing -> return $ T.unpack monogramPlaceholder
Just path -> do
result <- unsafeCompiler $ try (TIO.readFile path)
:: Compiler (Either IOException T.Text)
@ -87,9 +113,24 @@ monogramSvgField = field "monogramSvg" $ \item -> do
unsafeCompiler $ hPutStrLn stderr $
"[Marks] " ++ toFilePath (itemIdentifier item) ++
": failed to read " ++ path ++ ": " ++ show e
noResult "monogram read failed"
return $ T.unpack monogramPlaceholder
Right svg -> return $ T.unpack $ wrapMonogram (processSvg svg)
-- | Empty-roundel placeholder used while a piece's monogram is still
-- to be authored (Phase 2 of MARKS.md). The @--placeholder@ modifier
-- class lets CSS render it at reduced opacity so it reads as a
-- neutral frame rather than a finished glyph.
monogramPlaceholder :: T.Text
monogramPlaceholder = T.concat
[ "<figure class=\"frontmatter-mark frontmatter-mark--monogram"
, " frontmatter-mark--placeholder\" aria-hidden=\"true\">"
, "<svg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 280 280\">"
, "<circle cx=\"140\" cy=\"140\" r=\"128\" fill=\"none\""
, " stroke=\"currentColor\" stroke-width=\"0.6\"/>"
, "</svg>"
, "</figure>"
]
-- | Wrap inlined monogram SVG in its outer figure element.
wrapMonogram :: T.Text -> T.Text
wrapMonogram svg = T.concat
@ -98,6 +139,33 @@ wrapMonogram svg = T.concat
, "</figure>"
]
-- | @$monogramSvg$@ override for synthesized pages whose item identifier
-- doesn't live under @content/@ (e.g. @/build/@, @/stats/@), so the
-- auto-resolver in 'monogramSvgField' can't find a co-located mark.
-- Reads from the supplied path; falls back to the placeholder roundel
-- when the file is absent or unreadable.
monogramSvgFieldFor :: FilePath -> Context a
monogramSvgFieldFor path = field "monogramSvg" $ \_ -> do
exists <- unsafeCompiler $ doesFileExist path
if not exists
then return $ T.unpack monogramPlaceholder
else do
result <- unsafeCompiler $ try (TIO.readFile path)
:: Compiler (Either IOException T.Text)
case result of
Left e -> do
unsafeCompiler $ hPutStrLn stderr $
"[Marks] failed to read " ++ path ++ ": " ++ show e
return $ T.unpack monogramPlaceholder
Right svg -> return $ T.unpack $ wrapMonogram (processSvg svg)
-- | @$has-monogram$@ override paired with 'monogramSvgFieldFor'. Present
-- (as @"true"@) only when the path exists; 'noResult' otherwise.
hasMonogramFieldFor :: FilePath -> Context a
hasMonogramFieldFor path = field "has-monogram" $ \_ -> do
exists <- unsafeCompiler $ doesFileExist path
if exists then return "true" else noResult "no real monogram"
-- | Replace hardcoded black fills/strokes with @currentColor@ and strip
-- the root @<svg>@'s @width@/@height@ attributes (presentation lives
-- in CSS via the @.frontmatter-mark svg@ selector). Mirrors the color

View File

@ -19,6 +19,7 @@ import qualified Data.Aeson as Aeson
import qualified Data.ByteString.Lazy.Char8 as LBS
import qualified Data.Map.Strict as Map
import Hakyll
import Archive (archiveRules)
import Authors (buildAllAuthors, applyAuthorRules)
import Backlinks (backlinkRules)
import BibExtras (BibExtra (..), emptyBibExtra, firstAuthorSurname, parseBibExtras)
@ -265,6 +266,13 @@ rules = do
-- /current.html. Re-compiles current.html when the YAML changes.
match "data/now.yaml" $ compile getResourceBody
-- Per-build stamp — written by Main.main before Hakyll starts, so it
-- always exists and always differs from the previous run. Matched
-- (not routed) purely so the telemetry pages can `load` it as a
-- dependency and thus recompile every build instead of serving a
-- stale cached copy. See build/Stats.hs.
match "data/build-stamp.txt" $ compile getResourceBody
-- ---------------------------------------------------------------------------
-- Homepage
-- ---------------------------------------------------------------------------
@ -529,6 +537,13 @@ rules = do
-- ---------------------------------------------------------------------------
photographyRules
-- ---------------------------------------------------------------------------
-- Archive — link-archiving system: per-entry /archive/<slug>/ pages and
-- the /archive/ index, driven by archive/manifest.yaml + PROVENANCE.json.
-- See build/Archive.hs and ARCHIVE.md for the design.
-- ---------------------------------------------------------------------------
archiveRules
-- ---------------------------------------------------------------------------
-- Blog index (paginated)
-- ---------------------------------------------------------------------------
@ -926,6 +941,13 @@ rules = do
create ["robots.txt"] $ do
route idRoute
compile $ makeItem $ unlines
-- /archive/ is *deliberately not* disallowed. Crawlers must be
-- able to reach the wrapper pages (and snapshot.html) to see
-- their <meta name=robots content="noindex, noarchive">; a
-- robots.txt Disallow would block that and a URL blocked only
-- by robots.txt can still appear in results when linked. The
-- raw PDFs cannot carry meta — they need an `X-Robots-Tag`
-- HTTP header from the deploy webserver (see nginx/archive.conf).
[ "User-agent: *"
, "Allow: /"
, ""

View File

@ -37,7 +37,10 @@ import qualified Text.Blaze.Html5.Attributes as A
import Text.Blaze.Html.Renderer.String (renderHtml)
import qualified Text.Blaze.Internal as BI
import Hakyll
import Archive (archiveBuildStats)
import Contexts (siteCtx, authorLinksField)
import Marks (hasMonogram, monogramSvgFieldFor,
hasMonogramFieldFor)
import qualified Patterns as P
import Utils (readingTime)
@ -675,6 +678,39 @@ renderEpistemic total ws wc wi we =
, txt (pctStr n total)
]
-- | Per-content-type counts feeding 'renderMarks'. @mrCount@ is the
-- denominator (total pieces of that type), @mrMonogram@ is the count
-- with a co-located @mark.svg@, and @mrFigure@ is the count with
-- @status:@ frontmatter (which is what triggers the epistemic figure
-- per MARKS.md §3.1).
data MarkRow = MarkRow
{ mrLabel :: String
, mrCount :: Int
, mrMonogram :: Int
, mrFigure :: Int
}
renderMarks :: [MarkRow] -> H.Html
renderMarks rows =
section "marks" "Marks coverage" $
table
["Type", "Pieces", "Monogram", "Epistemic figure"]
(map row rows)
(Just [ "Total"
, txt (commaInt totalCount)
, txt (commaInt totalMono ++ " (" ++ pctStr totalMono totalCount ++ ")")
, txt (commaInt totalFig ++ " (" ++ pctStr totalFig totalCount ++ ")")
])
where
totalCount = sum (map mrCount rows)
totalMono = sum (map mrMonogram rows)
totalFig = sum (map mrFigure rows)
row r = [ txt (mrLabel r)
, txt (commaInt (mrCount r))
, txt (commaInt (mrMonogram r) ++ " (" ++ pctStr (mrMonogram r) (mrCount r) ++ ")")
, txt (commaInt (mrFigure r) ++ " (" ++ pctStr (mrFigure r) (mrCount r) ++ ")")
]
renderOutput :: Map.Map String (Int, Integer) -> Int -> Integer -> H.Html
renderOutput grouped totalFiles totalSize =
section "output" "Output" $
@ -707,6 +743,14 @@ renderBuild ts dur =
, ("Last build duration", txt dur)
]
-- | Link-archive coverage and health. The metric rows are computed by
-- 'Archive.archiveBuildStats' (count, size, link-rot status breakdown,
-- snapshot quality, visibility, orphans); this only lays them out.
renderArchive :: [(String, String)] -> H.Html
renderArchive metrics =
section "archive" "Link archive" $
dl [ (k, txt v) | (k, v) <- metrics ]
-- ---------------------------------------------------------------------------
-- Static TOC (matches the nine h2 sections above)
-- ---------------------------------------------------------------------------
@ -725,7 +769,9 @@ pageTOC = H.ol $ mapM_ item sections
, ("tags", "Tags")
, ("links", "Links")
, ("epistemic", "Epistemic coverage")
, ("marks", "Marks coverage")
, ("output", "Output")
, ("archive", "Link archive")
, ("repository", "Repository")
, ("build", "Build")
]
@ -743,6 +789,16 @@ statsRules tags = do
create ["build/index.html"] $ do
route idRoute
compile $ do
-- ----------------------------------------------------------------
-- Per-build stamp dependency: data/build-stamp.txt is rewritten
-- by Main.main on every invocation, so loading it here forces
-- Hakyll to recompile this page each build. Without it the page
-- is served from cache whenever no tracked content changed, and
-- every unsafeCompiler-sourced figure below (timestamp, output
-- stats, git, LOC) goes stale. The value itself is unused.
-- ----------------------------------------------------------------
_ <- load (fromFilePath "data/build-stamp.txt") :: Compiler (Item String)
-- ----------------------------------------------------------------
-- Load all content items
-- ----------------------------------------------------------------
@ -824,8 +880,11 @@ statsRules tags = do
-- ----------------------------------------------------------------
-- Epistemic coverage (essays + posts)
-- ----------------------------------------------------------------
essayMetas <- mapM (getMetadata . itemIdentifier) essays
postMetas <- mapM (getMetadata . itemIdentifier) posts
essayMetas <- mapM (getMetadata . itemIdentifier) essays
postMetas <- mapM (getMetadata . itemIdentifier) posts
poemMetas <- mapM (getMetadata . itemIdentifier) poems
fictionMetas <- mapM (getMetadata . itemIdentifier) fiction
compMetas <- mapM (getMetadata . itemIdentifier) comps
let epMetas = essayMetas ++ postMetas
epTotal = length epMetas
ep f = length (filter (isJust . f) epMetas)
@ -834,6 +893,38 @@ statsRules tags = do
withImp = ep (lookupString "importance")
withEv = ep (lookupString "evidence")
-- ----------------------------------------------------------------
-- Marks coverage (per-portal monogram + epistemic-figure counts)
--
-- Monogram presence is a disk lookup via Marks.hasMonogram;
-- epistemic-figure presence is the same trigger as the figure
-- generator itself (status: set in frontmatter).
-- ----------------------------------------------------------------
essayMonos <- mapM hasMonogram essays
postMonos <- mapM hasMonogram posts
poemMonos <- mapM hasMonogram poems
fictionMonos <- mapM hasMonogram fiction
compMonos <- mapM hasMonogram comps
let countTrue = length . filter id
countStat = length . filter (isJust . lookupString "status")
markRows =
[ MarkRow "Essays" (length essays)
(countTrue essayMonos)
(countStat essayMetas)
, MarkRow "Blog posts" (length posts)
(countTrue postMonos)
(countStat postMetas)
, MarkRow "Poems" (length poems)
(countTrue poemMonos)
(countStat poemMetas)
, MarkRow "Fiction" (length fiction)
(countTrue fictionMonos)
(countStat fictionMetas)
, MarkRow "Compositions" (length comps)
(countTrue compMonos)
(countStat compMetas)
]
-- ----------------------------------------------------------------
-- Output directory stats
-- ----------------------------------------------------------------
@ -846,6 +937,11 @@ statsRules tags = do
(hf, hl, cf, cl, jf, jl) <- unsafeCompiler getLocStats
(commits, firstDate) <- unsafeCompiler getGitStats
-- ----------------------------------------------------------------
-- Link-archive coverage + link-rot health
-- ----------------------------------------------------------------
archiveMetrics <- unsafeCompiler archiveBuildStats
-- ----------------------------------------------------------------
-- Build timestamp + last build duration
-- ----------------------------------------------------------------
@ -868,7 +964,9 @@ statsRules tags = do
renderTagsSection topTags uniqueTags
renderLinks mostLinkedInfo orphanCount (length allPIs)
renderEpistemic epTotal withStatus withConf withImp withEv
renderMarks markRows
renderOutput outputGrouped totalFiles totalSize
renderArchive archiveMetrics
renderRepository hf hl cf cl jf jl commits firstDate
renderBuild buildTimestamp lastBuildDur
contentString = renderHtml htmlContent
@ -883,6 +981,8 @@ statsRules tags = do
\link analysis, epistemic coverage, output metrics, \
\repository overview, and build timing."
<> constField "build" "true"
<> monogramSvgFieldFor "content/build.mark.svg"
<> hasMonogramFieldFor "content/build.mark.svg"
<> authorLinksField
<> siteCtx
@ -897,6 +997,11 @@ statsRules tags = do
create ["stats/index.html"] $ do
route idRoute
compile $ do
-- Per-build stamp dependency — forces a recompile every build
-- so the heatmap's "today" and all corpus figures stay current.
-- See the /build/ rule above for the full rationale.
_ <- load (fromFilePath "data/build-stamp.txt") :: Compiler (Item String)
essays <- loadAll (P.essayPattern .&&. hasNoVersion)
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion)
poems <- loadAll ("content/poetry/*.md" .&&. hasNoVersion)
@ -954,6 +1059,8 @@ statsRules tags = do
<> constField "abstract" "Writing activity, corpus breakdown, \
\and tag distribution computed at build time."
<> constField "build" "true"
<> monogramSvgFieldFor "content/stats.mark.svg"
<> hasMonogramFieldFor "content/stats.mark.svg"
<> authorLinksField
<> siteCtx

View File

@ -18,7 +18,7 @@ constraints: any.Glob ==0.10.2,
any.assoc ==1.1.1,
any.async ==2.2.6,
any.attoparsec ==0.14.4,
any.attoparsec-aeson ==2.2.0.0,
any.attoparsec-aeson ==2.2.0.1,
any.auto-update ==0.1.6,
any.base ==4.18.2.1,
any.base-compat ==0.14.1,
@ -99,7 +99,7 @@ constraints: any.Glob ==0.10.2,
http-conduit +aeson,
any.http-date ==0.0.11,
any.http-types ==0.12.4,
any.http2 ==5.1.1,
any.http2 ==5.1.2,
any.indexed-traversable ==0.1.4,
any.indexed-traversable-instances ==0.1.2.1,
any.integer-conversion ==0.1.1,
@ -131,7 +131,7 @@ constraints: any.Glob ==0.10.2,
any.pretty ==1.1.3.6,
any.pretty-show ==1.10,
any.prettyprinter ==1.7.1,
any.prettyprinter-ansi-terminal ==1.1.3,
any.prettyprinter-ansi-terminal ==1.1.4,
any.primitive ==0.9.1.0,
any.process ==1.6.19.0,
any.psqueues ==0.2.8.3,
@ -144,7 +144,7 @@ constraints: any.Glob ==0.10.2,
any.safe ==0.3.21,
any.safe-exceptions ==0.1.7.4,
any.scientific ==0.3.8.1,
any.semialign ==1.3.1,
any.semialign ==1.3.1.1,
any.semigroupoids ==6.0.2,
any.serialise ==0.2.6.1,
any.simple-sendfile ==0.2.32,
@ -215,5 +215,5 @@ constraints: any.Glob ==0.10.2,
any.xml-types ==0.3.8,
any.yaml ==0.11.11.2,
any.zip-archive ==0.4.3.2,
any.zlib ==0.7.0.0
any.zlib ==0.7.1.0
index-state: hackage.haskell.org 2026-04-30T12:51:47Z

View File

@ -0,0 +1,60 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 280 280" role="img" aria-labelledby="mark-title-asymmetric-forgetting-a2">
<title id="mark-title-asymmetric-forgetting-a2">A vertical chain of links that thins and fades as it rises, beside a tall ladder-scaffold that stays uniform in weight throughout, with three inward arrows touching it and one empty rung-stub extending toward the chain</title>
<desc>A frontispiece mark for "Asymmetric Forgetting." Both figures rise from a heavy horizontal baseline (the moment of instruction). On the left, a vertical chain of links thins to hairline as it rises — the procedure, decaying without reactivation. On the right, a ladder-scaffold of rungs between two rails stays uniform in weight throughout — the concept, persistent. Three inward arrows touch the ladder from outside the figure at irregular heights — the world's analogues reaching in to refresh the schema. One empty rung extends from the ladder toward the chain side, capped with a small open circle: the docking slot where a procedure can re-attach when re-acquired. The chain has no such reach back; the absence is the point.</desc>
<circle cx="140" cy="140" r="128" stroke="currentColor" stroke-width="0.6" fill="none"/>
<line x1="40" y1="220" x2="240" y2="220" stroke="currentColor" stroke-width="1.4" stroke-linecap="round"/>
<g stroke="currentColor" fill="none" stroke-linecap="round" stroke-linejoin="round">
<ellipse cx="72" cy="210" rx="8" ry="5" stroke-width="1.3"/>
<ellipse cx="72" cy="197" rx="5" ry="8" stroke-width="1.3"/>
<ellipse cx="72" cy="182" rx="8" ry="5" stroke-width="1.1"/>
<ellipse cx="72" cy="169" rx="5" ry="8" stroke-width="1.1"/>
<ellipse cx="72" cy="154" rx="8" ry="5" stroke-width="0.85"/>
<ellipse cx="72" cy="141" rx="5" ry="8" stroke-width="0.85"/>
<ellipse cx="72" cy="126" rx="8" ry="5" stroke-width="0.55"/>
<ellipse cx="72" cy="113" rx="5" ry="8" stroke-width="0.45"/>
<ellipse cx="72" cy="98" rx="8" ry="5" stroke-width="0.3" opacity="0.5"/>
<ellipse cx="72" cy="86" rx="5" ry="8" stroke-width="0.25" opacity="0.3"/>
<ellipse cx="72" cy="72" rx="7" ry="4" stroke-width="0.2" opacity="0.15"/>
</g>
<g stroke="currentColor" fill="none" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.3">
<line x1="180" y1="220" x2="180" y2="72"/>
<line x1="222" y1="220" x2="222" y2="72"/>
<line x1="180" y1="206" x2="222" y2="206"/>
<line x1="180" y1="192" x2="222" y2="192"/>
<line x1="180" y1="178" x2="222" y2="178"/>
<line x1="180" y1="164" x2="222" y2="164"/>
<line x1="180" y1="150" x2="222" y2="150"/>
<line x1="180" y1="136" x2="222" y2="136"/>
<line x1="180" y1="122" x2="222" y2="122"/>
<line x1="180" y1="108" x2="222" y2="108"/>
<line x1="180" y1="94" x2="222" y2="94"/>
<line x1="180" y1="80" x2="222" y2="80"/>
</g>
<g stroke="currentColor" fill="none" stroke-linecap="round" stroke-linejoin="miter" stroke-width="0.9">
<line x1="158" y1="150" x2="180" y2="150"/>
<circle cx="155" cy="150" r="2.2" stroke-width="0.9" fill="none"/>
</g>
<g stroke="currentColor" fill="none" stroke-linecap="round" stroke-linejoin="miter" stroke-width="0.8" opacity="0.85">
<line x1="244" y1="94" x2="226" y2="94"/>
<path d="M 230 91 L 226 94 L 230 97"/>
<line x1="244" y1="178" x2="226" y2="178"/>
<path d="M 230 175 L 226 178 L 230 181"/>
<line x1="244" y1="206" x2="226" y2="206"/>
<path d="M 230 203 L 226 206 L 230 209"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 3.5 KiB

View File

@ -0,0 +1,81 @@
---
title: "Asymmetric Forgetting"
date: 2026-05-26
abstract: >
Curricula in mathematics and the sciences optimize for procedural fluency — the half of what they teach that decays once the student stops being a student. What survives twenty years on is conceptual residue, generated only as an accidental byproduct of the curriculum's intended work. The asymmetry compounds across generations of teachers and produces a population unable to do the work that civic life requires of it.
tags:
- education
- nonfiction
- nonfiction/philosophy
- political
authors:
- "Levi Neuwirth | /me.html"
status: "Working model"
confidence: 80
importance: 4
evidence: 2
scope: broad
novelty: moderate
practicality: moderate
confidence-history:
- 80
---
If you ask the prototypical adult who took AP Calculus in high school what a "derivative" is, you'll generally get a half-decent answer, assuming they were reasonably engaged in class. You might hear that it's a slope, a rate of change, a measurement of how fast something moves. You'll get something actionable. If you ask the same adult to compute an elementary [derivative](https://en.wikipedia.org/wiki/Derivative) in front of you, they'll almost certainly fail, even if they aren't too far removed from their time in class. This is not a failure of their education by any means. In some ways, it is the only meaningful success that said education currently has.
Curricula can attempt to instill two things in a student. The first is procedural fluency: the ability to perform the steps of some algorithm on demand, to execute a technique, to perform a computation. We tend to think of this as the most important. But the second thing is more durable, even if less outwardly crisp: an epistemic residue, a working sense of what the concept being learned about fundamentally is, what kinds of claims it can support and what would refute it, and where it lies within the bigger picture of knowledge that the student has accumulated over time. In the United States, curricula in mathematics, the sciences, and the limited curricula that exist in computation are designed and assessed essentially exclusively against the first. This is in direct contrast to what survives in the graduates of these curricula ten, twenty, fifty years later — almost exclusively the second thing, delivered almost entirely by serendipitous accident, in the gaps between the procedures the Curriculum was actually intended to teach.
Concepts and epistemic residues persist over time, while procedures without regular reactivation do not. The cognitive infrastructure that our Curricula provide adults, the backbone by which they are intended to lead their lives and function in an increasingly technological society, bears almost no resemblance to the infrastructure which the curriculum was ostensibly optimized for. Worse, this asymmetric forgetting is not bidirectional in its effects. An adult who retained the concept can quickly re-acquire the related procedures on demand. The adult who, cramming for their examinations, once learned the procedure but knew nothing of the concept will be unable to run the reverse. There is nothing here that the procedure could have left behind, for the scaffolding to which it naturally should've attached was never built. Our curricula optimize for the thing that doesn't last, at the chief expense of the thing that does, and the thing that does is the thing from which all else follows.
## The Mechanism
The asymmetrical forgetting rests on a distinction older than the cognitive science vocabulary widely adopted to describe it. Procedural knowledge and conceptual knowledge are not stored, retrieved, nor reactivated in the same way, and they consequently do not decay in the same way.
Procedural knowledge is inherently sequential. To compute a derivative is to execute a series of moves in a particular order, conditioned only on the form of the input. To balance a chemical equation, to write a for-loop in a syntax that has not been recently used, or to manually run long division for the first time since fourth grade — these are all chains of steps whose only durable representation in the brain is the chain itself. Such chains inherently require reactivation to persist. The adult who has not balanced an equation in fifteen years[^1] has not forgotten because they were incapable or because they were poorly taught; they have forgotten because the sequential chain hasn't fired in fifteen years, and chains that are not run inevitably decay. This is not controversial. It is the easy direction, the half of the asymmetry that the curriculum tacitly acknowledges through myriad practice problems and lifeless examinations. What the curriculum fails to acknowledge is the expiration date: the day the student stops being a student.
[^1]: Or, sadly, the recent college graduate who hasn't done it in a mere six years... embarrassing, I know!
Conceptual knowledge is structured fundamentally differently. A concept is not a sequence but a schema. This allows for integration into the rest of what one understands about the world they inhabit. The adult who has retained the basic concept of the derivative has retained it because the schema gets reactivated incidentally, for the ordinary course of living provides suitable analogues[^2]. The schema is constantly reactivated; every news article mentioning acceleration, every casual remark about curves of stocks becoming steeper, every passing thought about how quickly something is growing. The procedural skill has no such ambient reactivation. There is nothing in adult civic life that can incidentally re-run the steps of the [quotient rule](https://en.wikipedia.org/wiki/Quotient_rule). The schema persists by virtue of the world's analogues consistently reaching in and touching it; the procedure decays because it is left devoid of interaction that the curriculum once served to forcibly provide.
[^2]: I would go further to hypothesize that any reasonable schema will with probability ~1 be reactivated, for [everything is correlated](https://gwern.net/everything).
The final component of this mechanic is the most severe. Consider two adults, one who has retained the concept and lost the procedure, and the other who has improbably (but for the sake of argument) retained the procedure without ever having the concept. We will continue with our example of differentiation. The first adult, on encountering a problem that requires a derivative, will find the procedure from a reference, say a web search, and re-acquire it in minutes. The concept's scaffolding has built a place to store the procedure when it returns. The second adult, faced with an identical problem, cannot recognize that it requires a derivative in the first place, for they never had such a scaffolding; the schema that would enable them to notice is markedly absent. Even if the procedure is entirely intact, it has nowhere to attach to, no occasion on which to be deployed. A procedure cannot magically summon a concept that was never built.
From this final component follows the weight of the asymmetry as a design constraint. If both of these aspects of cognitive infrastructure were equally durable, or even equally recoverable in reduction, the question of which to prioritize would be a matter of taste, pedagogical convenience, and moderate pretentiousness. This is, of course, not the case. The conceptual component is the substrate within which the procedural becomes meaningful, the only component surviving long enough to matter for the adult life the retired student will lead. A curriculum that optimizes for students who can blindly execute procedures they will lose in five years has produced essentially nothing of lasting value. It has produced nothing more than a cacophonous credential that lacks any semblance of underlying understanding.
## Substantiation
Where do I land amidst all of this, and why do I care? I offer my own retention audit of sorts not as proof of the mechanism, but as a demonstration that it is at least observable in lived form. If nothing else, perhaps you, dear reader, can run such an audit on yourself and see if the results are the same.
I attended a rural public school in upstate New York. By every metric that the system has conceived, I was a success by the time of my graduation. I was in the top ten of my class, I had straight As on my state examinations in mathematics and the sciences, I had all 4s and 5s on my AP examinations (including some which were not even offered by my institution), and graduated with the highest honors possible in my district, heading outbound to an Ivy League university. The audit forces reframing: of what was taught to me with procedural intent, what has survived, and in what form?
What do I remember of [stoichiometry](https://en.wikipedia.org/wiki/Stoichiometry)? I got a perfect 100 on my chemistry examination in high school, converting between grams and molecules, running limiting-reagent problems. I was clearly good at it at the time, and yet I remember nothing of how to balance even a trivial equation without re-deriving it slowly from first principles. I have not computed a mole quantity since my examination in 2020. What I retain from my Chemistry experience, other than the fact that providing troublemaking high school students with Bunsen Burners is outright objectionable at best, is that chemical reactions are quantitative, that matter is conserved across them, and the relationships between reactants and products are precise. I retained all of the concepts despite the curriculum such that when I went on to take advanced courses in mathematics and physics at Brown, the scaffolding was laid; I could connect what I learned to form a bigger picture. The procedure has decayed because I have not balanced an equation since I was sixteen, but the structural concept has survived.
What of AP Biology, another course of which I earned a perfect score on the final examination? I will not bore the readers with another long-winded description. The procedures that I once mastered, say those for [Punnett Squares](https://en.wikipedia.org/wiki/Punnett_square), have long left me. Yet the concepts are strong enough that in the years since, I have been able to do research work that is strictly integrated with medicine and the life sciences. When I need a procedure from these fields that are outside of my expertise, I have the conceptual scaffolding to place what I derive into.
The pattern is identical across every subject that I can examine. What was taught with mere procedural intent has decayed, while what has survived is the epistemic and conceptual residues, the schemas. The success of my education, by my own retention audit, is a success that the curriculum was not optimizing for. It is not, therefore, a success that can be attributed to some quality of that curriculum or quality of the public institution at which I studied. It is the pattern of a curriculum that failed at what it tried to do and, accidentally, in the failure, left behind for me the only thing of value.
I do not believe that my case is unusual, and I invite you to perform such an audit on yourself if you feel open to it. Ask yourself: of any procedural unit that you remember being drilled on for exams, *what survives now?* Is it the procedure that was assessed or the concept that was incidentally surfaced alongside it? The mechanism predicts what your audit will find, and by virtue of my results, I'd place my money on the same prediction.
## Implication
So far we have been concerned with the individual. The individual graduate, twenty years on, retains the concept and loses the procedure by failure of the curriculum. At this scale, the consequence is little more than regrettable in a bounded way. The individual is poorer for the loss, and perhaps there has been some time squandered away by the system, but the promise of the residue may remain, providing enough to build on if they ever so choose to put in a bit of effort. One can see why we might just shrug our shoulders and keep on walking right past this, for people will muddle through, and the people who do care can refresh and derive for themselves.
The true consequence of this fact lives at the scale of the population. The population that emerges from the widespread adoption and delivery of such a curriculum is the population that has to operate the society all graduates inhabit. Let us consider, then, what the mechanism predicts at such a scale. The median adult, twenty years on, has neither working procedural fluency nor the robust conceptual scaffolding that would enable subsequent procedural acquisition. The shards of residue that they do retain are little more than incidental, surfaced in the gaps between the procedures that the curriculum tried to teach, never deliberately developed, never assessed, and never built into the structure that the curriculum optimized for. It is thin where it should be thick, accidental where it should be load bearing.
This population is the one that we then ask to do the work that civic life requires. We ask this population to evaluate claims made by scientific institutions during a pandemic. We ask them to vote on the regulation of technologies they have never been taught to reason about, to navigate algorithmic and financial systems whose underlying structure and principles they have no schema for, to distinguish a credible statistical claim from a contrived and misleading one. The asymmetric forgetting mechanism predicts, and with high accuracy, that the population we have is widely unable to do these tasks at the level that is implicitly required. This is not due to incapability, for even a thin conceptual residue would provide meaningful opportunities, but it is rather because the curriculum is optimized for the wrong part of what it could instill and leave behind in the long term. We live inside the aggregate consequence of this fact.
This is made worse by the fact that it compounds. Each generation of teachers is drawn from a population produced by the curriculum of the previous generation. My rural hometown teachers had themselves passed through a curriculum optimized for procedural fluency. Whatever conceptual residue they retained was that from which their teaching stemmed. The conceptual depth that asymmetric forgetting calls for simply cannot be requested of teachers who were themselves never taught to that depth. It is not a reasonable expectation, and is thus not a personal failure of the teachers. The current teacher workforce that cannot provide a concept-first curriculum is rather the predictable product of a curriculum that did not build concepts deeply. Those who are interested in the concepts must find it within themselves to search further in the current setting. The path from where we are to a system that optimizes for the concepts runs through the teachers and their own educations, and that path is inevitably long. There is no magic solution that will resolve this in six months, nor within a glorious five year plan. Recognizing this is not a counsel of despair, but rather a counsel of patience, a refusal to mistake or conflate the difficulty of the path for evidence against the destination.
I have deliberately chosen not to describe what a concept-first curriculum would look like. This is a separate piece of work that is owed its own diligent treatment, and I refuse to attempt to distill and collapse it into the closing of this one. What follows from the asymmetry is the subject of work to be done and essays yet to be written, by myself and by others.
## Coda
The adult who remembers what a derivative is at the conceptual level has been given something. The curriculum that gave it to them did so by accident, in the margins between what it was actually optimized to deliver, and at the cost of everything else that it could've deliberately built into the same student. The residue is real and is genuinely the only thing that survived. It is also catastrophically less than what twelve years of schooling could have left behind, if only the system had known what it was for.
The asymmetry is not subtle and it is not new. We have been running this experiment on every cohort of American students for as long as "American students" have existed. The results have now been replicated millions of times. Those results *are* the population that we now have. A population that retains the wrong half of what it was taught as a thin accident, subsequently tasked with operating a society whose questions and demands require the half that is absent. The curriculum may have optimized for what would not last, but what actually lasted is the accidental byproduct. We ask that byproduct to do the work, but the work is too large for such an accident to bear.
We can choose different. The asymmetry shows us exactly what to reach for: the epistemic residue that persists, the schemas that the world's analogues continually reach in to refresh, and the conceptual scaffolding to which future learning can attach. We are choosing against this. We have been choosing against it for a very long time. The cost of that choice is borne not by the student who took the exams but by the adults they became, the society those adults now have to navigate without the infrastructure their schooling could have built.
The most valuable thing a curriculum can give a student is what the student will still have twenty years down the line. We are giving them everything else.

View File

@ -13,6 +13,8 @@ executable site
hs-source-dirs: build
other-modules:
Site
Archive
ArchiveIndex
Authors
Catalog
Commonplace
@ -36,6 +38,7 @@ executable site
Filters.Sidenotes
Filters.Dropcaps
Filters.Smallcaps
Filters.Archive
Filters.Wikilinks
Filters.Transclusion
Filters.EmbedPdf

45
nginx/archive.conf Normal file
View File

@ -0,0 +1,45 @@
# archive.conf — `X-Robots-Tag: noindex, noarchive` for the link archive.
#
# Place at /etc/nginx/snippets/archive.conf and `include` it inside the
# levineuwirth.org server { } block, *after* security-headers.conf:
#
# server {
# server_name levineuwirth.org;
# root /var/www/levineuwirth.org;
# ...
# include snippets/security-headers.conf;
# include snippets/static-assets.conf;
# include snippets/popup-proxy.conf;
# include snippets/archive.conf;
# }
#
# Why a location header rather than robots.txt: a URL blocked by
# robots.txt can still appear in results when externally linked, and the
# noindex directive must be reachable. Wrapper pages carry the meta in
# HTML, and the HTML snapshots have the same meta injected at fetch
# time. But raw PDFs cannot carry meta directives — and a robots.txt
# Disallow on /archive/ would prevent crawlers from reading the wrapper
# meta in the first place. The header form is the right control for the
# whole tree: crawlers honour it for any resource, HTML or PDF.
#
# `^~` makes this prefix-match take priority over any regex location
# that might match the same path.
location ^~ /archive/ {
# nginx's add_header chain is inherited from a parent context ONLY
# when the current context declares no add_header directives — see
# nginx.org/en/docs/http/ngx_http_headers_module.html. Adding any
# header inside this location would silently drop the baseline
# security headers within the /archive/ subtree, so we re-include
# security-headers.conf to keep HSTS, CSP, X-Frame-Options, etc.
# intact for archive pages and raw artifacts.
include snippets/security-headers.conf;
# `always` so the header is emitted even on 4xx/5xx responses (the
# default add_header only sets on 2xx/3xx; without `always` a 404
# under /archive/ could be indexed).
add_header X-Robots-Tag "noindex, noarchive" always;
# Hand off to the same static-file fallback as the rest of the site.
try_files $uri $uri/index.html $uri.html =404;
}

View File

@ -42,6 +42,12 @@ server {
include snippets/security-headers.conf;
include snippets/static-assets.conf;
include snippets/popup-proxy.conf;
# archive.conf must come *after* security-headers.conf — it declares
# its own add_header inside `location ^~ /archive/`, which (per the
# nginx add_header inheritance rules) would otherwise drop the
# baseline headers within that subtree. The snippet re-includes
# security-headers.conf inside its location to compensate.
include snippets/archive.conf;
# Static-site fallback. Pretty URLs first (foo/index.html, foo.html),
# then 404.

463
static/css/archive.css Normal file
View File

@ -0,0 +1,463 @@
/* archive.css the link archive: /archive/ and /archive/<slug>/.
*
* Gated in head.html via $if(archive)$ (build/Archive.hs sets the flag on
* the index and every entry page). The archive pages are structured
* surfaces rather than prose, but they render inside #markdownBody so
* every rule here is scoped under #markdownBody to clear the id-specificity
* prose rules in typography.css (heading scales, figure framing, paragraph
* indent) that would otherwise win over a bare class.
*
* Treatment: "framed / structured" the archival chrome (banner,
* provenance panel, the embedded artifact viewer) is given visible borders
* so a reader is never in doubt that this is a preservation copy, not the
* original. All colour comes from tokens, so dark mode follows for free;
* the embedded artifact itself is shown raw and is deliberately not themed.
*/
/* Structured pages, not essays — no first-line indent on any paragraph. */
#markdownBody :is(.archive-banner-text, .archive-degraded, .archive-note,
.archive-private, .archive-status-note, .archive-index-intro,
.archive-removal, .archive-empty),
#markdownBody .archive-fulltext-wrap > p {
text-indent: 0;
}
/* ============================================================
ENTRY HEADER + ARCHIVAL BANNER
The banner is a bordered callout, stacked: a small-caps label,
one plain-language line, and the original link given real
weight the original is the hero, never the archived copy.
============================================================ */
#markdownBody .archive-header {
margin-bottom: 0.5rem;
}
#markdownBody .archive-header .page-title {
margin-bottom: 0;
}
#markdownBody .archive-banner {
margin-top: 1.4rem;
padding: 0.9rem 1.1rem;
display: flex;
flex-direction: column;
gap: 0.3rem;
border: 1px solid var(--border-muted);
border-radius: 2px;
background: var(--bg-subtle);
}
#markdownBody .archive-banner-label {
margin: 0;
font-family: var(--font-sans);
font-size: 0.7rem;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.13em;
color: var(--text-muted);
}
#markdownBody .archive-banner-text {
margin: 0;
font-family: var(--font-serif);
font-size: 0.95rem;
line-height: 1.5;
color: var(--text);
}
#markdownBody .archive-banner-original {
align-self: flex-start;
font-family: var(--font-sans);
font-size: 0.85rem;
font-weight: 600;
}
/* Degraded / js-required snapshots: a dashed-border note. Restrained
the monochrome palette has no alarm colour and wants none. */
#markdownBody .archive-degraded {
margin: 1rem 0 0;
padding: 0.7rem 1rem;
border: 1px dashed var(--border-muted);
border-radius: 2px;
font-family: var(--font-serif);
font-size: 0.9rem;
line-height: 1.55;
color: var(--text-muted);
}
#markdownBody .archive-degraded-label {
margin-right: 0.4rem;
font-family: var(--font-sans);
font-size: 0.7rem;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.1em;
color: var(--text);
}
/* Private entry: the artifact is held offline, not published a calm
informational panel in place of the artifact viewer. */
#markdownBody .archive-private {
margin: 1.8rem 0;
padding: 1rem 1.2rem;
border: 1px solid var(--border);
border-radius: 2px;
background: var(--bg-subtle);
font-family: var(--font-serif);
font-size: 0.95rem;
line-height: 1.6;
color: var(--text-muted);
}
/* Link-rot status a header note for non-live states (archive.py check),
and the status word in the provenance panel. The palette is monochrome,
so a `rotted` entry is marked by weight and a heavier left rule, never
colour. */
#markdownBody .archive-status-note {
margin: 1rem 0 0;
padding: 0.7rem 1rem;
border: 1px solid var(--border-muted);
border-left-width: 3px;
border-radius: 2px;
font-family: var(--font-serif);
font-size: 0.92rem;
line-height: 1.55;
color: var(--text);
}
#markdownBody .archive-status-note--rotted {
border-left-color: var(--text);
}
#markdownBody .archive-status-note--moved {
color: var(--text-muted);
}
#markdownBody .archive-status {
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.04em;
}
#markdownBody .archive-status--live {
color: var(--text-muted);
}
#markdownBody .archive-status--rotted {
font-weight: 600;
}
/* ============================================================
PROVENANCE PANEL
A bordered box with a small-caps label; the metadata is a
two-column key/value grid labels auto-sized, values take
the rest, long URLs and hashes wrap rather than overflow.
============================================================ */
#markdownBody .archive-provenance {
margin: 1.8rem 0;
padding: 1rem 1.2rem 1.1rem;
border: 1px solid var(--border);
border-radius: 2px;
}
#markdownBody .archive-panel-title {
margin: 0 0 0.7rem;
font-family: var(--font-sans);
font-size: 0.72rem;
font-weight: 600;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.12em;
color: var(--text-faint);
}
#markdownBody .archive-meta {
margin: 0;
display: grid;
grid-template-columns: max-content 1fr;
gap: 0.34rem 1.1rem;
}
#markdownBody .archive-meta dt {
font-family: var(--font-sans);
font-size: 0.78rem;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.05em;
color: var(--text-faint);
}
#markdownBody .archive-meta dd {
margin: 0;
font-family: var(--font-serif);
font-size: 0.92rem;
color: var(--text);
overflow-wrap: anywhere;
}
#markdownBody .archive-meta dd code {
font-family: var(--font-mono);
font-size: 0.82rem;
}
/* The author's reason-for-archiving note, set in the page measure. */
#markdownBody .archive-note {
margin: 1.6rem 0;
font-family: var(--font-serif);
font-size: 0.97rem;
font-style: italic;
line-height: 1.6;
color: var(--text-muted);
}
/* ============================================================
ARTIFACT VIEWER
A <div> (not a <figure> that carries prose framing) with a
mono caption bar that names the raw artifact and links to it,
and the artifact embedded raw beneath: the PDF renders in the
browser's native viewer, the HTML snapshot loads sandboxed.
============================================================ */
#markdownBody .archive-viewer {
margin: 1.8rem 0;
border: 1px solid var(--border-muted);
border-radius: 2px;
overflow: hidden;
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.03);
}
#markdownBody .archive-viewer-bar {
display: flex;
align-items: baseline;
justify-content: space-between;
gap: 1rem;
padding: 0.45rem 0.75rem;
border-bottom: 1px solid var(--border-muted);
background: var(--bg-subtle);
}
#markdownBody .archive-viewer-name {
font-family: var(--font-mono);
font-size: 0.78rem;
color: var(--text-muted);
}
#markdownBody .archive-viewer-open {
font-family: var(--font-sans);
font-size: 0.76rem;
white-space: nowrap;
}
#markdownBody .archive-frame {
display: block;
width: 100%;
height: 80vh;
border: 0;
background: var(--bg);
}
/* ============================================================
EXTRACTED FULL TEXT
Always in the DOM, for embed.py / Pagefind. PDF text is
collapsed in a <details> and keeps its pdftotext layout in a
scrollable mono block; HTML text shows as serif paragraphs.
============================================================ */
#markdownBody .archive-fulltext-wrap {
margin: 1.8rem 0 0;
}
#markdownBody .archive-fulltext-title,
#markdownBody .archive-section-title {
margin: 0 0 0.6rem;
padding-bottom: 0.4rem;
border-bottom: 1px solid var(--border);
font-family: var(--font-sans);
font-size: 0.78rem;
font-weight: 600;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.1em;
color: var(--text-muted);
}
#markdownBody summary.archive-fulltext-title {
cursor: pointer;
}
#markdownBody .archive-fulltext-wrap > p {
margin: 0 0 0.85rem;
font-family: var(--font-serif);
font-size: 0.95rem;
line-height: 1.6;
color: var(--text);
}
/* The pdftotext block: scroll-capped so it never dominates the page. */
#markdownBody .archive-fulltext {
margin: 0.8rem 0 0;
padding: 0.9rem 1rem;
max-height: 60vh;
overflow: auto;
border: 1px solid var(--border);
border-radius: 2px;
background: var(--bg-subtle);
font-family: var(--font-mono);
font-size: 0.8rem;
line-height: 1.5;
color: var(--text-muted);
white-space: pre-wrap;
overflow-wrap: anywhere;
}
/* ============================================================
REFERENCED BY / RELATED
The site-wide .backlinks-list / .similar-links-list styles
(components.css) carry the lists themselves; these rules add
only the section framing and the granular fragment groups.
============================================================ */
#markdownBody .archive-backlinks,
#markdownBody .archive-related {
margin: 1.8rem 0 0;
}
#markdownBody .referenced-by-group {
margin-top: 0.9rem;
}
#markdownBody .referenced-by-fragment {
margin: 0 0 0.3rem;
font-family: var(--font-sans);
font-size: 0.72rem;
font-weight: 600;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.08em;
color: var(--text-faint);
}
/* ============================================================
REMOVAL NOTICE
A quiet italic footer line, set off by a top rule present
on every archive page and on the index.
============================================================ */
#markdownBody .archive-removal {
margin: 2.4rem 0 0;
padding-top: 1rem;
border-top: 1px solid var(--border);
font-family: var(--font-serif);
font-size: 0.85rem;
font-style: italic;
line-height: 1.55;
color: var(--text-faint);
}
/* ============================================================
INDEX PAGE /archive/
A text list in the catalog idiom: one hairline between rows,
the title in serif, type + date + any quality flag in quiet
sans pushed to the row's end.
============================================================ */
#markdownBody .archive-index-header {
margin-bottom: 1.8rem;
}
#markdownBody .archive-index-intro {
margin: 0.6rem 0 0;
font-family: var(--font-serif);
font-size: 1rem;
line-height: 1.6;
color: var(--text-muted);
}
#markdownBody .archive-list {
margin: 0;
padding: 0;
list-style: none;
}
#markdownBody .archive-list-item {
display: flex;
align-items: baseline;
justify-content: space-between;
gap: 0.4rem 1rem;
flex-wrap: wrap;
padding: 0.7rem 0;
border-bottom: 1px solid var(--border);
}
#markdownBody .archive-list-item:last-child {
border-bottom: none;
}
#markdownBody .archive-list-link {
font-family: var(--font-serif);
font-size: 1.05rem;
color: var(--text);
text-decoration: none;
}
#markdownBody .archive-list-link:hover {
text-decoration: underline;
text-underline-offset: 2px;
}
#markdownBody .archive-list-meta {
font-family: var(--font-sans);
font-size: 0.78rem;
color: var(--text-faint);
white-space: nowrap;
}
/* Non-'ok' capture flag — a dashed chip, echoing the entry-page note. */
#markdownBody .archive-quality-flag {
padding: 0.05em 0.4em;
border: 1px dashed var(--border-muted);
border-radius: 2px;
font-variant: all-small-caps;
font-feature-settings: "smcp" 1;
letter-spacing: 0.04em;
color: var(--text-muted);
}
/* A rotted entry is the one health state worth a solid, inked flag. */
#markdownBody .archive-quality-flag--rotted {
border-style: solid;
border-color: var(--text);
color: var(--text);
}
#markdownBody .archive-empty {
font-family: var(--font-serif);
font-style: italic;
color: var(--text-muted);
}
/* ============================================================
MOBILE
Collapse the provenance grid to stacked rows; trim the frame.
============================================================ */
@media (max-width: 540px) {
#markdownBody .archive-meta {
grid-template-columns: 1fr;
gap: 0;
}
#markdownBody .archive-meta dt {
margin-top: 0.55rem;
}
#markdownBody .archive-meta dt:first-of-type {
margin-top: 0;
}
#markdownBody .archive-frame {
height: 70vh;
}
}

View File

@ -278,15 +278,11 @@ html {
line-height: var(--line-height);
-webkit-text-size-adjust: 100%;
scroll-behavior: smooth;
/* clip (not hidden) prevents horizontal scroll at the viewport level
without creating a scroll container, so position:sticky still works. */
overflow-x: clip;
}
body {
margin: 0;
padding: 0;
overflow-x: clip;
background-color: var(--bg);
color: var(--text);
transition: background-color var(--transition-fast),

View File

@ -543,12 +543,28 @@ nav.site-nav {
using `aria-hidden="true"` (set by toc.js). The transition still
works because we keep `max-height: 0` for the visual collapse. */
.toc-nav {
overflow: hidden;
max-height: 80vh;
/* Fill the sticky sidebar's remaining height and scroll when the
outline is taller than the viewport. The subtracted ~2.6rem
mirrors #toc's own max-height budget (layout.css) minus the
.toc-header row (label + progress rule + its bottom margin). */
max-height: calc(100vh - var(--nav-height, 4rem) - 3rem - 2.6rem);
overflow-y: auto;
overflow-x: hidden;
overscroll-behavior: contain;
transition: max-height 0.3s ease;
scrollbar-width: thin;
scrollbar-color: var(--border) transparent;
}
.toc-nav::-webkit-scrollbar { width: 6px; }
.toc-nav::-webkit-scrollbar-thumb {
background: var(--border);
border-radius: 3px;
}
#toc.is-collapsed .toc-nav {
max-height: 0;
/* overflow:hidden so the outline is clipped (not scrolled) while
the max-height transition runs down to 0. */
overflow: hidden;
}
#toc.is-collapsed .toc-nav a,
#toc.is-collapsed .toc-nav button {
@ -1849,3 +1865,50 @@ pre:hover .copy-btn,
min-height: 300px;
}
}
/* Archive affordance
The superscript "A" appended after a body link whose target is preserved
in the local archive (build/Filters/Archive.hs). Loaded site-wide because
the marker appears in essay/prose content, not on archive pages. */
.archive-affordance {
font-size: 0.7em;
margin-left: 0.15em;
line-height: 0;
}
.archive-affordance a {
font-family: var(--font-sans);
font-weight: 600;
text-decoration: none;
color: var(--text-faint);
border: 1px solid var(--border-muted);
border-radius: 2px;
padding: 0 0.25em;
}
.archive-affordance a:hover {
color: var(--text);
border-color: var(--text-muted);
background: var(--bg-subtle);
}
/* Dead-link flip a body link whose archived target is `rotted` has its
href redirected to the local copy (build/Filters/Archive.hs). A dotted
underline marks the link as redirected; its marker becomes a solid chip
reading "archived" rather than the quiet bordered "A". */
.archive-rotted {
text-decoration-style: dotted;
}
.archive-affordance--rotted a {
color: var(--bg);
background: var(--text-muted);
border-color: var(--text-muted);
}
.archive-affordance--rotted a:hover {
color: var(--bg);
background: var(--text);
border-color: var(--text);
}

View File

@ -5,10 +5,16 @@
The outer shell. Wide enough for TOC + body + sidenotes.
============================================================ */
body {
/* Body is plain block keeps the sticky nav header (a direct body
child) working reliably on iOS WebKit, where position: sticky on a
direct flex/grid child silently degrades to static. The sticky-footer
math (full-viewport min-height + flex column to push footer down)
moves to .page-shell, which wraps everything below the nav. */
.page-shell {
display: flex;
flex-direction: column;
min-height: 100vh;
min-height: calc(100dvh - var(--nav-height, 4rem));
}
/* ============================================================
@ -17,7 +23,14 @@ body {
(Nav styles live in components.css)
============================================================ */
body > header {
/* Site-nav header only exclude the essay-frontmatter <header> that
essay/reading/blog templates emit as a body-level sibling so its
monogram and figure columns can span full viewport width. Without
the :not() guard, the essay header inherits the sticky / nav-bg /
border-bottom chrome meant only for the top navigation, painting
the wrong color band over the page and pinning the frontmatter to
the viewport top. */
body > header:not(.essay-frontmatter) {
width: 100%;
border-bottom: 1px solid var(--border);
background-color: var(--bg-nav);
@ -78,18 +91,20 @@ body > header {
/* ============================================================
STANDALONE PAGES (no #content wrapper)
essay-index, blog-index, tag-index, page, blog-post, search
these emit #markdownBody as a direct child of <body>. Without
the #content flex-row wrapper there is no centering; fix it here.
these emit #markdownBody as a direct child of .page-shell.
Without the #content flex-row wrapper there is no centering;
fix it here. (Was body > #markdownBody before the page-shell
wrapper was introduced to keep iOS sticky working.)
============================================================ */
body > #markdownBody {
.page-shell > #markdownBody {
align-self: center;
padding: 2rem var(--page-padding);
flex: 1 0 auto;
}
@media (max-width: 680px) {
body > #markdownBody {
.page-shell > #markdownBody {
padding: 1.25rem var(--page-padding);
}
}
@ -99,7 +114,7 @@ body > #markdownBody {
FOOTER
============================================================ */
body > footer {
.page-shell > footer {
width: 100%;
border-top: 1px solid var(--border);
padding: 1.5rem var(--page-padding);
@ -109,6 +124,7 @@ body > footer {
display: flex;
justify-content: space-between;
align-items: center;
margin-top: auto;
}
.footer-left {
@ -217,7 +233,7 @@ body > footer {
}
/* Footer: stack vertically so three sections don't fight for width. */
body > footer {
.page-shell > footer {
flex-direction: column;
align-items: center;
gap: 0.3rem;

View File

@ -9,40 +9,106 @@
suppress the slot div entirely when its SVG is empty).
============================================================ */
/* Three-column grid:
/* Three-column grid spanning the full viewport width:
[ monogram ] [ title block 1fr ] [ epistemic figure ]
The 1fr column absorbs all extra width. Mark slots are
sized to their content via grid auto-placement. When the
template guard suppresses one or both slot divs, the column
simply does not exist for layout purposes. */
The header lives outside #content so the grid stretches
edge-to-edge; the side columns are pinned to the same
`clamp()` width as the marks themselves so the middle
column stays symmetric in the viewport even when only one
slot has content (e.g. essays without status: render only
the monogram, but the right column still reserves its
width without this, the title block would centre in an
off-axis 1fr cell and visibly slide right). The padding
mirrors #content's `2rem var(--page-padding)` so the
header aligns with the body below. */
.essay-frontmatter {
display: grid;
grid-template-columns: auto minmax(0, 1fr) auto;
grid-template-columns:
clamp(170px, 17vw, 280px)
minmax(0, 1fr)
clamp(170px, 17vw, 280px);
column-gap: clamp(0.75rem, 2vw, 1.75rem);
row-gap: 0.75rem;
align-items: center;
margin-bottom: 1rem;
padding: 2rem var(--page-padding);
width: 100%;
}
/* Reading variant (poetry / fiction) and blog variant share the
same grid; declared explicitly so future tweaks can diverge. */
.essay-frontmatter--reading,
.essay-frontmatter--blog {
grid-template-columns: auto minmax(0, 1fr) auto;
grid-template-columns:
clamp(170px, 17vw, 280px)
minmax(0, 1fr)
clamp(170px, 17vw, 280px);
}
/* Title block stays in the centre; never shrinks below 0. */
/* Title block: centred in its grid column, capped to roughly
the body column's measure so prose lines stay readable on
ultrawide viewports. `justify-self: center` is the explicit
override of grid items' default `stretch`; without it auto
margins do not centre because the item is already filling
the cell at max-width. */
.frontmatter-title {
min-width: 0;
width: 100%;
max-width: var(--body-max-width);
justify-self: center;
text-align: center;
}
/* Centre the title and metadata under the H1 matches the
visual rhythm of the reference mockup, where the byline,
abstract, and compact strip sit in a stacked column. */
.frontmatter-title > .page-title {
margin-bottom: 0.25rem;
}
/* Abstract paragraph reads better left-aligned even inside a
centred title block multi-line prose with ragged-right
on a centred axis becomes hard to track. */
.frontmatter-title .meta-description {
text-align: left;
}
/* Compact-strip chips sit comfortably wider than the abstract;
centre them so the row balances visually under the title. */
.frontmatter-title .meta-epistemic-strip {
justify-content: center;
}
/* Tailmatter: the body-level wrapper that hosts the metadata-tail
row (tags + keywords + affiliation + pagelinks). Constrained to
the body column's measure and centred so its contents read at the
same width they did before the layout split, while sitting
*above* #content so the TOC sidebar starts right under the
frontmatter divider rather than competing with it for the top
of the page. */
.essay-tailmatter {
width: min(var(--body-max-width), 100%);
margin: 0 auto;
padding: 0 var(--page-padding);
box-sizing: border-box;
}
/* The cursive-L frontmatter divider runs edge-to-edge of the
viewport the dashed lines on either side of the L use the
existing `flex: 1` rule on `.content-divider::before` and
`::after` to fill whatever container they sit inside, so as a
body-level child the divider naturally spans the full page
width. The page-padding keeps the dashes off the literal edge
while still reading as a full-width separator. */
.content-divider--frontmatter {
padding: 0 var(--page-padding);
box-sizing: border-box;
}
/* Monogram placeholder (rendered when no mark.svg exists for the
piece see Marks.monogramPlaceholder). Lower opacity so it reads
as a neutral frame, balancing the epistemic-figure column without
being mistaken for an authored glyph. */
.frontmatter-mark--placeholder svg {
opacity: 0.35;
}
/* Subtitle: a short secondary line, lighter than the H1, never
competing with it. Kept restrained so existing essays without
a subtitle render unchanged. */
@ -81,8 +147,6 @@
#markdownBody .frontmatter-mark {
margin: 0;
padding: 0;
width: 170px;
height: 170px;
max-width: none;
background: none;
border: none;
@ -91,6 +155,30 @@
color: var(--text);
}
/* Frontmatter header context: scale with viewport. 170 px floor on
narrow desktops, 280 px cap on ultrawide displays (matches the
monogram viewBox so the placeholder roundel reaches its native
edge). 17vw is the slope between the two at ~1000 px it equals
the floor, at ~1650 px it equals the cap. */
.essay-frontmatter .frontmatter-mark {
width: clamp(170px, 17vw, 280px);
height: clamp(170px, 17vw, 280px);
}
/* Item-card context: small inline glyph beside the kind badge.
Sized to read as a marker, not a competing figure. */
.item-card-monogram {
flex-shrink: 0;
line-height: 0;
color: var(--text-muted);
margin-top: 0.15em;
}
.item-card-monogram .frontmatter-mark {
width: 72px;
height: 72px;
}
/* SVG fills its parent figure exactly. !important defeats the
global `img, video, svg { max-width: 100%; height: auto }` in
base.css for our specific case (which would otherwise leave the

View File

@ -36,6 +36,40 @@
SHARED POPUP CONTENT
============================================================ */
/* Internal-page popups with a monogram render in two columns:
the monogram on the left, title + abstract + meta on the right.
Without a monogram, the body fills normally (default block flow). */
.popup-internal.has-monogram {
display: grid;
grid-template-columns: 56px minmax(0, 1fr);
gap: 0.65rem;
align-items: start;
}
/* Source label spans both columns when a monogram is present so it
reads as the popup's source attribution, not as a column entry. */
.popup-internal.has-monogram > .popup-source {
grid-column: 1 / -1;
}
.popup-monogram {
grid-column: 1;
line-height: 0;
color: var(--text);
}
.popup-monogram svg {
display: block;
width: 100%;
height: auto;
color: inherit;
}
.popup-internal-body {
grid-column: 2;
min-width: 0;
}
.popup-title {
font-weight: 600;
color: var(--text);

View File

@ -1,5 +1,10 @@
/* print.css Clean paper output.
Loaded on every page via <link media="print">.
Loaded LAST in head.html via <link media="print"> so its rules win
the cascade at print page widths (~595820 CSS px on A4/Letter),
which otherwise trip every screen breakpoint below 900px (mobile
TOC bar, body bottom padding) and below 1499px (sidenote footnote
fallback).
Hides chrome, expands body full-width, renders in black on white. */
@media print {
@ -13,7 +18,8 @@
[data-theme="dark"],
[data-theme="cappuccino"] {
--bg: #ffffff;
--bg-offset: #f5f5f5;
--bg-nav: #ffffff;
--bg-offset: #ffffff;
--bg-subtle: #f9f9f9;
--text: #000000;
--text-muted: #333333;
@ -23,36 +29,71 @@
--rule: #cccccc;
}
/* Reading-mode body warm tints would otherwise repaint pages cream. */
body.reading-mode { --bg: #ffffff; }
/* ----------------------------------------------------------------
Hide chrome entirely
Hide chrome.
Site nav is `body > header` (templates/partials/nav.html); the
essay frontmatter is `.page-shell > header.essay-frontmatter`
(essay/blog-post/reading templates) and MUST stay visible its
<h1> is the page title.
The mark slots (monogram + epistemic figure) and frontmatter
divider are decorative; suppressing them lets the title block
collapse to a clean masthead.
---------------------------------------------------------------- */
header,
footer,
body > header,
.page-shell > footer,
#toc,
#toc-mobile-bar,
#reading-progress,
.skip-link,
.settings-wrap,
.selection-popup,
.link-popup,
.ann-tooltip,
.ann-picker,
.sidenote-popup-overlay,
.toc-toggle,
.section-toggle,
.nav-portals,
.nav-portal-toggle,
.footer-ornament,
.content-divider,
.aftermatter-divider,
.frontmatter-mark-slot,
.metadata .meta-pagelinks,
.page-meta-footer #backlinks,
.page-meta-footer #similar-links,
.nav-portals {
.version-history-more {
display: none !important;
}
/* The mobile TOC bar's screen rule also adds `body { padding-bottom: 2.5rem }`
to clear the fixed strip undo it on paper. */
body {
padding-bottom: 0 !important;
}
/* ----------------------------------------------------------------
Layout single full-width column
---------------------------------------------------------------- */
body {
font-size: 11pt;
line-height: 1.6;
line-height: 1.55;
background: var(--bg);
color: var(--text);
margin: 0;
padding: 0;
}
.page-shell {
display: block !important;
min-height: 0 !important;
}
#content {
display: block !important;
width: 100% !important;
@ -60,7 +101,11 @@
margin: 0 !important;
}
#markdownBody {
#markdownBody,
.page-shell > #markdownBody,
body.reading-mode .page-shell > #markdownBody,
body.reading-mode.poetry > #markdownBody,
body.reading-mode.fiction > #markdownBody {
width: 100% !important;
max-width: 100% !important;
grid-column: unset !important;
@ -68,29 +113,157 @@
padding: 0 !important;
}
/* Sidenotes: pull inline as footnote-like blocks */
.sidenote-ref {
display: none;
/* ----------------------------------------------------------------
Essay frontmatter collapse the 3-column viewport-spanning grid
to a single linear masthead. Mark slots already hidden above.
---------------------------------------------------------------- */
.essay-frontmatter {
display: block !important;
padding: 0 0 0.6em 0 !important;
margin: 0 0 1em 0 !important;
border-bottom: 1px solid var(--border);
}
.frontmatter-title {
max-width: 100% !important;
width: 100% !important;
text-align: left !important;
justify-self: stretch !important;
}
.frontmatter-title > .page-title {
font-size: 22pt;
margin: 0 0 0.2em 0;
line-height: 1.15;
}
.essay-subtitle {
font-size: 12pt;
margin: 0 0 0.4em 0;
}
.frontmatter-title .meta-description {
text-align: left !important;
}
.frontmatter-title .meta-epistemic-strip {
justify-content: flex-start !important;
}
.essay-tailmatter {
max-width: 100% !important;
width: 100% !important;
padding: 0 !important;
margin: 0 0 1em 0 !important;
}
.essay-summary {
background: transparent !important;
border: 1px solid var(--border-muted);
padding: 0.6em 0.8em !important;
margin: 0.5em 0 1em 0 !important;
}
/* ----------------------------------------------------------------
Sidenotes render inline as parenthetical asides.
The build's Sidenotes filter replaces Pandoc's
`section.footnotes` entirely with `<span class="sidenote">`
siblings of the ref, so there is no end-of-document footnote
section to fall back to. Sidenotes.css hides .sidenote at
narrow widths (which print triggers); print.css promotes them
back inline as small italic parentheticals so the surrounding
sentence flow stays intact block-style footnotes mid-sentence
leave dangling clauses on the next line.
---------------------------------------------------------------- */
.sidenote {
display: block;
display: inline !important;
position: static !important;
width: auto !important;
margin: 0.5em 2em;
padding: 0.4em 0.8em;
border-left: 2px solid var(--border);
font-size: 9pt;
max-width: none !important;
margin: 0;
padding: 0;
border: none;
font-size: 0.85em;
font-style: italic;
color: var(--text-muted);
}
.sidenote::before { content: " ["; font-style: normal; }
.sidenote::after { content: "]"; font-style: normal; }
.sidenote-para {
display: inline !important;
margin: 0 !important;
}
.sidenote-para + .sidenote-para::before {
content: " / ";
font-style: normal;
color: var(--text-faint);
}
.sidenote-num {
display: none !important;
}
.sidenote-ref a {
color: var(--text);
text-decoration: none;
}
/* ----------------------------------------------------------------
User annotations strip on-screen highlight backgrounds; keep
a faint underline so the marker still surfaces on paper.
---------------------------------------------------------------- */
mark.user-annotation {
background: transparent !important;
color: inherit !important;
padding: 0 !important;
text-decoration: underline;
text-decoration-color: var(--text-faint);
text-decoration-thickness: 0.5pt;
text-underline-offset: 0.18em;
}
/* ----------------------------------------------------------------
Figures strip on-screen card chrome (bg-offset card, inner
image border, drop shadow). Plain image + caption reads more
naturally on paper.
---------------------------------------------------------------- */
#markdownBody figure {
background: transparent !important;
border: none !important;
box-shadow: none !important;
padding: 0 !important;
margin: 1em auto !important;
max-width: 100% !important;
}
#markdownBody figure img {
border: none !important;
}
#markdownBody figcaption {
font-size: 9pt;
margin-top: 0.3em;
}
/* ----------------------------------------------------------------
Drop cap quieten the magazine flourish; at 3.8em it prints as
a giant black blob that wastes the first half page.
---------------------------------------------------------------- */
#markdownBody > p:first-of-type::first-letter,
#markdownBody .dropcap p::first-letter,
body.reading-mode.fiction > #markdownBody h2 + p::first-letter {
font-size: 2.4em;
line-height: 0.9;
}
/* ----------------------------------------------------------------
Page setup
---------------------------------------------------------------- */
@page {
margin: 2cm 2.5cm;
}
@page :first {
margin-top: 3cm;
margin: 2cm 2cm;
}
/* ----------------------------------------------------------------
@ -106,41 +279,49 @@
widows: 3;
}
pre, figure, .exhibit {
pre, figure, .exhibit, table {
page-break-inside: avoid;
break-inside: avoid;
}
/* Show href after external links */
a[href^="http"]::after {
content: " (" attr(href) ")";
font-size: 0.8em;
color: var(--text-faint);
word-break: break-all;
}
/* But not for nav or obvious UI links */
.cite-link::after,
.meta-tag::after,
a[href^="#"]::after {
/* Decorative inline link-icon glyphs (wikipedia W, arxiv X, github,
etc.) they render as an inline-block 0.75em × 0.75em masked
SVG, fine on screen but on paper they print as an opaque black
speckle next to every external link. Suppress them entirely.
Critical: this also unsets the fixed width/height/mask so any
`::after` content from another rule renders as plain inline text. */
a[data-link-icon-type="svg"]::after,
a[data-link-icon]::after {
content: none !important;
display: none !important;
}
/* External links keep their default underline readers can follow
the live URL via the PDF's preserved link metadata. We don't
inline the href as printed text because (a) it duplicates the
link, (b) `word-break: break-all` URLs interact badly with the
link-icon ::after we just suppressed, and (c) it makes the page
visually noisy. */
/* ----------------------------------------------------------------
Code blocks strip background, border only
---------------------------------------------------------------- */
pre, code {
background: var(--bg-subtle) !important;
background: transparent !important;
border: 1px solid var(--border-muted) !important;
box-shadow: none !important;
}
/* ----------------------------------------------------------------
Bibliography / footer keep but compact
Page meta footer keep epistemic + version history compact;
collapse the auto-fit grid so columns don't get pushed onto a
new page when the body ended near a break.
---------------------------------------------------------------- */
.page-meta-footer {
margin-top: 1.5em;
padding-top: 1em;
padding: 1em 0 0 0 !important;
border-top: 1px solid var(--border);
gap: 0.8em !important;
}
.meta-footer-full,
@ -148,4 +329,12 @@
width: 100% !important;
max-width: 100% !important;
}
.meta-footer-grid {
display: block !important;
}
.meta-footer-section {
margin-bottom: 0.8em;
}
}

View File

@ -48,9 +48,10 @@ body.reading-mode {
}
/* Reading body: narrower than the essay default (800px ~62ch).
Since reading.html emits body > #markdownBody (no #content grid),
the centering is handled by the existing layout.css rule. */
body.reading-mode > #markdownBody {
reading.html emits #markdownBody as a direct child of .page-shell
(no #content grid); centering is handled by the matching
.page-shell > #markdownBody rule in layout.css. */
body.reading-mode .page-shell > #markdownBody {
max-width: 62ch;
}

75
static/js/now.js Normal file
View File

@ -0,0 +1,75 @@
/* now.js Keep the Current page's "Last updated" relative phrase
honest.
build/Now.hs renders `.now-stamp-relative` ("3 days ago") at build
time, relative to the build machine's clock. A page served days
later from cache/CDN would then lie. We recompute the phrase in the
browser from the `<time datetime>` attribute (an unambiguous
YYYY-MM-DD), against the visitor's own clock.
The bucket thresholds below mirror `relativeTime` in build/Now.hs
exactly keep the two in sync. The server-rendered text remains the
no-JS fallback and is only replaced once we've recomputed. */
(function () {
'use strict';
function relative(days) {
if (days < 0) return ''; // future / clock skew
if (days === 0) return 'today';
if (days === 1) return 'yesterday';
if (days < 7) return days + ' days ago';
var n, unit;
if (days < 28) { n = Math.floor(days / 7); unit = 'week'; }
else if (days < 365) { n = Math.floor(days / 30); unit = 'month'; }
else { n = Math.floor(days / 365); unit = 'year'; }
return n === 1 ? ('1 ' + unit + ' ago')
: (n + ' ' + unit + 's ago');
}
function update() {
var stamp = document.querySelector('.now-stamp');
if (!stamp) return;
var timeEl = stamp.querySelector('.now-stamp-date');
if (!timeEl) return;
var iso = timeEl.getAttribute('datetime');
var m = /^(\d{4})-(\d{2})-(\d{2})$/.exec(iso || '');
if (!m) return; // unparseable — leave the SSR fallback as-is
// Calendar-day difference, computed via UTC epoch days so DST
// transitions can't add or drop a day. "today" uses the
// visitor's *local* date components, matching what they'd
// read off a wall calendar.
var then = Date.UTC(+m[1], +m[2] - 1, +m[3]);
var local = new Date();
var today = Date.UTC(
local.getFullYear(),
local.getMonth(),
local.getDate()
);
var days = Math.round((today - then) / 86400000);
var text = relative(days);
var rel = stamp.querySelector('.now-stamp-relative');
if (!text) {
// No meaningful relative phrase (e.g. dated in the future):
// drop any stale server-rendered one rather than keep a lie.
if (rel) rel.remove();
return;
}
if (!rel) {
rel = document.createElement('span');
rel.className = 'now-stamp-relative';
stamp.appendChild(rel);
}
rel.textContent = text;
}
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', update);
} else {
update();
}
})();

View File

@ -348,6 +348,16 @@
var titleEl = doc.querySelector('h1.page-title');
if (!titleEl) return null;
/* Monogram \u2014 only when the source page renders a real
authored mark.svg (not the placeholder roundel that
the empty-frontmatter slot uses for symmetric layout).
Serialised as an outerHTML string and trusted as
already-sanitised SVG produced by our own build. */
var monoEl = doc.querySelector(
'figure.frontmatter-mark--monogram:not(.frontmatter-mark--placeholder) svg'
);
var mono = monoEl ? monoEl.outerHTML : '';
/* Abstract */
var abstrEl = doc.querySelector('.meta-description');
var abstract = abstrEl ? abstrEl.textContent.trim() : '';
@ -375,13 +385,16 @@
].filter(Boolean).join(' · ');
return store(href,
'<div class="popup-internal">'
'<div class="popup-internal' + (mono ? ' has-monogram' : '') + '">'
+ srcHtml('internal', 'levineuwirth.org')
+ (mono ? '<div class="popup-monogram" aria-hidden="true">' + mono + '</div>' : '')
+ '<div class="popup-internal-body">'
+ (tags ? '<div class="popup-tags">' + esc(tags) + '</div>' : '')
+ '<div class="popup-title">' + esc(titleEl.textContent.trim()) + '</div>'
+ (authors ? '<div class="popup-authors">' + esc(authors) + '</div>' : '')
+ (abstract ? '<div class="popup-abstract">' + esc(abstract) + '</div>' : '')
+ (stats ? '<div class="popup-meta">' + esc(stats) + '</div>' : '')
+ '</div>'
+ '</div>');
})
.catch(function () { return null; });

View File

@ -0,0 +1,23 @@
<div id="content">
<main id="markdownBody" data-pagefind-body>
<header class="archive-index-header">
<h1 class="page-title">$title$</h1>
<p class="archive-index-intro">Local snapshots of works referenced across the site, preserved against link rot. Each is an archived copy; the original is linked prominently from its page.</p>
</header>
$if(has-entries)$
<ul class="archive-list">
$for(entries)$
<li class="archive-list-item">
<a class="archive-list-link" href="$entry-url$">$entry-title$</a>
<span class="archive-list-meta">$entry-type$ &middot; archived $entry-archived$$if(entry-degraded)$ &middot; <span class="archive-quality-flag">$entry-quality$ capture</span>$endif$$if(entry-private)$ &middot; <span class="archive-quality-flag">private</span>$endif$$if(entry-rotted)$ &middot; <span class="archive-quality-flag archive-quality-flag--rotted">link rotted</span>$endif$</span>
</li>
$endfor$
</ul>
$else$
<p class="archive-empty">Nothing archived yet.</p>
$endif$
$partial("templates/partials/archive-removal-notice.html")$
</main>
</div>

109
templates/archive.html Normal file
View File

@ -0,0 +1,109 @@
<div id="content">
<main id="markdownBody" data-pagefind-body data-pagefind-filter="type:archive, status:$status$">
<article class="archive-entry">
<header class="archive-header">
<h1 class="page-title">$title$</h1>
$partial("templates/partials/archive-banner.html")$
$if(status-note)$
<p class="archive-status-note archive-status-note--$status$" role="note">
$status-note$
</p>
$endif$
$if(degraded)$
<p class="archive-degraded" role="note">
<span class="archive-degraded-label">Capture: $snapshot-quality$</span>
Some of the original's content (images, scripted elements)
may be missing or incomplete in this snapshot. The original
is linked above.
</p>
$endif$
</header>
<section class="archive-provenance" aria-label="Provenance">
<h2 class="archive-panel-title">Provenance</h2>
<dl class="archive-meta">
<dt>Original</dt>
<dd><a href="$original-url$" rel="noopener noreferrer" target="_blank">$original-url$</a></dd>
<dt>Link status</dt>
<dd class="archive-status archive-status--$status$">$status$</dd>
<dt>Archived</dt>
<dd>$archived$</dd>
<dt>Type</dt>
<dd>$archive-type$</dd>
<dt>Snapshot quality</dt>
<dd>$snapshot-quality$</dd>
<dt>Size</dt>
<dd>$size$</dd>
<dt>SHA-256</dt>
<dd><code>$sha-short$&hellip;</code></dd>
$if(wayback)$
<dt>Wayback</dt>
<dd><a href="$wayback$" rel="noopener noreferrer" target="_blank">web.archive.org copy</a></dd>
$endif$
$if(paywalled)$
<dt>Access</dt>
<dd>The original sits behind a paywall.</dd>
$endif$
$if(private)$
<dt>Visibility</dt>
<dd>private &mdash; held offline</dd>
$endif$
</dl>
</section>
$if(note)$<p class="archive-note">$note$</p>$endif$
$if(private)$
<p class="archive-private" role="note">
This work is archived <strong>privately</strong>: a local
preservation copy is kept against link rot, but the artifact
is not published here. Use the original link above to read it.
</p>
$else$
<div class="archive-viewer">
<div class="archive-viewer-bar">
<span class="archive-viewer-name">$artifact-name$</span>
<a class="archive-viewer-open" href="$artifact-url$" target="_blank" rel="noopener noreferrer">Open raw&nbsp;&#8599;</a>
</div>
$if(is-pdf)$
<iframe class="archive-frame" src="$artifact-url$" title="$title$ &mdash; archived document" loading="lazy"></iframe>
$endif$
$if(is-html)$
<iframe class="archive-frame" src="$artifact-url$" title="$title$ &mdash; archived snapshot" sandbox referrerpolicy="no-referrer" loading="lazy"></iframe>
$endif$
</div>
$endif$
$if(fulltext)$
$if(is-pdf)$
<details class="archive-fulltext-wrap">
<summary class="archive-fulltext-title">Full text (extracted)</summary>
$fulltext$
</details>
$endif$
$if(is-html)$
<section class="archive-fulltext-wrap">
<h2 class="archive-fulltext-title">Readable text (extracted)</h2>
$fulltext$
</section>
$endif$
$endif$
$if(referenced-by)$
<section class="archive-backlinks">
<h2 class="archive-section-title">Referenced by</h2>
$referenced-by$
</section>
$endif$
$if(similar-links)$
<section class="archive-related">
<h2 class="archive-section-title">Related</h2>
$similar-links$
</section>
$endif$
$partial("templates/partials/archive-removal-notice.html")$
</article>
</main>
</div>

View File

@ -1,21 +1,19 @@
<header class="essay-frontmatter essay-frontmatter--blog">
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$if(date)$
<p class="post-date"><time class="date-hover" datetime="$date-iso$" data-date-start="$date-iso$">$date$</time></p>
$endif$
</div>
$if(epistemicSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--right">
<a href="#epistemic" aria-label="Jump to epistemic profile">$epistemicSvg$</a>
</div>
$endif$
</header>
<main id="markdownBody" data-pagefind-body>
<header class="essay-frontmatter essay-frontmatter--blog">
$if(monogramSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
$endif$
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$if(date)$
<p class="post-date"><time class="date-hover" datetime="$date-iso$" data-date-start="$date-iso$">$date$</time></p>
$endif$
</div>
$if(epistemicSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--right">
<a href="#epistemic" aria-label="Jump to epistemic profile">$epistemicSvg$</a>
</div>
$endif$
</header>
$body$
$if(backlinks)$
<footer class="page-meta-footer">

View File

@ -12,8 +12,10 @@ $if(search)$
<script src="/js/semantic-search.js" defer></script>
<script src="/js/search-filters.js" defer></script>
$endif$
<div class="page-shell">
$body$
$partial("templates/partials/footer.html")$
</div>
<!-- JS — all deferred -->
<script src="/js/popups.js" defer></script>
<script src="/js/annotations.js" defer></script>
@ -27,6 +29,7 @@ $partial("templates/partials/footer.html")$
<script src="/js/lightbox.js" defer></script>
$if(home)$<script src="/js/random.js" defer></script>$endif$
$if(reading)$<script src="/js/reading.js" defer></script>$endif$
$if(now)$<script src="/js/now.js" defer></script>$endif$
$if(photography)$<script src="/js/photography-modes.js" defer></script>$endif$
$if(photography-map)$<script src="/leaflet/leaflet.js" defer></script>$endif$
$if(photography-map)$<script src="/leaflet/leaflet.markercluster.js" defer></script>$endif$

View File

@ -1,3 +1,28 @@
<header class="essay-frontmatter">
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$partial("templates/partials/metadata-header.html")$
</div>
$if(epistemicSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--right">
<a href="#epistemic" aria-label="Jump to epistemic profile">$epistemicSvg$</a>
</div>
$endif$
</header>
<div class="essay-tailmatter">
$partial("templates/partials/metadata-tail.html")$
$if(summary)$
<div class="essay-summary" data-pagefind-ignore="all">
<div class="essay-summary-label">Summary</div>
$summary$
</div>
$endif$
</div>
<div class="content-divider content-divider--frontmatter" aria-hidden="true">
<a href="/new.html" class="content-divider-logo" aria-label="New"></a>
</div>
<div id="content">
<aside id="toc" aria-label="Table of contents" data-pagefind-ignore="all">
<div class="toc-header">
@ -9,31 +34,6 @@
</nav>
</aside>
<main id="markdownBody" data-pagefind-body$if(no-collapse)$ data-no-collapse$endif$>
<header class="essay-frontmatter">
$if(monogramSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
$endif$
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$partial("templates/partials/metadata-header.html")$
</div>
$if(epistemicSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--right">
<a href="#epistemic" aria-label="Jump to epistemic profile">$epistemicSvg$</a>
</div>
$endif$
</header>
$partial("templates/partials/metadata-tail.html")$
$if(summary)$
<div class="essay-summary" data-pagefind-ignore="all">
<div class="essay-summary-label">Summary</div>
$summary$
</div>
$endif$
<div class="content-divider" aria-hidden="true">
<a href="/new.html" class="content-divider-logo" aria-label="New"></a>
</div>
$body$
</main>
</div>

View File

@ -0,0 +1,5 @@
<div class="archive-banner" role="note">
<p class="archive-banner-label">Archived copy</p>
<p class="archive-banner-text">A local preservation snapshot taken $archived$ &mdash; this page is not the original.</p>
<a class="archive-banner-original" href="$original-url$" rel="noopener noreferrer" target="_blank">View the original&nbsp;&#8599;</a>
</div>

View File

@ -0,0 +1,5 @@
<p class="archive-removal">
This is an archived copy, preserved so that a work cited across the site
survives the original going dark. To request removal, email
<a href="mailto:ln@levineuwirth.org">ln@levineuwirth.org</a>.
</p>

View File

@ -7,7 +7,7 @@
<span class="footer-license"><a href="https://creativecommons.org/licenses/by-nc-sa/4.0/" rel="license">CC&nbsp;BY-NC-SA&nbsp;4.0</a> · <a href="https://git.levineuwirth.org/neuwirth/levineuwirth.org">MIT</a> · <a href="/memento-mori.html" class="footer-mm">MM</a></span>
</div>
<div class="footer-right">
<a href="/build/" class="footer-build-link" aria-label="Build telemetry">build</a> $build-time$
<a href="/build/" class="footer-build-link" aria-label="Build telemetry">build</a> <span class="footer-build-time" data-build-time>$build-time$</span>
· <a href="$url$.sig" class="footer-sig-link" aria-label="PGP signature for this page" title="Ed25519 signing subkey C9A42A6F AD444FBE 566FD738 531BDC1C C2707066 · public key at /gpg/pubkey.asc">sig</a>
</div>
</footer>

View File

@ -2,6 +2,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1">
$if(home)$<title>Levi Neuwirth</title>$else$$if(title)$<title>$title$ — Levi Neuwirth</title>$else$<title>Levi Neuwirth</title>$endif$$endif$
$if(description)$<meta name="description" content="$description$">$endif$
$if(noindex)$<meta name="robots" content="noindex">$endif$
<link rel="canonical" href="$site-url$$url$">
<link rel="alternate" type="application/atom+xml" title="Levi Neuwirth" href="/feed.xml">
<link rel="alternate" type="application/atom+xml" title="Levi Neuwirth — music" href="/music/feed.xml">
@ -49,6 +50,7 @@ $if(build)$<link rel="stylesheet" href="/css/build.css">$endif$
$if(reading)$<link rel="stylesheet" href="/css/reading.css">$endif$
$if(composition)$<link rel="stylesheet" href="/css/score-reader.css">$endif$
$if(photography)$<link rel="stylesheet" href="/css/photography.css">$endif$
$if(archive)$<link rel="stylesheet" href="/css/archive.css">$endif$
$if(photography-map)$<link rel="stylesheet" href="/leaflet/leaflet.css">$endif$
$if(photography-map)$<link rel="stylesheet" href="/leaflet/MarkerCluster.css">$endif$
$if(photography-map)$<link rel="stylesheet" href="/leaflet/MarkerCluster.Default.css">$endif$

View File

@ -1,5 +1,8 @@
<li class="item-card">
<li class="item-card$if(has-monogram)$ item-card--has-monogram$endif$">
<span class="item-card-kind">$item-kind$</span>
$if(has-monogram)$
<span class="item-card-monogram" aria-hidden="true">$monogramSvg$</span>
$endif$
<div class="item-card-main">
<div class="item-card-header">
<a class="item-card-title" href="$url$">$title$</a>

View File

@ -1,15 +1,13 @@
<div id="reading-progress" aria-hidden="true"></div>
<main id="markdownBody" data-pagefind-body$if(no-collapse)$ data-no-collapse$endif$>
<header class="essay-frontmatter essay-frontmatter--reading">
$if(monogramSvg)$
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
$endif$
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$partial("templates/partials/metadata-header.html")$
</div>
</header>
<header class="essay-frontmatter essay-frontmatter--reading">
<div class="frontmatter-mark-slot frontmatter-mark-slot--left">$monogramSvg$</div>
<div class="frontmatter-title">
<h1 class="page-title">$title$</h1>
$if(subtitle)$<p class="essay-subtitle">$subtitle$</p>$endif$
$partial("templates/partials/metadata-header.html")$
</div>
</header>
<div class="essay-tailmatter">
$partial("templates/partials/metadata-tail.html")$
$if(summary)$
<div class="essay-summary" data-pagefind-ignore="all">
@ -17,9 +15,11 @@
$summary$
</div>
$endif$
<div class="content-divider" aria-hidden="true">
<a href="/new.html" class="content-divider-logo" aria-label="New"></a>
</div>
</div>
<div class="content-divider content-divider--frontmatter" aria-hidden="true">
<a href="/new.html" class="content-divider-logo" aria-label="New"></a>
</div>
<main id="markdownBody" data-pagefind-body$if(no-collapse)$ data-no-collapse$endif$>
$body$
</main>
$partial("templates/partials/page-footer.html")$

1151
tools/archive.py Normal file

File diff suppressed because it is too large Load Diff

225
tools/audit-marks.py Executable file
View File

@ -0,0 +1,225 @@
#!/usr/bin/env python3
"""Audit frontmatter marks (monograms + epistemic figures).
Walks ``content/**/*.md``, resolves each piece's monogram candidate
path, checks whether ``mark.svg`` exists and whether ``status:`` is
set, and emits a table plus corpus-wide coverage percentages. Output
is pure ASCII so it pipes / scrolls cleanly.
Run as::
make audit-marks
or directly via::
uv run python tools/audit-marks.py
Exit code is always 0; this is a report tool, not a gate.
The dual-form path resolver matches ``build/Marks.hs``:
* ``content/essays/foo.md`` -> ``content/essays/foo.mark.svg``
* ``content/essays/foo/index.md`` -> ``content/essays/foo/mark.svg``
Photography is excluded: visual content doesn't carry monograms or
epistemic figures by design (see PHOTOGRAPHY.md).
"""
from __future__ import annotations
import sys
from dataclasses import dataclass
from pathlib import Path
import yaml
CONTENT_ROOT = Path("content")
# Sections that ship marks by design — these get a coverage line in
# the summary even when empty (so a regression is visible). Other
# sections appear in the summary only when they contain pieces.
PRIMARY_SECTIONS = ("essays", "blog", "poetry", "fiction", "music")
# Excluded entirely: visual content (PHOTOGRAPHY.md), in-progress
# drafts, and the per-portal tag-meta sidecar tree (which is metadata
# infrastructure, not authored pieces).
SKIPPED_DIRS = ("photography", "drafts", "tag-meta")
@dataclass
class AuditRow:
"""One row of audit output for a single source file."""
path: Path
section: str
has_monogram: bool
has_status: bool
@property
def suggestion(self) -> str:
actions = []
if not self.has_monogram:
actions.append("add mark.svg")
if not self.has_status:
actions.append("set status:")
return ", ".join(actions)
def parse_frontmatter(md_path: Path) -> dict:
"""Extract the YAML frontmatter block from a Markdown file.
Returns an empty dict on parse failure or when no frontmatter is
present. Errors are non-fatal the audit reports what it can."""
try:
text = md_path.read_text(encoding="utf-8", errors="replace")
except OSError:
return {}
if not text.startswith("---"):
return {}
end = text.find("\n---", 3)
if end == -1:
return {}
fm_block = text[3:end]
try:
data = yaml.safe_load(fm_block)
except yaml.YAMLError:
return {}
return data if isinstance(data, dict) else {}
def monogram_path(md_path: Path) -> Path:
"""Resolve the candidate ``mark.svg`` path for a Markdown source.
Mirrors ``Marks.monogramCandidates`` in build/Marks.hs."""
if md_path.name == "index.md":
return md_path.parent / "mark.svg"
return md_path.with_suffix(".mark.svg")
def section_of(path: Path) -> str:
"""Bucket a content path under its top-level section name.
Returns ``"standalone"`` for files directly under ``content/``."""
rel = path.relative_to(CONTENT_ROOT)
if len(rel.parts) == 1:
return "standalone"
return rel.parts[0]
def collect() -> list[AuditRow]:
"""Walk content/ and return one AuditRow per published source file."""
rows: list[AuditRow] = []
for md_path in CONTENT_ROOT.rglob("*.md"):
rel = md_path.relative_to(CONTENT_ROOT)
if rel.parts and rel.parts[0] in SKIPPED_DIRS:
continue
# Skip tag-meta sidecars (they're not authored pages).
if md_path.name == "_tag-meta.md":
continue
fm = parse_frontmatter(md_path)
rows.append(
AuditRow(
path=md_path,
section=section_of(md_path),
has_monogram=monogram_path(md_path).is_file(),
has_status="status" in fm and bool(str(fm["status"]).strip()),
)
)
rows.sort(
key=lambda r: (
r.section != "standalone", # standalone last
r.section,
not r.has_status,
not r.has_monogram,
str(r.path),
)
)
return rows
def fmt_check(present: bool) -> str:
return "OK" if present else "--"
def render_table(rows: list[AuditRow]) -> None:
if not rows:
print("No content files found under content/.")
return
path_w = max(len(str(r.path)) for r in rows)
path_w = min(path_w, 60) # cap so suggestions stay on the same line
header = f"{'PATH':<{path_w}} {'MONO':<5} {'EPIS':<5} SUGGESTION"
print(header)
print("-" * len(header))
current_section = None
for r in rows:
if r.section != current_section:
current_section = r.section
print(f"\n# {current_section}")
path_str = str(r.path)
if len(path_str) > path_w:
path_str = path_str[: path_w - 1] + "..."
print(
f"{path_str:<{path_w}} "
f"{fmt_check(r.has_monogram):<5} "
f"{fmt_check(r.has_status):<5} "
f"{r.suggestion}"
)
def render_summary(rows: list[AuditRow]) -> None:
print()
print("# Coverage")
print("-" * 60)
by_section: dict[str, list[AuditRow]] = {}
for r in rows:
by_section.setdefault(r.section, []).append(r)
def line(label: str, group: list[AuditRow]) -> None:
n = len(group)
if n == 0:
return
m = sum(1 for r in group if r.has_monogram)
e = sum(1 for r in group if r.has_status)
print(
f"{label:<14} {n:>3} pieces "
f"monogram {m:>3}/{n:<3} ({m * 100 // n:>3}%) "
f"epistemic {e:>3}/{n:<3} ({e * 100 // n:>3}%)"
)
rendered: set[str] = set()
for section in PRIMARY_SECTIONS:
if section in by_section:
line(section, by_section[section])
rendered.add(section)
other_sections = sorted(s for s in by_section if s not in rendered)
for section in other_sections:
line(section, by_section[section])
print("-" * 60)
line("total", rows)
def main() -> int:
if not CONTENT_ROOT.is_dir():
print(f"error: {CONTENT_ROOT}/ not found (run from repo root)",
file=sys.stderr)
return 1
rows = collect()
render_table(rows)
render_summary(rows)
return 0
if __name__ == "__main__":
raise SystemExit(main())

BIN
tools/bin/monolith Executable file

Binary file not shown.

View File

@ -48,7 +48,16 @@ MIN_SCORE = 0.30 # similar-links: discard weak matches
MIN_PARA_CHARS = 80 # semantic: skip very short paragraphs
MAX_PARA_CHARS = 1000 # semantic: truncate before embedding
EXCLUDE_URLS = {"/search/", "/build/", "/404.html", "/feed.xml", "/music/feed.xml"}
# /archive/ is the archive index — a list page that would dominate every
# entry's "Related" set; the individual /archive/<slug>/ pages stay in.
EXCLUDE_URLS = {"/search/", "/build/", "/404.html", "/feed.xml",
"/music/feed.xml", "/archive/"}
# Whole subtrees kept out of the corpus. /source/ is the repository code
# mirror — source files, not content; left in, they pollute every page's
# "Related" set and semantic search (e.g. a template file surfacing as a
# neighbour, titled with its unrendered "$title$" placeholder).
EXCLUDE_PREFIXES = ("/source/",)
# Pages whose <body data-portal> are portal/landing pages — they aggregate
# excerpts from many entries and would otherwise dominate every page's
@ -122,7 +131,7 @@ def extract_page(html_path: Path) -> dict | None:
soup = BeautifulSoup(raw, "html.parser")
url = _url_from_path(html_path)
if url in EXCLUDE_URLS:
if url in EXCLUDE_URLS or url.startswith(EXCLUDE_PREFIXES):
return None
body_tag = soup.body
if body_tag is not None and body_tag.has_attr(PORTAL_BODY_ATTR):

72
tools/hooks/pre-commit-marks.sh Executable file
View File

@ -0,0 +1,72 @@
#!/usr/bin/env bash
# Pre-commit advisory: warn when newly-added essay files are missing a
# monogram (mark.svg) or the epistemic status field. Warning only —
# this hook never blocks a commit. The author is the one staging, and
# the audit table at `make audit-marks` is the canonical view; this
# hook just nudges at the moment of commit.
#
# Install (one-time):
#
# ln -s ../../tools/hooks/pre-commit-marks.sh .git/hooks/pre-commit
#
# Or chain into an existing pre-commit:
#
# bash tools/hooks/pre-commit-marks.sh
#
# Scope: newly-added (status `A`) .md files under content/essays/.
# Modified files are not warned about — the author has presumably made
# a deliberate choice about marks by then.
set -u
# Newly-added .md files under content/essays/ in this commit.
mapfile -t added < <(
git diff --cached --name-status --diff-filter=A -- 'content/essays/*.md' \
| awk '{ print $2 }'
)
if [[ ${#added[@]} -eq 0 ]]; then
exit 0
fi
warnings=0
for path in "${added[@]}"; do
# Resolve the dual-form mark.svg candidate path. Mirrors
# build/Marks.hs and tools/audit-marks.py.
if [[ "$(basename -- "$path")" == "index.md" ]]; then
mark="$(dirname -- "$path")/mark.svg"
else
mark="${path%.md}.mark.svg"
fi
has_mark=0
has_status=0
[[ -f "$mark" ]] && has_mark=1
# Best-effort frontmatter probe: does any line in the YAML head
# block start with `status:`? Avoids a YAML dependency in the
# hook, which has to run before the build environment is sourced.
if awk '/^---$/{f++; next} f==1 && /^status:[[:space:]]*[^[:space:]]/{print; exit}' \
-- "$path" \
| grep -q .; then
has_status=1
fi
if [[ $has_mark -eq 0 || $has_status -eq 0 ]]; then
if [[ $warnings -eq 0 ]]; then
echo "[marks] advisory: newly-added essays missing marks:" >&2
fi
msgs=()
[[ $has_mark -eq 0 ]] && msgs+=("no mark.svg at $mark")
[[ $has_status -eq 0 ]] && msgs+=("no status: in frontmatter")
printf ' %s — %s\n' "$path" "$(IFS=, ; echo "${msgs[*]}")" >&2
warnings=$((warnings + 1))
fi
done
if [[ $warnings -gt 0 ]]; then
echo "[marks] (advisory only — commit not blocked. \`make audit-marks\` for the full report.)" >&2
fi
exit 0

View File

@ -0,0 +1,17 @@
# Pinned monolith binary — the HTML-snapshot tool for the link archive.
#
# Unlike PDF.js / Leaflet (servable assets downloaded at build time and
# gitignored), monolith is a build-time *executable*: the binary itself is
# committed at tools/bin/monolith so `git clone` -> `make build` needs no
# network fetch and stays reproducible from a bare clone. See ARCHIVE.md.
#
# To re-vendor (version bump, or a build host on a different architecture):
# 1. Download the matching asset from
# https://github.com/Y2Z/monolith/releases
# 2. Place it at tools/bin/monolith and `chmod +x`.
# 3. Update the three values below; verify `tools/bin/monolith --version`.
# 4. Commit the binary and this file together.
version = 2.10.1
asset = monolith-gnu-linux-x86_64
sha256 = 663ca914b078e91d5a854b4a07e913c613bbbcfe8fb11a24da1a6ab23c9205df

77
tools/stamp-build-time.py Executable file
View File

@ -0,0 +1,77 @@
#!/usr/bin/env python3
"""Post-build sweep: stamp the site-wide build time into every footer.
Why this exists
---------------
build/Contexts.hs binds $build-time$ to getCurrentTime at item-compile
time, but Hakyll caches outputs. Pages whose dependencies have not
changed are not recompiled, so the previously-rendered timestamp stays
on disk and the footer drifts per page. We want one site-wide
"last built at" stamp, so this script walks _site/**/*.html after
Hakyll runs and rewrites the contents of every wrapped element.
Format must match build/Contexts.hs:buildTimeField exactly so a fresh
build (where Hakyll renders the timestamp itself) and the sweep agree.
"""
from __future__ import annotations
import os
import re
import sys
from datetime import datetime, timezone
def ordinal_suffix(day: int) -> str:
if 11 <= day <= 13:
return "th"
return {1: "st", 2: "nd", 3: "rd"}.get(day % 10, "th")
def format_now() -> str:
now = datetime.now(timezone.utc)
return (
f"{now.strftime('%A, %B')} "
f"{now.day}{ordinal_suffix(now.day)}, "
f"{now.strftime('%Y %H:%M:%S')}"
)
PATTERN = re.compile(
rb'(<span class="footer-build-time" data-build-time>)[^<]*(</span>)'
)
def stamp_file(path: str, replacement_bytes: bytes) -> bool:
with open(path, "rb") as f:
data = f.read()
new_data, count = PATTERN.subn(
lambda m: m.group(1) + replacement_bytes + m.group(2),
data,
)
if count and new_data != data:
with open(path, "wb") as f:
f.write(new_data)
return True
return False
def main(root: str) -> int:
if not os.path.isdir(root):
print(f"stamp-build-time: {root} not found", file=sys.stderr)
return 1
timestamp = format_now().encode("utf-8")
rewritten = 0
scanned = 0
for dirpath, _, files in os.walk(root):
for name in files:
if not name.endswith(".html"):
continue
scanned += 1
if stamp_file(os.path.join(dirpath, name), timestamp):
rewritten += 1
print(f"stamp-build-time: rewrote {rewritten}/{scanned} HTML files")
return 0
if __name__ == "__main__":
sys.exit(main(sys.argv[1] if len(sys.argv) > 1 else "_site"))