affiliation, cabal helper script
This commit is contained in:
parent
5ec418b1b3
commit
728afd4c68
|
|
@ -16,6 +16,17 @@ data/backlinks.json
|
|||
data/build-stats.json
|
||||
data/build-start.txt
|
||||
data/last-build-seconds.txt
|
||||
data/semantic-index.bin
|
||||
data/semantic-meta.json
|
||||
|
||||
# IGNORE.txt is for the local build and need not be synced.
|
||||
IGNORE.txt
|
||||
|
||||
# Model files for client-side semantic search (~22 MB binary artifacts).
|
||||
# Download with: make download-model
|
||||
static/models/
|
||||
|
||||
# Generated WebP companions (produced by tools/convert-images.sh at build time).
|
||||
# To intentionally commit a WebP, use: git add -f path/to/file.webp
|
||||
static/**/*.webp
|
||||
content/**/*.webp
|
||||
|
|
@ -0,0 +1,129 @@
|
|||
I think this is exactly right. Let me lay out the concrete plan.
|
||||
|
||||
The insight driving your approach is: **the personality is the filter, and that's intentional.** The kind of collaborators, advisors, and readers you want are precisely the ones who encounter a homepage that opens with intellectual warmth and curiosity and think *yes, this is someone I want to talk to*. The professional information is there for those who need it, but it's not the first thing — it's the second or third, placed where someone who's already interested can find it instantly.
|
||||
|
||||
Here's how I'd structure the page, top to bottom:
|
||||
|
||||
---
|
||||
|
||||
### The Plan
|
||||
|
||||
**1. Your name.** Large, in Spectral. Not centered — left-aligned, consistent with how essay titles appear on the rest of the site. No subtitle, no description line. Just the name. The site header and nav sit above it as usual.
|
||||
|
||||
**2. The introduction.** Two to four paragraphs of prose. This is the heart of the homepage and the thing that does the most work. It should be written the way you'd introduce yourself to someone interesting at a dinner party — not "I am a computer scientist" but something closer to what's already there, except tighter and more specific. The current introduction is good but it has two problems: the first sentence ("You have reached the website of...") is a web convention from 2003 that undersells you, and the middle veers into site-navigation explanation ("This website is organized broadly by Portals and Tags...") which is functional information masquerading as prose. Strip the navigational explanation entirely — readers will figure out portals and tags by using them, and if they don't, the search works. Replace it with something that communicates what you actually care about, what you're working on right now, what this place *is*. Your Me page demonstrates that you write about yourself with genuine flair. Channel that here, compressed.
|
||||
|
||||
The Latin welcome at the end of the current introduction (*Te accipio, hospes benignus*) is a lovely touch and should stay — it's exactly the kind of earned ornament that signals personality. But it should be the capstone of the introduction, not buried after a navigation guide.
|
||||
|
||||
**3. The professional row.** A single horizontal line of compact links, visually quiet, in Fira Sans at a slightly smaller size than body text. Something like:
|
||||
|
||||
> Biography · CV · Email · GitHub · ORCID · GPG
|
||||
|
||||
No labels explaining what these are. Anyone who needs your CV knows what "CV" means. This row is for the professor who just reviewed your paper and wants to know more, the potential collaborator who met you at a conference, the PhD committee member checking your background. They get what they need in a glance, without the homepage making a big deal about it.
|
||||
|
||||
Styled as a single line with middot separators, perhaps in smallcaps or with slightly muted color. It should be clearly present but visually subordinate to the prose above and the portals below. Think of it as a utility strip.
|
||||
|
||||
**4. The curiosities row.** A second horizontal line, same visual treatment as the professional row, linking to the unusual features of the site — the things that make levineuwirth.org distinctive and that signal to a certain kind of visitor that they're in the right place:
|
||||
|
||||
> Memento Mori · Build · Commonplace · Colophon · Library · Random
|
||||
|
||||
These are the easter eggs promoted to the front door. Someone who sees "Memento Mori" and "Build Telemetry" on a personal homepage and clicks them is exactly the kind of person you want reading your site. These links serve as a personality signal as much as a navigation element — they say *this site has layers, and the author thinks about infrastructure, mortality, and curation*.
|
||||
|
||||
This row should be visually parallel to the professional row — same size, same treatment — so that together they read as two lines of a compact directory. The professional row is "here's how to reach me in the conventional world." The curiosities row is "here's what makes this place unusual."
|
||||
|
||||
**5. Portal links.** Below both rows, after some breathing room, the portals. Not as cards. As a simple vertical list or a wrapped horizontal line, each portal as a plain text link. If you want brief annotations (and I think you should, because several portal names are ambiguous to a first-time visitor), they should be very short — five to ten words, in muted text:
|
||||
|
||||
> Research — formal inquiry and open problems
|
||||
> Nonfiction — essays, criticism, living documents
|
||||
> Fiction — stories and a novel in progress
|
||||
> Poetry — verse, formal and free
|
||||
> Music — compositions, scores, and recordings
|
||||
> AI — on intelligence, artificial and otherwise
|
||||
> Tech — systems, tools, and craft
|
||||
> Miscellany — everything that defies category
|
||||
|
||||
Each annotation is just enough to tell a new visitor what they'll find behind the door, without turning the portal list into a card grid. The annotations should be in your voice — "everything that defies category" is better than "blog posts and other content."
|
||||
|
||||
**6. A "Recently" section (optional, add when corpus supports it).** Below the portals, 3–5 most recently published or substantially revised items. Title, date, portal tag. Auto-populated by Hakyll `recentFirst`. This is Direction B from your HOMEPAGE.md — the heartbeat that rewards returning visitors. I'd defer this until you have enough content that the list changes meaningfully between visits (probably 8–10 published pieces). When it's ready, it goes here.
|
||||
|
||||
**7. Footer.** As it currently is — the build timestamp, the license, the sig. No change needed.
|
||||
|
||||
---
|
||||
|
||||
### What's removed
|
||||
|
||||
- The card grid. Gone entirely.
|
||||
- The "How to navigate this site" collapsible (already removed, stays removed).
|
||||
- The contact row as a separate visual element at the bottom. The contact links are absorbed into the professional row (item 3).
|
||||
- The "Random Page" card. Random moves to the curiosities row, where it fits perfectly.
|
||||
- The "About" card. The Me page is linked from the nav and from the Biography link in the professional row.
|
||||
|
||||
### What's new
|
||||
|
||||
- The professional row (compact, one line, visually quiet).
|
||||
- The curiosities row (compact, one line, same treatment).
|
||||
- Portal annotations (very short, in your voice).
|
||||
- The introduction is rewritten to be tighter and more personal.
|
||||
|
||||
### What's unchanged
|
||||
|
||||
- Your name at the top.
|
||||
- The nav bar and portal dropdown.
|
||||
- The footer.
|
||||
- The overall monochrome palette and typographic system.
|
||||
|
||||
---
|
||||
|
||||
### Visual rhythm of the page
|
||||
|
||||
```
|
||||
[nav bar with portals dropdown and settings]
|
||||
|
||||
Levi Neuwirth ← large, Spectral
|
||||
|
||||
Two to four paragraphs of introduction ← body text, Spectral
|
||||
in your voice. What you care about,
|
||||
what you're working on, what this place
|
||||
is. Te accipio, hospes benignus.
|
||||
|
||||
Biography · CV · Email · GitHub · ← compact row, Fira Sans, muted
|
||||
ORCID · GPG
|
||||
|
||||
Memento Mori · Build · Commonplace · ← compact row, same treatment
|
||||
Colophon · Library · Random
|
||||
|
||||
← breathing room
|
||||
|
||||
Research — formal inquiry ← portal list, Fira Sans
|
||||
Nonfiction — essays and living documents
|
||||
Fiction — stories and a novel
|
||||
Poetry — verse, formal and free
|
||||
Music — compositions and scores
|
||||
AI — on intelligence
|
||||
Tech — systems and craft
|
||||
Miscellany — everything else
|
||||
|
||||
← (future: Recently section)
|
||||
|
||||
[footer: license, build, sig]
|
||||
```
|
||||
|
||||
The entire page fits on one screen at desktop width (assuming 3–4 paragraphs of introduction). There is nothing to scroll past, nothing to decode, no interface to learn. A reader arrives, encounters your voice, sees where to find professional information, notices some intriguing links, and chooses a portal. The page takes about 45 seconds to read, which is exactly right for a homepage.
|
||||
|
||||
---
|
||||
|
||||
### On the transition toward G
|
||||
|
||||
This design degrades gracefully into Direction G over time. As your reputation grows and the corpus deepens, you can:
|
||||
|
||||
1. Shorten the introduction to one paragraph, then to one sentence, then to nothing.
|
||||
2. Drop the portal annotations as readers learn what each portal contains.
|
||||
3. Eventually remove the professional row (it moves to the Me page permanently).
|
||||
4. What remains: your name, the curiosities row (which becomes a signature element of the site), and the portal list.
|
||||
|
||||
Each step is a subtraction, and each subtraction signals growing confidence. The infrastructure supports all of these states without any engineering changes — it's just editing `index.md` and the homepage template.
|
||||
|
||||
---
|
||||
|
||||
### The one thing I'd encourage you to write first
|
||||
|
||||
The introduction. Not a draft — the real thing. Sit down, write it as if you're explaining to a curious stranger at a dinner party what this website is and why it exists, and what's on your mind right now. Don't worry about whether it's "good enough for a homepage." The Me page proves you can write this kind of thing with warmth and depth. The introduction is the same skill, compressed. Once that prose exists, the rest of the homepage design falls into place around it — it's just CSS and template work.
|
||||
13
Makefile
13
Makefile
|
|
@ -1,4 +1,4 @@
|
|||
.PHONY: build deploy sign watch clean dev
|
||||
.PHONY: build deploy sign download-model convert-images watch clean dev
|
||||
|
||||
# Source .env for GITHUB_TOKEN and GITHUB_REPO if it exists.
|
||||
# .env format: KEY=value (one per line, no `export` prefix, no quotes needed).
|
||||
|
|
@ -9,6 +9,7 @@ build:
|
|||
@git add content/
|
||||
@git diff --cached --quiet || git commit -m "auto: $$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
@date +%s > data/build-start.txt
|
||||
@./tools/convert-images.sh
|
||||
cabal run site -- build
|
||||
pagefind --site _site
|
||||
@if [ -d .venv ]; then \
|
||||
|
|
@ -24,6 +25,16 @@ build:
|
|||
sign:
|
||||
@./tools/sign-site.sh
|
||||
|
||||
# Download the quantized ONNX model for client-side semantic search.
|
||||
# Run once; files are gitignored. Safe to re-run (skips existing files).
|
||||
download-model:
|
||||
@./tools/download-model.sh
|
||||
|
||||
# Convert JPEG/PNG images to WebP companions (also runs automatically in build).
|
||||
# Requires cwebp: pacman -S libwebp / apt install webp
|
||||
convert-images:
|
||||
@./tools/convert-images.sh
|
||||
|
||||
deploy: build sign
|
||||
@if [ -z "$(GITHUB_TOKEN)" ] || [ -z "$(GITHUB_REPO)" ]; then \
|
||||
echo "Skipping GitHub push: set GITHUB_TOKEN and GITHUB_REPO in .env"; \
|
||||
|
|
|
|||
28
WRITING.md
28
WRITING.md
|
|
@ -47,6 +47,9 @@ authors: # optional; overrides the default "Levi Neuwirth" link
|
|||
- "Levi Neuwirth | /me.html"
|
||||
- "Collaborator | https://their.site"
|
||||
- "Plain Name" # URL optional; omit for plain-text credit
|
||||
affiliation: # optional; shown below author in metadata block
|
||||
- "Brown University | https://cs.brown.edu"
|
||||
- "Some Research Lab" # URL optional; scalar string also accepted
|
||||
further-reading: # optional; see Citations section
|
||||
- someKey
|
||||
- anotherKey
|
||||
|
|
@ -271,9 +274,14 @@ not derivable from the page title.
|
|||
|
||||
## Math
|
||||
|
||||
KaTeX renders client-side from raw LaTeX. CSS and JS are loaded conditionally
|
||||
on pages that have `math: true` set in their context (all essays and posts have
|
||||
this by default).
|
||||
Pandoc parses LaTeX math and wraps it in `class="math inline"` / `class="math display"`
|
||||
spans. KaTeX CSS is loaded conditionally on pages that contain math — this styles the
|
||||
pre-rendered output. Client-side KaTeX JS rendering is not yet loaded; complex math
|
||||
will appear as LaTeX source. Build-time server-side rendering is planned but not yet
|
||||
implemented. Simple math currently renders through Pandoc's built-in KaTeX span output.
|
||||
|
||||
`math: true` is auto-set for all essays and blog posts. Standalone pages that use
|
||||
math must set it explicitly in frontmatter to load the KaTeX CSS.
|
||||
|
||||
| Syntax | Usage |
|
||||
|--------|-------|
|
||||
|
|
@ -671,19 +679,19 @@ The [CPU]{.smallcaps} handles this.
|
|||
|
||||
Wrap any paragraph in a `::: dropcap` fenced div to get a drop cap regardless
|
||||
of its position in the document. The first line is automatically rendered in
|
||||
small caps.
|
||||
small caps via `::first-line { font-variant-caps: small-caps }`.
|
||||
|
||||
```markdown
|
||||
::: dropcap
|
||||
COMPOSITION [IS]{.smallcaps} PERHAPS MORE THAN ANYTHING ELSE THE PRACTICE OF MY
|
||||
LIFE. I say these strong words because I feel strongly about this process.
|
||||
A personal website is not a publication. It is a position — something you
|
||||
inhabit, argue from, and occasionally revise in public.
|
||||
:::
|
||||
```
|
||||
|
||||
The opening word (or words, before a space) should be written in ALL CAPS in
|
||||
source — they will render as small caps via `::first-line`. The `[IS]{.smallcaps}`
|
||||
span is not strictly necessary but can force specific words into the smallcaps
|
||||
run if needed.
|
||||
Write in normal mixed case. The CSS applies `font-variant-caps: small-caps` to
|
||||
the entire first rendered line, converting lowercase letters to small-cap glyphs.
|
||||
Use `[WORD]{.smallcaps}` spans to force specific words into small-caps anywhere
|
||||
in the paragraph.
|
||||
|
||||
A paragraph that immediately follows a `::: dropcap` block will be indented
|
||||
correctly (`text-indent: 1.5em`), matching the paragraph-after-paragraph rule.
|
||||
|
|
|
|||
|
|
@ -28,6 +28,38 @@ import SimilarLinks (similarLinksField)
|
|||
import Stability (stabilityField, lastReviewedField, versionHistoryField)
|
||||
import Tags (tagLinksField)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Affiliation field
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Parses the @affiliation@ frontmatter key and exposes each entry as
|
||||
-- @affiliation-name@ / @affiliation-url@ pairs.
|
||||
--
|
||||
-- Accepts a scalar string or a YAML list. Each entry may use pipe syntax:
|
||||
-- @"Brown University | https://cs.brown.edu"@
|
||||
-- Entries without a URL still produce a row; @affiliation-url@ fails
|
||||
-- (evaluates to noResult), so @$if(affiliation-url)$@ works in templates.
|
||||
--
|
||||
-- Usage:
|
||||
-- $for(affiliation-links)$
|
||||
-- $if(affiliation-url)$<a href="$affiliation-url$">$affiliation-name$</a>
|
||||
-- $else$$affiliation-name$$endif$$sep$ · $endfor$
|
||||
affiliationField :: Context a
|
||||
affiliationField = listFieldWith "affiliation-links" ctx $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
let entries = case lookupStringList "affiliation" meta of
|
||||
Just xs -> xs
|
||||
Nothing -> maybe [] (:[]) (lookupString "affiliation" meta)
|
||||
return $ map (Item (fromFilePath "") . parseEntry) entries
|
||||
where
|
||||
ctx = field "affiliation-name" (return . fst . itemBody)
|
||||
<> field "affiliation-url" (\i -> let u = snd (itemBody i)
|
||||
in if null u then noResult "no url" else return u)
|
||||
parseEntry s = case break (== '|') s of
|
||||
(name, '|' : url) -> (trim name, trim url)
|
||||
(name, _) -> (trim name, "")
|
||||
trim = reverse . dropWhile (== ' ') . reverse . dropWhile (== ' ')
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Build time field
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -162,6 +194,7 @@ epistemicCtx =
|
|||
essayCtx :: Context String
|
||||
essayCtx =
|
||||
authorLinksField
|
||||
<> affiliationField
|
||||
<> snapshotField "toc" "toc"
|
||||
<> snapshotField "word-count" "word-count"
|
||||
<> snapshotField "reading-time" "reading-time"
|
||||
|
|
@ -184,6 +217,7 @@ essayCtx =
|
|||
postCtx :: Context String
|
||||
postCtx =
|
||||
authorLinksField
|
||||
<> affiliationField
|
||||
<> backlinksField
|
||||
<> similarLinksField
|
||||
<> dateField "date" "%-d %B %Y"
|
||||
|
|
@ -196,7 +230,7 @@ postCtx =
|
|||
-- ---------------------------------------------------------------------------
|
||||
|
||||
pageCtx :: Context String
|
||||
pageCtx = authorLinksField <> siteCtx
|
||||
pageCtx = authorLinksField <> affiliationField <> siteCtx
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Reading contexts (fiction + poetry)
|
||||
|
|
|
|||
|
|
@ -14,7 +14,8 @@ import qualified Filters.Links as Links
|
|||
import qualified Filters.Smallcaps as Smallcaps
|
||||
import qualified Filters.Dropcaps as Dropcaps
|
||||
import qualified Filters.Math as Math
|
||||
import qualified Filters.Wikilinks as Wikilinks
|
||||
import qualified Filters.Wikilinks as Wikilinks
|
||||
import qualified Filters.Transclusion as Transclusion
|
||||
import qualified Filters.Code as Code
|
||||
import qualified Filters.Images as Images
|
||||
|
||||
|
|
@ -34,4 +35,4 @@ applyAll
|
|||
-- | Apply source-level preprocessors to the raw Markdown string.
|
||||
-- Run before 'readPandocWith'.
|
||||
preprocessSource :: String -> String
|
||||
preprocessSource = Wikilinks.preprocess
|
||||
preprocessSource = Transclusion.preprocess . Wikilinks.preprocess
|
||||
|
|
|
|||
|
|
@ -1,23 +1,23 @@
|
|||
{-# LANGUAGE GHC2021 #-}
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
-- | Image attribute filter.
|
||||
-- | Image filter: lazy loading, lightbox markers, and WebP <picture> wrappers.
|
||||
--
|
||||
-- Walks all @Image@ inlines and:
|
||||
-- * Adds @loading="lazy"@ to every image.
|
||||
-- * Adds @data-lightbox="true"@ to images that are NOT already wrapped in
|
||||
-- a @Link@ inline (i.e. the image is not itself a hyperlink).
|
||||
-- For local raster images (JPG, JPEG, PNG, GIF), emits a @<picture>@ element
|
||||
-- with a WebP @<source>@ and the original format as the @<img>@ fallback.
|
||||
-- tools/convert-images.sh produces the companion .webp files at build time.
|
||||
--
|
||||
-- The wrapping-link check is done by walking the document with two passes:
|
||||
-- a block-level walk that handles the common @Link [Image …] …@ pattern,
|
||||
-- and a plain image walk that stamps @loading="lazy"@ on everything else.
|
||||
-- SVG files and external URLs are passed through with only lazy loading
|
||||
-- (and lightbox markers for standalone images).
|
||||
module Filters.Images (apply) where
|
||||
|
||||
import Data.Char (toLower)
|
||||
import Data.Text (Text)
|
||||
import qualified Data.Text as T
|
||||
import System.FilePath (replaceExtension)
|
||||
import Text.Pandoc.Definition
|
||||
import Text.Pandoc.Walk (walk)
|
||||
|
||||
-- | Apply image attribute injection to the entire document.
|
||||
-- | Apply image attribute injection and WebP wrapping to the entire document.
|
||||
apply :: Pandoc -> Pandoc
|
||||
apply = walk transformInline
|
||||
|
||||
|
|
@ -25,32 +25,118 @@ apply = walk transformInline
|
|||
-- Core transformation
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Process a single inline node.
|
||||
--
|
||||
-- * @Link … [Image …] …@ — image inside a link: add only @loading="lazy"@.
|
||||
-- * @Image …@ — standalone image: add both @loading="lazy"@ and
|
||||
-- @data-lightbox="true"@.
|
||||
-- * Anything else — pass through unchanged.
|
||||
transformInline :: Inline -> Inline
|
||||
transformInline (Link lAttr ils lTarget) =
|
||||
-- Recurse into link contents, but mark any images inside as linked
|
||||
-- (so they receive lazy loading only, no lightbox marker).
|
||||
Link lAttr (map (addLazyOnly) ils) lTarget
|
||||
-- Recurse into link contents; images inside a link get no lightbox marker.
|
||||
Link lAttr (map wrapLinkedImg ils) lTarget
|
||||
where
|
||||
addLazyOnly (Image iAttr alt iTarget) =
|
||||
Image (addAttr "loading" "lazy" iAttr) alt iTarget
|
||||
addLazyOnly x = x
|
||||
wrapLinkedImg (Image iAttr alt iTarget) = renderImg iAttr alt iTarget False
|
||||
wrapLinkedImg x = x
|
||||
transformInline (Image attr alt target) =
|
||||
Image (addAttr "data-lightbox" "true" (addAttr "loading" "lazy" attr)) alt target
|
||||
renderImg attr alt target True
|
||||
transformInline x = x
|
||||
|
||||
-- | Dispatch on image type:
|
||||
-- * Local raster → @<picture>@ with WebP @<source>@
|
||||
-- * Everything else → plain @<img>@ with loading/lightbox attrs
|
||||
renderImg :: Attr -> [Inline] -> Target -> Bool -> Inline
|
||||
renderImg attr alt target@(src, _) lightbox
|
||||
| isLocalRaster (T.unpack src) =
|
||||
RawInline (Format "html") (renderPicture attr alt target lightbox)
|
||||
| otherwise =
|
||||
Image (addLightbox lightbox (addAttr "loading" "lazy" attr)) alt target
|
||||
where
|
||||
addLightbox True a = addAttr "data-lightbox" "true" a
|
||||
addLightbox False a = a
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Attribute helpers
|
||||
-- <picture> rendering
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Prepend a key=value pair to an @Attr@'s key-value list (if not already
|
||||
-- present, to avoid duplicating attributes that come from Markdown).
|
||||
-- | Emit a @<picture>@ element with a WebP @<source>@ and an @<img>@ fallback.
|
||||
renderPicture :: Attr -> [Inline] -> Target -> Bool -> Text
|
||||
renderPicture (ident, classes, kvs) alt (src, title) lightbox =
|
||||
T.concat
|
||||
[ "<picture>"
|
||||
, "<source srcset=\"", T.pack webpSrc, "\" type=\"image/webp\">"
|
||||
, "<img"
|
||||
, attrId ident
|
||||
, attrClasses classes
|
||||
, " src=\"", esc src, "\""
|
||||
, attrAlt alt
|
||||
, attrTitle title
|
||||
, " loading=\"lazy\""
|
||||
, if lightbox then " data-lightbox=\"true\"" else ""
|
||||
, renderKvs passedKvs
|
||||
, ">"
|
||||
, "</picture>"
|
||||
]
|
||||
where
|
||||
webpSrc = replaceExtension (T.unpack src) ".webp"
|
||||
-- Strip attrs we handle explicitly so they don't appear twice.
|
||||
passedKvs = filter (\(k, _) -> k `notElem` ["loading", "data-lightbox"]) kvs
|
||||
|
||||
attrId :: Text -> Text
|
||||
attrId t = if T.null t then "" else " id=\"" <> esc t <> "\""
|
||||
|
||||
attrClasses :: [Text] -> Text
|
||||
attrClasses [] = ""
|
||||
attrClasses cs = " class=\"" <> T.intercalate " " (map esc cs) <> "\""
|
||||
|
||||
attrAlt :: [Inline] -> Text
|
||||
attrAlt ils = let t = stringify ils
|
||||
in if T.null t then "" else " alt=\"" <> esc t <> "\""
|
||||
|
||||
attrTitle :: Text -> Text
|
||||
attrTitle t = if T.null t then "" else " title=\"" <> esc t <> "\""
|
||||
|
||||
renderKvs :: [(Text, Text)] -> Text
|
||||
renderKvs = T.concat . map (\(k, v) -> " " <> k <> "=\"" <> esc v <> "\"")
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Helpers
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | True for local (non-URL) images with a raster format we can convert.
|
||||
isLocalRaster :: FilePath -> Bool
|
||||
isLocalRaster src = not (isUrl src) && lowerExt src `elem` [".jpg", ".jpeg", ".png", ".gif"]
|
||||
|
||||
isUrl :: String -> Bool
|
||||
isUrl s = any (`isPrefixOf` s) ["http://", "https://", "//", "data:"]
|
||||
where isPrefixOf pfx str = take (length pfx) str == pfx
|
||||
|
||||
-- | Extension of a path, lowercased (e.g. ".JPG" → ".jpg").
|
||||
lowerExt :: FilePath -> String
|
||||
lowerExt = map toLower . reverse . ('.' :) . takeWhile (/= '.') . tail . dropWhile (/= '.') . reverse
|
||||
|
||||
-- | Prepend a key=value pair if not already present.
|
||||
addAttr :: Text -> Text -> Attr -> Attr
|
||||
addAttr k v (ident, classes, kvs)
|
||||
| any ((== k) . fst) kvs = (ident, classes, kvs)
|
||||
| otherwise = (ident, classes, (k, v) : kvs)
|
||||
addAttr k v (i, cs, kvs)
|
||||
| any ((== k) . fst) kvs = (i, cs, kvs)
|
||||
| otherwise = (i, cs, (k, v) : kvs)
|
||||
|
||||
-- | Plain-text content of a list of inlines (for alt text).
|
||||
stringify :: [Inline] -> Text
|
||||
stringify = T.concat . map go
|
||||
where
|
||||
go (Str t) = t
|
||||
go Space = " "
|
||||
go SoftBreak = " "
|
||||
go LineBreak = " "
|
||||
go (Emph ils) = stringify ils
|
||||
go (Strong ils) = stringify ils
|
||||
go (Code _ t) = t
|
||||
go (Link _ ils _) = stringify ils
|
||||
go (Image _ ils _) = stringify ils
|
||||
go (Span _ ils) = stringify ils
|
||||
go _ = ""
|
||||
|
||||
-- | HTML-escape a text value for use in attribute values.
|
||||
esc :: Text -> Text
|
||||
esc = T.concatMap escChar
|
||||
where
|
||||
escChar '&' = "&"
|
||||
escChar '<' = "<"
|
||||
escChar '>' = ">"
|
||||
escChar '"' = """
|
||||
escChar c = T.singleton c
|
||||
|
|
|
|||
|
|
@ -0,0 +1,60 @@
|
|||
{-# LANGUAGE GHC2021 #-}
|
||||
-- | Source-level transclusion preprocessor.
|
||||
--
|
||||
-- Rewrites block-level {{slug}} and {{slug#section}} directives to raw
|
||||
-- HTML placeholders that transclude.js resolves at runtime.
|
||||
--
|
||||
-- A directive must be the sole content of a line (after trimming) to be
|
||||
-- replaced — this prevents accidental substitution inside prose or code.
|
||||
--
|
||||
-- Examples:
|
||||
-- {{my-essay}} → full-page transclusion of /my-essay.html
|
||||
-- {{essays/deep-dive}} → /essays/deep-dive.html (full body)
|
||||
-- {{my-essay#introduction}} → section "introduction" of /my-essay.html
|
||||
module Filters.Transclusion (preprocess) where
|
||||
|
||||
import Data.List (isSuffixOf, isPrefixOf, stripPrefix)
|
||||
|
||||
-- | Apply transclusion substitution to the raw Markdown source string.
|
||||
preprocess :: String -> String
|
||||
preprocess = unlines . map processLine . lines
|
||||
|
||||
processLine :: String -> String
|
||||
processLine line =
|
||||
case parseDirective (trim line) of
|
||||
Nothing -> line
|
||||
Just (url, secAttr) ->
|
||||
"<div class=\"transclude\" data-src=\"" ++ url ++ "\""
|
||||
++ secAttr ++ "></div>"
|
||||
|
||||
-- | Parse a {{slug}} or {{slug#section}} directive.
|
||||
-- Returns (absolute-url, section-attribute-string) or Nothing.
|
||||
parseDirective :: String -> Maybe (String, String)
|
||||
parseDirective s = do
|
||||
inner <- stripPrefix "{{" s >>= stripSuffix "}}"
|
||||
case break (== '#') inner of
|
||||
("", _) -> Nothing
|
||||
(slug, "") -> Just (slugToUrl slug, "")
|
||||
(slug, '#' : sec)
|
||||
| null sec -> Just (slugToUrl slug, "")
|
||||
| otherwise -> Just (slugToUrl slug,
|
||||
" data-section=\"" ++ sec ++ "\"")
|
||||
_ -> Nothing
|
||||
|
||||
-- | Convert a slug (possibly with leading slash, possibly with path segments)
|
||||
-- to a root-relative .html URL.
|
||||
slugToUrl :: String -> String
|
||||
slugToUrl slug
|
||||
| "/" `isPrefixOf` slug = slug ++ ".html"
|
||||
| otherwise = "/" ++ slug ++ ".html"
|
||||
|
||||
-- | Strip a suffix from a string, returning Nothing if not present.
|
||||
stripSuffix :: String -> String -> Maybe String
|
||||
stripSuffix suf str
|
||||
| suf `isSuffixOf` str = Just (take (length str - length suf) str)
|
||||
| otherwise = Nothing
|
||||
|
||||
-- | Strip leading and trailing spaces.
|
||||
trim :: String -> String
|
||||
trim = f . f
|
||||
where f = reverse . dropWhile (== ' ')
|
||||
|
|
@ -18,6 +18,14 @@ import Tags (buildAllTags, applyTagRules)
|
|||
import Pagination (blogPaginateRules)
|
||||
import Stats (statsRules)
|
||||
|
||||
-- Poems inside collection subdirectories, excluding their index pages.
|
||||
collectionPoems :: Pattern
|
||||
collectionPoems = "content/poetry/*/*.md" .&&. complement "content/poetry/*/index.md"
|
||||
|
||||
-- All poetry content (flat + collection), excluding collection index pages.
|
||||
allPoetry :: Pattern
|
||||
allPoetry = "content/poetry/*.md" .||. collectionPoems
|
||||
|
||||
feedConfig :: FeedConfiguration
|
||||
feedConfig = FeedConfiguration
|
||||
{ feedTitle = "Levi Neuwirth"
|
||||
|
|
@ -144,9 +152,17 @@ rules = do
|
|||
>>= loadAndApplyTemplate "templates/default.html" commonplaceCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
match "content/colophon.md" $ do
|
||||
route $ constRoute "colophon.html"
|
||||
compile $ essayCompiler
|
||||
>>= loadAndApplyTemplate "templates/essay.html" essayCtx
|
||||
>>= loadAndApplyTemplate "templates/default.html" essayCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
match ("content/*.md"
|
||||
.&&. complement "content/index.md"
|
||||
.&&. complement "content/commonplace.md") $ do
|
||||
.&&. complement "content/commonplace.md"
|
||||
.&&. complement "content/colophon.md") $ do
|
||||
route $ gsubRoute "content/" (const "")
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ pageCompiler
|
||||
|
|
@ -181,6 +197,7 @@ rules = do
|
|||
-- ---------------------------------------------------------------------------
|
||||
-- Poetry
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Flat poems (e.g. content/poetry/sonnet-60.md)
|
||||
match "content/poetry/*.md" $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
`composeRoutes` setExtension "html"
|
||||
|
|
@ -190,6 +207,24 @@ rules = do
|
|||
>>= loadAndApplyTemplate "templates/default.html" poetryCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- Collection poems (e.g. content/poetry/shakespeare-sonnets/sonnet-1.md)
|
||||
match collectionPoems $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ poetryCompiler
|
||||
>>= saveSnapshot "content"
|
||||
>>= loadAndApplyTemplate "templates/reading.html" poetryCtx
|
||||
>>= loadAndApplyTemplate "templates/default.html" poetryCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- Collection index pages (e.g. content/poetry/shakespeare-sonnets/index.md)
|
||||
match "content/poetry/*/index.md" $ do
|
||||
route $ gsubRoute "content/poetry/" (const "poetry/")
|
||||
`composeRoutes` setExtension "html"
|
||||
compile $ pageCompiler
|
||||
>>= loadAndApplyTemplate "templates/default.html" pageCtx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Fiction
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -291,7 +326,7 @@ rules = do
|
|||
essays <- loadAll ("content/essays/*.md" .&&. hasNoVersion)
|
||||
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion)
|
||||
fiction <- loadAll ("content/fiction/*.md" .&&. hasNoVersion)
|
||||
poetry <- loadAll ("content/poetry/*.md" .&&. hasNoVersion)
|
||||
poetry <- loadAll (allPoetry .&&. hasNoVersion)
|
||||
filtered <- filterM (hasPortal p) (essays ++ posts ++ fiction ++ poetry)
|
||||
recentFirst filtered
|
||||
|
||||
|
|
@ -337,7 +372,7 @@ rules = do
|
|||
( ( "content/essays/*.md"
|
||||
.||. "content/blog/*.md"
|
||||
.||. "content/fiction/*.md"
|
||||
.||. "content/poetry/*.md"
|
||||
.||. allPoetry
|
||||
.||. "content/music/*/index.md"
|
||||
)
|
||||
.&&. hasNoVersion
|
||||
|
|
|
|||
275
build/Stats.hs
275
build/Stats.hs
|
|
@ -8,12 +8,14 @@ module Stats (statsRules) where
|
|||
|
||||
import Control.Exception (IOException, catch)
|
||||
import Control.Monad (forM)
|
||||
import Data.List (find, isSuffixOf, sortBy)
|
||||
import Data.List (find, isSuffixOf, sort, sortBy)
|
||||
import qualified Data.Map.Strict as Map
|
||||
import Data.Maybe (catMaybes, fromMaybe, isJust, listToMaybe)
|
||||
import Data.Ord (comparing, Down (..))
|
||||
import qualified Data.Set as Set
|
||||
import Data.Time (getCurrentTime, formatTime, defaultTimeLocale)
|
||||
import Data.Time (getCurrentTime, formatTime, defaultTimeLocale,
|
||||
Day, parseTimeM, utctDay, addDays, diffDays)
|
||||
import Data.Time.Calendar (toGregorian, dayOfWeek)
|
||||
import System.Directory (doesDirectoryExist, getFileSize, listDirectory)
|
||||
import System.Exit (ExitCode (..))
|
||||
import System.FilePath (takeExtension, (</>))
|
||||
|
|
@ -109,6 +111,197 @@ normUrl u
|
|||
| ".html" `isSuffixOf` u = take (length u - 5) u
|
||||
| otherwise = u
|
||||
|
||||
pad2 :: (Show a, Integral a) => a -> String
|
||||
pad2 n = if n < 10 then "0" ++ show n else show n
|
||||
|
||||
-- | Median of a non-empty list; returns 0 for empty.
|
||||
median :: [Int] -> Int
|
||||
median [] = 0
|
||||
median xs = let s = sort xs in s !! (length s `div` 2)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Date helpers (for /stats/ page)
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
parseDay :: String -> Maybe Day
|
||||
parseDay = parseTimeM True defaultTimeLocale "%Y-%m-%d"
|
||||
|
||||
-- | First Monday on or before 'day' (start of its ISO week).
|
||||
weekStart :: Day -> Day
|
||||
weekStart day = addDays (fromIntegral (negate (fromEnum (dayOfWeek day)))) day
|
||||
|
||||
-- | Intensity class for the heatmap (hm0 … hm4).
|
||||
heatClass :: Int -> String
|
||||
heatClass 0 = "hm0"
|
||||
heatClass n | n < 500 = "hm1"
|
||||
heatClass n | n < 2000 = "hm2"
|
||||
heatClass n | n < 5000 = "hm3"
|
||||
heatClass _ = "hm4"
|
||||
|
||||
shortMonth :: Int -> String
|
||||
shortMonth m = case m of
|
||||
1 -> "Jan"; 2 -> "Feb"; 3 -> "Mar"; 4 -> "Apr"
|
||||
5 -> "May"; 6 -> "Jun"; 7 -> "Jul"; 8 -> "Aug"
|
||||
9 -> "Sep"; 10 -> "Oct"; 11 -> "Nov"; 12 -> "Dec"
|
||||
_ -> ""
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Heatmap SVG
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | 52-week writing activity heatmap (inline SVG, CSS-variable colors).
|
||||
renderHeatmap :: Map.Map Day Int -> Day -> String
|
||||
renderHeatmap wordsByDay today =
|
||||
let cellSz = 10 :: Int
|
||||
gap = 2 :: Int
|
||||
step = cellSz + gap
|
||||
hdrH = 22 :: Int -- vertical space for month labels
|
||||
nWeeks = 52
|
||||
-- First Monday of the 52-week window
|
||||
startDay = addDays (fromIntegral (-(nWeeks - 1)) * 7) (weekStart today)
|
||||
nDays = diffDays today startDay + 1
|
||||
allDays = [addDays i startDay | i <- [0 .. nDays - 1]]
|
||||
weekOf d = fromIntegral (diffDays d startDay `div` 7) :: Int
|
||||
dowOf d = fromEnum (dayOfWeek d) -- Mon=0..Sun=6
|
||||
svgW = (nWeeks - 1) * step + cellSz
|
||||
svgH = 6 * step + cellSz + hdrH
|
||||
|
||||
-- Month labels: one per first-of-month day
|
||||
monthLbls = concatMap (\d ->
|
||||
let (_, mo, da) = toGregorian d
|
||||
in if da == 1
|
||||
then "<text class=\"hm-lbl\" x=\"" ++ show (weekOf d * step)
|
||||
++ "\" y=\"14\">" ++ shortMonth mo ++ "</text>"
|
||||
else "") allDays
|
||||
|
||||
-- One rect per day
|
||||
cells = concatMap (\d ->
|
||||
let wc = fromMaybe 0 (Map.lookup d wordsByDay)
|
||||
(yr, mo, da) = toGregorian d
|
||||
x = weekOf d * step
|
||||
y = dowOf d * step + hdrH
|
||||
tip = show yr ++ "-" ++ pad2 mo ++ "-" ++ pad2 da
|
||||
++ if wc > 0 then ": " ++ commaInt wc ++ " words" else ""
|
||||
in "<rect class=\"" ++ heatClass wc ++ "\""
|
||||
++ " x=\"" ++ show x ++ "\" y=\"" ++ show y ++ "\""
|
||||
++ " width=\"" ++ show cellSz ++ "\" height=\"" ++ show cellSz ++ "\""
|
||||
++ " rx=\"2\"><title>" ++ tip ++ "</title></rect>") allDays
|
||||
|
||||
-- Inline legend (five sample rects)
|
||||
legendW = 5 * step - gap
|
||||
legendSvg =
|
||||
"<svg width=\"" ++ show legendW ++ "\" height=\"" ++ show cellSz ++ "\""
|
||||
++ " viewBox=\"0 0 " ++ show legendW ++ " " ++ show cellSz ++ "\""
|
||||
++ " style=\"display:inline;vertical-align:middle\">"
|
||||
++ concatMap (\i ->
|
||||
"<rect class=\"hm" ++ show i ++ "\""
|
||||
++ " x=\"" ++ show (i * step) ++ "\" y=\"0\""
|
||||
++ " width=\"" ++ show cellSz ++ "\" height=\"" ++ show cellSz ++ "\""
|
||||
++ " rx=\"2\"/>") [0..4]
|
||||
++ "</svg>"
|
||||
|
||||
in "<figure class=\"stats-heatmap\">"
|
||||
++ "<svg width=\"" ++ show svgW ++ "\" height=\"" ++ show svgH ++ "\""
|
||||
++ " viewBox=\"0 0 " ++ show svgW ++ " " ++ show svgH ++ "\""
|
||||
++ " class=\"heatmap-svg\" role=\"img\""
|
||||
++ " aria-label=\"52-week writing activity heatmap\">"
|
||||
++ "<style>"
|
||||
++ ".hm0{fill:var(--hm-0)}.hm1{fill:var(--hm-1)}.hm2{fill:var(--hm-2)}"
|
||||
++ ".hm3{fill:var(--hm-3)}.hm4{fill:var(--hm-4)}"
|
||||
++ ".hm-lbl{font-size:9px;fill:var(--text-faint);font-family:sans-serif}"
|
||||
++ "</style>"
|
||||
++ monthLbls ++ cells
|
||||
++ "</svg>"
|
||||
++ "<figcaption class=\"heatmap-legend\">"
|
||||
++ "Less\xA0" ++ legendSvg ++ "\xA0More"
|
||||
++ "</figcaption>"
|
||||
++ "</figure>"
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Stats page sections
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
renderMonthlyVolume :: Map.Map Day Int -> String
|
||||
renderMonthlyVolume wordsByDay =
|
||||
section "volume" "Monthly volume" $
|
||||
let byMonth = Map.fromListWith (+)
|
||||
[ ((y, m), wc)
|
||||
| (day, wc) <- Map.toList wordsByDay
|
||||
, let (y, m, _) = toGregorian day
|
||||
]
|
||||
in if Map.null byMonth
|
||||
then "<p><em>No dated content yet.</em></p>"
|
||||
else
|
||||
let maxWC = max 1 $ maximum $ Map.elems byMonth
|
||||
bar (y, m) =
|
||||
let wc = fromMaybe 0 (Map.lookup (y, m) byMonth)
|
||||
pct = if wc == 0 then 0 else max 2 (wc * 100 `div` maxWC)
|
||||
lbl = shortMonth m ++ " \x2019" ++ drop 2 (show y)
|
||||
in "<div class=\"build-bar-row\">"
|
||||
++ "<span class=\"build-bar-label\">" ++ lbl ++ "</span>"
|
||||
++ "<span class=\"build-bar-wrap\"><span class=\"build-bar\" style=\"width:"
|
||||
++ show pct ++ "%\"></span></span>"
|
||||
++ "<span class=\"build-bar-count\">"
|
||||
++ (if wc > 0 then commaInt wc else "") ++ "</span>"
|
||||
++ "</div>"
|
||||
in "<div class=\"build-bars\">" ++ concatMap bar (Map.keys byMonth) ++ "</div>"
|
||||
|
||||
renderCorpus :: [TypeRow] -> [PageInfo] -> String
|
||||
renderCorpus typeRows allPIs =
|
||||
section "corpus" "Corpus" $ concat
|
||||
[ dl [ ("Total words", commaInt totalWords)
|
||||
, ("Total pages", commaInt (length allPIs))
|
||||
, ("Total reading time", rtStr totalWords)
|
||||
, ("Average length", commaInt avgWC ++ " words")
|
||||
, ("Median length", commaInt medWC ++ " words")
|
||||
]
|
||||
, table ["Type", "Pages", "Words", "Reading time"]
|
||||
(map row typeRows)
|
||||
(Just ["Total", commaInt (sum (map trCount typeRows))
|
||||
, commaInt totalWords, rtStr totalWords])
|
||||
]
|
||||
where
|
||||
hasSomeWC = filter (\p -> piWC p > 0) allPIs
|
||||
totalWords = sum (map trWords typeRows)
|
||||
avgWC = if null hasSomeWC then 0 else totalWords `div` length hasSomeWC
|
||||
medWC = median (map piWC hasSomeWC)
|
||||
row r = [trLabel r, commaInt (trCount r), commaInt (trWords r), rtStr (trWords r)]
|
||||
|
||||
renderNotable :: [PageInfo] -> String
|
||||
renderNotable allPIs =
|
||||
section "notable" "Notable" $ concat
|
||||
[ "<p><strong>Longest</strong></p>"
|
||||
, pageList (take 5 (sortBy (comparing (Down . piWC)) hasSomeWC))
|
||||
, "<p><strong>Shortest</strong></p>"
|
||||
, pageList (take 5 (sortBy (comparing piWC) hasSomeWC))
|
||||
]
|
||||
where
|
||||
hasSomeWC = filter (\p -> piWC p > 50) allPIs
|
||||
pageList ps = "<ol class=\"build-page-list\">"
|
||||
++ concatMap (\p -> "<li>" ++ link (piUrl p) (piTitle p)
|
||||
++ " \x2014 " ++ commaInt (piWC p) ++ " words</li>") ps
|
||||
++ "</ol>"
|
||||
|
||||
renderStatsTags :: [(String, Int)] -> Int -> String
|
||||
renderStatsTags topTags uniqueCount =
|
||||
section "tags" "Tags" $ concat
|
||||
[ dl [("Unique tags", commaInt uniqueCount)]
|
||||
, table ["Tag", "Items"] (map row topTags) Nothing
|
||||
]
|
||||
where row (t, n) = [link ("/" ++ t ++ "/") t, show n]
|
||||
|
||||
statsTOC :: String
|
||||
statsTOC = "<ol>\n" ++ concatMap item entries ++ "</ol>\n"
|
||||
where
|
||||
item (i, t) = "<li><a href=\"#" ++ i ++ "\" data-target=\"" ++ i ++ "\">"
|
||||
++ t ++ "</a></li>\n"
|
||||
entries = [ ("activity", "Writing activity")
|
||||
, ("volume", "Monthly volume")
|
||||
, ("corpus", "Corpus")
|
||||
, ("notable", "Notable")
|
||||
, ("tags", "Tags")
|
||||
]
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- IO: output directory walk
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -380,8 +573,12 @@ pageTOC = "<ol>\n" ++ concatMap item sections ++ "</ol>\n"
|
|||
-- ---------------------------------------------------------------------------
|
||||
|
||||
statsRules :: Tags -> Rules ()
|
||||
statsRules tags =
|
||||
create ["build/index.html"] $ do
|
||||
statsRules tags = do
|
||||
|
||||
-- -------------------------------------------------------------------------
|
||||
-- Build telemetry page (/build/)
|
||||
-- -------------------------------------------------------------------------
|
||||
create ["build/index.html"] $ do
|
||||
route idRoute
|
||||
compile $ do
|
||||
-- ----------------------------------------------------------------
|
||||
|
|
@ -530,3 +727,73 @@ statsRules tags =
|
|||
>>= loadAndApplyTemplate "templates/essay.html" ctx
|
||||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
||||
-- -------------------------------------------------------------------------
|
||||
-- Writing statistics page (/stats/)
|
||||
-- -------------------------------------------------------------------------
|
||||
create ["stats/index.html"] $ do
|
||||
route idRoute
|
||||
compile $ do
|
||||
essays <- loadAll ("content/essays/*.md" .&&. hasNoVersion)
|
||||
posts <- loadAll ("content/blog/*.md" .&&. hasNoVersion)
|
||||
poems <- loadAll ("content/poetry/*.md" .&&. hasNoVersion)
|
||||
fiction <- loadAll ("content/fiction/*.md" .&&. hasNoVersion)
|
||||
comps <- loadAll ("content/music/*/index.md" .&&. hasNoVersion)
|
||||
|
||||
essayWCs <- mapM loadWC essays
|
||||
postWCs <- mapM loadWC posts
|
||||
poemWCs <- mapM loadWC poems
|
||||
fictionWCs <- mapM loadWC fiction
|
||||
compWCs <- mapM loadWC comps
|
||||
|
||||
let allItems = essays ++ posts ++ poems ++ fiction ++ comps
|
||||
typeRows =
|
||||
[ TypeRow "Essays" (length essays) (sum essayWCs)
|
||||
, TypeRow "Blog posts" (length posts) (sum postWCs)
|
||||
, TypeRow "Poems" (length poems) (sum poemWCs)
|
||||
, TypeRow "Fiction" (length fiction) (sum fictionWCs)
|
||||
, TypeRow "Compositions" (length comps) (sum compWCs)
|
||||
]
|
||||
|
||||
allPIs <- catMaybes <$> mapM loadPI allItems
|
||||
|
||||
-- Build wordsByDay: for each item with a parseable `date`, map that
|
||||
-- day to the item's word count (summing if multiple items share a date).
|
||||
datePairs <- fmap catMaybes $ forM allItems $ \item -> do
|
||||
meta <- getMetadata (itemIdentifier item)
|
||||
wc <- loadWC item
|
||||
return $ case lookupString "date" meta >>= parseDay of
|
||||
Nothing -> Nothing
|
||||
Just d -> Just (d, wc)
|
||||
let wordsByDay = Map.fromListWith (+) datePairs
|
||||
|
||||
let tagFreqs = map (\(t, ids) -> (t, length ids)) (tagsMap tags)
|
||||
topTags = take 15 (sortBy (comparing (Down . snd)) tagFreqs)
|
||||
uniqueTags = length tagFreqs
|
||||
|
||||
today <- unsafeCompiler (utctDay <$> getCurrentTime)
|
||||
|
||||
let content = concat
|
||||
[ section "activity" "Writing activity" (renderHeatmap wordsByDay today)
|
||||
, renderMonthlyVolume wordsByDay
|
||||
, renderCorpus typeRows allPIs
|
||||
, renderNotable allPIs
|
||||
, renderStatsTags topTags uniqueTags
|
||||
]
|
||||
plainText = stripHtmlTags content
|
||||
wc = length (words plainText)
|
||||
rt = readingTime plainText
|
||||
ctx = constField "toc" statsTOC
|
||||
<> constField "word-count" (show wc)
|
||||
<> constField "reading-time" (show rt)
|
||||
<> constField "title" "Writing Statistics"
|
||||
<> constField "abstract" "Writing activity, corpus breakdown, \
|
||||
\and tag distribution — computed at build time."
|
||||
<> constField "build" "true"
|
||||
<> authorLinksField
|
||||
<> siteCtx
|
||||
|
||||
makeItem content
|
||||
>>= loadAndApplyTemplate "templates/essay.html" ctx
|
||||
>>= loadAndApplyTemplate "templates/default.html" ctx
|
||||
>>= relativizeUrls
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ constraints: any.Glob ==0.10.2,
|
|||
any.Only ==0.1,
|
||||
any.QuickCheck ==2.15.0.1,
|
||||
any.StateVar ==1.2.2,
|
||||
any.aeson ==2.2.0.0,
|
||||
any.aeson ==2.2.1.0,
|
||||
any.aeson-pretty ==0.8.10,
|
||||
any.ansi-terminal ==1.1,
|
||||
any.ansi-terminal-types ==1.1,
|
||||
|
|
|
|||
|
|
@ -0,0 +1,11 @@
|
|||
---
|
||||
title: "The Philosophical Legacy of Dostoevsky's Implicit Rejection of Logic and Science in Part I of <em>Notes from Underground</em>"
|
||||
date: 2025-10-24
|
||||
abstract: >
|
||||
*Notes from Underground* is widely admired as a cornerstone of literature, culture, and philosophy. This paper develops the argument that the primary philosophical undercurrent is a rejection of logic and science as end-alls in modern life, traced through comparison with the more explicitly articulated works of Dostoevsky's contemporaries and successors: Nietzsche, Heidegger, Shestov, Ellul, Sartre, Camus, Husserl, and Arendt.
|
||||
tags:
|
||||
- nonfiction
|
||||
- nonfiction/philosophy
|
||||
authors:
|
||||
- "Levi Neuwirth | /me.html"
|
||||
---
|
||||
|
|
@ -32,6 +32,7 @@ executable site
|
|||
Filters.Dropcaps
|
||||
Filters.Smallcaps
|
||||
Filters.Wikilinks
|
||||
Filters.Transclusion
|
||||
Filters.Links
|
||||
Filters.Math
|
||||
Filters.Code
|
||||
|
|
|
|||
|
|
@ -13,7 +13,6 @@ dependencies = [
|
|||
"faiss-cpu>=1.9",
|
||||
"numpy>=2.0",
|
||||
"beautifulsoup4>=4.12",
|
||||
"einops>=0.8",
|
||||
# CPU-only torch — avoids pulling ~3 GB of CUDA libraries
|
||||
"torch>=2.5",
|
||||
]
|
||||
|
|
|
|||
71
spec.md
71
spec.md
|
|
@ -1,7 +1,7 @@
|
|||
# levineuwirth.org — Design Specification v9
|
||||
# levineuwirth.org — Design Specification v11
|
||||
|
||||
**Author:** Levi Neuwirth
|
||||
**Date:** March 2026 (v9: 20 March 2026)
|
||||
**Date:** March 2026 (v11: 21 March 2026)
|
||||
**Status:** LIVING DOCUMENT — Updated as implementation progresses.
|
||||
|
||||
---
|
||||
|
|
@ -178,8 +178,12 @@ build:
|
|||
@git add content/
|
||||
@git diff --cached --quiet || git commit -m "auto: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
@date +%s > data/build-start.txt
|
||||
@./tools/convert-images.sh # WebP conversion (skipped if cwebp absent)
|
||||
cabal run site -- build
|
||||
pagefind --site _site
|
||||
@if [ -d .venv ]; then \
|
||||
uv run python tools/embed.py || echo "Warning: embedding failed"; \
|
||||
fi
|
||||
> IGNORE.txt # clear stability pins after each build
|
||||
@BUILD_END=$(date +%s); BUILD_START=$(cat data/build-start.txt); \
|
||||
echo $((BUILD_END - BUILD_START)) > data/last-build-seconds.txt
|
||||
|
|
@ -195,6 +199,12 @@ watch:
|
|||
|
||||
clean:
|
||||
cabal run site -- clean
|
||||
|
||||
download-model:
|
||||
@./tools/download-model.sh # fetch quantized ONNX model to static/models/ (once per machine)
|
||||
|
||||
convert-images:
|
||||
@./tools/convert-images.sh # manual trigger; also runs in build
|
||||
```
|
||||
|
||||
### Hosting Timeline
|
||||
|
|
@ -214,7 +224,9 @@ server {
|
|||
ssl_certificate /etc/letsencrypt/live/levineuwirth.org/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/levineuwirth.org/privkey.pem;
|
||||
|
||||
add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self'; font-src 'self';" always;
|
||||
# cdn.jsdelivr.net required for transformers.js (semantic search library).
|
||||
# Model weights served same-origin from /models/ — connect-src stays 'self'.
|
||||
add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; img-src 'self'; font-src 'self';" always;
|
||||
|
||||
gzip on;
|
||||
gzip_types text/html text/css application/javascript application/json image/svg+xml;
|
||||
|
|
@ -285,11 +297,14 @@ levineuwirth.org/
|
|||
│ │ ├── annotations.js # localStorage highlight/annotation engine (UI deferred)
|
||||
│ │ ├── score-reader.js # Score reader: page-turn, movement jumps, deep linking
|
||||
│ │ ├── viz.js # Vega-Lite render + dark mode re-render via MutationObserver
|
||||
│ │ ├── semantic-search.js # Client-side semantic search: transformers.js + Float32Array cosine ranking
|
||||
│ │ ├── search.js # Pagefind UI init + ?q= pre-fill + search timing (#search-timing)
|
||||
│ │ └── prism.min.js # Syntax highlighting
|
||||
│ ├── fonts/ # Self-hosted WOFF2 (subsetted with OT features)
|
||||
│ ├── gpg/
|
||||
│ │ └── pubkey.asc # Ed25519 signing subkey public key (master: CD90AE96…; subkey: C9A42A6F…)
|
||||
│ ├── models/ # Self-hosted ONNX model (gitignored; run: make download-model)
|
||||
│ │ └── all-MiniLM-L6-v2/ # ~22 MB quantized — served at /models/ for semantic-search.js
|
||||
│ └── images/
|
||||
│ └── link-icons/ # SVG icons for external link classification
|
||||
│ ├── external.svg
|
||||
|
|
@ -331,7 +346,7 @@ levineuwirth.org/
|
|||
│ │ ├── Links.hs # External link classification + data-link-icon attributes
|
||||
│ │ ├── Math.hs # Simple LaTeX → Unicode conversion
|
||||
│ │ ├── Code.hs # Prepend language- prefix for Prism.js
|
||||
│ │ ├── Images.hs # Lazy loading, lightbox data-attributes
|
||||
│ │ ├── Images.hs # Lazy loading, lightbox data-attributes, WebP <picture> wrapper for local raster images
|
||||
│ │ ├── Score.hs # Score fragment SVG inlining + currentColor replacement
|
||||
│ │ └── Viz.hs # Visualization IO filter: runs Python scripts, inlines SVG / Vega-Lite JSON
|
||||
│ ├── Authors.hs # Author-as-tag system (slugify, authorLinksField, author pages)
|
||||
|
|
@ -350,7 +365,10 @@ levineuwirth.org/
|
|||
│ ├── subset-fonts.sh
|
||||
│ ├── viz_theme.py # Matplotlib monochrome helpers (apply_monochrome, save_svg, LINESTYLE_CYCLE)
|
||||
│ ├── sign-site.sh # Detach-sign every _site/**/*.html → .html.sig (called by `make sign`)
|
||||
│ └── preset-signing-passphrase.sh # Cache signing subkey passphrase in gpg-agent (run once per boot)
|
||||
│ ├── preset-signing-passphrase.sh # Cache signing subkey passphrase in gpg-agent (run once per boot)
|
||||
│ ├── download-model.sh # Fetch quantized ONNX model to static/models/ (run once per machine)
|
||||
│ ├── convert-images.sh # Convert JPEG/PNG → WebP companions via cwebp (runs automatically in build)
|
||||
│ └── embed.py # Build-time embedding pipeline: similar-links + semantic search index
|
||||
├── levineuwirth.cabal
|
||||
├── cabal.project
|
||||
├── cabal.project.freeze
|
||||
|
|
@ -397,7 +415,7 @@ levineuwirth.org/
|
|||
- [~] Annotations — `annotations.js` / `annotations.css`; localStorage infrastructure + highlight re-anchoring written; UI (button in selection popup) deferred
|
||||
|
||||
### Phase 4: Creative Content & Polish
|
||||
- [x] Image handling (lazy load, lightbox, figures)
|
||||
- [x] Image handling (lazy load, lightbox, figures, WebP `<picture>` wrapper for local raster images)
|
||||
- [x] Homepage (replaces standalone index; gateway + curated recent content)
|
||||
- [x] Poetry typesetting — codex reading mode (`reading.html`, `reading.css`, `reading.js`); `poetryCompiler` with `Ext_hard_line_breaks`; narrower measure, stanza spacing, drop-cap suppressed
|
||||
- [x] Fiction reading mode — same codex layout; `fictionCompiler`; chapter drop caps + smallcaps lead-in via `h2 + p::first-letter`; reading mode infrastructure shared with poetry
|
||||
|
|
@ -409,7 +427,7 @@ levineuwirth.org/
|
|||
|
||||
### Phase 5: Infrastructure & Advanced
|
||||
- [x] **Arch Linux VPS + nginx + certbot + DNS migration** — Hetzner VPS provisioned, Arch Linux installed, nginx configured (config in §III), TLS cert via certbot, DNS migrated from DreamHost. `make deploy` pushes to GitHub and rsyncs to VPS.
|
||||
- [ ] **Semantic embedding pipeline** — Superseded by Phase 6 "Embedding-powered similar links" (local model, no API cost).
|
||||
- [x] **Semantic embedding pipeline** — Implemented. See Phase 6 "Embedding-powered similar links" and "Full-text semantic search".
|
||||
- [x] **Backlinks with context** — Two-pass build-time system (`build/Backlinks.hs`). Pass 1: `version "links"` compiles each page lightly (wikilinks preprocessed, links + context extracted, serialised as JSON). Pass 2: `create ["data/backlinks.json"]` inverts the map. `backlinksField` in `essayCtx` / `postCtx` loads the JSON and renders `<details>`-collapsible per-entry lists. `popups.js` excludes `.backlink-source` links from the preview popup. Context paragraph uses `runPure . writeHtml5String` on the surrounding `Para` block. See Implementation Notes.
|
||||
- [ ] **Link archiving** — For all external links in `data/bibliography.bib` and in page bodies, check availability and save snapshots (Wayback Machine `save` API or local archivebox instance). Store archive URLs in `data/link-archive.json`; `Filters.Links` injects `data-archive-url` attributes; `popups.js` falls back to the archive if the live URL returns 404.
|
||||
- [ ] **Self-hosted git (Forgejo)** — Run Forgejo on the VPS. Mirror the build repo. Link from the colophon. Not essential; can remain on GitHub indefinitely.
|
||||
|
|
@ -428,10 +446,12 @@ levineuwirth.org/
|
|||
- [x] **Epistemic profile** — Replaces the old `certainty` / `importance` fields with a richer multi-axis system. **Compact** (always visible in footer): status chip · confidence % · importance dots · evidence dots. **Expanded** (`<details>`): stability (auto) · scope · novelty · practicality · last reviewed · confidence trend. Auto-calculation in `build/Stability.hs` via `git log --follow`; `IGNORE.txt` pins overrides. See Metadata section and Implementation Notes for full schema and vocabulary.
|
||||
- [ ] **Writing statistics dashboard** — A `/stats` page computed entirely at build time from the corpus. Contents: total word count across all content types, essay/post/poem count, words written per month rendered as a GitHub-style contribution heatmap (SVG generated by Haskell or a Python script), average and median essay length, longest essay, most-cited essay (by backlink count), tag distribution as a treemap, reading-time histogram, site growth over time (cumulative word count by date). All data collected during the Hakyll build from compiled items and their snapshots; serialized to `data/stats.json` and rendered into a dedicated `stats.html` template.
|
||||
- [x] **Memento mori** — Implemented at `/memento-mori/` as a full standalone page. 90×52 grid of weeks anchored to birthday anniversaries (nested year/week loop via `setFullYear`; week 52 stretched to eve of next birthday to absorb 365th/366th days). Week popup shows dynamic day-count and locale-derived day names. Score fragment (bassoon, `content/memento-mori/scores/bsn.svg`) inlined via `Filters.Score`. Linked from footer (MM).
|
||||
- [ ] **Embedding-powered similar links** — Precompute dense vector embeddings for every page using a local model (e.g. `nomic-embed-text` or `gte-large` via `ollama` or `llama.cpp`) on personal hardware — no API dependency, no per-call cost. At build time, a Python script reads `_site/` HTML, embeds each page, computes top-N cosine neighbors, and writes `data/similar-links.json` (slug → [{slug, title, score}]). Hakyll injects this into each page's context (via `Metadata.hs` reading the JSON); template renders a "Related" section in the page footer. Analogous to gwern's `GenerateSimilar.hs` but model-agnostic and self-hosted. Note: supersedes the Phase 5 "Semantic embedding pipeline" stub — that stub should be replaced by this when implemented.
|
||||
- [x] **Embedding-powered similar links** — `tools/embed.py` encodes every `#markdownBody` page with `all-MiniLM-L6-v2` (384 dims, unit-normalised), builds a FAISS `IndexFlatIP`, queries top-5 neighbours per page (cosine ≥ 0.30), writes `data/similar-links.json`. `build/SimilarLinks.hs` provides `similarLinksField` in `essayCtx`/`postCtx` with Hakyll dependency tracking; "Related" section rendered in `page-footer.html`. Staleness check skips re-embedding when JSON is newer than all HTML. Called by `make build` via `uv run`; non-fatal if `.venv` absent. See Implementation Notes.
|
||||
- [x] **Bidirectional backlinks with context** — See Phase 5 above; implemented with full context-paragraph extraction. Merged with the Phase 5 stub.
|
||||
- [x] **Signed pages / content integrity** — `make sign` (called by `make deploy`) runs `tools/sign-site.sh`: walks `_site/**/*.html`, produces a detached ASCII-armored `.sig` per page. Signing uses a dedicated Ed25519 subkey isolated in `~/.gnupg-signing/` (master `sec#` stub + `ssb` signing subkey). Passphrase cached 24 h in the signing agent via `tools/preset-signing-passphrase.sh` + `gpg-preset-passphrase`; `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase`. Footer "sig" link points to `$url$.sig`; hovering shows the ASCII armor via `popups.js` `sigContent` provider. Public key at `static/gpg/pubkey.asc` → served at `/gpg/pubkey.asc`. Fingerprints: master `CD90AE96…B5C9663`; signing subkey `C9A42A6F…2707066` (keygrip `619844703EC398E70B0045D7150F08179CFEEFE3`). See Implementation Notes.
|
||||
- [ ] **Full-text semantic search** — A secondary search mode alongside Pagefind's keyword index. Precompute embeddings for every paragraph (same pipeline as similar links). Store as a compact binary or JSON index. At query time, either: (a) compute the query embedding client-side using a small WASM model (e.g. `transformers.js` with a quantized MiniLM) and run cosine similarity against the stored paragraph vectors, or (b) use a precomputed query-expansion table (top-K words → relevant slugs, offline). Surfaced as a "Semantic search" toggle on `/search.html`. Returns paragraphs rather than pages as the result unit, with the source page title and a link to the specific section. This finds conceptually related content even when exact keywords differ — searching "the relationship between music and mathematics" surfaces relevant essays regardless of vocabulary.
|
||||
- [x] **Self-hosted semantic search model** — `tools/download-model.sh` fetches the quantized ONNX model (`all-MiniLM-L6-v2`, ~22 MB, 5 files) from HuggingFace into `static/models/all-MiniLM-L6-v2/` (gitignored). `semantic-search.js` sets `env.localModelPath = '/models/'` and `env.allowRemoteModels = false` before calling `pipeline()`, so all model weight fetches are same-origin. The CDN import of transformers.js itself still requires `cdn.jsdelivr.net` in `script-src`; `connect-src` stays `'self'`. `make download-model` is a one-time setup step per machine. See Implementation Notes.
|
||||
- [x] **Responsive images (WebP)** — `tools/convert-images.sh` walks `static/` and `content/`, calls `cwebp -q 85` to produce `.webp` companions alongside every JPEG/PNG (skips existing; exits gracefully if `cwebp` absent). `make build` runs it before Hakyll so WebP files are present when `static/**` is copied. `build/Filters/Images.hs` detects local raster images and emits `RawInline "html"` `<picture>` elements with a `<source srcset="…webp" type="image/webp">` and an `<img>` fallback; SVG, external URLs, and `data:` URIs pass through as plain `<img>`. Generated `.webp` files gitignored via `static/**/*.webp` / `content/**/*.webp`. See Implementation Notes.
|
||||
- [x] **Full-text semantic search** — `tools/embed.py` also produces paragraph-level embeddings (same `all-MiniLM-L6-v2` model, same pass): walks `<p>/<li>/<blockquote>` in `#markdownBody`, tracks nearest preceding heading for context, writes `data/semantic-index.bin` (raw Float32, N×384) + `data/semantic-meta.json` ([{url, title, heading, excerpt}]). Client: `static/js/semantic-search.js` dynamically imports `@xenova/transformers@2` from CDN, embeds the query, brute-force cosine-ranks all paragraph vectors (fast at <5k paragraphs in JS), renders top-8 results as title + section heading + excerpt. Surfaced on `/search.html` as a **Keyword / Semantic** tab strip; active tab persists in `localStorage` (keyword default on first visit). Both tabs on same page; `semantic-search.js` loaded within existing `$if(search)$` block. See Implementation Notes.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -668,6 +688,39 @@ Implemented across `build/Stability.hs`, `build/Contexts.hs`, `templates/partial
|
|||
|
||||
**nginx:** `.sig` files need no special handling — they're served as static files alongside `.html`. The `try_files` directive handles `$uri` directly.
|
||||
|
||||
### Embedding pipeline + semantic search
|
||||
|
||||
**Model unification:** Both similar-links (page-level) and semantic search (paragraph-level) use `all-MiniLM-L6-v2` (384 dims). This is a deliberate simplification: the same model runs at build time (Python/sentence-transformers) and query time (browser/transformers.js `Xenova/all-MiniLM-L6-v2` quantized), guaranteeing that query vectors and corpus vectors are in the same embedding space.
|
||||
|
||||
**Build-time (`tools/embed.py`):** One HTML parse pass per file extracts both the full-page text (for similar-links) and individual paragraphs (for semantic search). Model is loaded once and both encoding jobs run sequentially. Outputs: `data/similar-links.json`, `data/semantic-index.bin` (raw `float32`, shape `[N_paragraphs, 384]`), `data/semantic-meta.json`. All three are gitignored (generated). Staleness check: skips the entire run if all three outputs are newer than all `_site/` HTML.
|
||||
|
||||
**Binary index format:** `para_vecs.tobytes()` writes a flat, little-endian `float32` array. In JS: `new Float32Array(arrayBuffer)`. No header, no framing — row `i` starts at byte offset `i × 384 × 4`. This is the simplest possible format and avoids a numpy/npy parser in the browser.
|
||||
|
||||
**Client-side search (`semantic-search.js`):** Dynamically imports `@xenova/transformers@2` from jsDelivr CDN on first query (lazy — no load cost on pages that never use semantic search). Fetches binary index + metadata JSON (also lazy, browser-cached). Brute-force dot product over a `Float32Array` in a tight JS loop — fast enough at <5k paragraphs; revisit with a WASM FAISS binding if the corpus grows beyond ~20k paragraphs. Vectors are unit-normalised at build time, so dot product = cosine similarity.
|
||||
|
||||
**Tab default + localStorage:** Keyword (Pagefind) is the default on first visit — zero cold-start. User's last-used tab is stored under `search-tab` in `localStorage` and restored on load, so returning users who prefer semantic always land there. If `localStorage` is unavailable (private browsing restrictions), falls back silently to keyword.
|
||||
|
||||
**Self-hosted model:** `semantic-search.js` sets `mod.env.localModelPath = '/models/'` and `mod.env.allowRemoteModels = false` immediately after the CDN import. transformers.js then resolves model files as `GET /models/all-MiniLM-L6-v2/{file}`, which are same-origin. `make download-model` (= `tools/download-model.sh`) fetches 5 files from HuggingFace: `config.json`, `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`, `onnx/model_quantized.onnx`. The files live in `static/models/` (gitignored) and are copied to `_site/models/` by the existing `static/**` Hakyll rule.
|
||||
|
||||
**CSP:** `script-src` requires `https://cdn.jsdelivr.net` for the transformers.js library import. `connect-src` stays `'self'` — all model weight fetches are same-origin after `allowRemoteModels = false`. The binary index and meta JSON are also same-origin.
|
||||
|
||||
### Responsive images — WebP `<picture>` wrapping
|
||||
|
||||
`build/Filters/Images.hs` inspects each `Image` inline's `src`. If it is a local raster (not starting with `http://`, `https://`, `//`, or `data:`; extension `.jpg`/`.jpeg`/`.png`/`.gif`), it emits `RawInline (Format "html")` containing a `<picture>` element:
|
||||
|
||||
```html
|
||||
<picture>
|
||||
<source srcset="/images/foo.webp" type="image/webp">
|
||||
<img src="/images/foo.jpg" alt="…" loading="lazy" data-lightbox="true">
|
||||
</picture>
|
||||
```
|
||||
|
||||
The WebP `srcset` is computed at build time by `System.FilePath.replaceExtension`. No IO is needed in the filter — the `<source>` is always emitted; browsers silently ignore it if the file doesn't exist (falling back to `<img>`). SVG, external URLs, and `data:` URIs remain plain `<img>` tags. Images inside `<a>` links get no `data-lightbox` marker (same as before).
|
||||
|
||||
`tools/convert-images.sh` performs the actual conversion: `cwebp -q 85` per file. Runs before Hakyll in `make build` and is also available as a standalone `make convert-images` target. Exits 0 with a notice if `cwebp` is not installed, so the build never fails on machines without `libwebp`. Generated `.webp` files are gitignored; `git add -f` to commit an authored WebP.
|
||||
|
||||
**Quality note:** `-q 85` is a good default for photographic images. For pixel-art, diagrams, or images that are already highly compressed, `-lossless` or a higher quality setting may be appropriate (edit the script).
|
||||
|
||||
### Annotations (infrastructure only)
|
||||
`annotations.js` stores annotations as JSON in `localStorage` under `site-annotations`, scoped per `location.pathname`. On `DOMContentLoaded`, `applyAll()` re-anchors saved annotations via a `TreeWalker` text-stream search (concatenates all text nodes in `#markdownBody`, finds exact match by index, builds a `Range`, wraps with `<mark>`). Cross-element ranges use `extractContents()` + `insertNode()` fallback. Four highlight colors (amber / sage / steel / rose) defined in `annotations.css` as `rgba` overlays with `box-decoration-break: clone`. Hover tooltip shows note, date, and delete button. Public API: `window.Annotations.add(text, color, note)` / `.remove(id)`. The selection-popup "Annotate" button is currently removed pending a UI revision.
|
||||
|
||||
|
|
|
|||
|
|
@ -120,6 +120,13 @@
|
|||
|
||||
/* Transitions */
|
||||
--transition-fast: 0.15s ease;
|
||||
|
||||
/* Writing activity heatmap (light mode) */
|
||||
--hm-0: #e8e8e4; /* empty cell */
|
||||
--hm-1: #b4b4b0; /* < 500 words */
|
||||
--hm-2: #787874; /* 500–1999 words */
|
||||
--hm-3: #424240; /* 2000–4999 words */
|
||||
--hm-4: #1a1a1a; /* 5000+ words */
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -146,6 +153,13 @@
|
|||
|
||||
--selection-bg: #d4d0c8;
|
||||
--selection-text: #121212;
|
||||
|
||||
/* Writing activity heatmap (dark mode) */
|
||||
--hm-0: #252524;
|
||||
--hm-1: #484844;
|
||||
--hm-2: #6e6e6a;
|
||||
--hm-3: #9e9e9a;
|
||||
--hm-4: #d4d0c8;
|
||||
}
|
||||
|
||||
/* System dark mode fallback */
|
||||
|
|
@ -168,6 +182,12 @@
|
|||
|
||||
--selection-bg: #d4d0c8;
|
||||
--selection-text: #121212;
|
||||
|
||||
--hm-0: #252524;
|
||||
--hm-1: #484844;
|
||||
--hm-2: #6e6e6a;
|
||||
--hm-3: #9e9e9a;
|
||||
--hm-4: #d4d0c8;
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -188,6 +208,9 @@ html {
|
|||
line-height: var(--line-height);
|
||||
-webkit-text-size-adjust: 100%;
|
||||
scroll-behavior: smooth;
|
||||
/* clip (not hidden) — prevents horizontal scroll at the viewport level
|
||||
without creating a scroll container, so position:sticky still works. */
|
||||
overflow-x: clip;
|
||||
}
|
||||
|
||||
body {
|
||||
|
|
|
|||
|
|
@ -107,3 +107,41 @@
|
|||
margin: 0;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
Writing statistics page (/stats/)
|
||||
============================================================ */
|
||||
|
||||
/* Heatmap figure */
|
||||
.stats-heatmap {
|
||||
margin: 1.5rem 0 1rem;
|
||||
overflow-x: auto;
|
||||
}
|
||||
|
||||
.heatmap-svg {
|
||||
display: block;
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
/* Legend row below heatmap */
|
||||
.heatmap-legend {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.4rem;
|
||||
margin-top: 0.5rem;
|
||||
font-size: 0.75em;
|
||||
font-family: var(--font-ui, var(--font-sans));
|
||||
color: var(--text-faint);
|
||||
}
|
||||
|
||||
/* Ordered lists on the Notable section */
|
||||
.build-page-list {
|
||||
font-size: 0.95em;
|
||||
padding-left: 1.5rem;
|
||||
margin: 0.25rem 0 1rem;
|
||||
}
|
||||
|
||||
.build-page-list li {
|
||||
margin: 0.3rem 0;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -245,6 +245,63 @@ nav.site-nav {
|
|||
scroll-behavior: auto !important;
|
||||
}
|
||||
|
||||
/* ── Mobile nav (≤540px) ─────────────────────────────────────────────
|
||||
Controls are position:absolute on desktop (out of flex flow, pinned
|
||||
right). On narrow viewports they collide with the primary links.
|
||||
Fix: bring them into flow, wrap the row, add a separator line.
|
||||
──────────────────────────────────────────────────────────────────── */
|
||||
|
||||
@media (max-width: 540px) {
|
||||
.nav-row-primary {
|
||||
flex-wrap: wrap;
|
||||
justify-content: center;
|
||||
padding: 0.4rem 0.75rem;
|
||||
gap: 0.25rem 0;
|
||||
}
|
||||
|
||||
.nav-primary {
|
||||
flex-wrap: wrap;
|
||||
justify-content: center;
|
||||
font-size: 0.72rem;
|
||||
}
|
||||
|
||||
/* Larger vertical padding → usable tap targets without extra markup */
|
||||
.nav-primary a {
|
||||
padding: 0.35rem 0.55rem;
|
||||
}
|
||||
|
||||
/* border-left separators look broken on wrapped rows (the first link
|
||||
of a new row still gets a left border). Replace with a gap. */
|
||||
.nav-primary a + a {
|
||||
border-left: none;
|
||||
margin-left: 0.1rem;
|
||||
}
|
||||
|
||||
/* Pull controls out of absolute positioning, span full width, center */
|
||||
.nav-controls {
|
||||
position: static;
|
||||
width: 100%;
|
||||
justify-content: center;
|
||||
padding-top: 0.3rem;
|
||||
border-top: 1px solid var(--border);
|
||||
margin-top: 0.15rem;
|
||||
}
|
||||
|
||||
.nav-portal-toggle {
|
||||
padding: 0.3rem 0;
|
||||
}
|
||||
|
||||
/* Portal row: tighter side padding on narrow screens */
|
||||
.nav-portals {
|
||||
padding-left: 0.75rem;
|
||||
padding-right: 0.75rem;
|
||||
}
|
||||
|
||||
.nav-portals a {
|
||||
padding: 0.3rem 0.55rem;
|
||||
}
|
||||
}
|
||||
|
||||
/* Row 2: portal links — hidden until nav.js adds .is-open */
|
||||
.nav-portals {
|
||||
display: none;
|
||||
|
|
@ -500,6 +557,26 @@ nav.site-nav {
|
|||
font-variant-caps: normal;
|
||||
}
|
||||
|
||||
/* Affiliation: institution name below author, in sans to contrast the serif author row */
|
||||
.meta-affiliation {
|
||||
font-family: var(--font-sans);
|
||||
font-size: 0.7rem;
|
||||
font-variant-caps: all-small-caps;
|
||||
letter-spacing: 0.05em;
|
||||
color: var(--text-faint);
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.meta-affiliation a {
|
||||
color: var(--text-faint);
|
||||
text-decoration: none;
|
||||
transition: color var(--transition-fast);
|
||||
}
|
||||
|
||||
.meta-affiliation a:hover {
|
||||
color: var(--text-muted);
|
||||
}
|
||||
|
||||
/* Authors: "by" label + name */
|
||||
.meta-authors {
|
||||
font-size: 0.78rem;
|
||||
|
|
@ -975,6 +1052,96 @@ h3:hover .section-toggle,
|
|||
min-height: 1.2em; /* reserve space to prevent layout shift */
|
||||
}
|
||||
|
||||
/* Search tabs (Keyword / Semantic toggle) */
|
||||
.search-tabs {
|
||||
display: flex;
|
||||
gap: 0;
|
||||
margin-bottom: 0.1rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.search-tab {
|
||||
background: none;
|
||||
border: none;
|
||||
border-bottom: 2px solid transparent;
|
||||
margin-bottom: -1px;
|
||||
padding: 0.45rem 1rem;
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
cursor: pointer;
|
||||
transition: color 0.1s, border-color 0.1s;
|
||||
}
|
||||
|
||||
.search-tab:hover { color: var(--text-muted); }
|
||||
.search-tab.is-active { color: var(--text); border-bottom-color: var(--text); }
|
||||
|
||||
/* Search panels (keyword / semantic) — only active panel visible */
|
||||
.search-panel { display: none; }
|
||||
.search-panel.is-active { display: block; }
|
||||
|
||||
/* Semantic query input — mirrors Pagefind UI style */
|
||||
.semantic-query-input {
|
||||
width: 100%;
|
||||
margin-top: 1.65rem;
|
||||
padding: 0.5rem 0.75rem;
|
||||
font-family: var(--font-sans);
|
||||
font-size: 0.85rem;
|
||||
color: var(--text);
|
||||
background: var(--bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 2px;
|
||||
box-sizing: border-box;
|
||||
outline: none;
|
||||
}
|
||||
|
||||
.semantic-query-input:focus { border-color: var(--text-muted); }
|
||||
|
||||
.semantic-status {
|
||||
font-family: var(--font-mono);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
margin-top: 0.5rem;
|
||||
min-height: 1.2em;
|
||||
}
|
||||
|
||||
/* Semantic results list */
|
||||
.semantic-results-list {
|
||||
list-style: none;
|
||||
margin: 1.2rem 0 0;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
.semantic-result {
|
||||
padding: 0.85rem 0;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.semantic-result:last-child { border-bottom: none; }
|
||||
|
||||
.semantic-result-title {
|
||||
font-family: var(--font-sans);
|
||||
font-weight: 600;
|
||||
font-size: 0.9rem;
|
||||
color: var(--text);
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
.semantic-result-title:hover { text-decoration: underline; }
|
||||
|
||||
.semantic-result-heading {
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
}
|
||||
|
||||
.semantic-result-excerpt {
|
||||
margin: 0.3rem 0 0;
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-muted);
|
||||
line-height: 1.5;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
COMPOSITION LANDING PAGE
|
||||
============================================================ */
|
||||
|
|
@ -1081,3 +1248,148 @@ h3:hover .section-toggle,
|
|||
margin-top: 0.5rem;
|
||||
accent-color: var(--text-muted);
|
||||
}
|
||||
|
||||
|
||||
/* ============================================================
|
||||
TRANSCLUSION
|
||||
States for <div class="transclude"> placeholders injected by
|
||||
Filters.Transclusion and populated by transclude.js.
|
||||
============================================================ */
|
||||
|
||||
/* Loading placeholder — subtle pulse so the page doesn't look broken
|
||||
while the fetch is in flight. */
|
||||
.transclude--loading {
|
||||
min-height: 1.65rem;
|
||||
background: var(--bg-offset);
|
||||
border-radius: 2px;
|
||||
animation: transclude-pulse 1.2s ease-in-out infinite;
|
||||
}
|
||||
|
||||
@keyframes transclude-pulse {
|
||||
0%, 100% { opacity: 1; }
|
||||
50% { opacity: 0.45; }
|
||||
}
|
||||
|
||||
/* Loaded — transparent wrapper; content reads as native. */
|
||||
.transclude--loaded {
|
||||
/* No visual treatment: transcluded content is visually native. */
|
||||
}
|
||||
|
||||
/* Optional: uncomment to show a subtle left rule on loaded transclusions.
|
||||
.transclude--loaded {
|
||||
border-left: 2px solid var(--border);
|
||||
padding-left: 1rem;
|
||||
margin-left: -1rem;
|
||||
}
|
||||
*/
|
||||
|
||||
/* Error state — faint inline notice. */
|
||||
.transclude--error {
|
||||
font-family: var(--font-mono);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
padding: 0.4rem 0;
|
||||
}
|
||||
|
||||
/* Content wrapper — display:contents so the injected nodes participate
|
||||
in normal document flow without adding a structural block. */
|
||||
.transclude--content {
|
||||
display: contents;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
COPY BUTTON
|
||||
Injected by copy.js into every <pre> block. Fades in on hover.
|
||||
============================================================ */
|
||||
|
||||
.copy-btn {
|
||||
position: absolute;
|
||||
top: 0.5rem;
|
||||
right: 0.5rem;
|
||||
font-family: var(--font-sans);
|
||||
font-size: 0.72rem;
|
||||
font-weight: 500;
|
||||
letter-spacing: 0.03em;
|
||||
text-transform: uppercase;
|
||||
color: var(--text-faint);
|
||||
background: var(--bg);
|
||||
border: 1px solid var(--border-muted);
|
||||
border-radius: 3px;
|
||||
padding: 0.2em 0.55em;
|
||||
cursor: pointer;
|
||||
opacity: 0;
|
||||
transition: opacity 0.15s, color 0.15s, border-color 0.15s;
|
||||
user-select: none;
|
||||
line-height: 1.6;
|
||||
}
|
||||
|
||||
pre:hover .copy-btn,
|
||||
.copy-btn:focus-visible {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
.copy-btn:hover {
|
||||
color: var(--text-muted);
|
||||
border-color: var(--border);
|
||||
}
|
||||
|
||||
.copy-btn[data-copied] {
|
||||
color: var(--text-muted);
|
||||
border-color: var(--border);
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
TOUCH DEVICES
|
||||
(hover: none) and (pointer: coarse) reliably targets
|
||||
touchscreen-primary devices (phones, tablets) without
|
||||
relying on viewport width or user-agent sniffing.
|
||||
|
||||
Goals:
|
||||
1. Suppress :hover states that "stick" after a tap
|
||||
2. Ensure tap targets meet ~44px minimum
|
||||
3. (popup disable is handled in popups.js)
|
||||
============================================================ */
|
||||
|
||||
@media (hover: none) and (pointer: coarse) {
|
||||
/* Nav: ensure adequate vertical tap area even on wider tablets
|
||||
where the ≤540px query doesn't fire */
|
||||
.nav-primary a {
|
||||
padding-top: 0.35rem;
|
||||
padding-bottom: 0.35rem;
|
||||
}
|
||||
|
||||
.nav-portal-toggle {
|
||||
padding-top: 0.3rem;
|
||||
padding-bottom: 0.3rem;
|
||||
}
|
||||
|
||||
/* Settings gear: expand hit area without visual change */
|
||||
.settings-toggle {
|
||||
padding: 0.5rem;
|
||||
margin: -0.5rem;
|
||||
}
|
||||
|
||||
/* Portal row links */
|
||||
.nav-portals a {
|
||||
padding-top: 0.3rem;
|
||||
padding-bottom: 0.3rem;
|
||||
}
|
||||
|
||||
/* Suppress stuck hover states — reset to their non-hover values */
|
||||
.nav-primary a:hover { color: var(--text-muted); }
|
||||
.nav-portals a:hover { color: var(--text-faint); }
|
||||
.nav-portal-toggle:hover { color: var(--text-faint); }
|
||||
.settings-toggle:hover { color: var(--text-faint); }
|
||||
.toc-nav a:hover { color: var(--text-faint); }
|
||||
.toc-nav > ol > li > a:hover { color: var(--text-muted); }
|
||||
.hp-pro-row a:hover { color: var(--text-muted); }
|
||||
.hp-curiosity-row a:hover { color: var(--text-muted); }
|
||||
.hp-portal-name:hover { text-decoration-color: var(--border-muted); }
|
||||
.meta-tag:hover { color: var(--text-faint); }
|
||||
.backlink-source:hover { color: var(--text-muted); }
|
||||
.section-toggle:hover { color: var(--text-faint); }
|
||||
|
||||
/* Copy button: always visible on touch (no persistent hover) */
|
||||
.copy-btn { opacity: 1; }
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,191 +1,106 @@
|
|||
/* home.css — Homepage-specific styles (loaded only on index.html) */
|
||||
|
||||
/* ============================================================
|
||||
CONTACT ROW
|
||||
Professional links: Email · CV · About · GitHub · GPG · ORCID
|
||||
UTILITY ROWS
|
||||
Professional row and curiosities row: compact Fira Sans
|
||||
link strips separated by middots. Two rows, visually equal.
|
||||
============================================================ */
|
||||
|
||||
.contact-row {
|
||||
.hp-pro-row,
|
||||
.hp-curiosity-row {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
justify-content: space-evenly;
|
||||
gap: 0.35rem 1rem;
|
||||
margin: 1.75rem 0 0;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
margin-top: 0.75rem;
|
||||
}
|
||||
|
||||
.contact-row a {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.3em;
|
||||
/* First row: extra top margin since the hr is gone — breathing
|
||||
room between the intro block and the utility rows. */
|
||||
.hp-pro-row {
|
||||
margin-top: 2rem;
|
||||
}
|
||||
|
||||
.hp-pro-row a,
|
||||
.hp-curiosity-row a {
|
||||
color: var(--text-muted);
|
||||
text-decoration: none;
|
||||
transition: color var(--transition-fast);
|
||||
padding: 0.1rem 0.15rem;
|
||||
}
|
||||
|
||||
.contact-row a:hover {
|
||||
.hp-pro-row a:hover,
|
||||
.hp-curiosity-row a:hover {
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.contact-row a[data-contact-icon]::before {
|
||||
content: '';
|
||||
display: inline-block;
|
||||
width: 0.85em;
|
||||
height: 0.85em;
|
||||
flex-shrink: 0;
|
||||
background-color: currentColor;
|
||||
mask-size: contain;
|
||||
mask-repeat: no-repeat;
|
||||
mask-position: center;
|
||||
-webkit-mask-size: contain;
|
||||
-webkit-mask-repeat: no-repeat;
|
||||
-webkit-mask-position: center;
|
||||
}
|
||||
|
||||
.contact-row a[data-contact-icon="email"]::before {
|
||||
mask-image: url('/images/link-icons/email.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/email.svg');
|
||||
}
|
||||
.contact-row a[data-contact-icon="document"]::before {
|
||||
mask-image: url('/images/link-icons/document.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/document.svg');
|
||||
}
|
||||
.contact-row a[data-contact-icon="person"]::before {
|
||||
mask-image: url('/images/link-icons/person.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/person.svg');
|
||||
}
|
||||
.contact-row a[data-contact-icon="github"]::before {
|
||||
mask-image: url('/images/link-icons/github.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/github.svg');
|
||||
}
|
||||
.contact-row a[data-contact-icon="key"]::before {
|
||||
mask-image: url('/images/link-icons/key.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/key.svg');
|
||||
}
|
||||
.contact-row a[data-contact-icon="orcid"]::before {
|
||||
mask-image: url('/images/link-icons/orcid.svg');
|
||||
-webkit-mask-image: url('/images/link-icons/orcid.svg');
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
SITE GUIDE (expandable <details>)
|
||||
============================================================ */
|
||||
|
||||
.site-guide {
|
||||
margin: 1.5rem 0 0;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 3px;
|
||||
}
|
||||
|
||||
.site-guide summary {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
padding: 0.6rem 0.9rem;
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-muted);
|
||||
cursor: pointer;
|
||||
list-style: none;
|
||||
user-select: none;
|
||||
transition: color var(--transition-fast);
|
||||
}
|
||||
|
||||
.site-guide summary::-webkit-details-marker { display: none; }
|
||||
|
||||
.site-guide summary::before {
|
||||
content: '▶';
|
||||
font-size: 0.6rem;
|
||||
transition: transform var(--transition-fast);
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.site-guide[open] summary::before {
|
||||
transform: rotate(90deg);
|
||||
}
|
||||
|
||||
.site-guide summary:hover {
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.site-guide-body {
|
||||
padding: 0.75rem 1rem 1rem;
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-muted);
|
||||
line-height: 1.6;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.site-guide-body p { margin: 0 0 0.6rem; }
|
||||
.site-guide-body p:last-child { margin-bottom: 0; }
|
||||
|
||||
/* ============================================================
|
||||
CURATED GRID
|
||||
Hand-picked entry points, one per portal.
|
||||
============================================================ */
|
||||
|
||||
.curated-grid {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 1rem;
|
||||
margin-top: 2.5rem;
|
||||
}
|
||||
|
||||
@media (max-width: 580px) {
|
||||
.curated-grid {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
}
|
||||
|
||||
.curated-card {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.3rem;
|
||||
padding: 0.9rem 1.1rem;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 3px;
|
||||
text-decoration: none;
|
||||
color: inherit;
|
||||
transition: border-color var(--transition-fast);
|
||||
}
|
||||
|
||||
.curated-card:hover {
|
||||
border-color: var(--text-muted);
|
||||
}
|
||||
|
||||
.curated-portal {
|
||||
font-family: var(--font-sans);
|
||||
font-size: 0.68rem;
|
||||
font-variant: small-caps;
|
||||
letter-spacing: 0.07em;
|
||||
.hp-sep {
|
||||
color: var(--text-faint);
|
||||
padding: 0 0.2em;
|
||||
user-select: none;
|
||||
pointer-events: none;
|
||||
}
|
||||
|
||||
.curated-title {
|
||||
font-family: var(--font-serif);
|
||||
font-size: 1rem;
|
||||
line-height: 1.35;
|
||||
color: var(--text);
|
||||
/* ============================================================
|
||||
LATIN BENEDICTION
|
||||
Right-aligned closing salutation, set in lang="la" for
|
||||
correct hyphenation and screen-reader pronunciation.
|
||||
============================================================ */
|
||||
|
||||
.hp-latin {
|
||||
text-align: right;
|
||||
margin-top: 1.5rem;
|
||||
}
|
||||
|
||||
/* Reset <button> to look like a card */
|
||||
button.curated-card {
|
||||
background: none;
|
||||
font: inherit;
|
||||
text-align: left;
|
||||
cursor: pointer;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
button.curated-card:hover {
|
||||
border-color: var(--text-muted);
|
||||
}
|
||||
|
||||
.curated-desc {
|
||||
font-family: var(--font-sans);
|
||||
font-size: var(--text-size-small);
|
||||
.hp-latin p {
|
||||
margin: 0;
|
||||
color: var(--text-muted);
|
||||
line-height: 1.45;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
PORTAL LIST
|
||||
Annotated portal directory, eight items, one per line.
|
||||
Name link — em dash — short description in faint text.
|
||||
============================================================ */
|
||||
|
||||
.hp-portals {
|
||||
margin-top: 2.5rem;
|
||||
font-family: var(--font-sans);
|
||||
}
|
||||
|
||||
.hp-portal-item {
|
||||
display: flex;
|
||||
align-items: baseline;
|
||||
gap: 0.4em;
|
||||
padding: 0.2rem 0;
|
||||
}
|
||||
|
||||
.hp-portal-name {
|
||||
color: var(--text);
|
||||
text-decoration: underline;
|
||||
text-decoration-color: var(--border-muted);
|
||||
text-decoration-thickness: 0.15em;
|
||||
text-underline-offset: 0.2em;
|
||||
text-decoration-skip-ink: auto;
|
||||
transition: text-decoration-color var(--transition-fast), color var(--transition-fast);
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.hp-portal-name:hover {
|
||||
text-decoration-color: var(--link-hover-underline);
|
||||
}
|
||||
|
||||
.hp-portal-dash {
|
||||
color: var(--text-faint);
|
||||
padding: 0 0.15em;
|
||||
user-select: none;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.hp-portal-desc {
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
line-height: 1.4;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -221,6 +221,23 @@ body > footer {
|
|||
grid-column: 1;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
/* Footer: stack vertically so three sections don't fight for width. */
|
||||
body > footer {
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
gap: 0.3rem;
|
||||
padding: 0.9rem 1rem;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.footer-left {
|
||||
justify-content: center;
|
||||
}
|
||||
|
||||
.footer-right {
|
||||
text-align: center;
|
||||
}
|
||||
}
|
||||
|
||||
/* Below ~900px: body spans full width (TOC hidden, no phantom column). */
|
||||
|
|
|
|||
|
|
@ -257,6 +257,7 @@ code, kbd, samp {
|
|||
}
|
||||
|
||||
pre {
|
||||
position: relative;
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.88em;
|
||||
font-feature-settings: 'liga' 1, 'calt' 1;
|
||||
|
|
|
|||
|
|
@ -2,12 +2,99 @@
|
|||
Self-guards via #markdownBody check; no-ops on non-essay pages.
|
||||
Persists collapsed state per heading in localStorage.
|
||||
Retriggered sidenote positioning after each transition via window resize.
|
||||
|
||||
Exposes window.reinitCollapse(container) for use by transclude.js when
|
||||
newly injected content contains collapsible headings.
|
||||
*/
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
var PREFIX = 'section-collapsed:';
|
||||
|
||||
function initHeading(heading) {
|
||||
var level = parseInt(heading.tagName[1], 10);
|
||||
var content = [];
|
||||
var node = heading.nextElementSibling;
|
||||
|
||||
// Collect sibling elements until the next same-or-higher heading.
|
||||
while (node) {
|
||||
if (/^H[1-6]$/.test(node.tagName) &&
|
||||
parseInt(node.tagName[1], 10) <= level) break;
|
||||
content.push(node);
|
||||
node = node.nextElementSibling;
|
||||
}
|
||||
if (!content.length) return;
|
||||
|
||||
// Wrap collected nodes in a .section-body div.
|
||||
var wrapper = document.createElement('div');
|
||||
wrapper.className = 'section-body';
|
||||
wrapper.id = 'section-body-' + heading.id;
|
||||
heading.parentNode.insertBefore(wrapper, content[0]);
|
||||
content.forEach(function (el) { wrapper.appendChild(el); });
|
||||
|
||||
// Inject toggle button into the heading.
|
||||
var btn = document.createElement('button');
|
||||
btn.className = 'section-toggle';
|
||||
btn.setAttribute('aria-label', 'Toggle section');
|
||||
btn.setAttribute('aria-controls', wrapper.id);
|
||||
heading.appendChild(btn);
|
||||
|
||||
// Restore persisted state without transition flash.
|
||||
var key = PREFIX + heading.id;
|
||||
var collapsed = localStorage.getItem(key) === '1';
|
||||
|
||||
function setCollapsed(c, animate) {
|
||||
if (!animate) wrapper.style.transition = 'none';
|
||||
if (c) {
|
||||
wrapper.style.maxHeight = '0';
|
||||
wrapper.classList.add('is-collapsed');
|
||||
btn.setAttribute('aria-expanded', 'false');
|
||||
} else {
|
||||
// Animate: transition 0 → scrollHeight, then release to 'none'
|
||||
// in transitionend so late-rendering content (e.g. KaTeX) is
|
||||
// never clipped. No animation: go straight to 'none'.
|
||||
wrapper.style.maxHeight = animate
|
||||
? wrapper.scrollHeight + 'px'
|
||||
: 'none';
|
||||
wrapper.classList.remove('is-collapsed');
|
||||
btn.setAttribute('aria-expanded', 'true');
|
||||
}
|
||||
if (!animate) {
|
||||
// Re-enable transition after layout pass.
|
||||
requestAnimationFrame(function () {
|
||||
requestAnimationFrame(function () {
|
||||
wrapper.style.transition = '';
|
||||
});
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
setCollapsed(collapsed, false);
|
||||
|
||||
btn.addEventListener('click', function (e) {
|
||||
e.stopPropagation();
|
||||
var isCollapsed = wrapper.classList.contains('is-collapsed');
|
||||
if (!isCollapsed) {
|
||||
// Pin height before collapsing so CSS transition has a from-value.
|
||||
wrapper.style.maxHeight = wrapper.scrollHeight + 'px';
|
||||
void wrapper.offsetHeight; // force reflow
|
||||
}
|
||||
setCollapsed(!isCollapsed, true);
|
||||
localStorage.setItem(key, isCollapsed ? '0' : '1');
|
||||
});
|
||||
|
||||
// After open animation: release the height cap so late-rendering
|
||||
// content (KaTeX, images) is never clipped.
|
||||
// After close animation: cap is already 0, nothing to do.
|
||||
// Also retrigger sidenote layout after each transition.
|
||||
wrapper.addEventListener('transitionend', function () {
|
||||
if (!wrapper.classList.contains('is-collapsed')) {
|
||||
wrapper.style.maxHeight = 'none';
|
||||
}
|
||||
window.dispatchEvent(new Event('resize'));
|
||||
});
|
||||
}
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
var body = document.getElementById('markdownBody');
|
||||
if (!body) return;
|
||||
|
|
@ -16,88 +103,13 @@
|
|||
var headings = Array.from(body.querySelectorAll('h2[id], h3[id]'));
|
||||
if (!headings.length) return;
|
||||
|
||||
headings.forEach(function (heading) {
|
||||
var level = parseInt(heading.tagName[1], 10);
|
||||
var content = [];
|
||||
var node = heading.nextElementSibling;
|
||||
|
||||
// Collect sibling elements until the next same-or-higher heading.
|
||||
while (node) {
|
||||
if (/^H[1-6]$/.test(node.tagName) &&
|
||||
parseInt(node.tagName[1], 10) <= level) break;
|
||||
content.push(node);
|
||||
node = node.nextElementSibling;
|
||||
}
|
||||
if (!content.length) return;
|
||||
|
||||
// Wrap collected nodes in a .section-body div.
|
||||
var wrapper = document.createElement('div');
|
||||
wrapper.className = 'section-body';
|
||||
wrapper.id = 'section-body-' + heading.id;
|
||||
heading.parentNode.insertBefore(wrapper, content[0]);
|
||||
content.forEach(function (el) { wrapper.appendChild(el); });
|
||||
|
||||
// Inject toggle button into the heading.
|
||||
var btn = document.createElement('button');
|
||||
btn.className = 'section-toggle';
|
||||
btn.setAttribute('aria-label', 'Toggle section');
|
||||
btn.setAttribute('aria-controls', wrapper.id);
|
||||
heading.appendChild(btn);
|
||||
|
||||
// Restore persisted state without transition flash.
|
||||
var key = PREFIX + heading.id;
|
||||
var collapsed = localStorage.getItem(key) === '1';
|
||||
|
||||
function setCollapsed(c, animate) {
|
||||
if (!animate) wrapper.style.transition = 'none';
|
||||
if (c) {
|
||||
wrapper.style.maxHeight = '0';
|
||||
wrapper.classList.add('is-collapsed');
|
||||
btn.setAttribute('aria-expanded', 'false');
|
||||
} else {
|
||||
// Animate: transition 0 → scrollHeight, then release to 'none'
|
||||
// in transitionend so late-rendering content (e.g. KaTeX) is
|
||||
// never clipped. No animation: go straight to 'none'.
|
||||
wrapper.style.maxHeight = animate
|
||||
? wrapper.scrollHeight + 'px'
|
||||
: 'none';
|
||||
wrapper.classList.remove('is-collapsed');
|
||||
btn.setAttribute('aria-expanded', 'true');
|
||||
}
|
||||
if (!animate) {
|
||||
// Re-enable transition after layout pass.
|
||||
requestAnimationFrame(function () {
|
||||
requestAnimationFrame(function () {
|
||||
wrapper.style.transition = '';
|
||||
});
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
setCollapsed(collapsed, false);
|
||||
|
||||
btn.addEventListener('click', function (e) {
|
||||
e.stopPropagation();
|
||||
var isCollapsed = wrapper.classList.contains('is-collapsed');
|
||||
if (!isCollapsed) {
|
||||
// Pin height before collapsing so CSS transition has a from-value.
|
||||
wrapper.style.maxHeight = wrapper.scrollHeight + 'px';
|
||||
void wrapper.offsetHeight; // force reflow
|
||||
}
|
||||
setCollapsed(!isCollapsed, true);
|
||||
localStorage.setItem(key, isCollapsed ? '0' : '1');
|
||||
});
|
||||
|
||||
// After open animation: release the height cap so late-rendering
|
||||
// content (KaTeX, images) is never clipped.
|
||||
// After close animation: cap is already 0, nothing to do.
|
||||
// Also retrigger sidenote layout after each transition.
|
||||
wrapper.addEventListener('transitionend', function () {
|
||||
if (!wrapper.classList.contains('is-collapsed')) {
|
||||
wrapper.style.maxHeight = 'none';
|
||||
}
|
||||
window.dispatchEvent(new Event('resize'));
|
||||
});
|
||||
});
|
||||
headings.forEach(initHeading);
|
||||
});
|
||||
|
||||
// Public entry point for transclude.js: initialize collapse toggles on
|
||||
// headings inside a newly injected fragment.
|
||||
window.reinitCollapse = function (container) {
|
||||
Array.from(container.querySelectorAll('h2[id], h3[id]'))
|
||||
.forEach(initHeading);
|
||||
};
|
||||
}());
|
||||
|
|
|
|||
|
|
@ -0,0 +1,45 @@
|
|||
/* copy.js — Copy-to-clipboard button for <pre> code blocks.
|
||||
*
|
||||
* Injects a .copy-btn into every <pre> element on the page.
|
||||
* The button is visually hidden until the block is hovered (CSS handles this).
|
||||
* On click: copies the text content of the block, shows "copied" briefly.
|
||||
*/
|
||||
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
var RESET_DELAY = 1800; /* ms before label reverts to "copy" */
|
||||
|
||||
function attachButton(pre) {
|
||||
var btn = document.createElement('button');
|
||||
btn.className = 'copy-btn';
|
||||
btn.textContent = 'copy';
|
||||
btn.setAttribute('aria-label', 'Copy code to clipboard');
|
||||
|
||||
btn.addEventListener('click', function () {
|
||||
var text = pre.querySelector('code')
|
||||
? pre.querySelector('code').innerText
|
||||
: pre.innerText;
|
||||
|
||||
navigator.clipboard.writeText(text).then(function () {
|
||||
btn.textContent = 'copied';
|
||||
btn.setAttribute('data-copied', '');
|
||||
setTimeout(function () {
|
||||
btn.textContent = 'copy';
|
||||
btn.removeAttribute('data-copied');
|
||||
}, RESET_DELAY);
|
||||
}).catch(function () {
|
||||
btn.textContent = 'error';
|
||||
setTimeout(function () {
|
||||
btn.textContent = 'copy';
|
||||
}, RESET_DELAY);
|
||||
});
|
||||
});
|
||||
|
||||
pre.appendChild(btn);
|
||||
}
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
document.querySelectorAll('pre').forEach(attachButton);
|
||||
});
|
||||
}());
|
||||
|
|
@ -38,6 +38,10 @@
|
|||
------------------------------------------------------------------ */
|
||||
|
||||
function init() {
|
||||
// Hover popups are meaningless on touch-primary devices and interfere
|
||||
// with tap navigation (first tap = hover, second tap = follow link).
|
||||
if (window.matchMedia('(hover: none) and (pointer: coarse)').matches) return;
|
||||
|
||||
popup = document.createElement('div');
|
||||
popup.className = 'link-popup';
|
||||
popup.setAttribute('aria-live', 'polite');
|
||||
|
|
|
|||
|
|
@ -1,21 +1,25 @@
|
|||
/* random.js — "Random Page" button for the homepage.
|
||||
/* random.js — "Random Page" for the homepage.
|
||||
Fetches /random-pages.json (essays + blog posts, generated at build time)
|
||||
and navigates to a uniformly random entry on click. */
|
||||
and navigates to a uniformly random entry on click.
|
||||
Attaches to: #random-page-btn (legacy button) or [data-random] (link row). */
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
var btn = document.getElementById('random-page-btn');
|
||||
if (!btn) return;
|
||||
function goRandom(e) {
|
||||
e.preventDefault();
|
||||
fetch('/random-pages.json')
|
||||
.then(function (r) { return r.json(); })
|
||||
.then(function (pages) {
|
||||
if (!pages.length) return;
|
||||
window.location.href = pages[Math.floor(Math.random() * pages.length)];
|
||||
})
|
||||
.catch(function () {});
|
||||
}
|
||||
|
||||
btn.addEventListener('click', function () {
|
||||
fetch('/random-pages.json')
|
||||
.then(function (r) { return r.json(); })
|
||||
.then(function (pages) {
|
||||
if (!pages.length) return;
|
||||
window.location.href = pages[Math.floor(Math.random() * pages.length)];
|
||||
})
|
||||
.catch(function () {});
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
var els = document.querySelectorAll('#random-page-btn, [data-random]');
|
||||
els.forEach(function (el) {
|
||||
el.addEventListener('click', goRandom);
|
||||
});
|
||||
});
|
||||
}());
|
||||
|
|
|
|||
|
|
@ -0,0 +1,220 @@
|
|||
/* semantic-search.js — Client-side semantic search using paragraph embeddings.
|
||||
*
|
||||
* At build time, tools/embed.py produces:
|
||||
* /data/semantic-index.bin raw Float32Array (N_paragraphs × 384 dims)
|
||||
* /data/semantic-meta.json [{url, title, heading, excerpt}, ...]
|
||||
*
|
||||
* At query time, transformers.js embeds the user's query with all-MiniLM-L6-v2
|
||||
* (same model used at build time) and ranks paragraphs by cosine similarity.
|
||||
* All computation is client-side; no server required.
|
||||
*
|
||||
* Model: Xenova/all-MiniLM-L6-v2 (~22 MB quantized, cached by browser after first load)
|
||||
* Model files served from /models/all-MiniLM-L6-v2/ (same-origin; run tools/download-model.sh)
|
||||
* Index format: raw little-endian Float32, shape [N, 384], unit-normalized
|
||||
*
|
||||
* CSP: requires cdn.jsdelivr.net in script-src (transformers.js library).
|
||||
* connect-src stays 'self' — model weights are served same-origin.
|
||||
*/
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
var MODEL = 'all-MiniLM-L6-v2'; /* local name, no Xenova/ prefix */
|
||||
var MODEL_PATH = '/models/'; /* served same-origin */
|
||||
var DIM = 384;
|
||||
var TOP_K = 8;
|
||||
var CDN = 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2';
|
||||
|
||||
var extractor = null; /* loaded lazily on first search */
|
||||
var vectors = null; /* Float32Array, shape [N, DIM] */
|
||||
var meta = null; /* [{url, title, heading, excerpt}] */
|
||||
var indexReady = false;
|
||||
|
||||
var queryEl = document.getElementById('semantic-query');
|
||||
var statusEl = document.getElementById('semantic-status');
|
||||
var resultsEl = document.getElementById('semantic-results');
|
||||
|
||||
if (!queryEl) return; /* not on the search page */
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Index loading — fetch once, lazily
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
function loadIndex() {
|
||||
if (indexReady) return Promise.resolve();
|
||||
|
||||
return Promise.all([
|
||||
fetch('/data/semantic-index.bin').then(function (r) {
|
||||
if (!r.ok) throw new Error('semantic-index.bin not found');
|
||||
return r.arrayBuffer();
|
||||
}),
|
||||
fetch('/data/semantic-meta.json').then(function (r) {
|
||||
if (!r.ok) throw new Error('semantic-meta.json not found');
|
||||
return r.json();
|
||||
}),
|
||||
]).then(function (results) {
|
||||
vectors = new Float32Array(results[0]);
|
||||
meta = results[1];
|
||||
indexReady = true;
|
||||
});
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Model loading — dynamic import from CDN, lazy
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
function loadModel() {
|
||||
if (extractor) return Promise.resolve(extractor);
|
||||
setStatus('Loading model…');
|
||||
return import(CDN).then(function (mod) {
|
||||
/* Point transformers.js at our self-hosted model files. */
|
||||
mod.env.localModelPath = MODEL_PATH;
|
||||
mod.env.allowRemoteModels = false;
|
||||
return mod.pipeline('feature-extraction', MODEL, { quantized: true });
|
||||
}).then(function (pipe) {
|
||||
extractor = pipe;
|
||||
return extractor;
|
||||
});
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Search
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
function cosineSims(queryVec) {
|
||||
/* queryVec is already unit-normalized; dot product = cosine similarity */
|
||||
var N = meta.length;
|
||||
var scores = new Float32Array(N);
|
||||
for (var i = 0; i < N; i++) {
|
||||
var dot = 0;
|
||||
var off = i * DIM;
|
||||
for (var d = 0; d < DIM; d++) dot += queryVec[d] * vectors[off + d];
|
||||
scores[i] = dot;
|
||||
}
|
||||
return scores;
|
||||
}
|
||||
|
||||
function topK(scores) {
|
||||
var indices = Array.from({ length: meta.length }, function (_, i) { return i; });
|
||||
indices.sort(function (a, b) { return scores[b] - scores[a]; });
|
||||
return indices.slice(0, TOP_K).map(function (i) {
|
||||
return { idx: i, score: scores[i] };
|
||||
});
|
||||
}
|
||||
|
||||
function runSearch(query) {
|
||||
query = query.trim();
|
||||
if (!query) { clearResults(); return; }
|
||||
|
||||
setStatus('Searching…');
|
||||
|
||||
var indexPromise = loadIndex().catch(function (err) {
|
||||
setStatus('Semantic index not available — run make build first.');
|
||||
throw err;
|
||||
});
|
||||
var modelPromise = loadModel();
|
||||
|
||||
Promise.all([indexPromise, modelPromise]).then(function (results) {
|
||||
var pipe = results[1];
|
||||
return pipe(query, { pooling: 'mean', normalize: true });
|
||||
}).then(function (output) {
|
||||
var queryVec = output.data; /* Float32Array, length 384 */
|
||||
var scores = cosineSims(queryVec);
|
||||
var hits = topK(scores);
|
||||
renderResults(hits);
|
||||
setStatus(hits.length ? '' : 'No results found.');
|
||||
}).catch(function (err) {
|
||||
if (err.message && err.message.indexOf('not available') === -1) {
|
||||
setStatus('Search error — see console for details.');
|
||||
console.error('semantic-search:', err);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Rendering
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
function renderResults(hits) {
|
||||
if (!hits.length) { clearResults(); return; }
|
||||
|
||||
var html = '<ol class="semantic-results-list">';
|
||||
for (var i = 0; i < hits.length; i++) {
|
||||
var h = hits[i];
|
||||
var m = meta[h.idx];
|
||||
var sameHeading = m.heading === m.title;
|
||||
html += '<li class="semantic-result">'
|
||||
+ '<a class="semantic-result-title" href="' + esc(m.url) + '">'
|
||||
+ esc(m.title) + '</a>';
|
||||
if (!sameHeading) {
|
||||
html += '<span class="semantic-result-heading"> § ' + esc(m.heading) + '</span>';
|
||||
}
|
||||
html += '<p class="semantic-result-excerpt">' + esc(m.excerpt) + '</p>'
|
||||
+ '</li>';
|
||||
}
|
||||
html += '</ol>';
|
||||
resultsEl.innerHTML = html;
|
||||
}
|
||||
|
||||
function clearResults() {
|
||||
resultsEl.innerHTML = '';
|
||||
}
|
||||
|
||||
function setStatus(msg) {
|
||||
statusEl.textContent = msg;
|
||||
}
|
||||
|
||||
function esc(s) {
|
||||
return String(s)
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"');
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Tab switching — persists choice in localStorage
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
var STORAGE_KEY = 'search-tab';
|
||||
|
||||
function activateTab(target) {
|
||||
document.querySelectorAll('.search-tab').forEach(function (b) {
|
||||
var active = b.dataset.tab === target;
|
||||
b.classList.toggle('is-active', active);
|
||||
b.setAttribute('aria-selected', active ? 'true' : 'false');
|
||||
});
|
||||
document.querySelectorAll('.search-panel').forEach(function (p) {
|
||||
p.classList.toggle('is-active', p.dataset.panel === target);
|
||||
});
|
||||
try { localStorage.setItem(STORAGE_KEY, target); } catch (e) {}
|
||||
}
|
||||
|
||||
document.querySelectorAll('.search-tab').forEach(function (btn) {
|
||||
btn.addEventListener('click', function () { activateTab(btn.dataset.tab); });
|
||||
});
|
||||
|
||||
/* Restore last-used tab (falls back to keyword if unset or unrecognised) */
|
||||
var saved = null;
|
||||
try { saved = localStorage.getItem(STORAGE_KEY); } catch (e) {}
|
||||
if (saved === 'semantic') activateTab('semantic');
|
||||
|
||||
/* ------------------------------------------------------------------
|
||||
Input handling — debounced, 400 ms
|
||||
------------------------------------------------------------------ */
|
||||
|
||||
var debounceTimer = null;
|
||||
queryEl.addEventListener('input', function () {
|
||||
clearTimeout(debounceTimer);
|
||||
var q = queryEl.value.trim();
|
||||
if (!q) { clearResults(); setStatus(''); return; }
|
||||
debounceTimer = setTimeout(function () { runSearch(q); }, 400);
|
||||
});
|
||||
|
||||
/* Pre-fill from ?q= on load — mirror keyword search behaviour */
|
||||
var params = new URLSearchParams(window.location.search);
|
||||
var initial = params.get('q');
|
||||
if (initial) {
|
||||
queryEl.value = initial;
|
||||
runSearch(initial);
|
||||
}
|
||||
}());
|
||||
|
|
@ -0,0 +1,139 @@
|
|||
/* transclude.js — Client-side lazy transclusion.
|
||||
*
|
||||
* Authored in Markdown as a standalone line:
|
||||
* {{slug}} — embed full body of /slug.html
|
||||
* {{slug#section-id}} — embed one section by heading id
|
||||
* {{path/to/page}} — sub-path pages work the same way
|
||||
*
|
||||
* The Haskell preprocessor (Filters.Transclusion) converts these at build
|
||||
* time to placeholder divs:
|
||||
* <div class="transclude" data-src="/slug.html"
|
||||
* data-section="section-id"></div>
|
||||
*
|
||||
* This script finds those divs, fetches the target page, extracts the
|
||||
* requested content, rewrites cross-page fragment hrefs, injects the
|
||||
* content inline, and retriggers layout-dependent JS (sidenotes, collapse).
|
||||
*/
|
||||
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
/* Shared fetch cache — one network request per URL regardless of how
|
||||
* many transclusions reference the same page. */
|
||||
var cache = {};
|
||||
|
||||
function fetchPage(url) {
|
||||
if (!cache[url]) {
|
||||
cache[url] = fetch(url).then(function (r) {
|
||||
if (!r.ok) throw new Error('HTTP ' + r.status);
|
||||
return r.text();
|
||||
});
|
||||
}
|
||||
return cache[url];
|
||||
}
|
||||
|
||||
function parseDoc(html) {
|
||||
return new DOMParser().parseFromString(html, 'text/html');
|
||||
}
|
||||
|
||||
/* Extract a named section: the heading element with id=sectionId plus
|
||||
* all following siblings until the next heading at the same or higher
|
||||
* level (lower number), or end of parent. */
|
||||
function extractSection(doc, sectionId) {
|
||||
var anchor = doc.getElementById(sectionId);
|
||||
if (!anchor) return null;
|
||||
|
||||
var level = parseInt(anchor.tagName[1], 10);
|
||||
if (!level) return null;
|
||||
|
||||
var nodes = [anchor.cloneNode(true)];
|
||||
var el = anchor.nextElementSibling;
|
||||
|
||||
while (el) {
|
||||
if (/^H[1-6]$/.test(el.tagName) &&
|
||||
parseInt(el.tagName[1], 10) <= level) break;
|
||||
nodes.push(el.cloneNode(true));
|
||||
el = el.nextElementSibling;
|
||||
}
|
||||
|
||||
return nodes.length ? nodes : null;
|
||||
}
|
||||
|
||||
/* Extract the full contents of #markdownBody. */
|
||||
function extractBody(doc) {
|
||||
var body = doc.getElementById('markdownBody');
|
||||
if (!body) return null;
|
||||
var nodes = Array.from(body.children).map(function (el) {
|
||||
return el.cloneNode(true);
|
||||
});
|
||||
return nodes.length ? nodes : null;
|
||||
}
|
||||
|
||||
/* Rewrite href="#fragment" → href="srcUrl#fragment" so in-page anchor
|
||||
* links from the source page remain valid when embedded elsewhere. */
|
||||
function rewriteFragmentHrefs(nodes, srcUrl) {
|
||||
nodes.forEach(function (node) {
|
||||
node.querySelectorAll('a[href^="#"]').forEach(function (a) {
|
||||
a.setAttribute('href', srcUrl + a.getAttribute('href'));
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/* After injection, retrigger layout-dependent subsystems. */
|
||||
function reinitFragment(container) {
|
||||
/* sidenotes.js repositions on resize — dispatch to trigger it. */
|
||||
window.dispatchEvent(new Event('resize'));
|
||||
|
||||
/* collapse.js exposes reinitCollapse for newly added headings. */
|
||||
if (typeof window.reinitCollapse === 'function') {
|
||||
window.reinitCollapse(container);
|
||||
}
|
||||
|
||||
/* gallery.js can expose reinitGallery when needed. */
|
||||
if (typeof window.reinitGallery === 'function') {
|
||||
window.reinitGallery(container);
|
||||
}
|
||||
}
|
||||
|
||||
function loadTransclusion(el) {
|
||||
var src = el.dataset.src;
|
||||
var section = el.dataset.section || null;
|
||||
if (!src) return;
|
||||
|
||||
el.classList.add('transclude--loading');
|
||||
|
||||
fetchPage(src)
|
||||
.then(function (html) {
|
||||
var doc = parseDoc(html);
|
||||
var nodes = section
|
||||
? extractSection(doc, section)
|
||||
: extractBody(doc);
|
||||
|
||||
if (!nodes) {
|
||||
el.classList.replace('transclude--loading', 'transclude--error');
|
||||
el.textContent = '[transclusion not found: '
|
||||
+ src + (section ? '#' + section : '') + ']';
|
||||
return;
|
||||
}
|
||||
|
||||
rewriteFragmentHrefs(nodes, src);
|
||||
|
||||
var wrapper = document.createElement('div');
|
||||
wrapper.className = 'transclude--content';
|
||||
nodes.forEach(function (n) { wrapper.appendChild(n); });
|
||||
|
||||
el.classList.replace('transclude--loading', 'transclude--loaded');
|
||||
el.appendChild(wrapper);
|
||||
|
||||
reinitFragment(el);
|
||||
})
|
||||
.catch(function (err) {
|
||||
el.classList.replace('transclude--loading', 'transclude--error');
|
||||
console.warn('transclude: failed to load', src, err);
|
||||
});
|
||||
}
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function () {
|
||||
document.querySelectorAll('div.transclude').forEach(loadTransclusion);
|
||||
});
|
||||
}());
|
||||
|
|
@ -9,6 +9,7 @@ $partial("templates/partials/nav.html")$
|
|||
$if(search)$
|
||||
<script src="/pagefind/pagefind-ui.js"></script>
|
||||
<script src="/js/search.js" defer></script>
|
||||
<script src="/js/semantic-search.js" defer></script>
|
||||
$endif$
|
||||
$body$
|
||||
$partial("templates/partials/footer.html")$
|
||||
|
|
|
|||
|
|
@ -39,3 +39,5 @@ $if(viz)$
|
|||
<script src="/js/viz.js" defer></script>
|
||||
$endif$
|
||||
<script src="/js/collapse.js" defer></script>
|
||||
<script src="/js/transclude.js" defer></script>
|
||||
<script src="/js/copy.js" defer></script>
|
||||
|
|
|
|||
|
|
@ -12,6 +12,11 @@
|
|||
<div class="meta-row meta-authors">
|
||||
<span class="meta-label">by</span>$if(poet)$$poet$$else$$for(author-links)$<a href="$author-url$">$author-name$</a>$sep$, $endfor$$endif$
|
||||
</div>
|
||||
$if(affiliation-links)$
|
||||
<div class="meta-row meta-affiliation">
|
||||
$for(affiliation-links)$$if(affiliation-url)$<a href="$affiliation-url$">$affiliation-name$</a>$else$$affiliation-name$$endif$$sep$ · $endfor$
|
||||
</div>
|
||||
$endif$
|
||||
<nav class="meta-row meta-pagelinks" aria-label="Page sections">
|
||||
<a href="#version-history">History</a>
|
||||
$if(status)$<a href="#epistemic">Epistemic</a>$endif$
|
||||
|
|
|
|||
Binary file not shown.
|
|
@ -0,0 +1,39 @@
|
|||
#!/usr/bin/env bash
|
||||
# convert-images.sh — Produce WebP companions for every local raster image.
|
||||
#
|
||||
# Walks static/ and content/ for JPEG and PNG files and calls cwebp to produce
|
||||
# a .webp file alongside each one. Existing .webp files are skipped (safe to
|
||||
# re-run). If cwebp is not found the script exits 0 so the build continues.
|
||||
#
|
||||
# Requires: cwebp (libwebp) — pacman -S libwebp / apt install webp
|
||||
#
|
||||
# Quality: -q 85 is a good default for photographic content. For images that
|
||||
# are already highly compressed, -lossless avoids further degradation.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if ! command -v cwebp >/dev/null 2>&1; then
|
||||
echo "convert-images: cwebp not found — skipping WebP conversion." >&2
|
||||
echo " Install: pacman -S libwebp (or: apt install webp)" >&2
|
||||
exit 0
|
||||
fi
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
|
||||
converted=0
|
||||
skipped=0
|
||||
|
||||
while IFS= read -r -d '' img; do
|
||||
webp="${img%.*}.webp"
|
||||
if [ -f "$webp" ]; then
|
||||
skipped=$((skipped + 1))
|
||||
else
|
||||
echo " webp ${img#"$REPO_ROOT/"}"
|
||||
cwebp -quiet -q 85 "$img" -o "$webp"
|
||||
converted=$((converted + 1))
|
||||
fi
|
||||
done < <(find "$REPO_ROOT/static" "$REPO_ROOT/content" \
|
||||
\( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" \) \
|
||||
-print0 2>/dev/null)
|
||||
|
||||
echo "convert-images: ${converted} converted, ${skipped} already present."
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
#!/usr/bin/env bash
|
||||
# download-model.sh — Download the quantized ONNX model for client-side semantic search.
|
||||
#
|
||||
# Downloads Xenova/all-MiniLM-L6-v2 (quantized ONNX, ~22 MB total) from HuggingFace
|
||||
# into static/models/all-MiniLM-L6-v2/ so it can be served same-origin, keeping the
|
||||
# nginx CSP to 'self' for connect-src.
|
||||
#
|
||||
# Run once before deploying. Files are gitignored (binary artifacts).
|
||||
# Re-running is safe — existing files are skipped.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
MODEL_DIR="$REPO_ROOT/static/models/all-MiniLM-L6-v2"
|
||||
BASE_URL="https://huggingface.co/Xenova/all-MiniLM-L6-v2/resolve/main"
|
||||
|
||||
mkdir -p "$MODEL_DIR/onnx"
|
||||
|
||||
fetch() {
|
||||
local src="$1" dst="$2"
|
||||
if [ -f "$dst" ]; then
|
||||
echo " skip $src (already present)"
|
||||
return
|
||||
fi
|
||||
echo " fetch $src"
|
||||
curl -fsSL --progress-bar "$BASE_URL/$src" -o "$dst"
|
||||
}
|
||||
|
||||
echo "Downloading all-MiniLM-L6-v2 to $MODEL_DIR ..."
|
||||
|
||||
fetch "config.json" "$MODEL_DIR/config.json"
|
||||
fetch "tokenizer.json" "$MODEL_DIR/tokenizer.json"
|
||||
fetch "tokenizer_config.json" "$MODEL_DIR/tokenizer_config.json"
|
||||
fetch "special_tokens_map.json" "$MODEL_DIR/special_tokens_map.json"
|
||||
fetch "onnx/model_quantized.onnx" "$MODEL_DIR/onnx/model_quantized.onnx"
|
||||
|
||||
echo "Done. static/models/all-MiniLM-L6-v2/ is ready."
|
||||
echo "Run 'make build' to copy the model files into _site/."
|
||||
255
tools/embed.py
255
tools/embed.py
|
|
@ -1,22 +1,21 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
embed.py — Build-time similar-links generator.
|
||||
embed.py — Build-time embedding pipeline.
|
||||
|
||||
Reads _site/**/*.html, embeds each page with nomic-embed-text-v1.5,
|
||||
builds a FAISS IndexFlatIP, and writes data/similar-links.json:
|
||||
Produces two outputs from _site/**/*.html:
|
||||
|
||||
{ "/path/to/page/": [{"url": "...", "title": "...", "score": 0.87}, ...] }
|
||||
data/similar-links.json Page-level similarity (for "Related" footer section)
|
||||
data/semantic-index.bin Paragraph vectors as raw Float32 array (N × DIM)
|
||||
data/semantic-meta.json Paragraph metadata: [{url, title, heading, excerpt}]
|
||||
|
||||
Called by `make build` when .venv exists. Failures are non-fatal (make prints
|
||||
a warning and continues). Run `uv sync` first to provision the environment.
|
||||
Both use all-MiniLM-L6-v2 (384 dims) — the same model shipped to the browser
|
||||
via transformers.js for query-time semantic search.
|
||||
|
||||
Staleness check: skips re-embedding if data/similar-links.json is newer than
|
||||
every HTML file in _site/ — so content-only rebuilds that don't touch HTML
|
||||
won't re-embed.
|
||||
Called by `make build` when .venv exists. Failures are non-fatal.
|
||||
Staleness check: skips if all output files are newer than every HTML in _site/.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
|
@ -30,85 +29,118 @@ from sentence_transformers import SentenceTransformer
|
|||
# Configuration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
REPO_ROOT = Path(__file__).parent.parent
|
||||
SITE_DIR = REPO_ROOT / "_site"
|
||||
OUT_FILE = REPO_ROOT / "data" / "similar-links.json"
|
||||
MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
|
||||
TOP_N = 5
|
||||
MIN_SCORE = 0.30 # cosine similarity threshold; discard weak matches
|
||||
# Pages to exclude from both indexing and results (exact URL paths)
|
||||
REPO_ROOT = Path(__file__).parent.parent
|
||||
SITE_DIR = REPO_ROOT / "_site"
|
||||
SIMILAR_OUT = REPO_ROOT / "data" / "similar-links.json"
|
||||
SEMANTIC_BIN = REPO_ROOT / "data" / "semantic-index.bin"
|
||||
SEMANTIC_META = REPO_ROOT / "data" / "semantic-meta.json"
|
||||
|
||||
MODEL_NAME = "all-MiniLM-L6-v2"
|
||||
DIM = 384
|
||||
|
||||
TOP_N = 5 # similar-links: neighbours per page
|
||||
MIN_SCORE = 0.30 # similar-links: discard weak matches
|
||||
MIN_PARA_CHARS = 80 # semantic: skip very short paragraphs
|
||||
MAX_PARA_CHARS = 1000 # semantic: truncate before embedding
|
||||
|
||||
EXCLUDE_URLS = {"/search/", "/build/", "/404.html", "/feed.xml", "/music/feed.xml"}
|
||||
|
||||
STRIP_SELECTORS = [
|
||||
"nav", "footer", "#toc", ".link-popup", "script", "style",
|
||||
".page-meta-footer", ".metadata", "[data-pagefind-ignore]",
|
||||
]
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Staleness check
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def needs_update() -> bool:
|
||||
"""Return True if similar-links.json is missing or older than any _site HTML."""
|
||||
if not OUT_FILE.exists():
|
||||
outputs = [SIMILAR_OUT, SEMANTIC_BIN, SEMANTIC_META]
|
||||
if not all(p.exists() for p in outputs):
|
||||
return True
|
||||
json_mtime = OUT_FILE.stat().st_mtime
|
||||
for html in SITE_DIR.rglob("*.html"):
|
||||
if html.stat().st_mtime > json_mtime:
|
||||
return True
|
||||
return False
|
||||
oldest = min(p.stat().st_mtime for p in outputs)
|
||||
return any(html.stat().st_mtime > oldest for html in SITE_DIR.rglob("*.html"))
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML → text extraction
|
||||
# HTML parsing helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract(html_path: Path) -> dict | None:
|
||||
"""
|
||||
Parse an HTML file and extract:
|
||||
- url: root-relative URL path (e.g. "/essays/my-essay/")
|
||||
- title: page <title> text
|
||||
- text: plain text of the page body (nav/footer/TOC stripped)
|
||||
Returns None for pages that should not be indexed.
|
||||
"""
|
||||
raw = html_path.read_text(encoding="utf-8", errors="replace")
|
||||
soup = BeautifulSoup(raw, "html.parser")
|
||||
|
||||
# Derive root-relative URL from file path
|
||||
def _url_from_path(html_path: Path) -> str:
|
||||
rel = html_path.relative_to(SITE_DIR)
|
||||
if rel.name == "index.html":
|
||||
url = "/" + str(rel.parent) + "/"
|
||||
url = url.replace("//", "/") # root index.html → "/"
|
||||
else:
|
||||
url = "/" + str(rel)
|
||||
return url.replace("//", "/")
|
||||
return "/" + str(rel)
|
||||
|
||||
def _clean_soup(soup: BeautifulSoup) -> None:
|
||||
for sel in STRIP_SELECTORS:
|
||||
for el in soup.select(sel):
|
||||
el.decompose()
|
||||
|
||||
def _title(soup: BeautifulSoup, url: str) -> str:
|
||||
h1 = soup.find("h1")
|
||||
if h1:
|
||||
return h1.get_text(" ", strip=True)
|
||||
tag = soup.find("title")
|
||||
raw = tag.get_text(" ", strip=True) if tag else url
|
||||
return re.split(r"\s+[—–-]\s+", raw)[0].strip()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Page-level extraction (for similar-links)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract_page(html_path: Path) -> dict | None:
|
||||
raw = html_path.read_text(encoding="utf-8", errors="replace")
|
||||
soup = BeautifulSoup(raw, "html.parser")
|
||||
url = _url_from_path(html_path)
|
||||
|
||||
if url in EXCLUDE_URLS:
|
||||
return None
|
||||
|
||||
# Only index actual content pages — skip index/tag/feed/author pages
|
||||
# that have no prose body.
|
||||
body = soup.select_one("#markdownBody")
|
||||
if body is None:
|
||||
return None
|
||||
|
||||
# Title: prefer <h1>, fall back to <title> (strip " — Site Name" suffix)
|
||||
h1 = soup.find("h1")
|
||||
if h1:
|
||||
title = h1.get_text(" ", strip=True)
|
||||
else:
|
||||
title_tag = soup.find("title")
|
||||
raw_title = title_tag.get_text(" ", strip=True) if title_tag else url
|
||||
title = re.split(r"\s+[—–-]\s+", raw_title)[0].strip()
|
||||
title = _title(soup, url)
|
||||
_clean_soup(soup)
|
||||
|
||||
# Remove elements that aren't content
|
||||
for sel in ["nav", "footer", "#toc", ".link-popup", "script", "style",
|
||||
".page-meta-footer", ".metadata", "[data-pagefind-ignore]"]:
|
||||
for el in soup.select(sel):
|
||||
el.decompose()
|
||||
|
||||
text = body.get_text(" ", strip=True)
|
||||
# Collapse runs of whitespace
|
||||
text = re.sub(r"\s+", " ", text).strip()
|
||||
|
||||
if len(text) < 100: # too short to embed meaningfully
|
||||
text = re.sub(r"\s+", " ", body.get_text(" ", strip=True)).strip()
|
||||
if len(text) < 100:
|
||||
return None
|
||||
|
||||
# Feed title + text to the model so title is part of the representation
|
||||
return {"url": url, "title": title, "text": f"search_document: {title}\n\n{text}"}
|
||||
return {"url": url, "title": title, "text": text}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Paragraph-level extraction (for semantic search)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract_paragraphs(html_path: Path, url: str, title: str) -> list[dict]:
|
||||
raw = html_path.read_text(encoding="utf-8", errors="replace")
|
||||
soup = BeautifulSoup(raw, "html.parser")
|
||||
body = soup.select_one("#markdownBody")
|
||||
if body is None:
|
||||
return []
|
||||
|
||||
_clean_soup(soup)
|
||||
|
||||
paras = []
|
||||
heading = title # track current section heading
|
||||
|
||||
for el in body.find_all(["h1", "h2", "h3", "h4", "p", "li", "blockquote"]):
|
||||
if el.name in ("h1", "h2", "h3", "h4"):
|
||||
heading = el.get_text(" ", strip=True)
|
||||
continue
|
||||
text = re.sub(r"\s+", " ", el.get_text(" ", strip=True)).strip()
|
||||
if len(text) < MIN_PARA_CHARS:
|
||||
continue
|
||||
paras.append({
|
||||
"url": url,
|
||||
"title": title,
|
||||
"heading": heading,
|
||||
"excerpt": text[:200] + ("…" if len(text) > 200 else ""),
|
||||
"text": text[:MAX_PARA_CHARS],
|
||||
})
|
||||
|
||||
return paras
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
|
|
@ -120,67 +152,82 @@ def main() -> int:
|
|||
return 0
|
||||
|
||||
if not needs_update():
|
||||
print("embed.py: similar-links.json is up to date — skipping")
|
||||
print("embed.py: all outputs up to date — skipping")
|
||||
return 0
|
||||
|
||||
# --- Extract pages + paragraphs in one pass ---
|
||||
print("embed.py: extracting pages…")
|
||||
pages = []
|
||||
paragraphs = []
|
||||
|
||||
for html in sorted(SITE_DIR.rglob("*.html")):
|
||||
page = extract(html)
|
||||
if page:
|
||||
pages.append(page)
|
||||
page = extract_page(html)
|
||||
if page is None:
|
||||
continue
|
||||
pages.append(page)
|
||||
paragraphs.extend(extract_paragraphs(html, page["url"], page["title"]))
|
||||
|
||||
if not pages:
|
||||
print("embed.py: no indexable pages found", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
print(f"embed.py: embedding {len(pages)} pages with {MODEL_NAME}…")
|
||||
model = SentenceTransformer(MODEL_NAME, trust_remote_code=True)
|
||||
# --- Load model once for both tasks ---
|
||||
print(f"embed.py: loading {MODEL_NAME}…")
|
||||
model = SentenceTransformer(MODEL_NAME)
|
||||
|
||||
texts = [p["text"] for p in pages]
|
||||
# nomic requires a task prefix; we used "search_document:" above for the
|
||||
# corpus. For queries we'd use "search_query:" — but here both corpus and
|
||||
# query are the same documents, so we use "search_document:" throughout.
|
||||
embeddings = model.encode(
|
||||
texts,
|
||||
normalize_embeddings=True, # unit vectors → inner product == cosine
|
||||
# --- Similar-links (page level) ---
|
||||
print(f"embed.py: embedding {len(pages)} pages…")
|
||||
page_vecs = model.encode(
|
||||
[p["text"] for p in pages],
|
||||
normalize_embeddings=True,
|
||||
show_progress_bar=True,
|
||||
batch_size=32,
|
||||
)
|
||||
embeddings = np.array(embeddings, dtype=np.float32)
|
||||
batch_size=64,
|
||||
).astype(np.float32)
|
||||
|
||||
print("embed.py: building FAISS index…")
|
||||
dim = embeddings.shape[1]
|
||||
index = faiss.IndexFlatIP(dim) # exact inner product; fine for < 10k pages
|
||||
index.add(embeddings)
|
||||
index = faiss.IndexFlatIP(page_vecs.shape[1])
|
||||
index.add(page_vecs)
|
||||
scores_all, indices_all = index.search(page_vecs, TOP_N + 1)
|
||||
|
||||
print("embed.py: querying nearest neighbours…")
|
||||
# Query all at once: returns (n_pages, TOP_N+1) — +1 because self is #1
|
||||
scores_all, indices_all = index.search(embeddings, TOP_N + 1)
|
||||
|
||||
result: dict[str, list] = {}
|
||||
similar: dict[str, list] = {}
|
||||
for i, page in enumerate(pages):
|
||||
neighbours = []
|
||||
for rank in range(TOP_N + 1):
|
||||
j = int(indices_all[i, rank])
|
||||
score = float(scores_all[i, rank])
|
||||
if j == i:
|
||||
continue # skip self
|
||||
if score < MIN_SCORE:
|
||||
continue # skip weak matches
|
||||
neighbours.append({
|
||||
"url": pages[j]["url"],
|
||||
"title": pages[j]["title"],
|
||||
"score": round(score, 4),
|
||||
})
|
||||
j, score = int(indices_all[i, rank]), float(scores_all[i, rank])
|
||||
if j == i or score < MIN_SCORE:
|
||||
continue
|
||||
neighbours.append({"url": pages[j]["url"], "title": pages[j]["title"],
|
||||
"score": round(score, 4)})
|
||||
if len(neighbours) == TOP_N:
|
||||
break
|
||||
if neighbours:
|
||||
result[page["url"]] = neighbours
|
||||
similar[page["url"]] = neighbours
|
||||
|
||||
OUT_FILE.parent.mkdir(parents=True, exist_ok=True)
|
||||
OUT_FILE.write_text(json.dumps(result, ensure_ascii=False, indent=2))
|
||||
print(f"embed.py: wrote {len(result)} entries to {OUT_FILE.relative_to(REPO_ROOT)}")
|
||||
SIMILAR_OUT.parent.mkdir(parents=True, exist_ok=True)
|
||||
SIMILAR_OUT.write_text(json.dumps(similar, ensure_ascii=False, indent=2))
|
||||
print(f"embed.py: wrote {len(similar)} similar-links entries")
|
||||
|
||||
# --- Semantic index (paragraph level) ---
|
||||
if not paragraphs:
|
||||
print("embed.py: no paragraphs extracted — skipping semantic index")
|
||||
return 0
|
||||
|
||||
print(f"embed.py: embedding {len(paragraphs)} paragraphs…")
|
||||
para_vecs = model.encode(
|
||||
[p["text"] for p in paragraphs],
|
||||
normalize_embeddings=True,
|
||||
show_progress_bar=True,
|
||||
batch_size=64,
|
||||
).astype(np.float32)
|
||||
|
||||
SEMANTIC_BIN.write_bytes(para_vecs.tobytes())
|
||||
|
||||
meta = [{"url": p["url"], "title": p["title"],
|
||||
"heading": p["heading"], "excerpt": p["excerpt"]}
|
||||
for p in paragraphs]
|
||||
SEMANTIC_META.write_text(json.dumps(meta, ensure_ascii=False))
|
||||
|
||||
print(f"embed.py: wrote {len(paragraphs)} paragraphs to semantic index "
|
||||
f"({SEMANTIC_BIN.stat().st_size // 1024} KB)")
|
||||
return 0
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,401 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
import-poetry.py — Import a poetry collection from a Project Gutenberg plain-text file.
|
||||
|
||||
Produces:
|
||||
content/poetry/{collection-slug}/index.md Collection index page
|
||||
content/poetry/{collection-slug}/{poem-slug}.md One file per poem
|
||||
|
||||
Usage:
|
||||
python tools/import-poetry.py gutenberg.txt \\
|
||||
--poet "William Shakespeare" \\
|
||||
--collection "Sonnets" \\
|
||||
--date 1609 \\
|
||||
--title-prefix "Sonnet" \\
|
||||
--tags poetry,english \\
|
||||
[--slug shakespeare-sonnets] \\
|
||||
[--interactive] \\
|
||||
[--dry-run] \\
|
||||
[--overwrite]
|
||||
|
||||
The --title-prefix controls per-poem title generation:
|
||||
"Sonnet" → "Sonnet 1", "Sonnet 2", ..., slug "sonnet-1", "sonnet-2"
|
||||
If omitted, defaults to the singular of --collection (strips trailing 's').
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
REPO_ROOT = Path(__file__).parent.parent
|
||||
POETRY_DIR = REPO_ROOT / "content" / "poetry"
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Roman numeral conversion
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_ROMAN_VALS = [
|
||||
("M", 1000), ("CM", 900), ("D", 500), ("CD", 400),
|
||||
("C", 100), ("XC", 90), ("L", 50), ("XL", 40),
|
||||
("X", 10), ("IX", 9), ("V", 5), ("IV", 4), ("I", 1),
|
||||
]
|
||||
|
||||
def roman_to_int(s: str) -> Optional[int]:
|
||||
s = s.upper().strip()
|
||||
i, result = 0, 0
|
||||
for numeral, value in _ROMAN_VALS:
|
||||
while s[i : i + len(numeral)] == numeral:
|
||||
result += value
|
||||
i += len(numeral)
|
||||
return result if i == len(s) and result > 0 else None
|
||||
|
||||
# Matches a line that is *solely* a Roman numeral (with optional period/trailing space).
|
||||
# Anchored; leading/trailing whitespace stripped by caller.
|
||||
_ROMAN_RE = re.compile(
|
||||
r"^(M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3}))\.?$",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Slug generation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def slugify(s: str) -> str:
|
||||
s = s.lower()
|
||||
s = re.sub(r"[^\w\s-]", "", s)
|
||||
s = re.sub(r"[\s_]+", "-", s)
|
||||
s = re.sub(r"-+", "-", s)
|
||||
return s.strip("-")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Gutenberg parsing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_START_RE = re.compile(r"\*\*\* START OF THE PROJECT GUTENBERG", re.IGNORECASE)
|
||||
_END_RE = re.compile(r"\*\*\* END OF THE PROJECT GUTENBERG", re.IGNORECASE)
|
||||
|
||||
def strip_gutenberg(text: str) -> tuple[str, str]:
|
||||
"""Return (header, body) where body is the text between the PG markers."""
|
||||
lines = text.splitlines()
|
||||
start = 0
|
||||
end = len(lines)
|
||||
for i, line in enumerate(lines):
|
||||
if _START_RE.search(line):
|
||||
start = i + 1
|
||||
break
|
||||
for i, line in enumerate(lines):
|
||||
if _END_RE.search(line):
|
||||
end = i
|
||||
break
|
||||
header = "\n".join(lines[:start])
|
||||
body = "\n".join(lines[start:end])
|
||||
return header, body
|
||||
|
||||
def parse_gutenberg_meta(header: str) -> dict:
|
||||
meta: dict = {}
|
||||
for line in header.splitlines():
|
||||
for field in ("Title", "Author", "Release date", "Release Date"):
|
||||
if line.startswith(field + ":"):
|
||||
meta[field.lower().replace(" ", "-")] = line.split(":", 1)[1].strip()
|
||||
return meta
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Poem splitting
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def split_poems(body: str) -> list[dict]:
|
||||
"""
|
||||
Split body text into individual poems using Roman-numeral headings as
|
||||
boundaries. Returns a list of dicts:
|
||||
{ number: int, roman: str, lines: list[str] }
|
||||
|
||||
Lines are raw — call normalize_stanzas() before writing.
|
||||
"""
|
||||
lines = body.splitlines()
|
||||
poems: list[dict] = []
|
||||
current: Optional[dict] = None
|
||||
|
||||
for line in lines:
|
||||
stripped = line.strip()
|
||||
m = _ROMAN_RE.match(stripped)
|
||||
if m and stripped: # empty stripped means blank line, not a heading
|
||||
number = roman_to_int(m.group(1))
|
||||
if number is not None:
|
||||
if current is not None and _has_content(current["lines"]):
|
||||
poems.append(current)
|
||||
current = {"number": number, "roman": m.group(1).upper(), "lines": []}
|
||||
continue
|
||||
if current is not None:
|
||||
current["lines"].append(line)
|
||||
|
||||
if current is not None and _has_content(current["lines"]):
|
||||
poems.append(current)
|
||||
|
||||
return poems
|
||||
|
||||
def _has_content(lines: list[str], min_words: int = 4) -> bool:
|
||||
text = " ".join(l.strip() for l in lines if l.strip())
|
||||
return len(text.split()) >= min_words
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stanza normalization
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def normalize_stanzas(raw: list[str]) -> list[str]:
|
||||
"""
|
||||
Strip common indentation, remove leading/trailing blank lines, collapse
|
||||
runs of more than one blank line to a single blank line (stanza break).
|
||||
"""
|
||||
lines = [l.rstrip() for l in raw]
|
||||
|
||||
# Trim leading/trailing blank lines
|
||||
while lines and not lines[0].strip():
|
||||
lines.pop(0)
|
||||
while lines and not lines[-1].strip():
|
||||
lines.pop()
|
||||
|
||||
# Determine and strip common leading whitespace on content lines
|
||||
content = [l for l in lines if l.strip()]
|
||||
if content:
|
||||
indent = min(len(l) - len(l.lstrip()) for l in content)
|
||||
lines = [l[indent:] if len(l) >= indent else l for l in lines]
|
||||
|
||||
# Collapse multiple consecutive blank lines to one
|
||||
out: list[str] = []
|
||||
prev_blank = False
|
||||
for l in lines:
|
||||
blank = not l.strip()
|
||||
if blank and prev_blank:
|
||||
continue
|
||||
out.append(l)
|
||||
prev_blank = blank
|
||||
|
||||
# Final trim
|
||||
while out and not out[0].strip():
|
||||
out.pop(0)
|
||||
while out and not out[-1].strip():
|
||||
out.pop()
|
||||
|
||||
return out
|
||||
|
||||
def first_content_line(lines: list[str]) -> str:
|
||||
for l in lines:
|
||||
if l.strip():
|
||||
return l.strip()
|
||||
return ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# YAML helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def yaml_str(s: str) -> str:
|
||||
"""Quote a string for YAML if it needs it."""
|
||||
needs_quote = (
|
||||
not s
|
||||
or s[0] in " \t"
|
||||
or s[-1] in " \t"
|
||||
or any(c in s for c in ':{}[]|>&*!,#?@`\'"')
|
||||
)
|
||||
if needs_quote:
|
||||
return '"' + s.replace("\\", "\\\\").replace('"', '\\"') + '"'
|
||||
return s
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# File generation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def make_poem_file(
|
||||
poem: dict,
|
||||
title_prefix: str,
|
||||
poet: str,
|
||||
collection: str,
|
||||
collection_slug: str,
|
||||
date: str,
|
||||
tags: list[str],
|
||||
) -> tuple[str, str]:
|
||||
"""Return (filename_stem, markdown_content)."""
|
||||
title = f"{title_prefix} {poem['number']}"
|
||||
slug = slugify(title)
|
||||
norm = normalize_stanzas(poem["lines"])
|
||||
abstract = first_content_line(norm)
|
||||
tag_yaml = "[" + ", ".join(tags) + "]"
|
||||
col_url = f"/poetry/{collection_slug}/"
|
||||
|
||||
fm = f"""\
|
||||
---
|
||||
title: {yaml_str(title)}
|
||||
number: {poem['number']}
|
||||
poet: {yaml_str(poet)}
|
||||
collection: {yaml_str(collection)}
|
||||
collection-url: {col_url}
|
||||
date: {date}
|
||||
tags: {tag_yaml}
|
||||
abstract: {yaml_str(abstract)}
|
||||
---
|
||||
|
||||
"""
|
||||
content = fm + "\n".join(norm) + "\n"
|
||||
return slug, content
|
||||
|
||||
def make_collection_index(
|
||||
collection: str,
|
||||
poet: str,
|
||||
date: str,
|
||||
tags: list[str],
|
||||
collection_slug: str,
|
||||
title_prefix: str,
|
||||
poems: list[dict],
|
||||
) -> str:
|
||||
tag_yaml = "[" + ", ".join(tags) + "]"
|
||||
count = len(poems)
|
||||
abstract = f"{count} poem{'s' if count != 1 else ''}"
|
||||
|
||||
poem_links = "\n".join(
|
||||
f"- [{title_prefix} {p['number']}](./{slugify(title_prefix + ' ' + str(p['number']))}.html)"
|
||||
for p in sorted(poems, key=lambda p: p["number"])
|
||||
)
|
||||
|
||||
return f"""\
|
||||
---
|
||||
title: {yaml_str(collection)}
|
||||
poet: {yaml_str(poet)}
|
||||
date: {date}
|
||||
tags: {tag_yaml}
|
||||
abstract: {yaml_str(abstract)}
|
||||
---
|
||||
|
||||
*{poet}* · {date}
|
||||
|
||||
{poem_links}
|
||||
"""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Interactive review
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def interactive_review(poems: list[dict], title_prefix: str) -> list[dict]:
|
||||
approved: list[dict] = []
|
||||
total = len(poems)
|
||||
for idx, poem in enumerate(poems, 1):
|
||||
title = f"{title_prefix} {poem['number']}"
|
||||
preview = first_content_line(normalize_stanzas(poem["lines"]))
|
||||
n_lines = sum(1 for l in poem["lines"] if l.strip())
|
||||
|
||||
print(f"\n{'─' * 60}")
|
||||
print(f" [{idx}/{total}] {poem['roman']}. → {title}")
|
||||
print(f" First line : {preview}")
|
||||
print(f" Body lines : {n_lines}")
|
||||
print()
|
||||
resp = input(" [Enter] include s skip q quit: ").strip().lower()
|
||||
if resp == "q":
|
||||
print("Stopped at user request.")
|
||||
break
|
||||
elif resp == "s":
|
||||
print(f" Skipped {title}.")
|
||||
continue
|
||||
approved.append(poem)
|
||||
|
||||
return approved
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Import a Gutenberg poetry collection into content/poetry/."
|
||||
)
|
||||
parser.add_argument("source", help="Path to the Gutenberg .txt file")
|
||||
parser.add_argument("--poet", required=True, help='e.g. "William Shakespeare"')
|
||||
parser.add_argument("--collection", required=True, help='e.g. "Sonnets"')
|
||||
parser.add_argument("--date", required=True, help="Publication year, e.g. 1609")
|
||||
parser.add_argument("--title-prefix", help='Per-poem title prefix, e.g. "Sonnet". Defaults to singular of --collection.')
|
||||
parser.add_argument("--tags", default="poetry", help="Comma-separated tags (default: poetry)")
|
||||
parser.add_argument("--slug", help="Override collection directory slug")
|
||||
parser.add_argument("--interactive", action="store_true", help="Review each poem before writing")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Show what would be written; write nothing")
|
||||
parser.add_argument("--overwrite", action="store_true", help="Overwrite existing files")
|
||||
args = parser.parse_args()
|
||||
|
||||
source = Path(args.source)
|
||||
if not source.exists():
|
||||
print(f"error: file not found: {source}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Defaults
|
||||
title_prefix = args.title_prefix or args.collection.rstrip("s")
|
||||
collection_slug = args.slug or slugify(f"{args.poet}-{args.collection}")
|
||||
tags = [t.strip() for t in args.tags.split(",")]
|
||||
out_dir = POETRY_DIR / collection_slug
|
||||
|
||||
text = source.read_text(encoding="utf-8", errors="replace")
|
||||
header, body = strip_gutenberg(text)
|
||||
|
||||
if not body.strip():
|
||||
print("warning: Gutenberg markers not found — treating entire file as body", file=sys.stderr)
|
||||
body = text
|
||||
|
||||
poems = split_poems(body)
|
||||
|
||||
if not poems:
|
||||
print("No poems detected. The file may not use Roman-numeral headings.", file=sys.stderr)
|
||||
print("First 50 lines of body:", file=sys.stderr)
|
||||
for ln in body.splitlines()[:50]:
|
||||
print(f" {repr(ln)}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Detected {len(poems)} poems · collection: {args.collection} · poet: {args.poet}")
|
||||
|
||||
if args.interactive:
|
||||
poems = interactive_review(poems, title_prefix)
|
||||
print(f"\n{len(poems)} poem(s) approved for import.")
|
||||
|
||||
if not poems:
|
||||
print("Nothing to write.")
|
||||
return
|
||||
|
||||
# Build file map
|
||||
files: dict[Path, str] = {}
|
||||
for poem in poems:
|
||||
slug, content = make_poem_file(
|
||||
poem, title_prefix, args.poet, args.collection,
|
||||
collection_slug, args.date, tags,
|
||||
)
|
||||
files[out_dir / f"{slug}.md"] = content
|
||||
|
||||
files[out_dir / "index.md"] = make_collection_index(
|
||||
args.collection, args.poet, args.date, tags,
|
||||
collection_slug, title_prefix, poems,
|
||||
)
|
||||
|
||||
# Dry run
|
||||
if args.dry_run:
|
||||
print(f"\nDry run — {len(files)} file(s) → {out_dir.relative_to(REPO_ROOT)}/")
|
||||
for path in sorted(files):
|
||||
marker = " (exists)" if path.exists() else ""
|
||||
print(f" {path.name}{marker}")
|
||||
print(f"\nSample — first poem:\n{'─'*60}")
|
||||
first_content = next(v for k, v in files.items() if k.name != "index.md")
|
||||
print(first_content[:800])
|
||||
return
|
||||
|
||||
# Write
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
written = skipped = 0
|
||||
for path, content in sorted(files.items()):
|
||||
if path.exists() and not args.overwrite:
|
||||
print(f" skip {path.name}")
|
||||
skipped += 1
|
||||
else:
|
||||
path.write_text(content, encoding="utf-8")
|
||||
print(f" write {path.name}")
|
||||
written += 1
|
||||
|
||||
print(f"\n{written} written, {skipped} skipped → {out_dir.relative_to(REPO_ROOT)}/")
|
||||
if not args.dry_run:
|
||||
print("Next: make clean && make build")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -0,0 +1,21 @@
|
|||
#!/usr/bin/env bash
|
||||
# refreeze.sh — Regenerate cabal.project.freeze after a pacman -Syu updates
|
||||
# Haskell libraries. Run from anywhere inside the repo.
|
||||
set -euo pipefail
|
||||
|
||||
REPO_ROOT="$(git -C "$(dirname "$0")" rev-parse --show-toplevel)"
|
||||
FREEZE="$REPO_ROOT/cabal.project.freeze"
|
||||
|
||||
cd "$REPO_ROOT"
|
||||
|
||||
echo "==> Removing stale freeze file..."
|
||||
rm -f "$FREEZE"
|
||||
|
||||
echo "==> Resolving dependencies and writing new freeze file..."
|
||||
cabal freeze
|
||||
|
||||
echo "==> Verifying build..."
|
||||
cabal build
|
||||
|
||||
echo ""
|
||||
echo "Done. cabal.project.freeze updated."
|
||||
11
uv.lock
11
uv.lock
|
|
@ -171,15 +171,6 @@ wheels = [
|
|||
{ url = "https://files.pythonhosted.org/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl", hash = "sha256:85cef7cff222d8644161529808465972e51340599459b8ac3ccbac5a854e0d30", size = 8321, upload-time = "2023-10-07T05:32:16.783Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "einops"
|
||||
version = "0.8.2"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/2c/77/850bef8d72ffb9219f0b1aac23fbc1bf7d038ee6ea666f331fa273031aa2/einops-0.8.2.tar.gz", hash = "sha256:609da665570e5e265e27283aab09e7f279ade90c4f01bcfca111f3d3e13f2827", size = 56261, upload-time = "2026-01-26T04:13:17.638Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/2a/09/f8d8f8f31e4483c10a906437b4ce31bdf3d6d417b73fe33f1a8b59e34228/einops-0.8.2-py3-none-any.whl", hash = "sha256:54058201ac7087911181bfec4af6091bb59380360f069276601256a76af08193", size = 65638, upload-time = "2026-01-26T04:13:18.546Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "faiss-cpu"
|
||||
version = "1.13.2"
|
||||
|
|
@ -501,7 +492,6 @@ source = { virtual = "." }
|
|||
dependencies = [
|
||||
{ name = "altair" },
|
||||
{ name = "beautifulsoup4" },
|
||||
{ name = "einops" },
|
||||
{ name = "faiss-cpu" },
|
||||
{ name = "matplotlib" },
|
||||
{ name = "numpy" },
|
||||
|
|
@ -514,7 +504,6 @@ dependencies = [
|
|||
requires-dist = [
|
||||
{ name = "altair", specifier = ">=5.4" },
|
||||
{ name = "beautifulsoup4", specifier = ">=4.12" },
|
||||
{ name = "einops", specifier = ">=0.8" },
|
||||
{ name = "faiss-cpu", specifier = ">=1.9" },
|
||||
{ name = "matplotlib", specifier = ">=3.9" },
|
||||
{ name = "numpy", specifier = ">=2.0" },
|
||||
|
|
|
|||
Loading…
Reference in New Issue