GPG signing, embedding pipeline, visualization filter, search timing, sig popups
- GPG page signing: dedicated signing subkey in ~/.gnupg-signing, sign-site.sh walks _site/**/*.html producing .sig files, preset-signing-passphrase.sh caches passphrase via gpg-preset-passphrase; make sign target; make deploy chains it - Footer sig link: $url$.sig with hover popup showing ASCII armor (popups.js sigContent provider; .footer-sig-link bound explicitly to bypass footer exclusion) - Public key at static/gpg/pubkey.asc - Embedding pipeline: tools/embed.py encodes _site pages with nomic-embed-text-v1.5 + FAISS IndexFlatIP, writes data/similar-links.json; staleness check skips when JSON is newer than all HTML; make build invokes via uv, skips gracefully if .venv absent - SimilarLinks.hs: similarLinksField loads similar-links.json with Hakyll dependency tracking; renders Related section in page-footer.html - uv environment: pyproject.toml + uv.lock (CPU-only torch via pytorch-cpu index) - Visualization filter: Filters/Viz.hs runs Python scripts for .figure (SVG) and .visualization (Vega-Lite JSON) fenced divs; viz.js renders with monochrome config and MutationObserver dark-mode re-render; viz.css layout - Search timing: #search-timing element shows elapsed ms via MutationObserver - Build telemetry timestamps removed from git tracking (now in .gitignore) - spec.md updated to v9; WRITING.md updated with viz, related, signing, build docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
a94e82db98
commit
5cfbfbc0ef
|
|
@ -9,11 +9,13 @@ _cache/
|
|||
*.swp
|
||||
*.swo
|
||||
|
||||
# Data files that are generated or external (not version-controlled)
|
||||
# Data files that are generated at build time (not version-controlled)
|
||||
data/embeddings.json
|
||||
data/similar-links.json
|
||||
data/backlinks.json
|
||||
data/build-stats.json
|
||||
data/build-start.txt
|
||||
data/last-build-seconds.txt
|
||||
|
||||
# IGNORE.txt is for the local build and need not be synced.
|
||||
IGNORE.txt
|
||||
|
|
@ -0,0 +1 @@
|
|||
3.14
|
||||
12
Makefile
12
Makefile
|
|
@ -1,4 +1,4 @@
|
|||
.PHONY: build deploy watch clean dev
|
||||
.PHONY: build deploy sign watch clean dev
|
||||
|
||||
# Source .env for GITHUB_TOKEN and GITHUB_REPO if it exists.
|
||||
# .env format: KEY=value (one per line, no `export` prefix, no quotes needed).
|
||||
|
|
@ -11,12 +11,20 @@ build:
|
|||
@date +%s > data/build-start.txt
|
||||
cabal run site -- build
|
||||
pagefind --site _site
|
||||
@if [ -d .venv ]; then \
|
||||
uv run python tools/embed.py || echo "Warning: embedding failed — data/similar-links.json not updated (build continues)"; \
|
||||
else \
|
||||
echo "Embedding skipped: run 'uv sync' to enable similar-links (build continues)"; \
|
||||
fi
|
||||
> IGNORE.txt
|
||||
@BUILD_END=$$(date +%s); \
|
||||
BUILD_START=$$(cat data/build-start.txt); \
|
||||
echo $$((BUILD_END - BUILD_START)) > data/last-build-seconds.txt
|
||||
|
||||
deploy: build
|
||||
sign:
|
||||
@./tools/sign-site.sh
|
||||
|
||||
deploy: build sign
|
||||
@if [ -z "$(GITHUB_TOKEN)" ] || [ -z "$(GITHUB_REPO)" ]; then \
|
||||
echo "Skipping GitHub push: set GITHUB_TOKEN and GITHUB_REPO in .env"; \
|
||||
else \
|
||||
|
|
|
|||
96
WRITING.md
96
WRITING.md
|
|
@ -708,6 +708,80 @@ Wiktionary. **Here** opens the Pagefind search page pre-filled with the selectio
|
|||
|
||||
---
|
||||
|
||||
## Visualizations
|
||||
|
||||
Two types of figure are supported, authored as fenced divs. The Python script
|
||||
runs at build time via the Pandoc filter; no client-side computation is needed
|
||||
for static figures.
|
||||
|
||||
### Static figures (matplotlib)
|
||||
|
||||
```markdown
|
||||
::: {.figure script="figures/my-plot.py" caption="Caption text."}
|
||||
:::
|
||||
```
|
||||
|
||||
The script path is resolved relative to the source file's directory. It should
|
||||
import `viz_theme` from `tools/` and write SVG to stdout:
|
||||
|
||||
```python
|
||||
import sys
|
||||
sys.path.insert(0, 'tools')
|
||||
from viz_theme import apply_monochrome, save_svg
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
apply_monochrome()
|
||||
fig, ax = plt.subplots()
|
||||
ax.plot([1, 2, 3], [1, 4, 9])
|
||||
save_svg(fig)
|
||||
```
|
||||
|
||||
`apply_monochrome()` sets transparent backgrounds and pure black elements so
|
||||
the figure inherits the page's dark/light mode via CSS `currentColor`.
|
||||
Multi-series charts should use `LINESTYLE_CYCLE` instead of color:
|
||||
|
||||
```python
|
||||
from viz_theme import apply_monochrome, save_svg, LINESTYLE_CYCLE
|
||||
apply_monochrome()
|
||||
fig, ax = plt.subplots()
|
||||
for i, style in enumerate(LINESTYLE_CYCLE[:3]):
|
||||
ax.plot(x, data[i], **style, label=f"Series {i+1}")
|
||||
save_svg(fig)
|
||||
```
|
||||
|
||||
### Interactive figures (Altair / Vega-Lite)
|
||||
|
||||
```markdown
|
||||
::: {.visualization script="figures/my-chart.py" caption="Caption text."}
|
||||
:::
|
||||
```
|
||||
|
||||
The script outputs Vega-Lite JSON to stdout:
|
||||
|
||||
```python
|
||||
import sys, json
|
||||
import altair as alt
|
||||
import pandas as pd
|
||||
|
||||
df = pd.read_csv('figures/data.csv')
|
||||
chart = alt.Chart(df).mark_line().encode(x='year:O', y='value:Q')
|
||||
print(json.dumps(chart.to_dict()))
|
||||
```
|
||||
|
||||
The site's monochrome Vega config is applied automatically, overriding the
|
||||
spec's own `config`. Dark mode re-renders automatically.
|
||||
|
||||
Add `viz: true` to the frontmatter of any page using `.visualization` divs —
|
||||
this loads the Vega CDN scripts:
|
||||
|
||||
```yaml
|
||||
viz: true
|
||||
```
|
||||
|
||||
Pages with only static `.figure` divs do not need `viz: true`.
|
||||
|
||||
---
|
||||
|
||||
## Page footer sections
|
||||
|
||||
Essays get a structured footer. Sections with no data are hidden.
|
||||
|
|
@ -719,18 +793,24 @@ Essays get a structured footer. Sections with no data are hidden.
|
|||
| Bibliography | At least one inline citation |
|
||||
| Further Reading | `further-reading` key is present |
|
||||
| Backlinks | Other pages link to this page |
|
||||
| Related | Similar pages exist (embedding-based; computed at build time) |
|
||||
|
||||
Backlinks are auto-generated at build time. No markup needed — any internal
|
||||
link from another page creates an entry here, showing the source title and the
|
||||
surrounding paragraph as context.
|
||||
|
||||
Related pages are computed by `tools/embed.py` using semantic embeddings
|
||||
(`nomic-embed-text-v1.5` + FAISS). Runs automatically during `make build`
|
||||
when the Python environment is set up (`uv sync`). No markup needed.
|
||||
|
||||
---
|
||||
|
||||
## Build
|
||||
|
||||
```bash
|
||||
make build # auto-commit content/, compile, run pagefind, clear IGNORE.txt
|
||||
make deploy # build + optional GitHub push + rsync to VPS
|
||||
make build # auto-commit content/, compile, run pagefind + embeddings, clear IGNORE.txt
|
||||
make sign # GPG detach-sign every _site/**/*.html → .html.sig (requires passphrase cached)
|
||||
make deploy # build + sign + optional GitHub push + rsync to VPS
|
||||
make watch # Hakyll live-reload dev server at http://localhost:8000
|
||||
make dev # clean build + python HTTP server at http://localhost:8000
|
||||
make clean # wipe _site/ and _cache/
|
||||
|
|
@ -743,3 +823,15 @@ Haskell-side changes.
|
|||
|
||||
`make deploy` pushes to GitHub if `GITHUB_TOKEN` and `GITHUB_REPO` are set in
|
||||
`.env` (see `.env.example`), then rsyncs `_site/` to the VPS.
|
||||
|
||||
**GPG signing:** `make sign` and `make deploy` require the signing subkey
|
||||
passphrase to be cached. Run once per boot (or per 24h expiry):
|
||||
|
||||
```bash
|
||||
./tools/preset-signing-passphrase.sh
|
||||
```
|
||||
|
||||
**Python environment:** the embedding pipeline requires `uv sync` to be run
|
||||
once. After that, `make build` invokes `uv run python tools/embed.py`
|
||||
automatically. If `.venv` is absent, the step is skipped with a warning and
|
||||
the build continues normally.
|
||||
|
|
|
|||
|
|
@ -25,6 +25,7 @@ import Utils (wordCount, readingTime, escapeHtml)
|
|||
import Filters (applyAll, preprocessSource)
|
||||
import qualified Citations
|
||||
import qualified Filters.Score as Score
|
||||
import qualified Filters.Viz as Viz
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Reader / writer options
|
||||
|
|
@ -146,13 +147,17 @@ essayCompilerWith rOpts = do
|
|||
(pandocWithCites, bibHtml, furtherHtml) <- unsafeCompiler $
|
||||
Citations.applyCitations frKeys (itemBody pandocItem)
|
||||
|
||||
-- Inline SVG score fragments (reads SVG files relative to the source file).
|
||||
-- Inline SVG score fragments and data visualizations (both read files
|
||||
-- relative to the source file's directory).
|
||||
filePath <- getResourceFilePath
|
||||
let srcDir = takeDirectory filePath
|
||||
pandocWithScores <- unsafeCompiler $
|
||||
Score.inlineScores (takeDirectory filePath) pandocWithCites
|
||||
Score.inlineScores srcDir pandocWithCites
|
||||
pandocWithViz <- unsafeCompiler $
|
||||
Viz.inlineViz srcDir pandocWithScores
|
||||
|
||||
-- Apply remaining AST-level filters (sidenotes, smallcaps, links, etc.).
|
||||
let pandocFiltered = applyAll pandocWithScores
|
||||
let pandocFiltered = applyAll pandocWithViz
|
||||
let pandocItem' = itemSetBody pandocFiltered pandocItem
|
||||
|
||||
-- Build TOC from the filtered AST.
|
||||
|
|
|
|||
|
|
@ -22,9 +22,10 @@ import Text.Read (readMaybe)
|
|||
import qualified Data.Text as T
|
||||
import Hakyll
|
||||
import Hakyll.Core.Metadata (lookupStringList)
|
||||
import Authors (authorLinksField)
|
||||
import Backlinks (backlinksField)
|
||||
import Stability (stabilityField, lastReviewedField, versionHistoryField)
|
||||
import Authors (authorLinksField)
|
||||
import Backlinks (backlinksField)
|
||||
import SimilarLinks (similarLinksField)
|
||||
import Stability (stabilityField, lastReviewedField, versionHistoryField)
|
||||
import Tags (tagLinksField)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
|
@ -167,6 +168,7 @@ essayCtx =
|
|||
<> bibliographyField
|
||||
<> furtherReadingField
|
||||
<> backlinksField
|
||||
<> similarLinksField
|
||||
<> epistemicCtx
|
||||
<> versionHistoryField
|
||||
<> dateField "date-created" "%-d %B %Y"
|
||||
|
|
@ -183,6 +185,7 @@ postCtx :: Context String
|
|||
postCtx =
|
||||
authorLinksField
|
||||
<> backlinksField
|
||||
<> similarLinksField
|
||||
<> dateField "date" "%-d %B %Y"
|
||||
<> dateField "date-iso" "%Y-%m-%d"
|
||||
<> constField "math" "true"
|
||||
|
|
|
|||
|
|
@ -0,0 +1,182 @@
|
|||
{-# LANGUAGE GHC2021 #-}
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
-- | Inline data visualizations into the Pandoc AST.
|
||||
--
|
||||
-- Two fenced-div classes are recognized in Markdown:
|
||||
--
|
||||
-- __Static figure__ (Matplotlib → SVG, no client-side JS required):
|
||||
--
|
||||
-- > ::: {.figure script="figures/myplot.py" caption="Caption text"}
|
||||
-- > :::
|
||||
--
|
||||
-- Runs the Python script; stdout must be an SVG document with a
|
||||
-- transparent background. Black fills and strokes are replaced with
|
||||
-- @currentColor@ so figures adapt to dark mode automatically.
|
||||
-- See @tools/viz_theme.py@ for the recommended matplotlib setup.
|
||||
--
|
||||
-- __Interactive figure__ (Altair/Vega-Lite → JSON spec):
|
||||
--
|
||||
-- > ::: {.visualization script="figures/myplot.py" caption="Caption text"}
|
||||
-- > :::
|
||||
--
|
||||
-- Runs the Python script; stdout must be a Vega-Lite JSON spec. The spec
|
||||
-- is embedded verbatim inside a @\<script type=\"application\/json\"\>@ tag;
|
||||
-- @viz.js@ picks it up and renders it via Vega-Embed, applying a
|
||||
-- monochrome theme that responds to the site\'s light/dark toggle.
|
||||
--
|
||||
-- __Authoring conventions:__
|
||||
--
|
||||
-- * Scripts are run from the project root; paths are relative to it.
|
||||
-- * @script=@ paths are resolved relative to the source file\'s directory.
|
||||
-- * For @.figure@ scripts: use pure black (@#000000@) for all drawn
|
||||
-- elements and transparent backgrounds so @processColors@ and CSS
|
||||
-- @currentColor@ handle dark mode.
|
||||
-- * For @.visualization@ scripts: set encoding colours to @\"black\"@;
|
||||
-- @viz.js@ applies the site palette via Vega-Lite @config@.
|
||||
-- * Set @viz: true@ in the page\'s YAML frontmatter to load Vega JS.
|
||||
module Filters.Viz (inlineViz) where
|
||||
|
||||
import Control.Exception (IOException, catch)
|
||||
import Data.Maybe (fromMaybe)
|
||||
import qualified Data.Text as T
|
||||
import System.Exit (ExitCode (..))
|
||||
import System.FilePath ((</>))
|
||||
import System.Process (readProcessWithExitCode)
|
||||
import Text.Pandoc.Definition
|
||||
import Text.Pandoc.Walk (walkM)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Public entry point
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Walk the Pandoc AST and inline all @.figure@ and @.visualization@ divs.
|
||||
-- @baseDir@ is the directory of the source file; @script=@ paths are
|
||||
-- resolved relative to it.
|
||||
inlineViz :: FilePath -> Pandoc -> IO Pandoc
|
||||
inlineViz baseDir = walkM (transformBlock baseDir)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Block transformation
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
transformBlock :: FilePath -> Block -> IO Block
|
||||
transformBlock baseDir blk@(Div (_, cls, attrs) _)
|
||||
| "figure" `elem` cls = do
|
||||
result <- runScript baseDir attrs
|
||||
case result of
|
||||
Left err ->
|
||||
warn "figure" err >> return (errorBlock err)
|
||||
Right out ->
|
||||
let caption = attr "caption" attrs
|
||||
in return $ RawBlock (Format "html")
|
||||
(staticFigureHtml (processColors out) caption)
|
||||
| "visualization" `elem` cls = do
|
||||
result <- runScript baseDir attrs
|
||||
case result of
|
||||
Left err ->
|
||||
warn "visualization" err >> return (errorBlock err)
|
||||
Right out ->
|
||||
let caption = attr "caption" attrs
|
||||
in return $ RawBlock (Format "html")
|
||||
(interactiveFigureHtml (escScriptTag out) caption)
|
||||
| otherwise = return blk
|
||||
transformBlock _ b = return b
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Script execution
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Run @python3 <script>@. Returns the script\'s stdout on success, or an
|
||||
-- error message on failure (non-zero exit or missing @script=@ attribute).
|
||||
runScript :: FilePath -> [(T.Text, T.Text)] -> IO (Either String T.Text)
|
||||
runScript baseDir attrs =
|
||||
case lookup "script" attrs of
|
||||
Nothing -> return (Left "missing script= attribute")
|
||||
Just p -> do
|
||||
let fullPath = baseDir </> T.unpack p
|
||||
(ec, out, err) <-
|
||||
readProcessWithExitCode "python3" [fullPath] ""
|
||||
`catch` (\e -> return (ExitFailure 1, "", show (e :: IOException)))
|
||||
return $ case ec of
|
||||
ExitSuccess -> Right (T.pack out)
|
||||
ExitFailure _ -> Left (if null err then "non-zero exit" else err)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- SVG colour post-processing (mirrors Filters.Score.processColors)
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Replace hardcoded black fill/stroke values with @currentColor@ so the
|
||||
-- embedded SVG inherits the CSS text colour in both light and dark modes.
|
||||
processColors :: T.Text -> T.Text
|
||||
processColors
|
||||
= T.replace "fill=\"#000\"" "fill=\"currentColor\""
|
||||
. T.replace "fill=\"black\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000\"" "stroke=\"currentColor\""
|
||||
. T.replace "stroke=\"black\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000" "fill:currentColor"
|
||||
. T.replace "fill:black" "fill:currentColor"
|
||||
. T.replace "stroke:#000" "stroke:currentColor"
|
||||
. T.replace "stroke:black" "stroke:currentColor"
|
||||
. T.replace "fill=\"#000000\"" "fill=\"currentColor\""
|
||||
. T.replace "stroke=\"#000000\"" "stroke=\"currentColor\""
|
||||
. T.replace "fill:#000000" "fill:currentColor"
|
||||
. T.replace "stroke:#000000" "stroke:currentColor"
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- JSON safety for <script> embedding
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Replace @<\/@ with the JSON Unicode escape @\u003c\/@ so that Vega-Lite
|
||||
-- JSON embedded inside a @\<script\>@ tag cannot accidentally close it.
|
||||
-- JSON.parse decodes the escape back to @<\/@ transparently.
|
||||
escScriptTag :: T.Text -> T.Text
|
||||
escScriptTag = T.replace "</" "\\u003c/"
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- HTML output
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
staticFigureHtml :: T.Text -> T.Text -> T.Text
|
||||
staticFigureHtml svgContent caption = T.concat
|
||||
[ "<figure class=\"viz-figure\">"
|
||||
, svgContent
|
||||
, if T.null caption then ""
|
||||
else "<figcaption class=\"viz-caption\">" <> escHtml caption <> "</figcaption>"
|
||||
, "</figure>"
|
||||
]
|
||||
|
||||
interactiveFigureHtml :: T.Text -> T.Text -> T.Text
|
||||
interactiveFigureHtml jsonSpec caption = T.concat
|
||||
[ "<figure class=\"viz-interactive\">"
|
||||
, "<div class=\"vega-container\">"
|
||||
, "<script type=\"application/json\" class=\"vega-spec\">"
|
||||
, jsonSpec
|
||||
, "</script>"
|
||||
, "</div>"
|
||||
, if T.null caption then ""
|
||||
else "<figcaption class=\"viz-caption\">" <> escHtml caption <> "</figcaption>"
|
||||
, "</figure>"
|
||||
]
|
||||
|
||||
errorBlock :: String -> Block
|
||||
errorBlock msg = RawBlock (Format "html") $ T.concat
|
||||
[ "<div class=\"viz-error\"><strong>Visualization error:</strong> "
|
||||
, escHtml (T.pack msg)
|
||||
, "</div>"
|
||||
]
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Helpers
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
attr :: T.Text -> [(T.Text, T.Text)] -> T.Text
|
||||
attr key kvs = fromMaybe "" (lookup key kvs)
|
||||
|
||||
warn :: String -> String -> IO ()
|
||||
warn kind msg = putStrLn $ "[Viz] " ++ kind ++ " error: " ++ msg
|
||||
|
||||
escHtml :: T.Text -> T.Text
|
||||
escHtml = T.replace "&" "&"
|
||||
. T.replace "<" "<"
|
||||
. T.replace ">" ">"
|
||||
. T.replace "\"" """
|
||||
|
|
@ -0,0 +1,103 @@
|
|||
{-# LANGUAGE GHC2021 #-}
|
||||
{-# LANGUAGE OverloadedStrings #-}
|
||||
-- | Similar-links field: injects a "Related" list into essay/page contexts.
|
||||
--
|
||||
-- @data/similar-links.json@ is produced by @tools/embed.py@ at build time
|
||||
-- (called from the Makefile after pagefind, before sign). It is a plain
|
||||
-- JSON object mapping root-relative URL paths to lists of similar pages:
|
||||
--
|
||||
-- { "/essays/my-essay/": [{"url": "...", "title": "...", "score": 0.87}] }
|
||||
--
|
||||
-- This module loads that file with dependency tracking (so pages recompile
|
||||
-- when embeddings change) and provides @similarLinksField@, which resolves
|
||||
-- to an HTML list for the current page's URL.
|
||||
--
|
||||
-- If the file is absent (e.g. @.venv@ not set up, or first build) the field
|
||||
-- returns @noResult@ — the @$if(similar-links)$@ guard in the template is
|
||||
-- false and no "Related" section is rendered.
|
||||
module SimilarLinks (similarLinksField) where
|
||||
|
||||
import Data.Maybe (fromMaybe)
|
||||
import qualified Data.Map.Strict as Map
|
||||
import Data.Map.Strict (Map)
|
||||
import qualified Data.Text as T
|
||||
import qualified Data.Text.Encoding as TE
|
||||
import qualified Data.Aeson as Aeson
|
||||
import Hakyll
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- JSON schema
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
data SimilarEntry = SimilarEntry
|
||||
{ seUrl :: String
|
||||
, seTitle :: String
|
||||
, seScore :: Double
|
||||
} deriving (Show)
|
||||
|
||||
instance Aeson.FromJSON SimilarEntry where
|
||||
parseJSON = Aeson.withObject "SimilarEntry" $ \o ->
|
||||
SimilarEntry
|
||||
<$> o Aeson..: "url"
|
||||
<*> o Aeson..: "title"
|
||||
<*> o Aeson..: "score"
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Context field
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Provides @$similar-links$@ (HTML list) and @$has-similar-links$@
|
||||
-- (boolean flag for template guards).
|
||||
-- Returns @noResult@ when the JSON file is absent, unparseable, or the
|
||||
-- current page has no similar entries.
|
||||
similarLinksField :: Context String
|
||||
similarLinksField = field "similar-links" $ \item -> do
|
||||
-- Load with dependency tracking — pages recompile when the JSON changes.
|
||||
slItem <- load (fromFilePath "data/similar-links.json") :: Compiler (Item String)
|
||||
case Aeson.decodeStrict (TE.encodeUtf8 (T.pack (itemBody slItem)))
|
||||
:: Maybe (Map T.Text [SimilarEntry]) of
|
||||
Nothing -> fail "similar-links: could not parse data/similar-links.json"
|
||||
Just slMap -> do
|
||||
mRoute <- getRoute (itemIdentifier item)
|
||||
case mRoute of
|
||||
Nothing -> fail "similar-links: item has no route"
|
||||
Just r ->
|
||||
let key = T.pack (normaliseUrl ("/" ++ r))
|
||||
entries = fromMaybe [] (Map.lookup key slMap)
|
||||
in if null entries
|
||||
then fail "no similar links"
|
||||
else return (renderSimilarLinks entries)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- URL normalisation (mirrors embed.py's URL derivation)
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
normaliseUrl :: String -> String
|
||||
normaliseUrl url =
|
||||
let t = T.pack url
|
||||
-- strip query + fragment
|
||||
t1 = fst (T.breakOn "?" (fst (T.breakOn "#" t)))
|
||||
-- ensure leading slash
|
||||
t2 = if T.isPrefixOf "/" t1 then t1 else "/" `T.append` t1
|
||||
-- strip trailing index.html → keep the directory slash
|
||||
t3 = fromMaybe t2 (T.stripSuffix "index.html" t2)
|
||||
-- strip bare .html extension only for non-index pages
|
||||
t4 = fromMaybe t3 (T.stripSuffix ".html" t3)
|
||||
in T.unpack t4
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- HTML rendering
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
renderSimilarLinks :: [SimilarEntry] -> String
|
||||
renderSimilarLinks entries =
|
||||
"<ul class=\"similar-links-list\">\n"
|
||||
++ concatMap renderOne entries
|
||||
++ "</ul>"
|
||||
where
|
||||
renderOne se =
|
||||
"<li class=\"similar-links-item\">"
|
||||
++ "<a href=\"" ++ escapeHtml (seUrl se) ++ "\">"
|
||||
++ escapeHtml (seTitle se)
|
||||
++ "</a>"
|
||||
++ "</li>\n"
|
||||
|
|
@ -81,6 +81,10 @@ rules = do
|
|||
route idRoute
|
||||
compile copyFileCompiler
|
||||
|
||||
-- Similar links — produced by tools/embed.py; absent on first build or
|
||||
-- when .venv is not set up. Compiled as a raw string for similarLinksField.
|
||||
match "data/similar-links.json" $ compile getResourceBody
|
||||
|
||||
-- Commonplace YAML — compiled as a raw string so it can be loaded
|
||||
-- with dependency tracking by the commonplace page compiler.
|
||||
match "data/commonplace.yaml" $ compile getResourceBody
|
||||
|
|
|
|||
|
|
@ -1 +0,0 @@
|
|||
1773865704
|
||||
|
|
@ -1 +0,0 @@
|
|||
1
|
||||
|
|
@ -17,6 +17,7 @@ executable site
|
|||
Catalog
|
||||
Commonplace
|
||||
Backlinks
|
||||
SimilarLinks
|
||||
Compilers
|
||||
Contexts
|
||||
Stats
|
||||
|
|
@ -36,6 +37,7 @@ executable site
|
|||
Filters.Code
|
||||
Filters.Images
|
||||
Filters.Score
|
||||
Filters.Viz
|
||||
Utils
|
||||
build-depends:
|
||||
base >= 4.18 && < 5,
|
||||
|
|
|
|||
|
|
@ -0,0 +1,29 @@
|
|||
[project]
|
||||
name = "levineuwirth-tools"
|
||||
version = "0.1.0"
|
||||
description = "Build-time tooling for levineuwirth.org"
|
||||
requires-python = ">=3.12"
|
||||
dependencies = [
|
||||
# Visualization
|
||||
"matplotlib>=3.9",
|
||||
"altair>=5.4",
|
||||
|
||||
# Embedding pipeline
|
||||
"sentence-transformers>=3.4",
|
||||
"faiss-cpu>=1.9",
|
||||
"numpy>=2.0",
|
||||
"beautifulsoup4>=4.12",
|
||||
"einops>=0.8",
|
||||
# CPU-only torch — avoids pulling ~3 GB of CUDA libraries
|
||||
"torch>=2.5",
|
||||
]
|
||||
|
||||
[[tool.uv.index]]
|
||||
name = "pytorch-cpu"
|
||||
url = "https://download.pytorch.org/whl/cpu"
|
||||
explicit = true
|
||||
|
||||
[tool.uv.sources]
|
||||
torch = [{ index = "pytorch-cpu" }]
|
||||
|
||||
[tool.uv]
|
||||
67
spec.md
67
spec.md
|
|
@ -1,7 +1,7 @@
|
|||
# levineuwirth.org — Design Specification v8
|
||||
# levineuwirth.org — Design Specification v9
|
||||
|
||||
**Author:** Levi Neuwirth
|
||||
**Date:** March 2026 (v8: 16 March 2026)
|
||||
**Date:** March 2026 (v9: 20 March 2026)
|
||||
**Status:** LIVING DOCUMENT — Updated as implementation progresses.
|
||||
|
||||
---
|
||||
|
|
@ -175,11 +175,19 @@ rsync -avz --delete \
|
|||
|
||||
```makefile
|
||||
build:
|
||||
@git add content/
|
||||
@git diff --cached --quiet || git commit -m "auto: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
@date +%s > data/build-start.txt
|
||||
cabal run site -- build
|
||||
pagefind --site _site
|
||||
> IGNORE.txt # clear stability pins after each build
|
||||
@BUILD_END=$(date +%s); BUILD_START=$(cat data/build-start.txt); \
|
||||
echo $((BUILD_END - BUILD_START)) > data/last-build-seconds.txt
|
||||
|
||||
deploy: build
|
||||
sign:
|
||||
@./tools/sign-site.sh # detach-sign every _site/**/*.html; requires passphrase cached via preset-signing-passphrase.sh
|
||||
|
||||
deploy: build sign
|
||||
rsync -avz --delete _site/ vps:/var/www/levineuwirth.org/
|
||||
|
||||
watch:
|
||||
|
|
@ -255,6 +263,7 @@ levineuwirth.org/
|
|||
│ │ ├── popups.css # Link preview popup styles
|
||||
│ │ ├── syntax.css # Monochrome code highlighting (JetBrains Mono)
|
||||
│ │ ├── components.css # Nav (incl. settings panel), TOC, metadata, citations, collapsibles
|
||||
│ │ ├── viz.css # Visualization figure layout (.viz-figure, .vega-container, .viz-caption)
|
||||
│ │ ├── gallery.css # Exhibit system + annotation callouts
|
||||
│ │ ├── selection-popup.css # Text-selection toolbar
|
||||
│ │ ├── annotations.css # User highlight marks + annotation tooltip
|
||||
|
|
@ -275,9 +284,12 @@ levineuwirth.org/
|
|||
│ │ ├── selection-popup.js # Context-aware text-selection toolbar
|
||||
│ │ ├── annotations.js # localStorage highlight/annotation engine (UI deferred)
|
||||
│ │ ├── score-reader.js # Score reader: page-turn, movement jumps, deep linking
|
||||
│ │ ├── search.js # Pagefind UI init + ?q= pre-fill
|
||||
│ │ ├── viz.js # Vega-Lite render + dark mode re-render via MutationObserver
|
||||
│ │ ├── search.js # Pagefind UI init + ?q= pre-fill + search timing (#search-timing)
|
||||
│ │ └── prism.min.js # Syntax highlighting
|
||||
│ ├── fonts/ # Self-hosted WOFF2 (subsetted with OT features)
|
||||
│ ├── gpg/
|
||||
│ │ └── pubkey.asc # Ed25519 signing subkey public key (master: CD90AE96…; subkey: C9A42A6F…)
|
||||
│ └── images/
|
||||
│ └── link-icons/ # SVG icons for external link classification
|
||||
│ ├── external.svg
|
||||
|
|
@ -320,7 +332,8 @@ levineuwirth.org/
|
|||
│ │ ├── Math.hs # Simple LaTeX → Unicode conversion
|
||||
│ │ ├── Code.hs # Prepend language- prefix for Prism.js
|
||||
│ │ ├── Images.hs # Lazy loading, lightbox data-attributes
|
||||
│ │ └── Score.hs # Score fragment SVG inlining + currentColor replacement
|
||||
│ │ ├── Score.hs # Score fragment SVG inlining + currentColor replacement
|
||||
│ │ └── Viz.hs # Visualization IO filter: runs Python scripts, inlines SVG / Vega-Lite JSON
|
||||
│ ├── Authors.hs # Author-as-tag system (slugify, authorLinksField, author pages)
|
||||
│ ├── Backlinks.hs # Two-pass build-time backlinks with context paragraph extraction
|
||||
│ ├── Catalog.hs # Music catalog: featured works + grouped-by-category HTML rendering
|
||||
|
|
@ -334,7 +347,10 @@ levineuwirth.org/
|
|||
│ ├── chicago-notes.csl # CSL style (in-text, Chicago Author-Date)
|
||||
│ └── (future: embeddings.json, similar-links.json)
|
||||
├── tools/
|
||||
│ └── subset-fonts.sh
|
||||
│ ├── subset-fonts.sh
|
||||
│ ├── viz_theme.py # Matplotlib monochrome helpers (apply_monochrome, save_svg, LINESTYLE_CYCLE)
|
||||
│ ├── sign-site.sh # Detach-sign every _site/**/*.html → .html.sig (called by `make sign`)
|
||||
│ └── preset-signing-passphrase.sh # Cache signing subkey passphrase in gpg-agent (run once per boot)
|
||||
├── levineuwirth.cabal
|
||||
├── cabal.project
|
||||
├── cabal.project.freeze
|
||||
|
|
@ -372,7 +388,7 @@ levineuwirth.org/
|
|||
|
||||
### Phase 3: Rich Interactions
|
||||
- [x] Link preview popups (`popups.js`) — internal page previews (title, abstract, authors, tags, reading time), Wikipedia excerpts, citation previews; relative-URL fix for index pages
|
||||
- [x] Pagefind search (`/search.html`) — `search.js` pre-fills from `?q=` param so selection popup "Here" button lands ready
|
||||
- [x] Pagefind search (`/search.html`) — `search.js` pre-fills from `?q=` param; `#search-timing` shows elapsed ms (mono, faint) via `MutationObserver` on search results subtree
|
||||
- [x] Author system — authors treated as tags; `build/Authors.hs`; author pages at `/authors/{slug}/`; `authorLinksField` in all contexts; defaults to Levi Neuwirth
|
||||
- [x] Settings panel — `settings.js` + `settings.css` section in `components.css`; theme, text size (3 steps), focus mode, reduce motion, print; all state in `localStorage`; `theme.js` restores all settings before first paint
|
||||
- [x] Selection popup — `selection-popup.js` / `selection-popup.css`; context-aware toolbar appears 450 ms after text selection; see Implementation Notes
|
||||
|
|
@ -388,7 +404,7 @@ levineuwirth.org/
|
|||
- [x] Music section — score fragment system (A): inline SVG excerpts (motifs, passages) integrated into the gallery/exhibit system; named, TOC-listed, focusable in the shared overlay alongside equations; authored via `{.score-fragment score-name="..." score-caption="..."}` fenced-div; SVG inlined at build time by `Filters.Score`; black fills/strokes replaced with `currentColor` for dark-mode; see Implementation Notes
|
||||
- [x] Music section — composition landing pages + full score reader (C): two-URL architecture per composition; `/music/{slug}/` (rich prose landing page with movement list, audio players, inline score fragments) and `/music/{slug}/score/` (minimal dedicated reader); Hakyll `version "score-reader"` mechanism; `compositionCtx` with `slug`, `score-url`, `has-score`, `score-page-count`, `score-pages` list, `has-movements`, `movements` list (Aeson-parsed nested YAML); `score-reader-default.html` minimal shell; `score-reader.js` (page navigation, movement jumps, `?p=` deep linking, preloading, keyboard); `score-reader.css`; dark mode via `filter: invert(1)`; see Implementation Notes
|
||||
- [x] Accessibility audit — skip link, TOC collapsed-link tabbing (`visibility: hidden`), section-toggle focus visibility, lightbox/gallery/settings focus restoration, popup `aria-hidden`, metadata nav wrapping, footer `onclick` removal; settings panel focus-steal bug fixed (focus only returns to toggle when it was inside the panel, preventing interference with text-selection popup)
|
||||
- [ ] Visualization pipeline — matplotlib / Altair figures generated at build time; each visualization lives in its own directory (e.g. `content/viz/my-chart/`) alongside a `generate.py` and a versioned dataset; Hakyll rule invokes `python generate.py` to produce SVG/HTML output and copies it into `_site/`; datasets can be updated independently and graphs regenerate on next build
|
||||
- [~] Visualization pipeline — Pandoc filter approach (`Filters.Viz`): `.figure` fenced divs run `python3 <script>`, capture SVG stdout, inline with `currentColor` replacement; `.visualization` fenced divs embed Vega-Lite JSON in a `<script type="application/json" class="vega-spec">` tag rendered by `viz.js`; `viz: true` frontmatter gates CDN Vega/Vega-Lite/Vega-Embed + `viz.js`; dark mode re-renders via `MutationObserver`; `tools/viz_theme.py` provides matplotlib monochrome helpers. Infrastructure complete; not yet used in production content.
|
||||
- [ ] Content migration — migrate existing essays, poems, fiction, and music landing pages from prior formats into `content/`
|
||||
|
||||
### Phase 5: Infrastructure & Advanced
|
||||
|
|
@ -401,7 +417,7 @@ levineuwirth.org/
|
|||
|
||||
### Phase 6: Deferred Features
|
||||
- [ ] **Annotation UI** — The `annotations.js` / `annotations.css` infrastructure exists (localStorage storage, re-anchoring on load, four highlight colors, hover tooltip). The selection popup "Annotate" button was removed pending a design decision on the color-picker and note-entry UX. Revisit: a popover with four color swatches and an optional text field, triggered from the selection popup.
|
||||
- [ ] **Visualization pipeline** — Each visualization lives in `content/viz/{slug}/` alongside `generate.py` and a versioned dataset CSV/JSON. Hakyll rule: `unsafeCompiler (callProcess "python" ["generate.py"])` writes SVG/HTML output into the item body. Output is embedded in the page or served as a static asset. Datasets can be updated independently; graphs regenerate on next `make build`. Matplotlib for static figures; Altair for interactive (Vega-Lite JSON embedded, rendered client-side by Vega-Lite JS — loaded conditionally).
|
||||
- [~] **Visualization pipeline** — Implemented as a Pandoc IO filter (`Filters.Viz`), not a per-slug Hakyll rule. See Phase 4 entry and Implementation Notes. Infrastructure complete; production content pending.
|
||||
- [x] **Music catalog page** — `/music/` index listing all compositions grouped by instrumentation category (orchestral → chamber → solo → vocal → choral → electronic → other), with an optional Featured section. Auto-generated from composition frontmatter by `build/Catalog.hs`; renders HTML in Haskell (same pattern as backlinks). Category, year, duration, instrumentation, and ◼/♫ indicators for score/recording availability. `content/music/index.md` provides prose intro + abstract. Template: `templates/music-catalog.html`. CSS: `static/css/catalog.css`. Context: `musicCatalogCtx` (provides `catalog: true` flag, `featured-works`, `has-featured`, `catalog-by-category`).
|
||||
- [x] **Score reader swipe gestures** — `touchstart`/`touchend` listeners on `#score-reader-stage` with passive: true. Threshold: ≥ 50 px horizontal, < 30 px vertical drift. Left swipe → next page; right swipe → previous page.
|
||||
- [x] **Full-piece audio on composition pages** — `recording` frontmatter key (path relative to the composition directory). Rendered as a full-width `<audio>` player in `composition.html`, above the per-movement list. Styled via `.comp-recording` / `.comp-recording-audio` in `components.css`. Per-movement `<audio>` players and `.comp-btn` / `.comp-movement-*` styles also added in the same pass.
|
||||
|
|
@ -414,7 +430,7 @@ levineuwirth.org/
|
|||
- [x] **Memento mori** — Implemented at `/memento-mori/` as a full standalone page. 90×52 grid of weeks anchored to birthday anniversaries (nested year/week loop via `setFullYear`; week 52 stretched to eve of next birthday to absorb 365th/366th days). Week popup shows dynamic day-count and locale-derived day names. Score fragment (bassoon, `content/memento-mori/scores/bsn.svg`) inlined via `Filters.Score`. Linked from footer (MM).
|
||||
- [ ] **Embedding-powered similar links** — Precompute dense vector embeddings for every page using a local model (e.g. `nomic-embed-text` or `gte-large` via `ollama` or `llama.cpp`) on personal hardware — no API dependency, no per-call cost. At build time, a Python script reads `_site/` HTML, embeds each page, computes top-N cosine neighbors, and writes `data/similar-links.json` (slug → [{slug, title, score}]). Hakyll injects this into each page's context (via `Metadata.hs` reading the JSON); template renders a "Related" section in the page footer. Analogous to gwern's `GenerateSimilar.hs` but model-agnostic and self-hosted. Note: supersedes the Phase 5 "Semantic embedding pipeline" stub — that stub should be replaced by this when implemented.
|
||||
- [x] **Bidirectional backlinks with context** — See Phase 5 above; implemented with full context-paragraph extraction. Merged with the Phase 5 stub.
|
||||
- [ ] **Signed pages / content integrity** — GPG-sign each HTML output file at build time using a detached ASCII-armored signature (`.sig` file per page). The signing step runs as a final Makefile target after Hakyll and Pagefind complete: `find _site -name '*.html' -exec gpg --batch --yes --detach-sign --armor {} \;`. Signatures are served alongside their pages (e.g. `/essays/my-essay.html.sig`). The page footer displays a verification block near the license: the signing key fingerprint, a link to `/gpg/` where the public key is published, and a link to the `.sig` file for that page — so readers can verify without hunting for the key. The public key is also available at the standard WKD location and published to keyservers. **Operational requirement:** a dedicated signing subkey (no passphrase) on the build machine; the master certifying key stays offline and passphrase-protected. A `tools/setup-signing.sh` script will walk through creating the signing subkey, exporting it, and configuring the build — so the setup is repeatable when moving between machines or provisioning the VPS. Philosophically consistent with the FOSS/privacy ethos and the "configuration is code" principle; extreme, but the site is already committed to doing things properly.
|
||||
- [x] **Signed pages / content integrity** — `make sign` (called by `make deploy`) runs `tools/sign-site.sh`: walks `_site/**/*.html`, produces a detached ASCII-armored `.sig` per page. Signing uses a dedicated Ed25519 subkey isolated in `~/.gnupg-signing/` (master `sec#` stub + `ssb` signing subkey). Passphrase cached 24 h in the signing agent via `tools/preset-signing-passphrase.sh` + `gpg-preset-passphrase`; `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase`. Footer "sig" link points to `$url$.sig`; hovering shows the ASCII armor via `popups.js` `sigContent` provider. Public key at `static/gpg/pubkey.asc` → served at `/gpg/pubkey.asc`. Fingerprints: master `CD90AE96…B5C9663`; signing subkey `C9A42A6F…2707066` (keygrip `619844703EC398E70B0045D7150F08179CFEEFE3`). See Implementation Notes.
|
||||
- [ ] **Full-text semantic search** — A secondary search mode alongside Pagefind's keyword index. Precompute embeddings for every paragraph (same pipeline as similar links). Store as a compact binary or JSON index. At query time, either: (a) compute the query embedding client-side using a small WASM model (e.g. `transformers.js` with a quantized MiniLM) and run cosine similarity against the stored paragraph vectors, or (b) use a precomputed query-expansion table (top-K words → relevant slugs, offline). Surfaced as a "Semantic search" toggle on `/search.html`. Returns paragraphs rather than pages as the result unit, with the source page title and a link to the specific section. This finds conceptually related content even when exact keywords differ — searching "the relationship between music and mathematics" surfaces relevant essays regardless of vocabulary.
|
||||
|
||||
---
|
||||
|
|
@ -621,6 +637,37 @@ Implemented across `build/Stability.hs`, `build/Contexts.hs`, `templates/partial
|
|||
|
||||
**Critical implementation note:** Fields that use `unsafeCompiler` must return `Maybe` from the IO block and call `fail` in the `Compiler` monad afterward — not inside the `IO` action. Calling `fail` inside `unsafeCompiler`'s IO block throws an `IOError` that Hakyll's `$if()$` template evaluation does not catch as `NoResult`, causing the entire item compilation to error silently.
|
||||
|
||||
### Visualization pipeline — Pandoc IO filter approach
|
||||
|
||||
`build/Filters/Viz.hs` walks the AST for `Div` blocks with class `figure` or `visualization`.
|
||||
|
||||
**Static figures (`.figure`):** reads the `script` attribute, runs `python3 <script>` with the source file's directory as cwd, captures stdout as SVG. Replaces hardcoded `#000000`/`black` fills/strokes with `currentColor` (same trick as `Filters.Score`). Wraps in `<figure class="viz-figure">`. Script is expected to import `tools/viz_theme` and call `save_svg()` which writes to stdout.
|
||||
|
||||
**Interactive figures (`.visualization`):** runs the script, expects Vega-Lite JSON on stdout. Embeds as `<script type="application/json" class="vega-spec">` inside a `.vega-container` div. `viz.js` finds all `.vega-spec` scripts, stores parsed spec on `container._vegaSpec`, calls `vegaEmbed`. Always applies the site's monochrome Vega config, ignoring the spec's own `config`. MutationObserver on `document.documentElement[data-theme]` triggers `reRenderAll()` on theme change.
|
||||
|
||||
**Frontmatter:** `viz: true` gates CDN loading of Vega/Vega-Lite/Vega-Embed and `viz.js` in `head.html` via `$if(viz)$`.
|
||||
|
||||
**`tools/viz_theme.py`:** `apply_monochrome()` sets matplotlib rcParams (transparent backgrounds, black lines); `save_svg(fig)` writes SVG to stdout via `io.StringIO`; `LINESTYLE_CYCLE` provides dash-pattern sequences for multi-series charts (no color distinction needed).
|
||||
|
||||
**Authoring syntax:**
|
||||
```markdown
|
||||
::: {.figure script="figures/plot.py" caption="Caption text"}
|
||||
:::
|
||||
|
||||
::: {.visualization script="figures/chart.py" caption="Caption text"}
|
||||
:::
|
||||
```
|
||||
|
||||
### GPG signing — dedicated subkey + preset passphrase
|
||||
|
||||
**Key architecture:** master certifying key in `~/.gnupg` (passphrase-protected, used for email). Dedicated signing keyring at `~/.gnupg-signing/` holds: `sec#` (master stub, no secret) + `ssb` Ed25519 signing subkey (with secret). Correct isolation: `gpg --export-secret-subkeys "FINGERPRINT!"` exports only the subkey secret.
|
||||
|
||||
**Passphrase caching:** GPG 2.4's `passwd` in `--edit-key` requires the master secret to be present — it cannot change a subkey passphrase in a stub+subkey-only keyring. Instead, `gpg-preset-passphrase` (`/usr/lib/gnupg/gpg-preset-passphrase`) caches the passphrase by keygrip directly in the agent. `~/.gnupg-signing/gpg-agent.conf` sets `allow-preset-passphrase` and `max-cache-ttl 86400`. `tools/preset-signing-passphrase.sh` prompts via the terminal, calls `gpg-preset-passphrase --preset <keygrip>`. Must be run once per boot (or when the 24h cache expires).
|
||||
|
||||
**Popup preview:** `popups.js` `sigContent` provider fetches the `.sig` URL (same-origin), renders the ASCII armor in a `<pre>` inside a `.popup-sig` div. Bound to `a.footer-sig-link` explicitly in `bindTargets`, bypassing the footer-exclusion guard on internal links. Result cached in the shared `cache` map.
|
||||
|
||||
**nginx:** `.sig` files need no special handling — they're served as static files alongside `.html`. The `try_files` directive handles `$uri` directly.
|
||||
|
||||
### Annotations (infrastructure only)
|
||||
`annotations.js` stores annotations as JSON in `localStorage` under `site-annotations`, scoped per `location.pathname`. On `DOMContentLoaded`, `applyAll()` re-anchors saved annotations via a `TreeWalker` text-stream search (concatenates all text nodes in `#markdownBody`, finds exact match by index, builds a `Range`, wraps with `<mark>`). Cross-element ranges use `extractContents()` + `insertNode()` fallback. Four highlight colors (amber / sage / steel / rose) defined in `annotations.css` as `rgba` overlays with `box-decoration-break: clone`. Hover tooltip shows note, date, and delete button. Public API: `window.Annotations.add(text, color, note)` / `.remove(id)`. The selection-popup "Annotate" button is currently removed pending a UI revision.
|
||||
|
||||
|
|
|
|||
|
|
@ -967,6 +967,14 @@ h3:hover .section-toggle,
|
|||
--pagefind-ui-font: var(--font-sans);
|
||||
}
|
||||
|
||||
#search-timing {
|
||||
font-family: var(--font-mono);
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-faint);
|
||||
margin-top: 0.5rem;
|
||||
min-height: 1.2em; /* reserve space to prevent layout shift */
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
COMPOSITION LANDING PAGE
|
||||
============================================================ */
|
||||
|
|
|
|||
|
|
@ -103,3 +103,14 @@
|
|||
.popup-citation .ref-num {
|
||||
display: none;
|
||||
}
|
||||
|
||||
/* PGP signature popup */
|
||||
.popup-sig pre {
|
||||
margin: 0;
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.68rem;
|
||||
line-height: 1.45;
|
||||
white-space: pre;
|
||||
overflow-x: auto;
|
||||
color: var(--text-muted);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -0,0 +1,66 @@
|
|||
/* viz.css — Styles for inline data visualizations (.viz-figure, .viz-interactive) */
|
||||
|
||||
/* ============================================================
|
||||
Static figures (Matplotlib SVG)
|
||||
============================================================ */
|
||||
|
||||
.viz-figure {
|
||||
margin: 2rem 0;
|
||||
break-inside: avoid;
|
||||
}
|
||||
|
||||
.viz-figure svg {
|
||||
width: 100%;
|
||||
height: auto;
|
||||
display: block;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
Interactive figures (Vega-Lite via vega-embed)
|
||||
============================================================ */
|
||||
|
||||
.viz-interactive {
|
||||
margin: 2rem 0;
|
||||
break-inside: avoid;
|
||||
}
|
||||
|
||||
.vega-container {
|
||||
width: 100%;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
/* vega-embed injects a <div> containing a <canvas> or <svg> */
|
||||
.vega-container > div {
|
||||
width: 100% !important;
|
||||
}
|
||||
|
||||
.vega-container canvas,
|
||||
.vega-container svg {
|
||||
max-width: 100%;
|
||||
display: block;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
Captions (shared)
|
||||
============================================================ */
|
||||
|
||||
.viz-caption {
|
||||
font-size: var(--text-size-small);
|
||||
color: var(--text-muted);
|
||||
text-align: center;
|
||||
margin-top: 0.5rem;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
/* ============================================================
|
||||
Error display (build-time script failures)
|
||||
============================================================ */
|
||||
|
||||
.viz-error {
|
||||
padding: 0.75rem 1rem;
|
||||
border: 1px solid var(--border);
|
||||
color: var(--text-muted);
|
||||
font-family: var(--font-mono);
|
||||
font-size: var(--text-size-small);
|
||||
margin: 1.5rem 0;
|
||||
}
|
||||
|
|
@ -1,6 +1,16 @@
|
|||
-----BEGIN PGP PUBLIC KEY BLOCK-----
|
||||
|
||||
[ Replace this entire block with your actual exported public key:
|
||||
gpg --armor --export ln@levineuwirth.org ]
|
||||
|
||||
mDMEYw0NIRYJKwYBBAHaRw8BAQdADR++EP+ENf143YCIYfJ9hsvvECChdqT/YrQi
|
||||
bcP5iti0KWxuQGxldmluZXV3aXJ0aC5vcmcgPGxuQGxldmluZXV3aXJ0aC5vcmc+
|
||||
iI8EEBYKACAFAmMNDSEGCwkHCAMCBBUICgIEFgIBAAIZAQIbAwIeAQAhCRCwGgGt
|
||||
G1yWYxYhBM2QrpY4O7r0FaHXQLAaAa0bXJZjqVMA/i6PkQSa3gk9Pr4c26BQwKZV
|
||||
67r6jfsaiEux6/wJMaBAAPwNAgJSeLCzIEYhFa/GTX8iHVixiNVKdrYXVtxsKbyK
|
||||
D7gzBGm9dD0WCSsGAQQB2kcPAQEHQC7h1p+9tCmC5KUqp7wJC/AbShfe9S4NcsBa
|
||||
2/guvOMoiO8EGBYKACAWIQTNkK6WODu69BWh10CwGgGtG1yWYwUCab10PQIbAgCB
|
||||
CRCwGgGtG1yWY3YgBBkWCgAdFiEEyaQqb61ET75Wb9c4UxvcHMJwcGYFAmm9dD0A
|
||||
CgkQUxvcHMJwcGZiXwD/ThKmvRJigUpPlIsWhTMufXbK1NOSf8a9Z8JhXxIwB8cB
|
||||
AJ7qC4wMBIaPy+9AwliWo7m8NKc/HkiVyoMxAIUBLHsPH/sA/2F1ACNZKJBVp9ze
|
||||
sxr9TPZnMMWSjLSAEoTsjXFKRwb+AP0XMJn+hiTSy2S7bEC3hsu3xJSKePHByFGD
|
||||
26I8RLM4BA==
|
||||
=Xwzi
|
||||
-----END PGP PUBLIC KEY BLOCK-----
|
||||
|
|
|
|||
|
|
@ -71,6 +71,11 @@
|
|||
bind(el, internalContent);
|
||||
});
|
||||
|
||||
/* PGP signature links in footer */
|
||||
root.querySelectorAll('a.footer-sig-link').forEach(function (el) {
|
||||
bind(el, sigContent);
|
||||
});
|
||||
|
||||
/* External links — single dispatcher handles all providers */
|
||||
root.querySelectorAll('a[href^="http"]').forEach(function (el) {
|
||||
if (el.closest('nav, #toc, footer, .page-meta-footer')) return;
|
||||
|
|
@ -541,6 +546,21 @@
|
|||
return html;
|
||||
}
|
||||
|
||||
/* PGP signature — fetch the .sig file and display ASCII armor */
|
||||
function sigContent(target) {
|
||||
var href = target.getAttribute('href');
|
||||
if (!href) return Promise.resolve(null);
|
||||
if (cache[href]) return Promise.resolve(cache[href]);
|
||||
return fetch(href, { credentials: 'same-origin' })
|
||||
.then(function (r) { return r.ok ? r.text() : null; })
|
||||
.then(function (text) {
|
||||
if (!text) return null;
|
||||
var html = '<div class="popup-sig"><pre>' + esc(text.trim()) + '</pre></div>';
|
||||
cache[href] = html;
|
||||
return html;
|
||||
});
|
||||
}
|
||||
|
||||
function esc(s) {
|
||||
return String(s)
|
||||
.replace(/&/g, '&')
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
/* search.js — Pagefind UI initialisation for /search.html.
|
||||
Loaded only on pages with search: true in frontmatter.
|
||||
Pre-fills the search box from the ?q= query parameter so that
|
||||
the selection popup's "Here" button lands ready to go. */
|
||||
the selection popup's "Here" button lands ready to go.
|
||||
Also instruments search timing and displays elapsed ms. */
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
|
|
@ -12,10 +13,41 @@
|
|||
excerptLength: 30,
|
||||
});
|
||||
|
||||
/* Pre-fill from URL parameter and trigger the search */
|
||||
/* Timing instrumentation ------------------------------------------ */
|
||||
var timingEl = document.getElementById('search-timing');
|
||||
var searchEl = document.getElementById('search');
|
||||
var startTime = null;
|
||||
|
||||
if (timingEl && searchEl) {
|
||||
/* Input field is created synchronously by PagefindUI above. */
|
||||
var input = searchEl.querySelector('.pagefind-ui__search-input');
|
||||
if (input) {
|
||||
input.addEventListener('input', function () {
|
||||
if (input.value.trim().length > 0) {
|
||||
startTime = performance.now();
|
||||
} else {
|
||||
startTime = null;
|
||||
timingEl.textContent = '';
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
/* Watch for Pagefind rebuilding the results area. */
|
||||
new MutationObserver(function () {
|
||||
if (startTime !== null) {
|
||||
var elapsed = Math.round(performance.now() - startTime);
|
||||
/* U+2009 thin space between number and unit */
|
||||
timingEl.textContent = elapsed + '\u2009ms';
|
||||
startTime = null;
|
||||
}
|
||||
}).observe(searchEl, { childList: true, subtree: true });
|
||||
}
|
||||
|
||||
/* Pre-fill from URL parameter and trigger the search -------------- */
|
||||
var params = new URLSearchParams(window.location.search);
|
||||
var q = params.get('q');
|
||||
if (q) {
|
||||
startTime = performance.now();
|
||||
ui.triggerSearch(q);
|
||||
}
|
||||
});
|
||||
|
|
|
|||
|
|
@ -0,0 +1,158 @@
|
|||
/* viz.js — Vega-Lite renderer with monochrome theming and dark-mode support.
|
||||
*
|
||||
* Finds every <script type="application/json" class="vega-spec"> embedded by
|
||||
* Filters.Viz, renders it into the parent .vega-container via Vega-Embed,
|
||||
* and re-renders whenever the site's light/dark theme changes.
|
||||
*
|
||||
* The site-level monochrome config is always applied; scripts should not
|
||||
* set colour-related Vega-Lite config keys — use encoding shape, strokeDash,
|
||||
* and opacity to distinguish series instead of hue.
|
||||
*/
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Monochrome Vega-Lite configs (matched to base.css custom properties)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
var LIGHT = {
|
||||
background: null,
|
||||
font: 'Spectral, Georgia, "Times New Roman", serif',
|
||||
mark: { color: '#1a1a1a' },
|
||||
axis: {
|
||||
gridColor: '#cccccc',
|
||||
gridOpacity: 0.6,
|
||||
domainColor: '#1a1a1a',
|
||||
tickColor: '#1a1a1a',
|
||||
labelColor: '#1a1a1a',
|
||||
titleColor: '#555555',
|
||||
labelFont: 'Spectral, Georgia, serif',
|
||||
titleFont: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
titleFontWeight: 'normal',
|
||||
},
|
||||
legend: {
|
||||
labelColor: '#1a1a1a',
|
||||
titleColor: '#555555',
|
||||
labelFont: 'Spectral, Georgia, serif',
|
||||
titleFont: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
titleFontWeight: 'normal',
|
||||
},
|
||||
title: {
|
||||
color: '#1a1a1a',
|
||||
subtitleColor: '#555555',
|
||||
font: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
fontWeight: 'normal',
|
||||
},
|
||||
range: {
|
||||
category: ['#1a1a1a', '#555555', '#888888', '#aaaaaa', '#cccccc'],
|
||||
ordinal: { scheme: 'greys' },
|
||||
ramp: { scheme: 'greys' },
|
||||
},
|
||||
view: { stroke: null },
|
||||
};
|
||||
|
||||
var DARK = {
|
||||
background: null,
|
||||
font: 'Spectral, Georgia, "Times New Roman", serif',
|
||||
mark: { color: '#d4d0c8' },
|
||||
axis: {
|
||||
gridColor: '#333333',
|
||||
gridOpacity: 0.8,
|
||||
domainColor: '#d4d0c8',
|
||||
tickColor: '#d4d0c8',
|
||||
labelColor: '#d4d0c8',
|
||||
titleColor: '#8c8881',
|
||||
labelFont: 'Spectral, Georgia, serif',
|
||||
titleFont: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
titleFontWeight: 'normal',
|
||||
},
|
||||
legend: {
|
||||
labelColor: '#d4d0c8',
|
||||
titleColor: '#8c8881',
|
||||
labelFont: 'Spectral, Georgia, serif',
|
||||
titleFont: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
titleFontWeight: 'normal',
|
||||
},
|
||||
title: {
|
||||
color: '#d4d0c8',
|
||||
subtitleColor: '#8c8881',
|
||||
font: '"Fira Sans", "Helvetica Neue", Arial, sans-serif',
|
||||
fontWeight: 'normal',
|
||||
},
|
||||
range: {
|
||||
category: ['#d4d0c8', '#8c8881', '#6a6660', '#444444', '#333333'],
|
||||
ordinal: { scheme: 'greys' },
|
||||
ramp: { scheme: 'greys' },
|
||||
},
|
||||
view: { stroke: null },
|
||||
};
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Theme detection (matches theme.js logic)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
function isDark() {
|
||||
var t = document.documentElement.dataset.theme;
|
||||
if (t === 'dark') return true;
|
||||
if (t === 'light') return false;
|
||||
return window.matchMedia('(prefers-color-scheme: dark)').matches;
|
||||
}
|
||||
|
||||
function themeConfig() {
|
||||
return isDark() ? DARK : LIGHT;
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Rendering
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
function renderOne(container) {
|
||||
var spec = container._vegaSpec;
|
||||
if (!spec) return;
|
||||
// Always apply site theme; ignore any config baked into the spec.
|
||||
var mergedSpec = Object.assign({}, spec, { config: themeConfig() });
|
||||
vegaEmbed(container, mergedSpec, { actions: false, renderer: 'svg' })
|
||||
.catch(function (err) { console.error('[viz]', err); });
|
||||
}
|
||||
|
||||
function renderAll() {
|
||||
var scripts = document.querySelectorAll('script.vega-spec');
|
||||
for (var i = 0; i < scripts.length; i++) {
|
||||
var scriptEl = scripts[i];
|
||||
var container = scriptEl.closest('.vega-container');
|
||||
if (!container) continue;
|
||||
try {
|
||||
// Store the parsed spec on the container before vegaEmbed replaces
|
||||
// the container's innerHTML (which removes the <script> element).
|
||||
container._vegaSpec = JSON.parse(scriptEl.textContent);
|
||||
} catch (e) {
|
||||
console.error('[viz] Failed to parse Vega-Lite spec:', e);
|
||||
continue;
|
||||
}
|
||||
renderOne(container);
|
||||
}
|
||||
}
|
||||
|
||||
function reRenderAll() {
|
||||
var containers = document.querySelectorAll('.vega-container');
|
||||
for (var i = 0; i < containers.length; i++) {
|
||||
renderOne(containers[i]);
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Initialisation and theme-change listener
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
document.addEventListener('DOMContentLoaded', renderAll);
|
||||
|
||||
new MutationObserver(function (mutations) {
|
||||
for (var i = 0; i < mutations.length; i++) {
|
||||
if (mutations[i].attributeName === 'data-theme') {
|
||||
reRenderAll();
|
||||
return;
|
||||
}
|
||||
}
|
||||
}).observe(document.documentElement, { attributes: true });
|
||||
|
||||
}());
|
||||
|
|
@ -8,5 +8,6 @@
|
|||
</div>
|
||||
<div class="footer-right">
|
||||
<span class="footer-build">build $build-time$</span><a href="/build/" class="footer-build-link" aria-label="Build telemetry">→</a>
|
||||
· <a href="$url$.sig" class="footer-sig-link" aria-label="PGP signature for this page" title="Ed25519 signing subkey C9A42A6F AD444FBE 566FD738 531BDC1C C2707066 · public key at /gpg/pubkey.asc">sig</a>
|
||||
</div>
|
||||
</footer>
|
||||
|
|
|
|||
|
|
@ -31,4 +31,11 @@ $endif$
|
|||
$if(has-citations)$
|
||||
<script src="/js/citations.js" defer></script>
|
||||
$endif$
|
||||
$if(viz)$
|
||||
<link rel="stylesheet" href="/css/viz.css">
|
||||
<script src="https://cdn.jsdelivr.net/npm/vega@5" defer></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/vega-lite@5" defer></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/vega-embed@6" defer></script>
|
||||
<script src="/js/viz.js" defer></script>
|
||||
$endif$
|
||||
<script src="/js/collapse.js" defer></script>
|
||||
|
|
|
|||
|
|
@ -62,6 +62,13 @@
|
|||
</div>
|
||||
$endif$
|
||||
|
||||
$if(similar-links)$
|
||||
<div class="meta-footer-section" id="similar-links">
|
||||
<h3>Related</h3>
|
||||
$similar-links$
|
||||
</div>
|
||||
$endif$
|
||||
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
|
|
|||
|
|
@ -0,0 +1,188 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
embed.py — Build-time similar-links generator.
|
||||
|
||||
Reads _site/**/*.html, embeds each page with nomic-embed-text-v1.5,
|
||||
builds a FAISS IndexFlatIP, and writes data/similar-links.json:
|
||||
|
||||
{ "/path/to/page/": [{"url": "...", "title": "...", "score": 0.87}, ...] }
|
||||
|
||||
Called by `make build` when .venv exists. Failures are non-fatal (make prints
|
||||
a warning and continues). Run `uv sync` first to provision the environment.
|
||||
|
||||
Staleness check: skips re-embedding if data/similar-links.json is newer than
|
||||
every HTML file in _site/ — so content-only rebuilds that don't touch HTML
|
||||
won't re-embed.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import faiss
|
||||
import numpy as np
|
||||
from bs4 import BeautifulSoup
|
||||
from sentence_transformers import SentenceTransformer
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Configuration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
REPO_ROOT = Path(__file__).parent.parent
|
||||
SITE_DIR = REPO_ROOT / "_site"
|
||||
OUT_FILE = REPO_ROOT / "data" / "similar-links.json"
|
||||
MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
|
||||
TOP_N = 5
|
||||
MIN_SCORE = 0.30 # cosine similarity threshold; discard weak matches
|
||||
# Pages to exclude from both indexing and results (exact URL paths)
|
||||
EXCLUDE_URLS = {"/search/", "/build/", "/404.html", "/feed.xml", "/music/feed.xml"}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Staleness check
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def needs_update() -> bool:
|
||||
"""Return True if similar-links.json is missing or older than any _site HTML."""
|
||||
if not OUT_FILE.exists():
|
||||
return True
|
||||
json_mtime = OUT_FILE.stat().st_mtime
|
||||
for html in SITE_DIR.rglob("*.html"):
|
||||
if html.stat().st_mtime > json_mtime:
|
||||
return True
|
||||
return False
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML → text extraction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract(html_path: Path) -> dict | None:
|
||||
"""
|
||||
Parse an HTML file and extract:
|
||||
- url: root-relative URL path (e.g. "/essays/my-essay/")
|
||||
- title: page <title> text
|
||||
- text: plain text of the page body (nav/footer/TOC stripped)
|
||||
Returns None for pages that should not be indexed.
|
||||
"""
|
||||
raw = html_path.read_text(encoding="utf-8", errors="replace")
|
||||
soup = BeautifulSoup(raw, "html.parser")
|
||||
|
||||
# Derive root-relative URL from file path
|
||||
rel = html_path.relative_to(SITE_DIR)
|
||||
if rel.name == "index.html":
|
||||
url = "/" + str(rel.parent) + "/"
|
||||
url = url.replace("//", "/") # root index.html → "/"
|
||||
else:
|
||||
url = "/" + str(rel)
|
||||
|
||||
if url in EXCLUDE_URLS:
|
||||
return None
|
||||
|
||||
# Only index actual content pages — skip index/tag/feed/author pages
|
||||
# that have no prose body.
|
||||
body = soup.select_one("#markdownBody")
|
||||
if body is None:
|
||||
return None
|
||||
|
||||
# Title: prefer <h1>, fall back to <title> (strip " — Site Name" suffix)
|
||||
h1 = soup.find("h1")
|
||||
if h1:
|
||||
title = h1.get_text(" ", strip=True)
|
||||
else:
|
||||
title_tag = soup.find("title")
|
||||
raw_title = title_tag.get_text(" ", strip=True) if title_tag else url
|
||||
title = re.split(r"\s+[—–-]\s+", raw_title)[0].strip()
|
||||
|
||||
# Remove elements that aren't content
|
||||
for sel in ["nav", "footer", "#toc", ".link-popup", "script", "style",
|
||||
".page-meta-footer", ".metadata", "[data-pagefind-ignore]"]:
|
||||
for el in soup.select(sel):
|
||||
el.decompose()
|
||||
|
||||
text = body.get_text(" ", strip=True)
|
||||
# Collapse runs of whitespace
|
||||
text = re.sub(r"\s+", " ", text).strip()
|
||||
|
||||
if len(text) < 100: # too short to embed meaningfully
|
||||
return None
|
||||
|
||||
# Feed title + text to the model so title is part of the representation
|
||||
return {"url": url, "title": title, "text": f"search_document: {title}\n\n{text}"}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main() -> int:
|
||||
if not SITE_DIR.exists():
|
||||
print("embed.py: _site/ not found — skipping", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
if not needs_update():
|
||||
print("embed.py: similar-links.json is up to date — skipping")
|
||||
return 0
|
||||
|
||||
print("embed.py: extracting pages…")
|
||||
pages = []
|
||||
for html in sorted(SITE_DIR.rglob("*.html")):
|
||||
page = extract(html)
|
||||
if page:
|
||||
pages.append(page)
|
||||
|
||||
if not pages:
|
||||
print("embed.py: no indexable pages found", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
print(f"embed.py: embedding {len(pages)} pages with {MODEL_NAME}…")
|
||||
model = SentenceTransformer(MODEL_NAME, trust_remote_code=True)
|
||||
|
||||
texts = [p["text"] for p in pages]
|
||||
# nomic requires a task prefix; we used "search_document:" above for the
|
||||
# corpus. For queries we'd use "search_query:" — but here both corpus and
|
||||
# query are the same documents, so we use "search_document:" throughout.
|
||||
embeddings = model.encode(
|
||||
texts,
|
||||
normalize_embeddings=True, # unit vectors → inner product == cosine
|
||||
show_progress_bar=True,
|
||||
batch_size=32,
|
||||
)
|
||||
embeddings = np.array(embeddings, dtype=np.float32)
|
||||
|
||||
print("embed.py: building FAISS index…")
|
||||
dim = embeddings.shape[1]
|
||||
index = faiss.IndexFlatIP(dim) # exact inner product; fine for < 10k pages
|
||||
index.add(embeddings)
|
||||
|
||||
print("embed.py: querying nearest neighbours…")
|
||||
# Query all at once: returns (n_pages, TOP_N+1) — +1 because self is #1
|
||||
scores_all, indices_all = index.search(embeddings, TOP_N + 1)
|
||||
|
||||
result: dict[str, list] = {}
|
||||
for i, page in enumerate(pages):
|
||||
neighbours = []
|
||||
for rank in range(TOP_N + 1):
|
||||
j = int(indices_all[i, rank])
|
||||
score = float(scores_all[i, rank])
|
||||
if j == i:
|
||||
continue # skip self
|
||||
if score < MIN_SCORE:
|
||||
continue # skip weak matches
|
||||
neighbours.append({
|
||||
"url": pages[j]["url"],
|
||||
"title": pages[j]["title"],
|
||||
"score": round(score, 4),
|
||||
})
|
||||
if len(neighbours) == TOP_N:
|
||||
break
|
||||
if neighbours:
|
||||
result[page["url"]] = neighbours
|
||||
|
||||
OUT_FILE.parent.mkdir(parents=True, exist_ok=True)
|
||||
OUT_FILE.write_text(json.dumps(result, ensure_ascii=False, indent=2))
|
||||
print(f"embed.py: wrote {len(result)} entries to {OUT_FILE.relative_to(REPO_ROOT)}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
#!/usr/bin/env bash
|
||||
# preset-signing-passphrase.sh — Cache the signing subkey passphrase in the
|
||||
# dedicated signing agent so that automated builds can sign without a prompt.
|
||||
#
|
||||
# Run this ONCE in an interactive terminal after system boot (or after the
|
||||
# agent cache expires). The passphrase is cached for 24 h (see gpg-agent.conf).
|
||||
#
|
||||
# Usage:
|
||||
# ./tools/preset-signing-passphrase.sh
|
||||
#
|
||||
# The script will prompt for the passphrase via the terminal (not pinentry).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
GNUPGHOME="${GNUPGHOME:-$HOME/.gnupg-signing}"
|
||||
KEYGRIP="619844703EC398E70B0045D7150F08179CFEEFE3"
|
||||
GPG_PRESET="/usr/lib/gnupg/gpg-preset-passphrase"
|
||||
|
||||
if [ ! -x "$GPG_PRESET" ]; then
|
||||
echo "Error: gpg-preset-passphrase not found at $GPG_PRESET" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Ensure the agent is running with our config.
|
||||
GNUPGHOME="$GNUPGHOME" gpg-connect-agent --homedir "$GNUPGHOME" /bye >/dev/null 2>&1 || true
|
||||
|
||||
echo -n "Signing subkey passphrase: "
|
||||
read -rs PASSPHRASE
|
||||
echo
|
||||
|
||||
echo -n "$PASSPHRASE" | GNUPGHOME="$GNUPGHOME" "$GPG_PRESET" --homedir "$GNUPGHOME" --preset "$KEYGRIP"
|
||||
|
||||
echo "Passphrase cached for keygrip $KEYGRIP (24 h TTL)."
|
||||
echo "Test: GNUPGHOME=$GNUPGHOME gpg --homedir $GNUPGHOME --batch --detach-sign --armor --output /dev/null /dev/null"
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
#!/usr/bin/env bash
|
||||
# sign-site.sh — Detach-sign every HTML file in _site/ with the signing subkey.
|
||||
#
|
||||
# Requires the passphrase to be pre-cached via tools/preset-signing-passphrase.sh.
|
||||
# Produces <file>.html.sig alongside each <file>.html.
|
||||
#
|
||||
# Usage (called by `make sign`):
|
||||
# ./tools/sign-site.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
GNUPGHOME="${GNUPGHOME:-$HOME/.gnupg-signing}"
|
||||
SITE_DIR="${1:-_site}"
|
||||
SIGNING_KEY="C9A42A6FAD444FBE566FD738531BDC1CC2707066"
|
||||
|
||||
if [ ! -d "$SITE_DIR" ]; then
|
||||
echo "Error: site directory '$SITE_DIR' not found. Run 'make build' first." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Pre-flight: verify the signing key is available and the passphrase is cached
|
||||
# by signing /dev/null. If this fails (e.g. passphrase not preset, key missing),
|
||||
# abort before touching any .sig files.
|
||||
echo "sign-site: pre-flight check..." >&2
|
||||
if ! GNUPGHOME="$GNUPGHOME" gpg \
|
||||
--homedir "$GNUPGHOME" \
|
||||
--batch \
|
||||
--yes \
|
||||
--detach-sign \
|
||||
--armor \
|
||||
--local-user "$SIGNING_KEY" \
|
||||
--output /dev/null \
|
||||
/dev/null 2>/dev/null; then
|
||||
echo "" >&2
|
||||
echo "ERROR: GPG signing pre-flight failed." >&2
|
||||
echo " The signing key passphrase is probably not cached." >&2
|
||||
echo " Run: ./tools/preset-signing-passphrase.sh" >&2
|
||||
echo " Then retry: make sign (or make deploy)" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo "sign-site: pre-flight OK — signing $SITE_DIR..." >&2
|
||||
|
||||
count=0
|
||||
while IFS= read -r -d '' html; do
|
||||
GNUPGHOME="$GNUPGHOME" gpg \
|
||||
--homedir "$GNUPGHOME" \
|
||||
--batch \
|
||||
--yes \
|
||||
--detach-sign \
|
||||
--armor \
|
||||
--local-user "$SIGNING_KEY" \
|
||||
--output "${html}.sig" \
|
||||
"$html"
|
||||
count=$((count + 1))
|
||||
done < <(find "$SITE_DIR" -name "*.html" -print0)
|
||||
|
||||
echo "Signed $count HTML files in $SITE_DIR."
|
||||
|
|
@ -0,0 +1,114 @@
|
|||
"""
|
||||
viz_theme.py — Shared matplotlib setup for levineuwirth.org figures.
|
||||
|
||||
Usage in a figure script:
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, 'tools') # relative to project root (where cabal runs)
|
||||
from viz_theme import apply_monochrome, save_svg
|
||||
|
||||
apply_monochrome()
|
||||
|
||||
import matplotlib.pyplot as plt
|
||||
fig, ax = plt.subplots()
|
||||
ax.plot([1, 2, 3], [4, 5, 6])
|
||||
ax.set_xlabel("x")
|
||||
ax.set_ylabel("y")
|
||||
|
||||
save_svg(fig) # writes SVG to stdout; Viz.hs captures it
|
||||
|
||||
Design constraints
|
||||
------------------
|
||||
- Use pure black (#000000) for all drawn elements (lines, markers, text,
|
||||
spines, ticks). Filters.Viz.processColors replaces these with
|
||||
`currentColor` so the SVG adapts to light/dark mode via CSS.
|
||||
- Use transparent backgrounds (figure and axes). The page background
|
||||
shows through, so the figure integrates cleanly in both modes.
|
||||
- For greyscale fills (bars, areas), use values in the range #333–#ccc.
|
||||
These do NOT get replaced by processColors, so choose mid-greys that
|
||||
remain legible in both light (#faf8f4) and dark (#121212) contexts.
|
||||
- For multi-series charts, distinguish series by linestyle (solid, dashed,
|
||||
dotted, dash-dot) rather than colour.
|
||||
|
||||
Font note: matplotlib's SVG output uses the font names configured here, but
|
||||
those fonts are not available in the browser SVG renderer — the browser falls
|
||||
back to its default serif. Do not rely on font metrics for sizing.
|
||||
"""
|
||||
|
||||
import sys
|
||||
import io
|
||||
import matplotlib as mpl
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
# Greyscale linestyle cycle for multi-series charts.
|
||||
# Each entry: (color, linestyle) — all black, distinguished by dash pattern.
|
||||
LINESTYLE_CYCLE = [
|
||||
{'color': '#000000', 'linestyle': 'solid'},
|
||||
{'color': '#000000', 'linestyle': 'dashed'},
|
||||
{'color': '#000000', 'linestyle': 'dotted'},
|
||||
{'color': '#000000', 'linestyle': (0, (5, 2, 1, 2))}, # dash-dot
|
||||
{'color': '#555555', 'linestyle': 'solid'},
|
||||
{'color': '#555555', 'linestyle': 'dashed'},
|
||||
]
|
||||
|
||||
|
||||
def apply_monochrome():
|
||||
"""Configure matplotlib for monochrome, transparent, dark-mode-safe output.
|
||||
|
||||
Call this before creating any figures. All element colours are set to
|
||||
pure black (#000000) so Filters.Viz.processColors can replace them with
|
||||
CSS currentColor. Backgrounds are transparent.
|
||||
"""
|
||||
mpl.rcParams.update({
|
||||
# Transparent backgrounds — CSS page background shows through.
|
||||
'figure.facecolor': 'none',
|
||||
'axes.facecolor': 'none',
|
||||
'savefig.facecolor': 'none',
|
||||
'savefig.edgecolor': 'none',
|
||||
|
||||
# All text and structural elements: pure black → currentColor.
|
||||
'text.color': 'black',
|
||||
'axes.labelcolor': 'black',
|
||||
'axes.edgecolor': 'black',
|
||||
'xtick.color': 'black',
|
||||
'ytick.color': 'black',
|
||||
|
||||
# Grid: mid-grey, stays legible in both modes (not replaced).
|
||||
'axes.grid': False,
|
||||
'grid.color': '#cccccc',
|
||||
'grid.linewidth': 0.6,
|
||||
|
||||
# Lines and patches: black → currentColor.
|
||||
'lines.color': 'black',
|
||||
'patch.edgecolor': 'black',
|
||||
|
||||
# Legend: no box frame; background transparent.
|
||||
'legend.frameon': False,
|
||||
'legend.facecolor': 'none',
|
||||
'legend.edgecolor': 'none',
|
||||
|
||||
# Use linestyle cycle instead of colour cycle for series distinction.
|
||||
'axes.prop_cycle': mpl.cycler(
|
||||
color=[c['color'] for c in LINESTYLE_CYCLE],
|
||||
linestyle=[c['linestyle'] for c in LINESTYLE_CYCLE],
|
||||
),
|
||||
})
|
||||
|
||||
|
||||
def save_svg(fig, tight=True):
|
||||
"""Write *fig* as SVG to stdout and close it.
|
||||
|
||||
Hakyll's Viz filter captures stdout and inlines the SVG.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fig : matplotlib.figure.Figure
|
||||
tight : bool
|
||||
If True (default), call fig.tight_layout() before saving.
|
||||
"""
|
||||
if tight:
|
||||
fig.tight_layout()
|
||||
buf = io.StringIO()
|
||||
fig.savefig(buf, format='svg', bbox_inches='tight', transparent=True)
|
||||
plt.close(fig)
|
||||
sys.stdout.write(buf.getvalue())
|
||||
Loading…
Reference in New Issue