The root cause of 'PDF/arXiv previews simply do not work' was twofold:
1. nginx/popup-proxy.conf was never installed on the VPS — every
/proxy/* request (arXiv, PubMed, Internet Archive) returned nginx's
default 404. Now installed (snippets + http{}-context cache/limit
zones in conf.d, included in the vhost, nginx -t verified, reloaded).
2. The snippet itself had a latent bug that only surfaced once
installed: with a VARIABLE upstream, a URI part on proxy_pass is
passed literally — every request hit the upstream's homepage
(archive.org HTML where JSON was expected, arXiv 429s, NCBI doc-page
redirects). Fixed with explicit prefix-strip rewrites; bad cached
responses purged. All three proxies verified returning real data,
including a live arXiv title resolve.
Client-side improvements:
- arXiv match covers old-style IDs (cs/9901002, math.GT/0309136,
cond-mat/...v1) alongside new-style, and .pdf-suffixed /pdf/ URLs
(regex verified against six forms)
- Wikipedia popups show the article's lead image: pageimages rides
along the existing extracts call (pithumbsize=320), rendered via a
new https-only image slot in renderPopup with float styling;
upload.wikimedia.org added to the CSP's img-src
- pdf-thumbs now walks all of static/ (pdfjs pruned), so /cv.pdf and
/resume.pdf — the most-linked internal PDFs, previously thumbnail-less
and therefore popup-less — get hover previews
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- embed.py: pin nomic's auto_map modeling repo via code_revision —
revision= alone left nomic-bert-2048 unpinned under
trust_remote_code (AUDIT §1.3; verified loadable with
HF_HUB_OFFLINE=1). Catch BadZipFile/EOFError when loading the page
cache so a half-written npz is discarded, not fatal (§4.2), and
unlink the tmp file on a failed save (§4.1)
- nginx: collapse the CSP to one physical line — nginx has no line
continuation in quoted strings, so the old value embedded literal
backslash+LF bytes, illegal in HTTP/2 (§8.1). Add the externals the
site actually uses: KaTeX webfonts + onnxruntime wasm via jsdelivr,
and the popup provider APIs popups.js documents (§8.2)
- Makefile: pathspec-limit the auto-commit to content/ so pre-staged
unrelated work is no longer swept into auto: commits (§8.3)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Add nginx/security-headers.conf — server_tokens off, HSTS (1y +
preload), X-Content-Type-Options, X-Frame-Options DENY,
Referrer-Policy, Permissions-Policy, and a usage-scoped CSP. CSP
ships in Report-Only mode; promote to enforcing once the report
stream is clean for a week. CSP allowlists are derived from actual
usage (cdn.jsdelivr.net for KaTeX/Vega, *.basemaps.cartocdn.com for
Leaflet tiles); 'unsafe-inline' and 'unsafe-eval' are documented
inline.
- Add nginx/vhost.conf.example — reference vhost showing the canonical
include order. The live vhost on the VPS remains the source of
truth; this file documents the structure so the VPS config can be
reproduced or audited from the repo.
- Shorten unfingerprinted CSS/JS cache from 24h to 1h. Bug fixes ship
to warm clients within an hour; if assets are ever fingerprinted,
this can move to immutable.
- Refresh README repo layout — add nginx/ entry, drop stale paper/
and spec.md references that never existed in the working tree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>