Commit Graph

3 Commits

Author SHA1 Message Date
Levi Neuwirth 23250d8782 Fix popup previews: proxy prefix-strip bug, arXiv IDs, Wikipedia images
The root cause of 'PDF/arXiv previews simply do not work' was twofold:

1. nginx/popup-proxy.conf was never installed on the VPS — every
   /proxy/* request (arXiv, PubMed, Internet Archive) returned nginx's
   default 404. Now installed (snippets + http{}-context cache/limit
   zones in conf.d, included in the vhost, nginx -t verified, reloaded).
2. The snippet itself had a latent bug that only surfaced once
   installed: with a VARIABLE upstream, a URI part on proxy_pass is
   passed literally — every request hit the upstream's homepage
   (archive.org HTML where JSON was expected, arXiv 429s, NCBI doc-page
   redirects). Fixed with explicit prefix-strip rewrites; bad cached
   responses purged. All three proxies verified returning real data,
   including a live arXiv title resolve.

Client-side improvements:
- arXiv match covers old-style IDs (cs/9901002, math.GT/0309136,
  cond-mat/...v1) alongside new-style, and .pdf-suffixed /pdf/ URLs
  (regex verified against six forms)
- Wikipedia popups show the article's lead image: pageimages rides
  along the existing extracts call (pithumbsize=320), rendered via a
  new https-only image slot in renderPopup with float styling;
  upload.wikimedia.org added to the CSP's img-src
- pdf-thumbs now walks all of static/ (pdfjs pruned), so /cv.pdf and
  /resume.pdf — the most-linked internal PDFs, previously thumbnail-less
  and therefore popup-less — get hover previews

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:06:13 -04:00
Levi Neuwirth f11495ff9a Fix audit tooling/infra findings
- embed.py: pin nomic's auto_map modeling repo via code_revision —
  revision= alone left nomic-bert-2048 unpinned under
  trust_remote_code (AUDIT §1.3; verified loadable with
  HF_HUB_OFFLINE=1). Catch BadZipFile/EOFError when loading the page
  cache so a half-written npz is discarded, not fatal (§4.2), and
  unlink the tmp file on a failed save (§4.1)
- nginx: collapse the CSP to one physical line — nginx has no line
  continuation in quoted strings, so the old value embedded literal
  backslash+LF bytes, illegal in HTTP/2 (§8.1). Add the externals the
  site actually uses: KaTeX webfonts + onnxruntime wasm via jsdelivr,
  and the popup provider APIs popups.js documents (§8.2)
- Makefile: pathspec-limit the auto-commit to content/ so pre-staged
  unrelated work is no longer swept into auto: commits (§8.3)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 09:21:47 -04:00
Levi Neuwirth 87819501a5 nginx: ship security baseline, reference vhost, and tighter cache
- Add nginx/security-headers.conf — server_tokens off, HSTS (1y +
  preload), X-Content-Type-Options, X-Frame-Options DENY,
  Referrer-Policy, Permissions-Policy, and a usage-scoped CSP. CSP
  ships in Report-Only mode; promote to enforcing once the report
  stream is clean for a week. CSP allowlists are derived from actual
  usage (cdn.jsdelivr.net for KaTeX/Vega, *.basemaps.cartocdn.com for
  Leaflet tiles); 'unsafe-inline' and 'unsafe-eval' are documented
  inline.
- Add nginx/vhost.conf.example — reference vhost showing the canonical
  include order. The live vhost on the VPS remains the source of
  truth; this file documents the structure so the VPS config can be
  reproduced or audited from the repo.
- Shorten unfingerprinted CSS/JS cache from 24h to 1h. Bug fixes ship
  to warm clients within an hour; if assets are ever fingerprinted,
  this can move to immutable.
- Refresh README repo layout — add nginx/ entry, drop stale paper/
  and spec.md references that never existed in the working tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:08:03 -04:00