With the Paper A sprint starting ahead of the public release,
squatting the PyPI name now avoids the risk of a namesquatter
grabbing it once the project surfaces on arXiv or JOSS. The
0.1.0.dev0 version is a PEP 440 dev release — pip install neuropose
won't resolve to it without --pre, so no accidental installs until
0.1.0 final.
PYPI_README.md is a minimal PyPI-specific project page: name-
reserved notice, pointer at the source URL, author list. Kept as a
separate file from the contributor-focused README so the two can
evolve independently — README.md speaks to someone cloning the
repo, PYPI_README.md speaks to someone landing on the PyPI page.
RESEARCH.md and TECHNICAL.md are living R&D / engineering roadmap
notes — pre-meeting drafts, speculative directions, and in-progress
thinking that should evolve freely without public-repo concerns.
Same for docs/research/, a new directory for pre-meeting scoping
artifacts (e.g. the MoCap data-needs spec being drafted for the
upcoming conversation with Dr. Shu).
Files stay on disk in every checkout — the gitignore just stops
them from entering the index. Anything that graduates to a
user-facing artifact moves into docs/ (which is tracked and feeds
mkdocs) rather than these files.
Three reference configs under examples/analysis/:
- minimal.yaml: full-trial DTW on raw coordinates, no alignment or
segmentation. Smallest working example; a starting template.
- paper_c_headline.yaml: the representative Paper C pipeline.
Bilateral gait-cycle segmentation, per-sequence Procrustes, and
joint-angle DTW on knee and hip flexion triplets.
- per_joint_debug.yaml: per-joint DTW breakdown for diagnosing
which joint drives an unexpected distance.
tests/integration/test_analyze_examples.py exercises each example
twice: load_config must accept the YAML (catches drift between the
examples and the current schema), and run_analysis must execute the
config end-to-end against synthetic predictions (catches drift
between the examples and the executor). The Paper C example has an
extra guard verifying the knee-flexion triplets haven't been edited
to something unexpected.
Also wires docs/api/pipeline.md into the mkdocs nav so mkdocstrings
surfaces the full schema and executor API.
Replaces the placeholder stub that returned EXIT_PENDING with a real
analyze --config <yaml> [--output <json>] command. Loads the YAML,
validates through AnalysisConfig (so typos fail with a clear
ValidationError before any predictions load), runs the pipeline,
and writes the AnalysisReport atomically.
Surfaces YAML parse errors and schema violations as EXIT_USAGE=2
with messages pointing at the offending file. Missing predictions
files during execution also surface as EXIT_USAGE rather than a
bare traceback.
Prints a one-line summary after the run: segmentation counts, the
analysis kind, and — for DTW — the per-segment distance count and
mean. --output / -o overrides the report path declared in the
config, useful when sweeping a single config over multiple input
pairs from a shell loop.
run_analysis(config) loads the predictions files named in the config,
applies the configured segmentation, dispatches to the selected
analysis kind (DTW, stats, or none), and emits a fully populated
AnalysisReport. The report's Provenance inherits the inference-time
envelope from the primary input with analysis_config stamped in, so
the output is self-describing even if the source YAML is later lost.
For DTW runs with segmentation, segments are paired one-to-one by
index across primary and reference, truncating to min of the two
counts. Bilateral segmentations emit per-side distances under
"left_heel_strikes[i]" / "right_heel_strikes[i]" labels. dtw_per_joint
stores its full per-unit breakdown in the per_joint_distances field
and reports the sum as the representative scalar distance.
Also ships load_config (YAML), save_report (atomic JSON write), and
load_report (rehydrate via the migration chain) so the executor can
be driven end-to-end from Python without the CLI. The CLI wiring
lands in the next commit.
neuropose.analyzer.pipeline ships two top-level pydantic schemas:
AnalysisConfig — what a user writes in YAML. Inputs (primary plus
optional reference), preprocessing (person_index, room to grow),
optional segmentation as a discriminated union of gait_cycles,
gait_cycles_bilateral, and extractor, and a required analysis stage
as a discriminated union of dtw, stats, none.
AnalysisReport — runtime output with config, Provenance envelope,
per-input summaries, produced segmentations, and a results payload
whose shape mirrors the stage (DtwResults, StatsResults, NoResults).
schema_version defaults to CURRENT_VERSION.
Cross-field invariants enforced at parse time via model_validator:
method='dtw_relation' requires joint_i/joint_j and refuses
representation='angles'; representation='angles' requires non-empty
angle_triplets; analysis.kind='dtw' requires inputs.reference;
analysis.kind='stats' refuses a reference. Typos fail in
milliseconds instead of after a multi-minute predictions load.
neuropose.migrations gains a third registry for AnalysisReport
(_ANALYSIS_REPORT_MIGRATIONS + register_analysis_report_migration +
migrate_analysis_report), ready for future schema changes. No v1→v2
migration is registered because AnalysisReport first shipped at v2.
Execution, CLI wiring, and example configs land in follow-up commits.
representation: Literal["coords", "angles"] on dtw_all and
dtw_per_joint. "coords" preserves the 0.1 behaviour; "angles" runs
extract_joint_angles on caller-supplied angle_triplets before DTW,
giving translation-, rotation-, and scale-invariant distances that
are directly interpretable as clinical joint-range comparisons. Under
dtw_per_joint the "unit" becomes one angle column per triplet.
nan_policy: Literal["propagate", "interpolate", "drop"] on all three
entry points. "propagate" (default) lets NaN hit fastdtw, which raises
ValueError via numpy.asarray_chkfinite — the safest default because it
surfaces degenerate-vector problems rather than silently corrupting a
distance. "interpolate" runs 1D linear interpolation per feature
column; "drop" removes NaN frames before DTW.
dtw_relation stays a standalone convenience entry point. Paper C's
typical call becomes dtw_all(representation="angles",
align="procrustes_per_sequence"); see TECHNICAL.md Phase 0.
segment_gait_cycles wraps segment_predictions with a joint_axis
extractor and gait-appropriate defaults (joint="rhee", axis="y",
min_cycle_seconds=0.4). The joint name resolves through
joint_index; axis is a "x"/"y"/"z" string literal converted to the
numeric index internally. An invert flag flips peaks and valleys
for recording conventions where a heel-strike appears as a local
minimum.
segment_gait_cycles_bilateral composes the single-side function
twice and returns a {"left_heel_strikes", "right_heel_strikes"}
dict shape-compatible with VideoPredictions.segmentations, so the
caller can merge it directly into a predictions object.
Pathological gaits (shuffling, walker-assisted) degrade to an
empty segments list rather than raising, inherited from
segment_by_peaks' peak-not-found behaviour.
Closes the gait-cycle segmentation item in TECHNICAL.md Phase 0.
Pure formatter drift picked up after running ruff format across the
tree. No behavioural changes: line-length unwrapping where strings
fit on one line, one blank-line separator added to _model.py.
Three-layer module: find_neuropose_processes() scans the process
table via psutil for running watch/serve instances; terminate_processes()
SIGINTs with a configurable grace period before optional SIGKILL
escalation; wipe_state() clears $data_dir/in/, out/, failed/,
the .neuropose.lock file, and leftover .ingest_<uuid>/ staging dirs
while preserving the container directories themselves. reset_pipeline()
composes the three and refuses to wipe while any process survives
termination.
CLI wraps it with --yes/-y, --keep-failed, --force-kill,
--grace-seconds, and --dry-run/-n. Always prints a preview before
prompting; returns EXIT_USAGE=2 when survivors block the wipe.
Unblocks the Mac benchmark iteration loop where partially-complete
runs need to be cleared between experiments.
procrustes_align in neuropose.analyzer.features — Kabsch closed-form
rigid alignment between two pose sequences, with per_frame and
per_sequence modes and an optional scale flag for cross-subject
comparisons. Returns aligned arrays plus an AlignmentDiagnostics
dataclass reporting rotation magnitude (mean and max), translation
magnitude (mean and max), and scale factor, so downstream code can
flag suspiciously large transforms.
Wired into every DTW entry point via a new keyword-only align
parameter — "none" (the default) preserves the 0.1 raw-coordinate
behaviour, while "procrustes_per_frame" and "procrustes_per_sequence"
route inputs through procrustes_align before DTW runs. Rejects
mismatched frame counts when alignment is requested (Procrustes
requires a 1:1 correspondence).
Phase 0 of TECHNICAL.md: closes one of the three methodological
gaps Paper C's pipeline is waiting on.
Captures the MeTRAbs SHA-256 and filename plus tensorflow /
tensorflow-metal / numpy / neuropose / python versions, and reserves
slots for seed, deterministic, and analysis_config. Populated
automatically by Estimator.process_video when the model was loaded via
load_model; propagates into JobResults and BenchmarkResult via the
existing output path. None on the injected-model test path where no
SHA is known.
_model.load_metrabs_model now returns a LoadedModel dataclass so the
estimator can bundle the TF handle with the pinned SHA without
re-hashing the tarball on every daemon startup. All test fakes and
the integration smoke tests updated to unwrap .model.
Bumps the optional schema_version field on VideoPredictions and
BenchmarkResult to default=CURRENT_VERSION so fresh writes stamp the
latest version; legacy payloads without it are migrated on load via
the chain registered in the previous commit.
One shared CURRENT_VERSION across the three top-level serialised
payloads (VideoPredictions, JobResults, BenchmarkResult), with
per-schema registries populated via register_*_migration(from_version)
decorators. FutureSchemaError and MigrationNotFoundError surface bad
chains clearly. CURRENT_VERSION=2 with v1→v2 migrations registered
that add an optional provenance field to the payload dicts.
Tested standalone; io.py is wired through the migrator in a follow-up
commit that introduces the Provenance schema those migrations target.
Phase 0 (C-enabling pipeline work) → Phase 1 (Paper C clinical
validation) → Phase 2 (open-source release + Paper A), with Track 2
(clinical platform) as a contingent side track. Mirrors RESEARCH.md but
for engineering scope rather than methodology.