37 KiB
37 KiB
Changelog
All notable changes to NeuroPose are recorded in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
This section covers the ground-up rewrite of NeuroPose. The entries below describe the difference between the previous internal prototype and the state of the repository at the first tagged release, and will be split into per-release sections once tagging begins.
Added
Package structure and tooling
src/neuropose/package layout withpy.typedmarker, MITLICENSE, policy-enforcing.gitignore, pinned Python 3.11 (.python-version), andpyproject.tomlwith full project metadata, classifiers, and URL pointers. The runtime TensorFlow dependency is pinned totensorflow>=2.16,<2.19— see Changed below for the rationale.psutil>=5.9is a runtime dependency used by the estimator's always-onPerformanceMetricscollection to sample peak RSS.[project.optional-dependencies].analysisextra for fastdtw, scipy, scikit-learn, and sktime — install viapip install neuropose[analysis].[project.optional-dependencies].metalextra pullingtensorflow-metal>=1.2,<2undersys_platform == 'darwin' and platform_machine == 'arm64'environment markers. Opt-in only viapip install 'neuropose[metal]'oruv sync --extra metal; silently no-op on every non-Apple-Silicon platform. The Metal path is not exercised in CI and is documented as experimental indocs/getting-started.md— users enabling it are expected to spot-check numerics against the CPU path before trusting results downstream.[dependency-groups].dev(PEP 735) with the full dev + docs + analyzer toolchain: pytest, pytest-cov, ruff, pyright, pre-commit, mkdocs-material, mkdocstrings, fastdtw, and scipy.uv sync --group devgives contributors everything needed to run the whole suite.AUTHORS.md,CITATION.cff(with a MeTRAbs upstreamreferences:entry), and a MIT-licensedLICENSEwith an explicit MeTRAbs attribution paragraph.- Pre-commit configuration (
.pre-commit-config.yaml) running ruff, ruff-format, gitleaks (secret scanning), a 500 KB-limit large-files hook, end-of-file fixers, trailing-whitespace fixers, and YAML/TOML/JSON validators. Pyright is deliberately not in pre-commit — it runs in CI only, so pre-commit stays fast. - Ruff configuration in
pyproject.tomlwith a deliberately broad rule selection (pycodestyle, pyflakes, isort, bugbear, pyupgrade, simplify, ruff-specific, pep8-naming, comprehensions, pathlib, pytest-style, tidy-imports, numpy-specific, pydocstyle with numpy convention). Per-file ignores for tests and private modules. - Pyright configuration in
standardmode (notstrict— TF/OpenCV stubs would otherwise drown the signal). Unknown-type reports are explicitly silenced until the TensorFlow version pin is settled. - Pytest configuration with strict markers, an opt-in
slowmarker, and a--runslowCLI flag implemented intests/conftest.py::pytest_collection_modifyitemsso integration tests stay out of the default run.
CI / infrastructure
- GitHub Actions workflow
.github/workflows/ci.ymlrunning three parallel jobs — lint (ruff), typecheck (pyright), and test (pytest) — on every push and PR tomain. Usesuvwith a pinned version (0.9.16) and cache-enabled setup for fast reruns. Concurrency control cancels superseded runs on the same branch. - GitHub Actions workflow
.github/workflows/docs.ymlthat builds the mkdocs-material site on every relevant push and uploads the rendered site as a 14-day workflow artifact. GitHub Pages deployment is intentionally not wired up yet; the workflow header comment describes what to add when the repo flips public.
Runtime modules
neuropose.config—Settingsclass built onpydantic-settings. Field-level validation fordevice,poll_interval_seconds, anddefault_fov_degrees; explicitfrom_yaml()classmethod (no implicit config-file discovery); XDG defaults fordata_dirandmodel_cache_dir(~/.local/share/neuropose/…) so runtime data never lives inside the repository;ensure_dirs()as an explicit method so construction remains filesystem-side-effect-free.neuropose.io— validated prediction schemas:FramePrediction(frozen),VideoMetadata(frame count, fps, width, height),VideoPredictions(metadata envelope + frames mapping + optionalsegmentationsfield),JobResults,JobStatusenum,JobStatusEntry(with a structurederrorfield plus optional live-progress fields —current_video,frames_processed,frames_total,videos_completed,videos_total,percent_complete,last_update— populated by the interfacer during inference and consumed byneuropose.monitor), andStatusFile. Legacy status files written before the progress fields existed still load cleanly because every new field is optional with aNonedefault. Performance schema: frozenPerformanceMetricscarrying per-call timings (model_load_seconds,total_seconds,per_frame_latencies_ms),peak_rss_mb,active_device,tensorflow_metal_active, andtensorflow_version;BenchmarkResultpairing a discardedwarmup_passwithmeasured_passesand aBenchmarkAggregate(mean / p50 / p95 / p99 per-frame latency, mean throughput, max peak RSS); optionalCpuComparisonResultnested insideBenchmarkResultfor--compare-cpuruns, carrying both device aggregates, the throughput speedup, and the maximum-element-wiseposes3ddivergence in millimetres. Segmentation schema: frozenSegmentwindows (start,end,peak),SegmentationConfig(with amethodversion literal, e.g.valley_to_valley_v1), a discriminatedExtractorSpecunion overJointAxisExtractor,JointPairDistanceExtractor,JointSpeedExtractor, andJointAngleExtractor, andSegmentationpairing a config with its segments so on-disk results are self-describing. Load and save helpers with an atomic tmp-file-then-rename pattern for every state file.load_benchmark_result/save_benchmark_resultfollow the same atomic pattern.load_statusis deliberately crash-resilient: missing, corrupt, or non-mapping JSON returns an emptyStatusFilerather than raising. Legacy predictions files without thesegmentationsfield deserialize cleanly to an empty mapping.neuropose.estimator—Estimatorclass that streams frames directly from OpenCV into the model, with no intermediate write-to- disk-then-read-back-as-PNG round trip. Returns a typedProcessVideoResultcontaining a validatedVideoPredictionsobject and an always-populatedPerformanceMetricsbundle (per- frame latency in ms, total wall clock, peak RSS viapsutil, active TF device string,tensorflow-metaldetection, TF version, and model load time when the caller went throughload_model()). Does not touch the filesystem. Constructor accepts an injected model for testability;load_model()delegates toneuropose._model.load_metrabs_model(). Typed exception hierarchy:EstimatorError,ModelNotLoadedError,VideoDecodeError. Optional per-frameprogresscallback for long videos. Frame identifier convention isframe_000000(six-digit zero-pad, no extension — no file is implied).neuropose.visualize—visualize_predictions()for per-frame 2D + 3D overlay rendering.matplotlib.use("Agg")is called inside the function rather than at module import, soimport neuropose.visualizehas no global side effect. Explicit deep-copy ofposes3dbefore axis rotation to prevent the aliasing bug from the previous prototype. Supportsframe_indicesfor rendering a subset of frames.neuropose.interfacer—Interfacerjob-lifecycle daemon with dependency-injectedSettingsandEstimator. Single-instance enforcement viafcntl.flockondata_dir/.neuropose.lock. Crash-recoveryrecover_stuck_jobs()that marks any status entries left inprocessingstate as failed with an "interrupted" message and quarantines their inputs. Graceful shutdown on SIGINT/ SIGTERM with an interruptible sleep. Structured error fields on every failed job.run_once()factored out of the main loop so tests can drive single iterations without threading. Quarantine collision handling (job_a.1,job_a.2, …) and empty-directory silent-skip heuristic (mid-copy directories are not marked failed).neuropose._model— MeTRAbs model loader. Downloads the pinned tarball from the upstream RWTH Aachen URL (metrabs_eff2l_y4_384px_800k_28ds.tar.gz), verifies its SHA-256 checksum, atomically extracts to a staging directory and renames into place, and loads viatf.saved_model.load. Streams the download and hash computation in 1 MB chunks so memory is flat. One automatic retry on SHA-256 mismatch (in case the previous download was truncated). Post-load interface check fordetect_poses,per_skeleton_joint_names, andper_skeleton_joint_edges.neuropose.monitor— localhost HTTP status dashboard. A smallhttp.server-based HTTP server (pure stdlib, zero new runtime dependencies) that serves a plain HTML page atGET /with an auto-refresh meta tag, one row per tracked job, a<progress>bar, and a stale-entry warning badge forprocessingjobs whoselast_updatehas not ticked in 60 s.GET /status.jsonreturns the raw validatedStatusFileas JSON forcurl/scripted pipelines;?job=<name>filters to a single entry.GET /healthis a simple liveness probe. Binds to127.0.0.1:8765by default — loopback-only, with an explicit--hostoverride required to expose externally. Every request re-readsstatus.json, so the monitor has no in-memory cache, no sync protocol with the daemon, and stays useful even if the daemon is down (last-known state surfaced with the stale badge).- Progress checkpointing in the interfacer.
Interfacernow updates the currently-running job'sJobStatusEntryeverysettings.status_checkpoint_every_framesframes (default 30, a newSettingsfield) during inference via the estimator'sprogresscallback. Each checkpoint rewritesstatus.jsonatomically through the existingsave_statushelper; writes are best-effort and I/O failures are logged without interrupting inference._run_job_innerseeds a "videos_total=N" checkpoint before calling the estimator so the monitor shows the job's scope from the first poll. Checkpoint cadence is knob-exposed for operators who want to tune the smoothness-vs-write-rate trade-off. neuropose.ingest— zip-archive intake utility.ingest_zip()extracts a zip of videos into one job directory per video under$data_dir/in/, with validation-before-write (path-traversal and absolute-path members rejected, oversize archives rejected at the 20 GB-uncompressed cap), zip-internal and external collision detection reported in one shot, non-video members silently skipped (.DS_Store,README.md, etc.), and per-job atomic placement via a staging directory +os.rename. Nested paths are flattened into job names by joining components with underscores and sanitising unsafe characters —patient_001/trial_01.mp4becomes jobpatient_001_trial_01, preserving disambiguation against a siblingpatient_002/trial_01.mp4. Typed exception hierarchy:IngestError,ArchiveInvalidError,ArchiveEmptyError,ArchiveTooLargeError,JobCollisionError(with a.collisionslist of offending names). The running daemon needs no changes — ingested job dirs are picked up on the next poll.neuropose.migrations— schema-migration infrastructure for the three top-level serialised payloads (VideoPredictions,JobResults,BenchmarkResult). Every payload carries aschema_versionfield defaulting toCURRENT_VERSION; on load, the raw JSON dict is passed throughmigrate_video_predictions/migrate_job_results/migrate_benchmark_resultbefore pydantic validation so files written by older NeuroPose versions upgrade transparently. One sharedCURRENT_VERSIONcounter; per-schema migration registries populated viaregister_video_predictions_migration(from_version)andregister_benchmark_result_migration(from_version)decorators.JobResultsis aRootModelwith no envelope of its own, so its migration runs per-entry across the root mapping. The driver raisesFutureSchemaErrorfor payloads newer than the current build (clear upgrade-NeuroPose message),MigrationNotFoundErrorfor missing chain links (indicates aCURRENT_VERSIONbump that forgot its migration), and logs at INFO on each version advance. Currently atCURRENT_VERSION = 2, with registered v1 → v2 migrations forVideoPredictionsandBenchmarkResultthat add the optionalprovenancefield.neuropose.analyzer.features.procrustes_align— Kabsch rigid-alignment helper for pose sequences, plus aProcrustesModeliteral ("per_frame"|"per_sequence") and a frozenAlignmentDiagnosticsdataclass (rotation_deg,rotation_deg_max,translation,translation_max,scale, plus the mode that produced them). Per-sequence mode fits one rigid transform across the whole trial; per-frame fits an independent transform per frame. Optionalscale=Truefits a uniform scale factor for cross-subject comparisons. Wired into every DTW entry point inneuropose.analyzer.dtwvia a new keyword-onlyalign: AlignMode = "none"parameter —"none"preserves the 0.1 raw-coordinate behaviour, while"procrustes_per_frame"and"procrustes_per_sequence"route inputs throughprocrustes_alignbefore DTW runs so the returned distance is rotation- and translation-invariant. Paper C's pipeline is expected to setalign="procrustes_per_sequence"; seeTECHNICAL.mdPhase 0.neuropose.analyzer.dtw.Representationandneuropose.analyzer.dtw.NanPolicy— two new Literal types exposing orthogonal DTW preprocessing knobs on every entry point.representation(ondtw_allanddtw_per_joint) switches the per-frame feature vector between"coords"(the 0.1 default) and"angles", which runsextract_joint_angleson the suppliedangle_tripletsfirst — yielding distances that are translation-, rotation-, and scale-invariant by construction, and directly interpretable in clinical terms.nan_policy(on all three entry points) selects"propagate"(surface fastdtw's ValueError on NaN — the default),"interpolate"(linear fill per feature column), or"drop"(remove NaN frames before DTW); the policy is applied consistently whether NaN originated from the angles pipeline or from corrupted upstream coordinates.dtw_relationstays a standalone convenience entry point for two-joint displacement DTW; users who prefer a unified API can express the same computation viadtw_allwith an appropriate pair of angle triplets or rundtw_relationdirectly.neuropose.analyzer.pipeline(schemas) — declarative analysis-pipeline configuration and output artifact, parseable from YAML or JSON via pydantic.AnalysisConfigcaptures a full experiment: inputs (primary + optional reference predictions files), preprocessing (person index, with room to grow), optional segmentation (gait_cycles/gait_cycles_bilateral/extractordiscriminated union), and a required analysis stage (dtw/stats/nonediscriminated union).AnalysisReportis the runtime output: carries the originating config, aProvenanceenvelope withanalysis_configpopulated, per-input summaries, produced segmentations, and an analysis-result payload that mirrors the stage choice (DtwResults,StatsResults, orNoResults). Cross-field invariants —method="dtw_relation"requiresjoint_i/joint_j,representation="angles"requires non-emptyangle_triplets,analysis.kind="dtw"requiresinputs.reference,analysis.kind="stats"refuses a reference — are enforced at parse time viamodel_validatorso typos fail in milliseconds instead of after a multi-minute predictions load.AnalysisReportcarries aschema_versionfield defaulting toCURRENT_VERSION = 2, with a newregister_analysis_report_migrationdecorator andmigrate_analysis_reportdriver inneuropose.migrationsready for future schema changes.run_analysis(config)loads the named predictions files, applies the configured segmentation, dispatches to the selected analysis kind (DTW, stats, or none), and emits a fully populatedAnalysisReportwhoseProvenanceinherits the inference-time envelope from the primary input withanalysis_configstamped in, so the report is self-describing even if the source YAML is lost. For DTW runs with segmentation, segments are paired one-to-one by index across primary and reference, truncating tomin(len_primary, len_reference); bilateral segmentations emit per-side distances under"left_heel_strikes[i]"/"right_heel_strikes[i]"labels.load_config(path)parses YAML,save_report(path, report)writes atomically, andload_report(path)rehydrates via the migration chain. Wired to the CLI asneuropose analyze --config <yaml> [--output <json>]— replaces the placeholder stub that previously returnedEXIT_PENDING. The CLI surfaces schema violations and YAML parse errors asEXIT_USAGE=2with a clear message pointing at the offending file, prints a one-line summary of the run (segmentation counts, analysis kind, per-segment distance count + mean for DTW), and supports--output/-oto override the report path declared in the config (useful for sweeping a single config over multiple input pairs from a shell loop). Example configs land in a follow-up commit.neuropose.analyzer.segment.segment_gait_cyclesandsegment_gait_cycles_bilateral— clinical convenience wrappers oversegment_predictionsthat pre-fill ajoint_axisextractor with gait-appropriate defaults (joint="rhee",axis="y",min_cycle_seconds=0.4). The single-side entry point accepts any berkeley_mhad_43 joint name and any spatial axis as a string literal"x" | "y" | "z", plus aninvertflag for recordings whose vertical axis runs opposite to MeTRAbs's Y-down world-coordinate convention. The bilateral wrapper runs the detection on bothlheeandrheeand returns the two results under"left_heel_strikes"/"right_heel_strikes"keys — shape-compatible withVideoPredictions.segmentationsso the dict can be merged in directly. Degrades gracefully on pathological gaits (shuffling, walker-assisted) by returning an empty segments list rather than raising. Closes the gait-cycle segmentation item inTECHNICAL.mdPhase 0.neuropose.io.Provenance— reproducibility envelope for every inference run. Populated automatically byEstimator.process_videowhen the model was loaded viaload_model(the production path) and attached to the outputVideoPredictions; propagates from there intoJobResults(per-video) andBenchmarkResult(via the benchmark loop). Captures the MeTRAbs artifact SHA-256 and filename,tensorflow/tensorflow-metal/numpy/neuropose/ Python versions, and reserved slots for aseed,deterministicflag (Track 2), andanalysis_config(Phase 0 YAML pipeline).Noneon the injected-model test path where NeuroPose has no way to fingerprint the supplied artifact. Frozen pydantic model withextra="forbid"andprotected_namespaces=()so themodel_*field names do not collide with pydantic v2's internal namespace._model.load_metrabs_modelnow returns aLoadedModeldataclass bundling the TF handle with the pinned SHA and filename so the estimator can build theProvenancewithout re-hashing the tarball.neuropose.reset— pipeline-wide reset utility for the benchmark / iteration loop.find_neuropose_processes()scans the OS process table (viapsutil) for runningneuropose watchandneuropose serveinstances and classifies each asdaemonormonitor.terminate_processes()SIGINTs them, polls for graceful exit up to a configurable grace period, and optionally escalates to SIGKILL withforce_kill=True.wipe_state()removes the contents of$data_dir/in/,$data_dir/out/(includingstatus.json),$data_dir/failed/(unlesskeep_failed=True), the.neuropose.lockfile, and any leftover.ingest_<uuid>/staging dirs from interrupted ingests; container directories themselves are preserved so the daemon does not need to recreate them on next startup.reset_pipeline()composes the three with one safety guard: if any process survives termination, the wipe phase is skipped and the returnedResetReportflagswipe_skipped_due_to_survivors, because removing$data_dirout from under an active daemon would corrupt its in-flight writes. Surfaced asneuropose resetin the CLI with--yes/-y,--keep-failed,--force-kill,--grace-seconds, and--dry-run/-nflags; the command always prints a preview before prompting (skipped under--yes) and returnsEXIT_USAGE=2when survivors block the wipe.neuropose.benchmark— multi-pass inference benchmarking for a single video.run_benchmark()runsprocess_videoN times (default 5), always discards the first pass as warmup (graph compilation, file-system cache warmup), and aggregates the remainingPerformanceMetricsinto aBenchmarkAggregatewith mean / p50 / p95 / p99 per-frame latency, mean throughput, and max peak RSS.capture_reference=Trueadditionally preserves the last measured pass'sVideoPredictionsin memory so the--compare-cpuCLI flow can diff theposes3darrays between a GPU and CPU run.compute_poses3d_divergence()computes the maximum element-wise absolute difference (in millimetres) between two prediction sets, skipping frames with mismatched detection counts and surfacing theframe_count_comparedso callers can tell if the number is trustworthy.format_benchmark_report()renders a human-readable summary for CLI stdout.neuropose.analyzer— post-processing subpackage with lazy imports for the heavy dependencies:analyzer.dtw— three DTW entry points (dtw_all,dtw_per_joint,dtw_relation) over fastdtw, with a frozenDTWResultdataclass and three orthogonal preprocessing knobs (align,representation,nan_policy). SeeRESEARCH.mdfor the ongoing methodology investigation.analyzer.features—predictions_to_numpy,normalize_pose_sequence(uniform and axis-wise),pad_sequences(edge-padding),procrustes_align(Kabsch rigid alignment, per-frame or per-sequence, optional uniform scaling),extract_joint_angles(NaN on degenerate vectors),extract_feature_statistics(FeatureStatisticsfrozen dataclass), and afind_peaksthin wrapper aroundscipy.signal.find_peaks.analyzer.segment— repetition segmentation for trials in which a subject performs the same movement several times. A three-layer API:segment_by_peaks(pure 1D valley-to-valley peak detection on a generic signal),segment_predictions(top-level entry point taking aVideoPredictionsplus anExtractorSpec, converting time-based parameters to frame counts viametadata.fps), andslice_predictions(split aVideoPredictionsinto one per detected repetition with re-keyed frame names and a rewrittenframe_count). Gait-specific convenience wrapperssegment_gait_cycles(single heel) andsegment_gait_cycles_bilateral(both heels, returning a dict keyed by"left_heel_strikes"/"right_heel_strikes") sit abovesegment_predictionswith clinical defaults. Ships four extractor factories —joint_axis,joint_pair_distance,joint_speed, andjoint_angle— plus aJOINT_NAMESconstant for the berkeley_mhad_43 skeleton with ajoint_index(name)lookup, so post-processing callers can resolve"rwri"→ integer without loading the MeTRAbs SavedModel. A matching integration test (tests/integration/test_joint_names_drift.py, markedslow) loads the real model and asserts the constant still matches, so any upstream skeleton drift fails CI.
neuropose.cli— Typer-based command-line interface with eight subcommands:watch(run the daemon),process <video>(run the estimator on a single video),ingest <archive>(unzip a video archive into per-video job directories under$data_dir/in/with validation-before-write and atomic placement;--forceoverwrites collisions, otherwise the whole operation refuses if any target name already exists),serve(start the localhost HTTP monitor at127.0.0.1:8765by default —--hostand--portare the two overrides; KeyboardInterrupt exits with the standard shell-interruption code and anOSErrorat bind time is translated to a clean usage error with the bind target in the message),reset(stop the daemon and monitor, then wipe pipeline state for a clean restart — wrapsneuropose.resetwith a confirmation prompt,--dry-runpreview,--keep-failedto preserve the forensic quarantine,--force-killto escalate to SIGKILL after the SIGINT grace period, and--grace-secondsto tune the wait; refuses to wipe state while any process survives termination so active writes cannot be corrupted),segment <results>(post-hoc repetition segmentation — loads a JobResults or a single VideoPredictions, runsneuropose.analyzer.segment.segment_predictionswith the chosen extractor and thresholds, and atomically writes the file back with the new segmentation attached under--name),benchmark <video>(multi-pass inference benchmark — runs--repeats Npasses with a discarded first pass and--warmup-frames Mexcluded from the head of each measured pass, reports aggregates to stdout, and optionally writes a structuredBenchmarkResultto--output. Supports--compare-cpuwhich spawns a--force-cpusubprocess, diffs the resultingposes3darrays, and reports throughput speedup and max divergence in mm — the missing Apple Silicon numerical verification answer fromRESEARCH.md), andanalyze --config <yaml>(run the declarative analysis pipeline — see the dedicated entry above for scope). Thesegmentsubcommand accepts joint specifiers as either berkeley_mhad_43 names (lwri,rwri, …) or integer indices, and refuses to overwrite an existing segmentation of the same name without--force. Global options--config/-c,--verbose/-v,--quiet/-q,--version. Structured error handling turns expected exceptions (FileNotFoundErroron config,ValidationError,AlreadyRunningError,NotImplementedError,KeyboardInterrupt) into clear stderr messages and distinct exit codes (EXIT_OK=0,EXIT_USAGE=2,EXIT_PENDING=3,EXIT_INTERRUPTED=130). The CLI entry point is wired in[project.scripts]asneuropose = "neuropose.cli:run".
Documentation
- mkdocs-material documentation site under
docs/with the full theme configuration (light/dark toggle, tabs navigation, search),mkdocstringsPython handler set to numpy docstring style with source links, and apymdownxextension set for admonitions, tabbed content, collapsible details, and syntax-highlighted code blocks. Nav: Home → Getting Started → Architecture → API Reference (auto-generated from module docstrings) → Development → Deployment. - Prose documentation pages:
docs/index.md(public landing page),docs/getting-started.md(install, CLI, output schema, Python API, visualization, troubleshooting),docs/architecture.md(three-stage pipeline, data flow, runtime directory layout, design principles),docs/development.md(contributor setup, tests, lint/type, commit hygiene, release process stub), anddocs/deployment.md(systemd user unit, Docker pointer, GPU notes, backup guidance). - API reference stubs
docs/api/{config,estimator,interfacer,io,visualize}.md— each is a two-line file containing a:::mkdocstrings directive, so the API documentation is generated from the source docstrings at build time and cannot drift out of sync. RESEARCH.mdat the repo root: a living R&D log for DTW methodology alternatives and MeTRAbs self-hosting / fine-tuning plans. Not user-facing documentation; not linked from the mkdocs nav.
Tests
tests/unit/covering configuration (defaults, validation, YAML loading, env overrides,ensure_dirs), IO schema and helpers (roundtrip, atomic save, frozen-model guarantees, corruption tolerance), the estimator (construction, model-guard, process path with fake MeTRAbs model, error paths), the visualize module (smoke tests + an anti-regression check for the audit §6 aliasing bug), the interfacer (construction, discovery, process-job happy and failure paths, stuck-job recovery, lock, run_once, interruptible sleep), the CLI (top-level options, config handling, each subcommand's error path), the analyzer DTW helpers, and the analyzer features helpers.tests/conftest.pywith an autouse_isolate_environmentfixture that redirects$HOMEand$XDG_DATA_HOMEat a per-test temp directory so no test can accidentally write to the developer's real machine, and clears anyNEUROPOSE_*env vars. Adds asynthetic_videofixture (cv2-generated 5-frame MJPG AVI sized for most unit tests) and afake_metrabs_modelfixture.tests/integration/test_estimator_smoke.py— end-to-end model loader + estimator smoke test against the real MeTRAbs tarball, marked@pytest.mark.slow, skipped by default, opt-in via--runslow. Uses a session-scoped model cache so the download happens at most once per run.
Operations
Dockerfile— CPU image based onpython:3.11-slim-bookworm. Installs the package with theanalysisextra, runs as non-root userneuropose(UID 1000), exposes/dataas a volume, setsNEUROPOSE_DATA_DIRandNEUROPOSE_MODEL_CACHE_DIRto point at the mounted volume, and usesENTRYPOINT ["neuropose"]withCMD ["watch"]so the default is the daemon and overrides are ergonomic..dockerignorethat aggressively excludes developer tooling, caches, tests, documentation sources, research notes, and ancillary scripts from the build context.scripts/download_model.py— standalone pre-warm script that invokesload_metrabs_model()with an optional--cache-diroverride. Useful for seeding a deployment's cache before cutting off network access.
Changed
- Relicensed from AGPL-3.0 (used in the prior internal prototype) to MIT. The prior license was copied from precedent rather than chosen deliberately; the MIT relicense better matches both the project's "research software others can build on" intent and the upstream MeTRAbs license.
- Reorganised from the prior
backend/+ runtime-data layout into asrc/neuropose/Python package. Runtime data now lives outside the repository by default (under$XDG_DATA_HOME/neuropose/) so subject-identifying inputs cannot accidentally end up in agit add. - Frame identifier convention changed from
frame_0000.png(old, misleading — no PNG file exists) toframe_000000(six-digit zero-pad, no extension, pure identifier). - Estimator API:
process_video()now returns a typedProcessVideoResultcontaining a validatedVideoPredictionsobject, instead of a stringly-typed dict withresults_pathandframe_count. The estimator no longer owns filesystem destinations — the caller decides where to save. VideoPredictionsschema now carries aVideoMetadataenvelope (frame count, fps, width, height) alongside the per-frame predictions. Downstream analysis can convert frame indices to real time without needing access to the original video.- Interfacer uses
datetime.now(UTC)instead of the deprecateddatetime.utcnow(), addresses the "no-videos"-vs-exception-path inconsistency (both now quarantine), and persists a structurederrorstring on every failure for grep-friendly diagnostics. - TensorFlow pin set to
tensorflow>=2.16,<2.19. The 2.16 floor is the first release with nativedarwin/arm64wheels under thetensorflowpackage name on PyPI, so a single dependency line works across Linux x86_64, Linux arm64, and Apple Silicon macOS without platform markers or a separatetensorflow-macospackage. The<2.19ceiling is atensorflow-metalcompatibility constraint: the latest Metal plugin (1.2.0, January 2025) advertises "TF 2.18+" but in practice fails on 2.19 and 2.20 with symbol-not-found errors and graph-executionInvalidArgumentErrors (tensorflow/tensorflow#84167). Cap is global rather than darwin-only so dependency resolution stays identical across platforms. The MeTRAbs SavedModel itself (metrabs_eff2l_y4_384px_800k_28ds, serialized with TF 2.10) was separately verified to load and rundetect_posesend-to-end on TF 2.21 + Keras 3 with no errors and zero custom ops, so the cap is purely an external-package constraint and can lift once Apple ships a Metal plugin that tracks mainline TensorFlow again. Full probe data and op inventory inRESEARCH.md. - Operating-system classifiers in
pyproject.tomlextended from Linux-only toPOSIX+POSIX :: Linux+MacOS, reflecting the Apple Silicon support that the TF 2.16 floor makes real.
Removed
- The previous
backend/analyzer.pyandbackend/validator.pystubs, which were non-functional and had never been run successfully.analyzer.pyis reintroduced as a pure-function subpackage (neuropose.analyzer) rewritten from the prior code's design intent.validator.pyis reintroduced as a real pytest suite (tests/unit/andtests/integration/). - The previous
reconstruct_from_frameshelper on theEstimator— dead code, broken (dereferencedself.OUTPUT_PATH, which did not exist), hardcoded 10 fps, never called. ffmpeg is a better tool for this and can be invoked directly. - The previous
__main__placeholder (print("in main"); sys.exit()) onestimator.py. The real CLI now lives inneuropose.cli. - Every file under
docs/in the previous prototype. All of the pydoc-generated HTML, Org-mode sources, and handwritten markdown described an older version of the API with methods (bind_and_block,construct_paths,toggle_visualization,propagate_fatal_error, etc.) that no longer exist. The docs are now auto-generated from source docstrings via mkdocstrings so drift is mechanically impossible. - The previous Dockerfile, which referenced a non-existent
backend/requirements.txt, attempted toCOPY ./model /app/model(no such directory), and setCMD ["uvicorn", "main:app"]for a FastAPI app that never existed. - The previous
install/install.sh,install/#install.sh#(an Emacs autosave file),install/install.sh~(an Emacs backup file), andinstall/environment.yml. The conda +git+httpsinstall story is replaced byuv+ a singlepyproject.toml. - The previous
bit.ly/metrabs_1URL shortener for the model download, replaced by a pinned canonical URL on the upstream RWTH Aachen "omnomnom" host, with SHA-256 verification on download. SeeRESEARCH.mdfor the plan to mirror to self-hosted storage.
Security
- Large-files pre-commit hook (
check-added-large-fileswith a 500 KB limit) blocks accidental commits of subject data or model weights. - Gitleaks pre-commit hook scans every staged change for secret material.
- Dockerfile runs as a non-root user (UID 1000,
neuropose) by default. - Tarfile extraction uses the
filter="data"option to block path traversal and other tar-bomb attacks during MeTRAbs model extraction. - SHA-256 pinning of the MeTRAbs model artifact. A change to the upstream tarball contents fails the checksum verification and requires a human-reviewed diff before the new artifact is trusted.
Known limitations
- Apple Silicon support is established by-construction (TF 2.16+
publishes native
darwin/arm64wheels and the MeTRAbs SavedModel uses only stock ops verified portable on TF 2.21) but has not yet been exercised on real Apple Silicon hardware. Amacos-14CI matrix entry covering the unit tests is the cheapest way to catch any regression and is planned as a follow-up. - Classification wrappers on top of sktime are deliberately not
included in
neuropose.analyzerfor this release. SeeRESEARCH.mdfor the reasoning and the plan. - GPU support in Docker is not yet shipped (
Dockerfile.gpuis planned). The existingDockerfileruns CPU-only. neuropose analyzeis a CLI stub that exits with a pending message. The analyzer subpackage is usable from Python directly; the CLI wrapper will follow once the analysis pipeline has a concrete shape worth wrapping.- The data-handling policy referenced from
docs/deployment.mdanddocs/index.md(docs/data-policy.md) is being authored separately and is not part of this changelog entry.