# Architecture This page describes how NeuroPose is structured and why. It is the document to read if you are about to modify the estimator, the daemon, or the output schema, and want to understand the constraints the existing design is trying to honour. ## Component overview NeuroPose is a three-stage pipeline: ```text ┌───────────────────┐ ┌──────────────────┐ ┌───────────────────┐ │ interfacer │ │ estimator │ │ analyzer │ │ (daemon) │────▶│ (inference) │────▶│ (post-process) │ │ │ │ │ │ │ │ watches filesystem│ │ MeTRAbs wrapper │ │ DTW, features, │ │ manages job state │ │ per-video worker │ │ classification │ └───────────────────┘ └──────────────────┘ └───────────────────┘ │ │ │ ▼ ▼ ▼ status.json + VideoPredictions analysis results job directories (validated schema) (pending commit 10) ``` Each stage is a separate module with one job, and the contracts between them are defined by validated pydantic schemas in [`neuropose.io`](api/io.md). ### estimator **Role:** pure inference library. Given a video path and a MeTRAbs model, produces a validated `VideoPredictions` object plus a `PerformanceMetrics` bundle. **Does NOT handle:** job directories, status files, polling, locking, signal handling, visualization, or anywhere-to-save decisions. It is a library, not a daemon. The estimator streams frames directly from OpenCV into the model — no intermediate write-to-disk-then-read-back-as-PNG round trip like the previous prototype had. `process_video()` returns a typed `ProcessVideoResult` containing the predictions and an always-populated `PerformanceMetrics` (per-frame latency, peak RSS, total wall clock, active TF device, TF version, `tensorflow-metal` detection, and model load time when the caller went through `load_model()`). It does not touch the filesystem unless the caller explicitly asks it to save the result. See [`neuropose.estimator`](api/estimator.md) for the API reference. ### benchmark **Role:** multi-pass inference benchmarking layered on top of the estimator. `run_benchmark()` calls `process_video` N times, discards the first pass as warmup, and aggregates the remaining `PerformanceMetrics` into a `BenchmarkAggregate` with distributional statistics (mean / p50 / p95 / p99 per-frame latency, mean throughput, max peak RSS). The benchmark is exposed via the `neuropose benchmark