specification-dilemma/README.md

35 lines
1.4 KiB
Markdown

# Specification Dilemma Experiment
## Files
- `config.yaml` — LMStudio endpoint, model, generation and analysis parameters
- `prompts/sparse.json` — 30 sparse prompts
- `prompts/dense.json` — 30 dense prompts (matched to sparse by index)
- `smoke_test.py` — pre-flight: connectivity, seed-honoring, per-generation latency
- `generate.py` — runs completions against LMStudio
- `embed.py` — sentence embeddings
- `similarity.py` — pairwise cosine similarities
- `stats.py` — t-test, Mann-Whitney, bootstrap, Cohen's d
- `plot.py` — violin plot
- `run_all.py` — orchestrator (runs the five pipeline scripts in order)
- `pyproject.toml`, `uv.lock` — uv-managed environment
- `requirements.txt` — pip fallback
- `outputs/{sparse,dense}/NN.txt` — model completions (generated)
- `embeddings/{sparse,dense}.npy` — L2-normalized embedding matrices (generated)
- `results/pairwise.csv`, `results/stats.json`, `results/plot.png` — analysis artifacts (generated)
## Setup
1. Install LMStudio, load a strong instruction-tuned model, start the local server.
2. `uv sync`
3. Edit `config.yaml` for your LMStudio host, port, and model name.
4. `uv run python smoke_test.py` — verifies the endpoint and reports whether `seed` is honored.
## Run
```
uv run python run_all.py
```
Or step-by-step: `generate.py``embed.py``similarity.py``stats.py``plot.py`.