35 lines
1.4 KiB
Markdown
35 lines
1.4 KiB
Markdown
# Specification Dilemma Experiment
|
|
|
|
## Files
|
|
|
|
- `config.yaml` — LMStudio endpoint, model, generation and analysis parameters
|
|
- `prompts/sparse.json` — 30 sparse prompts
|
|
- `prompts/dense.json` — 30 dense prompts (matched to sparse by index)
|
|
- `smoke_test.py` — pre-flight: connectivity, seed-honoring, per-generation latency
|
|
- `generate.py` — runs completions against LMStudio
|
|
- `embed.py` — sentence embeddings
|
|
- `similarity.py` — pairwise cosine similarities
|
|
- `stats.py` — t-test, Mann-Whitney, bootstrap, Cohen's d
|
|
- `plot.py` — violin plot
|
|
- `run_all.py` — orchestrator (runs the five pipeline scripts in order)
|
|
- `pyproject.toml`, `uv.lock` — uv-managed environment
|
|
- `requirements.txt` — pip fallback
|
|
- `outputs/{sparse,dense}/NN.txt` — model completions (generated)
|
|
- `embeddings/{sparse,dense}.npy` — L2-normalized embedding matrices (generated)
|
|
- `results/pairwise.csv`, `results/stats.json`, `results/plot.png` — analysis artifacts (generated)
|
|
|
|
## Setup
|
|
|
|
1. Install LMStudio, load a strong instruction-tuned model, start the local server.
|
|
2. `uv sync`
|
|
3. Edit `config.yaml` for your LMStudio host, port, and model name.
|
|
4. `uv run python smoke_test.py` — verifies the endpoint and reports whether `seed` is honored.
|
|
|
|
## Run
|
|
|
|
```
|
|
uv run python run_all.py
|
|
```
|
|
|
|
Or step-by-step: `generate.py` → `embed.py` → `similarity.py` → `stats.py` → `plot.py`.
|