auto: 2026-04-05T01:20:51Z
This commit is contained in:
parent
9a01c602bc
commit
6c0e5e67bd
|
|
@ -315,7 +315,7 @@ We presented the first statistically rigorous decomposition of SIMD speedup in M
|
||||||
|
|
||||||
**Future work.** Planned extensions include: hardware performance counter profiles (IPC, cache miss rates) via PAPI to validate the mechanistic explanations in the corresponding section; energy measurement via Intel RAPL; extension to ML-DSA (Dilithium) and SLH-DSA (SPHINCS+) with the same harness; and cross-ISA comparison with ARM NEON/SVE (Graviton3) and RISC-V V. A compiler version sensitivity study (GCC 11–14, Clang 14–17) will characterize how stable the auto-vectorization gap is across compiler releases.
|
**Future work.** Planned extensions include: hardware performance counter profiles (IPC, cache miss rates) via PAPI to validate the mechanistic explanations in the corresponding section; energy measurement via Intel RAPL; extension to ML-DSA (Dilithium) and SLH-DSA (SPHINCS+) with the same harness; and cross-ISA comparison with ARM NEON/SVE (Graviton3) and RISC-V V. A compiler version sensitivity study (GCC 11–14, Clang 14–17) will characterize how stable the auto-vectorization gap is across compiler releases.
|
||||||
|
|
||||||
**Artifact.** The benchmark harness, SLURM job templates, raw cycle-count data, analysis pipeline, and this paper are released at <https://git.levineuwirth.org/where-simd-helps> under the MIT License.
|
**Artifact.** The benchmark harness, SLURM job templates, raw cycle-count data, analysis pipeline, and this paper are released at <https://git.levineuwirth.org/neuwirth/where-simd-helps> under the MIT License.
|
||||||
|
|
||||||
## Supplementary: KEM-level end-to-end speedup
|
## Supplementary: KEM-level end-to-end speedup
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue