Methodology
Lock the rules before you play
Before running any experiment, we write tau.json:
- Exactly what we are testing
- Exactly how we measure success
- Exactly how many data points
- Exactly the random seeds
- Exactly the decision threshold
Then we compute a SHA-256 fingerprint and store it. The experiment script checks the fingerprint before running. Edited file → broken seal → experiment refuses to start.
Mock tau.json
Seal status
SHA-256 (truncated)
—
Pre-registration template
| Field | Description | Example |
|---|---|---|
claim |
Exactly what we are testing | G₄ r3 — uniform rank-16 layer compression |
metric |
Exactly how we measure success | activation-weighted Frobenius error |
data_points |
Number of evaluation samples | 1000 |
random_seeds |
Random seeds for τ baseline | 10 |
decision_threshold |
Pass/fail threshold (locked) | error < τ * (1 - 0.10) |
Kill-fast protocol — saving time by being brutal
Rule 1 — Analytical Pre-Filter (5 min)
5 min, pencil + paper. Byte budget, spectrum floor, random floor. If any check fails, kill immediately. No GPU needed.
Rule 2 — Hardest-Cell-First (4 min vs 60 min)
Run Layer 3 + Wikitext first. If it fails, stop. ~4 minutes instead of ~60.
Rule 3 — One-Shot Decisions (instant)
No ‘inconclusive’. The decision rule is locked. Ambiguity = protocol failure, not an invitation to keep trying.
When an idea dies — celebrate, then publish
A failed rung is a result. Every kill produces:
decision.json— machine-readable verdictpostmortem.md— human-readable analysistrap_results.json— evidence the traps fired correctly- Updated theorem tracker — Lean theorems proved, downgraded, or marked killed
Real postmortem excerpt — G₄ r6
| Task | NLL inflation |
|---|---|
| Code (humaneval) | +9.9% |
| Math (gsm8k) | +60.9% |
| Text (wikitext) | +110.5% |
Off-projection trap fired at +647% on wikitext — protocol working as intended.
Positioning and related work
The defensible contribution here is the falsification methodology and Lean kill-theorems, not any single compressor. The ladder does not propose a new compression algorithm; it systematically rules out mechanism families with machine-checked proofs of the ruling-out conditions.
What this work is not
Structured-rotation novelty is effectively zero relative to QuaRot (2024), SpinQuant (2024), and QuIP# (2024). The SRHT-vs-Haar ablation (R2 result: +1.767 pp, Hadamard·Rademacher beats Haar) is a controlled ablation within a known family, not a new compressor.
MoE compression prior art on the exact target models
| Paper | Target | Relationship |
|---|---|---|
| D2-MoE (arXiv:2502.17298) | MoE expert merging via weight averaging | Addresses expert redundancy at the merge level; orthogonal to activation-subspace kill |
| Sub-MoE (arXiv:2506.23266) | Task-conditional expert subset selection | Closest prior to G_4/R3f axis; Sub-MoE shows task-conditioned selection is viable at the expert-selection level, but does not address the rank-compression interior or cross-corpus generalization failure |
| MoBE (arXiv:2508.05257) | Mixture-of-Basis Experts decomposition | Basis-sharing across experts, directly in the G_7 (joint-basis) kill zone; this ladder’s Lean kill-theorem (olmoe_joint_basis_no_low_distortion) is the formal counterpart |
| DeRS (arXiv:2503.01359) | Dense-and-Residual MoE structure | Residual-routing design; orthogonal to the spectral kill mechanisms |
| MoE-I2 (arXiv:2411.01016) | INT2 MoE quantization | INT2 on MoE experts; R3a confirms the same inflation regime (+733%) for OLMoE FFN; mechanism gap holds but lever is insufficient |
| MiLo (arXiv:2504.02658) | Mixed-precision low-rank decomposition | Closest to R2/SRHT+INT4 axis; MiLo adds a learned low-rank correction; this ladder’s result is that data-free static rotation is the surviving lever |
| Collaborative Compression (arXiv:2509.25689) | Aggressive MoE edge-deployment (joint pruning + quantization) | Already demonstrates an aggressive composition escape for edge deployment; the sub-10GB failure at R6.5 is consistent with their finding that single-axis compression is insufficient, and their composition approach (pruning + quantization jointly) is the direction the R5+R2 anchor points toward |
What the Lean theorems add
The kill-theorems are not proofs that compression is impossible in general; they are proofs that specific mechanism families cannot satisfy specific gate conditions on the specific weight constants measured from OLMoE-1B-7B. This is useful because it rules out re-running the same dead mechanism with different hyperparameters, and it composes: if G_4 + G_7 are both Lean-killed, any proposed compressor that implicitly relies on either must either cite a new axiom or refute the measured constant.