Methodology

Lock the rules before you play

Before running any experiment, we write tau.json:

Then we compute a SHA-256 fingerprint and store it. The experiment script checks the fingerprint before running. Edited file → broken seal → experiment refuses to start.

Mock tau.json

Seal status

LOCKED

SHA-256 (truncated)

Pre-registration template

Field Description Example
claim Exactly what we are testing G₄ r3 — uniform rank-16 layer compression
metric Exactly how we measure success activation-weighted Frobenius error
data_points Number of evaluation samples 1000
random_seeds Random seeds for τ baseline 10
decision_threshold Pass/fail threshold (locked) error < τ * (1 - 0.10)

Kill-fast protocol — saving time by being brutal

Rule 1 — Analytical Pre-Filter (5 min)

5 min, pencil + paper. Byte budget, spectrum floor, random floor. If any check fails, kill immediately. No GPU needed.

Rule 2 — Hardest-Cell-First (4 min vs 60 min)

Run Layer 3 + Wikitext first. If it fails, stop. ~4 minutes instead of ~60.

Rule 3 — One-Shot Decisions (instant)

No ‘inconclusive’. The decision rule is locked. Ambiguity = protocol failure, not an invitation to keep trying.

When an idea dies — celebrate, then publish

A failed rung is a result. Every kill produces:

  1. decision.json — machine-readable verdict
  2. postmortem.md — human-readable analysis
  3. trap_results.json — evidence the traps fired correctly
  4. Updated theorem tracker — Lean theorems proved, downgraded, or marked killed

Real postmortem excerpt — G₄ r6

Task NLL inflation
Code (humaneval) +9.9%
Math (gsm8k) +60.9%
Text (wikitext) +110.5%

Off-projection trap fired at +647% on wikitext — protocol working as intended.

The defensible contribution here is the falsification methodology and Lean kill-theorems, not any single compressor. The ladder does not propose a new compression algorithm; it systematically rules out mechanism families with machine-checked proofs of the ruling-out conditions.

What this work is not

Structured-rotation novelty is effectively zero relative to QuaRot (2024), SpinQuant (2024), and QuIP# (2024). The SRHT-vs-Haar ablation (R2 result: +1.767 pp, Hadamard·Rademacher beats Haar) is a controlled ablation within a known family, not a new compressor.

MoE compression prior art on the exact target models

Paper Target Relationship
D2-MoE (arXiv:2502.17298) MoE expert merging via weight averaging Addresses expert redundancy at the merge level; orthogonal to activation-subspace kill
Sub-MoE (arXiv:2506.23266) Task-conditional expert subset selection Closest prior to G_4/R3f axis; Sub-MoE shows task-conditioned selection is viable at the expert-selection level, but does not address the rank-compression interior or cross-corpus generalization failure
MoBE (arXiv:2508.05257) Mixture-of-Basis Experts decomposition Basis-sharing across experts, directly in the G_7 (joint-basis) kill zone; this ladder’s Lean kill-theorem (olmoe_joint_basis_no_low_distortion) is the formal counterpart
DeRS (arXiv:2503.01359) Dense-and-Residual MoE structure Residual-routing design; orthogonal to the spectral kill mechanisms
MoE-I2 (arXiv:2411.01016) INT2 MoE quantization INT2 on MoE experts; R3a confirms the same inflation regime (+733%) for OLMoE FFN; mechanism gap holds but lever is insufficient
MiLo (arXiv:2504.02658) Mixed-precision low-rank decomposition Closest to R2/SRHT+INT4 axis; MiLo adds a learned low-rank correction; this ladder’s result is that data-free static rotation is the surviving lever
Collaborative Compression (arXiv:2509.25689) Aggressive MoE edge-deployment (joint pruning + quantization) Already demonstrates an aggressive composition escape for edge deployment; the sub-10GB failure at R6.5 is consistent with their finding that single-axis compression is insufficient, and their composition approach (pruning + quantization jointly) is the direction the R5+R2 anchor points toward

What the Lean theorems add

The kill-theorems are not proofs that compression is impossible in general; they are proofs that specific mechanism families cannot satisfy specific gate conditions on the specific weight constants measured from OLMoE-1B-7B. This is useful because it rules out re-running the same dead mechanism with different hyperparameters, and it composes: if G_4 + G_7 are both Lean-killed, any proposed compressor that implicitly relies on either must either cite a new axiom or refute the measured constant.