Skip to content

An experimental framework for testing whether injecting structured noise into LLM hidden states can mitigate reasoning collapse on complex algorithmic tasks.

License

Notifications You must be signed in to change notification settings

isztldav/structured-stochasticity

Repository files navigation

Structured Stochasticity: Beyond Single-Trajectory Reasoning

An experimental framework for testing whether injecting structured noise into LLM hidden states can mitigate reasoning collapse on complex algorithmic tasks.

Hypothesis

Contemporary reasoning models exhibit performance collapse on complex tasks not due to fundamental capacity limits, but because single-trajectory deterministic inference commits early to suboptimal representations with no recovery mechanism.

By introducing structured stochasticity—noise vectors injected into the inference flow—we can enable trajectory resampling analogous to how humans naturally reframe problems when encountering cognitive dead ends.

Core Idea

Standard (Weak Stochasticity):     Proposed (Strong Stochasticity):
                                   
Input → [Deterministic h] → Sample Output    Input + z → [Stochastic h] → Sample Output
                                              ↑
                                              z ~ P(z|X) (latent noise)

Project Structure

structured-stochasticity/
├── src/structured_stochasticity/
│   ├── __init__.py
│   ├── injection.py        # Noise injection strategies
│   ├── hooks.py            # PyTorch forward hooks for hidden state access
│   ├── tasks.py            # Benchmark tasks (Tower of Hanoi, etc.)
│   ├── evaluation.py       # Metrics and evaluation logic
│   └── experiment.py       # Main experiment runner
├── configs/
│   └── default.yaml        # Default experiment configuration
├── experiments/            # Saved experiment results
├── notebooks/              # Analysis notebooks
├── tests/
│   └── test_injection.py   # Unit tests
├── requirements.txt
├── setup.py
└── README.md

Installation

git clone https://github.com/isztldav/structured-stochasticity.git
cd structured-stochasticity
pip install -e .

Quick Start

from structured_stochasticity import NoisyInferenceWrapper, TowerOfHanoi
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")

# Wrap with noise injection
noisy_model = NoisyInferenceWrapper(
    model,
    injection_layers=[0, 1, 2],  # Early layers
    noise_scale=0.1,
    injection_mode="continuous"  # or "once" 
)

# Run experiment
task = TowerOfHanoi(num_disks=4)
results = noisy_model.solve_with_trajectories(
    task,
    tokenizer,
    k_trajectories=5,
    selection_method="majority_vote"
)

Experiment Configuration

Edit configs/default.yaml:

model:
  name: "meta-llama/Llama-3.2-1B"
  device: "cuda"

injection:
  layers: [0, 1, 2, 3]      # Which layers to inject noise
  scale: 0.1                 # Noise magnitude
  mode: "continuous"         # "once" | "continuous" | "annealed"
  anneal_factor: 0.95        # For annealed mode

task:
  name: "tower_of_hanoi"
  complexity_range: [3, 8]   # Min/max disks

evaluation:
  k_trajectories: [1, 3, 5, 10, 20]
  selection: "majority_vote"  # "majority_vote" | "verifier" | "best_of_k"
  num_trials: 50

Running Experiments

# Single experiment
python -m structured_stochasticity.experiment --config configs/default.yaml

# Sweep over noise scales
python -m structured_stochasticity.experiment \
    --config configs/default.yaml \
    --sweep injection.scale 0.01 0.05 0.1 0.2 0.5

Key Questions This Framework Tests

  1. Does K-trajectory sampling improve max solvable complexity?

    • Compare accuracy vs. complexity curves for K=1,5,10,20
  2. Where should noise be injected?

    • Early layers (problem framing) vs. late layers (output realization)
  3. When should noise be injected?

    • Once at start vs. continuous vs. annealed
  4. What noise magnitude works best?

    • Sweep over scales; expect sweet spot varies with task difficulty

Citation

If you use this framework, please cite:

@misc{isztl2025structured,
  author = {Isztl, Dávid},
  title = {Beyond Single-Trajectory Reasoning: Structured Stochasticity as a Remedy for Reasoning Collapse in Large Language Models},
  year = {2025}
}

License

MIT

References

  • Shojaee et al. (2025). "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity"

About

An experimental framework for testing whether injecting structured noise into LLM hidden states can mitigate reasoning collapse on complex algorithmic tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published