Release v0.7.0: Interactive Visualizations, New Aggregators and more · inseq-team/inseq

🌳 Treescope Integration for Interactive Visualizations (#283)

Inseq now integrates treescope, Google DeepMind's library for interactive model and tensor visualization. Two new methods leverage this integration:

FeatureAttributionOutput.show_granular: Interactive visualization of multidimensional attribution tensors

FeatureAttributionSequenceOutput.show_tokens: Interactive token highlights

The visualize_attribute_context method for the inseq attribute-context CLI command now produces an interactive output:

🔢 Enhanced Aggregation Capabilities (#282, #290)

SliceAggregator

A new SliceAggregator ("slices") allows slicing source (encoder-decoder) or target (decoder-only) tokens from attribution outputs. The __getitem__ method provides convenient [start:stop] syntax:

import inseq
from inseq.data.aggregator import SliceAggregator

attrib_model = inseq.load_model("gpt2", "attention")
input_prompt = """Instruction: Summarize this article.
Input_text: In a quiet village nestled between rolling hills, an ancient tree whispered secrets to those who listened. One night, a curious child named Elara leaned close and heard tales of hidden treasures beneath the roots. As dawn broke, she unearthed a shimmering box, unlocking a forgotten world of wonder and magic.
Summary:"""

full_output_prompt = input_prompt + " Elara discovers a shimmering box under an ancient tree, unlocking a world of magic."

out = attrib_model.attribute(input_prompt, full_output_prompt)[0]

# These are all equivalent ways to slice only the input text contents
out_sliced = out.aggregate(SliceAggregator, target_spans=(13,73))
out_sliced = out.aggregate("slices", target_spans=(13,73))
out_sliced = out[13:73]

StringSplitAggregator

A new StringSplitAggregator ("split") supports complex aggregation procedures with regex pattern matching:

# Split on newlines. Default split_mode = "single".
out.aggregate("split", split_pattern="\n").aggregate("sum").show(do_aggregation=False)

# Split on whitespace-separated words of length 5.
out.aggregate("split", split_pattern=r"\s(\w{5})(?=\s)", split_mode="end")

PairAggregator Shortcut

The __sub__ method now serves as a shortcut for PairAggregator, enabling intuitive comparison of attribution outputs:

import inseq

attrib_model = inseq.load_model("gpt2", "saliency")

out_male = attrib_model.attribute(
    "The director went home because",
    "The director went home because he was tired",
    step_scores=["probability"]
)[0]
out_female = attrib_model.attribute(
    "The director went home because",
    "The director went home because she was tired",
    step_scores=["probability"]
)[0]
(out_male - out_female).show()

💾 Memory-Efficient Attribution Saving (#273)

New scores_precision parameter in FeatureAttributionOutput.save enables efficient saving in float16 and float8 formats:

import inseq

attrib_model = inseq.load_model("gpt2", "attention")
out = attrib_model.attribute("Hello world", generation_kwargs={'max_new_tokens': 100})

# Previous usage, memory inefficient
out.save("output.json")

# Memory-efficient saving
out.save("output_fp16.json", scores_precision="float16") # or "float8"

# Automatic conversion to float32
out_loaded = inseq.FeatureAttributionOutput.load("output_fp16.json")

🐍 Python 3.13 Support

Added support for Python 3.13. Current support is Python >= 3.10, <= 3.13.

🤖 New Model Support

Added configurations for new models:

DbrxForCausalLM
OlmoForCausalLM
Phi3ForCausalLM
Qwen2MoeForCausalLM
Gemma2ForCausalLM
OlmoeForCausalLM
GraniteForCausalLM
GraniteMoeForCausalLM

All new models based on the modular Huggingface architecture should work out of the box if based on standard architectures like LlamaForCausalLM.

💥 Breaking Changes

Dropped support for Python 3.9. Current support is Python >= 3.10, <= 3.13 (#283).

All Merged PRs

🚀 Features

Added treescope for interactive model and tensor visualization (#283) @gsarti
New treescope-powered methods FeatureAttributionOutput.show_granular and FeatureAttributionSequenceOutput.show_tokens for interactive visualization (#283) @gsarti
Added support for Python 3.13 @gsarti
Added new models DbrxForCausalLM, OlmoForCausalLM, Phi3ForCausalLM, Qwen2MoeForCausalLM, Gemma2ForCausalLM, OlmoeForCausalLM, GraniteForCausalLM, GraniteMoeForCausalLM to model config @gsarti
Add rescale_attributions to Inseq CLI commands for rescale=True (#280) @gsarti
Rows and columns in the visualization now have indices alongside tokens (#282) @gsarti
New parameter clean_special_chars in model.attribute to automatically clean special characters from output tokens (#289) @gsarti
Added scores_precision to FeatureAttributionOutput.save for efficient float16 and float8 saving (#273) @gsarti
New SliceAggregator for slicing tokens with [start:stop] syntax (#282) @gsarti
New StringSplitAggregator for regex-based aggregation (#290) @gsarti
__sub__ method shortcut for PairAggregator (#282) @gsarti

🔧 Fixes & Refactoring

Fix the issue in the attention implementation where non-terminal positions were set to nan if they were 0s (#269) @gsarti
Fix the pad token for models where it is not specified by default (e.g. Qwen models) (#269) @gsarti
Fix value_zeroing for SDPA attention, enabling use on models like GemmaForCausalLM without workarounds (#267) @gsarti
Fix multi-device support and duplicate BOS for chat template models (#280) @gsarti
Clarified visualization directions using arrows instead of x/y (#282) @gsarti
Fix support for multi-EOS tokens (e.g. LLaMA 3.2) (#287) @gsarti
Fix copying configuration parameters to aggregated FeatureAttributionSequenceOutput objects (#292) @gsarti

📝 Documentation and Tutorials

Updated tutorial with treescope usage examples @gsarti
New tutorial on advanced analyses on RAG and reasoning models is available in reasoning_rag_attribution.ipynb

👥 List of Contributors

@LuukSuurmeijer, @gsarti and @niggoo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0: Interactive Visualizations, New Aggregators and more

Choose a tag to compare

Sorry, something went wrong.