Skip to content

v0.7.0: Interactive Visualizations, New Aggregators and more

Latest

Choose a tag to compare

@github-actions github-actions released this 02 Feb 05:49

🌳 Treescope Integration for Interactive Visualizations (#283)

Inseq now integrates treescope, Google DeepMind's library for interactive model and tensor visualization. Two new methods leverage this integration:

  • FeatureAttributionOutput.show_granular: Interactive visualization of multidimensional attribution tensors
image
  • FeatureAttributionSequenceOutput.show_tokens: Interactive token highlights
image
  • The visualize_attribute_context method for the inseq attribute-context CLI command now produces an interactive output:
image

🔢 Enhanced Aggregation Capabilities (#282, #290)

SliceAggregator

A new SliceAggregator ("slices") allows slicing source (encoder-decoder) or target (decoder-only) tokens from attribution outputs. The __getitem__ method provides convenient [start:stop] syntax:

import inseq
from inseq.data.aggregator import SliceAggregator

attrib_model = inseq.load_model("gpt2", "attention")
input_prompt = """Instruction: Summarize this article.
Input_text: In a quiet village nestled between rolling hills, an ancient tree whispered secrets to those who listened. One night, a curious child named Elara leaned close and heard tales of hidden treasures beneath the roots. As dawn broke, she unearthed a shimmering box, unlocking a forgotten world of wonder and magic.
Summary:"""

full_output_prompt = input_prompt + " Elara discovers a shimmering box under an ancient tree, unlocking a world of magic."

out = attrib_model.attribute(input_prompt, full_output_prompt)[0]

# These are all equivalent ways to slice only the input text contents
out_sliced = out.aggregate(SliceAggregator, target_spans=(13,73))
out_sliced = out.aggregate("slices", target_spans=(13,73))
out_sliced = out[13:73]

StringSplitAggregator

A new StringSplitAggregator ("split") supports complex aggregation procedures with regex pattern matching:

# Split on newlines. Default split_mode = "single".
out.aggregate("split", split_pattern="\n").aggregate("sum").show(do_aggregation=False)

# Split on whitespace-separated words of length 5.
out.aggregate("split", split_pattern=r"\s(\w{5})(?=\s)", split_mode="end")

PairAggregator Shortcut

The __sub__ method now serves as a shortcut for PairAggregator, enabling intuitive comparison of attribution outputs:

import inseq

attrib_model = inseq.load_model("gpt2", "saliency")

out_male = attrib_model.attribute(
    "The director went home because",
    "The director went home because he was tired",
    step_scores=["probability"]
)[0]
out_female = attrib_model.attribute(
    "The director went home because",
    "The director went home because she was tired",
    step_scores=["probability"]
)[0]
(out_male - out_female).show()

💾 Memory-Efficient Attribution Saving (#273)

New scores_precision parameter in FeatureAttributionOutput.save enables efficient saving in float16 and float8 formats:

import inseq

attrib_model = inseq.load_model("gpt2", "attention")
out = attrib_model.attribute("Hello world", generation_kwargs={'max_new_tokens': 100})

# Previous usage, memory inefficient
out.save("output.json")

# Memory-efficient saving
out.save("output_fp16.json", scores_precision="float16") # or "float8"

# Automatic conversion to float32
out_loaded = inseq.FeatureAttributionOutput.load("output_fp16.json")

🐍 Python 3.13 Support

Added support for Python 3.13. Current support is Python >= 3.10, <= 3.13.

🤖 New Model Support

Added configurations for new models:

  • DbrxForCausalLM
  • OlmoForCausalLM
  • Phi3ForCausalLM
  • Qwen2MoeForCausalLM
  • Gemma2ForCausalLM
  • OlmoeForCausalLM
  • GraniteForCausalLM
  • GraniteMoeForCausalLM

All new models based on the modular Huggingface architecture should work out of the box if based on standard architectures like LlamaForCausalLM.

💥 Breaking Changes

  • Dropped support for Python 3.9. Current support is Python >= 3.10, <= 3.13 (#283).

All Merged PRs

🚀 Features

  • Added treescope for interactive model and tensor visualization (#283) @gsarti
  • New treescope-powered methods FeatureAttributionOutput.show_granular and FeatureAttributionSequenceOutput.show_tokens for interactive visualization (#283) @gsarti
  • Added support for Python 3.13 @gsarti
  • Added new models DbrxForCausalLM, OlmoForCausalLM, Phi3ForCausalLM, Qwen2MoeForCausalLM, Gemma2ForCausalLM, OlmoeForCausalLM, GraniteForCausalLM, GraniteMoeForCausalLM to model config @gsarti
  • Add rescale_attributions to Inseq CLI commands for rescale=True (#280) @gsarti
  • Rows and columns in the visualization now have indices alongside tokens (#282) @gsarti
  • New parameter clean_special_chars in model.attribute to automatically clean special characters from output tokens (#289) @gsarti
  • Added scores_precision to FeatureAttributionOutput.save for efficient float16 and float8 saving (#273) @gsarti
  • New SliceAggregator for slicing tokens with [start:stop] syntax (#282) @gsarti
  • New StringSplitAggregator for regex-based aggregation (#290) @gsarti
  • __sub__ method shortcut for PairAggregator (#282) @gsarti

🔧 Fixes & Refactoring

  • Fix the issue in the attention implementation where non-terminal positions were set to nan if they were 0s (#269) @gsarti
  • Fix the pad token for models where it is not specified by default (e.g. Qwen models) (#269) @gsarti
  • Fix value_zeroing for SDPA attention, enabling use on models like GemmaForCausalLM without workarounds (#267) @gsarti
  • Fix multi-device support and duplicate BOS for chat template models (#280) @gsarti
  • Clarified visualization directions using arrows instead of x/y (#282) @gsarti
  • Fix support for multi-EOS tokens (e.g. LLaMA 3.2) (#287) @gsarti
  • Fix copying configuration parameters to aggregated FeatureAttributionSequenceOutput objects (#292) @gsarti

📝 Documentation and Tutorials

👥 List of Contributors

@LuukSuurmeijer, @gsarti and @niggoo