Skip to content

Ztrimus/llm-sensitivity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

313 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Sensitivity

Documentation

Final Experimental Dataset

from datasets import load_dataset

ds = load_dataset("Ztrimus/llm-safety-flip-dataset", split="full")

# Preview a sample
print(ds[0])

# Analyze flip rate
import pandas as pd
df = ds.to_pandas()
flip_rate = ((df.original_response_safety == "safe") & (df.perturbed_response_safety == "unsafe")).mean()
print(f"Safe → Unsafe flip rate: {flip_rate:.2%}")

Setup

module load mamba/latest
source activate llm_safety_39
  • Create credentials.py at src/config location with your personal credentials.
ASURITE_ID = "YOUR_ASURITE_ID"
HF_TOKEN ="PUT_HF_TOKEN_HERE"
  • to make src contains importable
cd llm-sensitivity
export PYTHONPATH=$(pwd)/src

About

LLM Sensitivity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages