MBZUAI NLP department

All

55 repositories

SemEval-2026-Task13
Public
Jupyter Notebook
•
Apache License 2.0
•16•33•0•0•Updated Feb 19, 2026Feb 19, 2026
ImageCLEF-MultimodalReasoning
Public
multimodal reasoning shared task
HTML
•1•3•0•0•Updated Feb 13, 2026Feb 13, 2026
PAN-CLEF2026-Reasoning-Trajectory-Detection
Public
PAN CLEF 2026 Shared Task: Reasoning Trajectory Detection
Other
•0•0•0•0•Updated Feb 12, 2026Feb 12, 2026
corporate-bias
Public
0•0•0•0•Updated Feb 9, 2026Feb 9, 2026
CLEF-2026-FinMMEval-Lab
Public
1•6•0•0•Updated Feb 4, 2026Feb 4, 2026
FAID
Public
Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning.
Python
•
MIT License
•2•0•0•0•Updated Jan 18, 2026Jan 18, 2026
CASA
Public
Clinical Annotations for Stuttering Assessment
Python
•1•2•0•0•Updated Jan 15, 2026Jan 15, 2026
finchain
Public
A symbolic benchmark for verifiable chain-of-thought financial reasoning. Includes executable templates, 58 topics across 12 domains, and ChainEval metrics.
benchmark symbolic-reasoning financial-nlp llm chain-of-thought financial-reasoning
Python
•4•25•2•1•Updated Dec 26, 2025Dec 26, 2025
SAHM
Public
Python
•
Apache License 2.0
•0•1•0•0•Updated Dec 1, 2025Dec 1, 2025
llm-tad-uncertainty
Public
Jupyter Notebook
•0•4•0•0•Updated Nov 1, 2025Nov 1, 2025
Personalized_MGT_Detect
Public
Official Repository for paper "When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection"
Python
•0•5•0•0•Updated Oct 15, 2025Oct 15, 2025
llm-media-profiling
Public
This repository contains the code, dataset, and resources for our ACL 2025 paper: "Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking…
Jupyter Notebook
•
MIT License
•0•9•0•0•Updated Oct 13, 2025Oct 13, 2025
stutterbank
Public
JavaScript
•0•0•0•0•Updated Oct 7, 2025Oct 7, 2025
AudioJailbreak
Public
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
Python
•3•30•1•0•Updated Oct 6, 2025Oct 6, 2025
spirit-breaking
Public
Python
•1•1•0•0•Updated Oct 5, 2025Oct 5, 2025
qorgau-kaz-ru-safety
Public
A benchmark and evaluation framework for assessing the safety of language models in Kazakh and Russian.
Jupyter Notebook
•1•1•0•0•Updated Sep 28, 2025Sep 28, 2025
qraft
Public
Python
•0•1•0•0•Updated Sep 17, 2025Sep 17, 2025
SPECS
Public
SPECS: Specificity-Enhanced CLIP-Score for Long Image Caption Evaluation (Accepted by EMNLP 2025 Main)
Python
•
Apache License 2.0
•1•8•1•1•Updated Sep 2, 2025Sep 2, 2025
arabic-aes-bea25
Public
Repo for ARWI_generate_data (Arabic Read, Write and Improve)
Python
•0•0•0•0•Updated Aug 18, 2025Aug 18, 2025
OpenFactCheck
Public
An Open-source Factuality Evaluation Demo for LLMs
natural-language-processing artificial-intelligence fact-checking
Python
•
GNU General Public License v3.0
•3•23•1•0•Updated Aug 10, 2025Aug 10, 2025
UnsafeChain
Public
Python
•
Apache License 2.0
•0•4•0•0•Updated Jul 30, 2025Jul 30, 2025
ArTST
Public
Python
•8•65•1•0•Updated Jul 10, 2025Jul 10, 2025
NADI-2025---Subatsk-3-Diacritic-Restoration
Public
Python
•0•0•0•0•Updated Jun 12, 2025Jun 12, 2025
Arabic_safety_evaluation
Public
A Benchmark and Evaluation framework for evaluating Arabic LLM safeguards
Jupyter Notebook
•2•5•0•0•Updated Jun 11, 2025Jun 11, 2025
fire
Public
A lightweight, agent-style framework for fact-checking atomic claims using iterative retrieval and verification. Reduces LLM and search cost while maintaining s…
framework retrieval verification factchecking factuality llm llm-agent hallucination-detection
Python
•3•14•0•0•Updated Jun 4, 2025Jun 4, 2025
UrduFactCheck
Public
An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking
Jupyter Notebook
•0•2•0•0•Updated May 30, 2025May 30, 2025
PAN-CLEF2025GenAIDetection-Subtask2
Public
Python
•1•4•0•0•Updated May 29, 2025May 29, 2025
entity-framing
Public
JavaScript
•0•0•0•0•Updated May 28, 2025May 28, 2025
Multilingual-ST
Public
Multilingual Statement Tuning
Jupyter Notebook
•0•2•0•0•Updated May 28, 2025May 28, 2025
arab_culture
Public
Other
•1•4•0•0•Updated May 26, 2025May 26, 2025