Skip to content

Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning

License

Notifications You must be signed in to change notification settings

ngocminhta/FAID

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FAID

Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning.

license last-commit repo-top-language repo-language-count

Built with the tools and technologies:

GNU%20Bash Python


The dataset of this project, FAIDSet is explicitly available at HuggingFace dataset ngocminhta/FAIDSet

🔗 Table of Contents


📍 Overview

The FAID project revolutionizes the detection of deepfake content through advanced text analysis. By leveraging state-of-the-art machine learning techniques, it offers robust tools for generating, managing, and evaluating text embeddings to accurately classify content as human, AI-generated, or mixed. Ideal for tech companies and cybersecurity experts, FAID enhances digital trust and integrity across various media platforms.


📁 Project Structure

└── FAID/
    ├── README.md
    ├── algorithm
    │   ├── gen_database.py
    │   ├── infer.py
    │   ├── requirements.txt
    │   ├── src
    │   │   ├── index.py
    │   │   ├── simclr.py
    │   │   └── text_embedding.py
    │   ├── test_from_database.py
    │   ├── train_classifier.py
    │   └── utils
    │       ├── load_dataset.py
    │       └── utils.py
    └── data
        ├── FAIDSet
        ├── Unseen_Domain
        ├── Unseen_Domain_and_Unseen_Generator
        └── Unseen_Generator

🚀 Getting Started

☑️ Prerequisites

Before getting started with FAID, ensure your runtime environment meets the following requirements:

  • Programming Language: Python
  • Package Manager: Pip

⚙️ Installation

Install FAID using one of the following methods:

Build from source:

  1. Clone the FAID repository:
❯ git clone https://github.com/ngocminhta/FAID
  1. Navigate to the project directory:
cd FAID
  1. Install the project dependencies:

Using pip  

❯ pip install -r algorithm/requirements.txt

🤖 Usage

Run FAID using the following command: Using pip  

To train the model

❯ python algorithm/train_classifier.py <your parameter goes here>

To generate the vector database after training

❯ python algorithm/gen_database.py <your parameter goes here>

🧪 Testing

Run the test suite using the following command: Using pip  

❯ python algorithm/test_from_database.py <your parameter goes here>

📌 News

[2026.01.04] Our research paper is accepted to EACL 2026 Main Conference!

[2025.05.20] Our research paper now publicly accessible on arXiv.

[2025.05.06] Our project is publicly accessible.


🎗 License

This project is protected under the MIT License.


🙌 Acknowledgments

This research is carried on at:

  • BKAI Research Center, Hanoi University of Science and Technology.
  • Natural Language Processing Department, Mohamed bin Zayed University of Artificial Intelligence.

🔬 Citation

@misc{ta2025faidfinegrainedaigeneratedtext,
      title={FAID: Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning}, 
      author={Minh Ngoc Ta and Dong Cao Van and Duc-Anh Hoang and Minh Le-Anh and Truong Nguyen and My Anh Tran Nguyen and Yuxia Wang and Preslav Nakov and Sang Dinh},
      year={2025},
      eprint={2505.14271},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.14271}, 
}

About

Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages