The rapid advancement of generative AI has led to a significant rise in deepfake and manipulated media, including images, videos, and audio. These are increasingly used in fraud, impersonation, misinformation, and social engineering attacks. Elderly and digitally less-aware users are especially vulnerable to such threats.
This project presents a real, end-to-end AI-powered system that verifies the authenticity of images and videos using pretrained deepfake detection models, explainable AI techniques, and asynchronous processing for heavy workloads.
The system is designed as a deployable software product, not a simulation.
-
Image deepfake detection using pretrained models
-
Video deepfake detection with temporal aggregation
-
Confidence score with human-readable verdicts:
- Real
- Suspicious
- Likely Fake
-
Asynchronous video analysis using Redis + Celery
-
Explainable AI support for images
-
Privacy-first processing (no media permanently stored)
-
API-driven architecture ready for web or mobile frontends
| Media Type | Status |
|---|---|
| Images (JPG, PNG) | Supported |
| Videos (MP4) | Supported |
| Audio | Planned |
| URLs / Links | Planned |
- Model: XceptionNet (pretrained on FaceForensics++)
- Input: RGB image
- Output: Manipulation probability
- Reason: Widely used research baseline with strong performance on compressed media
- Frame sampling from video
- Image-level inference on selected frames
- Top-K temporal aggregation to focus on highly manipulated frames
- Final video-level confidence score
This approach follows standard practices used in deepfake detection research and industry systems.
- Python
- FastAPI
- PyTorch
- OpenCV
- Redis
- Celery
- XceptionNet (pretrained)
- Torchvision
- TIMM
- NumPy
- Scikit-learn (evaluation and metrics)
- Asynchronous task queue (Celery)
- Message broker (Redis)
- CPU-based inference (GPU optional)
media-detector/
├── backend/
│ ├── api/ # FastAPI routes
│ ├── ml/ # ML models and evaluation
│ ├── tasks/ # Celery background tasks
│ ├── utils/ # Helpers and utilities
│ ├── main.py # FastAPI entry point
│ └── celery_app.py # Celery configuration
├── pretrained/ # Pretrained model weights
│ └── xception_c23.p
└── README.md
git clone <repository-url>
cd AI-Deepfake-Manipulated-Content-Verification-Tool/media-detectorpip install -r requirements.txtRequired packages include:
- torch
- torchvision
- timm
- fastapi
- uvicorn
- celery
- redis
- opencv-python
- pillow
- scikit-learn
Download the FaceForensics++ pretrained XceptionNet weights:
https://github.com/ondyari/FaceForensics/releases/download/v1.0/xception_c23.p
Place the file here:
media-detector/pretrained/xception_c23.p
Using Docker:
docker run -d -p 6379:6379 rediscelery -A backend.celery_app.celery_app worker --loglevel=info --pool=solouvicorn backend.main:appAPI documentation will be available at:
http://127.0.0.1:8000/docs
The system is evaluated using the FaceForensics++ dataset (C23 compression).
Metrics reported:
- Accuracy
- Precision
- Recall
- F1 Score
- ROC-AUC
Evaluation is performed using real inference, not simulated outputs.
- Media files are processed in memory or temporary storage only
- No user content is permanently stored
- No personal data collection
- Designed to be compliant with privacy-first principles
- Audio and live call analysis are not yet implemented
- Video detection relies on image-based models with temporal aggregation
- Performance depends on hardware (CPU vs GPU)