CXR Clinical AI Research

Clinical AI research repository for chest X-ray (CXR) analysis: multi-label pathology prediction and tuberculosis (TB) detection. Includes a FastAPI service for inference and Jupyter experiments for model development and comparison.

Repository structure

CXR/
├── app/                    # FastAPI application (inference API)
│   ├── main.py             # App entry, config, router registration
│   ├── requirements.txt    # Python dependencies
│   ├── endpoints/
│   │   └── predict.py      # /predict endpoint
│   ├── schemas/
│   │   └── prediction_schema.py
│   ├── services/
│   │   ├── cxr_service.py  # Chest pathology (TorchXrayVision + Grad-CAM)
│   │   └── tb_service.py  # TB detection (DINOv2 + classifier)
│   └── utils/
│       ├── image_utils.py  # CXR preprocessing
│       └── gradcam_utils.py# Bounding boxes from Grad-CAM
│
└── Experiments/            # Jupyter notebooks and experiment docs
    ├── Chest pathology prediction/
    │   ├── README.md       # Full description of 5 experiment tracks
    │   └── *.ipynb        # DINOv2, CXR Foundation, TorchXrayVision, etc.
    └── TB detection/
        ├── README.md      # TB pipelines and results
        └── *.ipynb        # DenseNet, DINOv2, Rad-DINO MAIRA 2

App directory (API)

The app/ folder is a FastAPI service that exposes a single prediction endpoint combining:

Chest pathology (TorchXrayVision DenseNet121): multi-label probabilities for 14 pathologies (e.g. Atelectasis, Cardiomegaly, Effusion, Pneumonia, Pneumothorax) and Grad-CAM-derived bounding boxes for top findings.
TB detection: DINOv2 (Stanford AIMI) backbone + a trained classifier (e.g. logistic regression) to output a binary TB finding.

Main components

Path	Purpose
`main.py`	FastAPI app; sets `MPLCONFIGDIR` and `TORCHXrayVISION_CACHE` under temp; mounts predict router.
`endpoints/predict.py`	`GET /predict?image_url=<url>` — downloads image, runs CXR + TB pipelines, returns JSON.
`schemas/prediction_schema.py`	`PredictionResponse`: `prediction_result`, `bounding_box`, `tb_finding`.
`services/cxr_service.py`	Loads TorchXrayVision DenseNet, runs inference, gets top predictions and Grad-CAM boxes.
`services/tb_service.py`	Loads DINOv2 + `logreg_model.joblib`, extracts embeddings, returns TB class.
`utils/image_utils.py`	CXR normalization, center crop, resize to 224×224 for TorchXrayVision.
`utils/gradcam_utils.py`	Grad-CAM heatmaps and bounding boxes for top pathologies.

Response shape

prediction_result: dict of pathology name → probability (float).
bounding_box: dict of pathology name → list of boxes [[x1,y1], [x2,y2]] (224×224 space).
tb_finding: 1 (TB) or 0 (no TB).

Experiments

Two experiment groups live under Experiments/. Each has its own README with objectives, pipelines, and results.

1. Chest pathology prediction (`Experiments/Chest pathology prediction/`)

Goal: Multi-label classification of 8 thoracic findings (Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax).
Tracks:
1. DL on DINOv2 — 768-D Stanford AIMI DINOv2 embeddings → MLP.
2. ML on DINOv2 — Same embeddings → RF, XGBoost, SVM.
3. Multi-model (ensemble) — DL + ML on DINOv2; separate models per class → ~79–81% val accuracy.
4. Google CXR Foundation — CXR Foundation embeddings → RF, SVM, XGBoost, custom Dense NN.
5. TorchXrayVision — Direct DenseNet121 inference + Grad-CAM.
See Experiments/Chest pathology prediction/README.md for pipelines, metrics, and Mermaid diagrams.

2. TB detection (`Experiments/TB detection/`)

Goal: Binary TB vs no-TB from CXR (e.g. TBX11k, HuggingFace TB datasets).
Pipelines:
1. DenseNet (TorchXrayVision) — Fine-tuned for TB; ~96.6% AUC, ~91.7% accuracy on TBX11k.
2. DINO v2 (Stanford AIMI) — Frozen ViT → embeddings → linear/MLP head.
3. Rad-DINO MAIRA 2 (Microsoft) — Radiology foundation model → embeddings → classifier.
See Experiments/TB detection/README.md for details and diagrams.

The app uses models/approaches aligned with these experiments (TorchXrayVision for pathology, DINOv2 + trained classifier for TB). The TB classifier file logreg_model.joblib is expected to be produced (or exported) from the TB experiments.

Setup

Prerequisites

Python 3.8+ (3.10+ recommended)
pip
Optional: CUDA for GPU (PyTorch, DINOv2, TorchXrayVision)

Steps

Clone and enter the repo
```
cd CXR
```

Create and activate a virtual environment

python -m venv venv
# Windows
venv\Scripts\activate
# Linux/macOS
source venv/bin/activate

Install app dependencies
```
cd app
pip install -r requirements.txt
```
First run may download TorchXrayVision assets and HuggingFace models (DINOv2); cache dirs are set in main.py (e.g. under system temp).
TB model for the API (required for /predict TB output)
The TB service loads a classifier from disk:
- Ensure logreg_model.joblib is in the current working directory when you start the server (i.e. place it inside app/ or the directory from which you run the app), or
- Train/export this model using one of the TB detection notebooks (e.g. DINOv2 + logistic regression) and copy the saved file to app/.
- If you use a different path or filename, update tb_service.py: joblib.load("your_path/logreg_model.joblib").

Run the API

# From app/ (so that logreg_model.joblib is in cwd if stored there)
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Docs: http://localhost:8000/docs

Running experiments

Install the same dependencies (or use the same venv); add Jupyter if needed: pip install jupyter.
Open the desired notebook under Experiments/Chest pathology prediction/ or Experiments/TB detection/ and run cells (datasets and paths may need to be set in the notebook).

API usage example

curl "http://localhost:8000/predict?image_url=https://example.com/path/to/cxr.png"

Example response shape:

{
  "prediction_result": {
    "Atelectasis": 0.12,
    "Cardiomegaly": 0.08,
    "Effusion": 0.45,
    "Pneumonia": 0.22,
    ...
  },
  "bounding_box": {
    "Effusion": [[10, 50], [180, 200]],
    ...
  },
  "tb_finding": 0
}

Dependencies (summary)

fastapi, uvicorn — API server
torch, torchvision — PyTorch
torchxrayvision — CXR pathology model
transformers — DINOv2
joblib — TB classifier
pytorch-grad-cam — Grad-CAM
scikit-image, opencv-python, pillow — image I/O and processing
pydantic — request/response schemas

Full list: app/requirements.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CXR Clinical AI Research

Repository structure

App directory (API)

Main components

Response shape

Experiments

1. Chest pathology prediction (`Experiments/Chest pathology prediction/`)

2. TB detection (`Experiments/TB detection/`)

Setup

Prerequisites

Steps

Running experiments

API usage example

Dependencies (summary)

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Experiments		Experiments
app		app
README.md		README.md

SHubhamanjk/cxr-clinical-ai-research

Folders and files

Latest commit

History

Repository files navigation

CXR Clinical AI Research

Repository structure

App directory (API)

Main components

Response shape

Experiments

1. Chest pathology prediction (Experiments/Chest pathology prediction/)

2. TB detection (Experiments/TB detection/)

Setup

Prerequisites

Steps

Running experiments

API usage example

Dependencies (summary)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Chest pathology prediction (`Experiments/Chest pathology prediction/`)

2. TB detection (`Experiments/TB detection/`)

Packages