Clinical AI research repository for chest X-ray (CXR) analysis: multi-label pathology prediction and tuberculosis (TB) detection. Includes a FastAPI service for inference and Jupyter experiments for model development and comparison.
CXR/
├── app/ # FastAPI application (inference API)
│ ├── main.py # App entry, config, router registration
│ ├── requirements.txt # Python dependencies
│ ├── endpoints/
│ │ └── predict.py # /predict endpoint
│ ├── schemas/
│ │ └── prediction_schema.py
│ ├── services/
│ │ ├── cxr_service.py # Chest pathology (TorchXrayVision + Grad-CAM)
│ │ └── tb_service.py # TB detection (DINOv2 + classifier)
│ └── utils/
│ ├── image_utils.py # CXR preprocessing
│ └── gradcam_utils.py# Bounding boxes from Grad-CAM
│
└── Experiments/ # Jupyter notebooks and experiment docs
├── Chest pathology prediction/
│ ├── README.md # Full description of 5 experiment tracks
│ └── *.ipynb # DINOv2, CXR Foundation, TorchXrayVision, etc.
└── TB detection/
├── README.md # TB pipelines and results
└── *.ipynb # DenseNet, DINOv2, Rad-DINO MAIRA 2
The app/ folder is a FastAPI service that exposes a single prediction endpoint combining:
- Chest pathology (TorchXrayVision DenseNet121): multi-label probabilities for 14 pathologies (e.g. Atelectasis, Cardiomegaly, Effusion, Pneumonia, Pneumothorax) and Grad-CAM-derived bounding boxes for top findings.
- TB detection: DINOv2 (Stanford AIMI) backbone + a trained classifier (e.g. logistic regression) to output a binary TB finding.
| Path | Purpose |
|---|---|
main.py |
FastAPI app; sets MPLCONFIGDIR and TORCHXrayVISION_CACHE under temp; mounts predict router. |
endpoints/predict.py |
GET /predict?image_url=<url> — downloads image, runs CXR + TB pipelines, returns JSON. |
schemas/prediction_schema.py |
PredictionResponse: prediction_result, bounding_box, tb_finding. |
services/cxr_service.py |
Loads TorchXrayVision DenseNet, runs inference, gets top predictions and Grad-CAM boxes. |
services/tb_service.py |
Loads DINOv2 + logreg_model.joblib, extracts embeddings, returns TB class. |
utils/image_utils.py |
CXR normalization, center crop, resize to 224×224 for TorchXrayVision. |
utils/gradcam_utils.py |
Grad-CAM heatmaps and bounding boxes for top pathologies. |
prediction_result: dict of pathology name → probability (float).bounding_box: dict of pathology name → list of boxes[[x1,y1], [x2,y2]](224×224 space).tb_finding:1(TB) or0(no TB).
Two experiment groups live under Experiments/. Each has its own README with objectives, pipelines, and results.
- Goal: Multi-label classification of 8 thoracic findings (Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax).
- Tracks:
- DL on DINOv2 — 768-D Stanford AIMI DINOv2 embeddings → MLP.
- ML on DINOv2 — Same embeddings → RF, XGBoost, SVM.
- Multi-model (ensemble) — DL + ML on DINOv2; separate models per class → ~79–81% val accuracy.
- Google CXR Foundation — CXR Foundation embeddings → RF, SVM, XGBoost, custom Dense NN.
- TorchXrayVision — Direct DenseNet121 inference + Grad-CAM.
- See
Experiments/Chest pathology prediction/README.mdfor pipelines, metrics, and Mermaid diagrams.
- Goal: Binary TB vs no-TB from CXR (e.g. TBX11k, HuggingFace TB datasets).
- Pipelines:
- DenseNet (TorchXrayVision) — Fine-tuned for TB; ~96.6% AUC, ~91.7% accuracy on TBX11k.
- DINO v2 (Stanford AIMI) — Frozen ViT → embeddings → linear/MLP head.
- Rad-DINO MAIRA 2 (Microsoft) — Radiology foundation model → embeddings → classifier.
- See
Experiments/TB detection/README.mdfor details and diagrams.
The app uses models/approaches aligned with these experiments (TorchXrayVision for pathology, DINOv2 + trained classifier for TB). The TB classifier file logreg_model.joblib is expected to be produced (or exported) from the TB experiments.
- Python 3.8+ (3.10+ recommended)
- pip
- Optional: CUDA for GPU (PyTorch, DINOv2, TorchXrayVision)
-
Clone and enter the repo
cd CXR -
Create and activate a virtual environment
python -m venv venv # Windows venv\Scripts\activate # Linux/macOS source venv/bin/activate
-
Install app dependencies
cd app pip install -r requirements.txtFirst run may download TorchXrayVision assets and HuggingFace models (DINOv2); cache dirs are set in
main.py(e.g. under system temp). -
TB model for the API (required for
/predictTB output)
The TB service loads a classifier from disk:- Ensure
logreg_model.joblibis in the current working directory when you start the server (i.e. place it insideapp/or the directory from which you run the app), or - Train/export this model using one of the TB detection notebooks (e.g. DINOv2 + logistic regression) and copy the saved file to
app/. - If you use a different path or filename, update
tb_service.py:joblib.load("your_path/logreg_model.joblib").
- Ensure
-
Run the API
# From app/ (so that logreg_model.joblib is in cwd if stored there) uvicorn main:app --reload --host 0.0.0.0 --port 8000
- Install the same dependencies (or use the same venv); add Jupyter if needed:
pip install jupyter. - Open the desired notebook under
Experiments/Chest pathology prediction/orExperiments/TB detection/and run cells (datasets and paths may need to be set in the notebook).
curl "http://localhost:8000/predict?image_url=https://example.com/path/to/cxr.png"Example response shape:
{
"prediction_result": {
"Atelectasis": 0.12,
"Cardiomegaly": 0.08,
"Effusion": 0.45,
"Pneumonia": 0.22,
...
},
"bounding_box": {
"Effusion": [[10, 50], [180, 200]],
...
},
"tb_finding": 0
}- fastapi, uvicorn — API server
- torch, torchvision — PyTorch
- torchxrayvision — CXR pathology model
- transformers — DINOv2
- joblib — TB classifier
- pytorch-grad-cam — Grad-CAM
- scikit-image, opencv-python, pillow — image I/O and processing
- pydantic — request/response schemas
Full list: app/requirements.txt.