Sharing installation process, and question about evaluating on other NuScenes splits

Hi there, thanks for sharing your work! Below, I am sharing the steps I took to install the repo and to be able to evaluate DriveVLA. And further below, **I have a question regarding evaluating the model on other splits on the NuScenes dataset**.

# Installation Process and Evaluation

I am running on a cluster running RedHat 9.4, CUDA 12.8, and on an NVIDIA A40 GPU.

After cloning the repo, I first edited the train packages in `pyproject.toml` as follows:
```
train = [
    "llava[standalone]",
    #"numpy==1.26.1",
    "open_clip_torch",
    "fastapi",
    "gradio==3.35.2",
    "markdown2[all]",
    #"numpy",
    "requests",
    "sentencepiece",
    #"torch==2.1.2",
    #"torchvision==0.16.2",
    "uvicorn",
    "wandb",
    "deepspeed==0.14.2",
    "peft==0.4.0",
    "accelerate==0.29.3",
    "tokenizers~=0.15.2",
    "transformers@git+https://github.com/huggingface/transformers.git@1c39974a4c4036fd641bc1191cc32799f85715a4",
    "flash-attn==2.5.7",
    "bitsandbytes==0.41.0",
    #"scikit-learn==1.2.2",
    "sentencepiece~=0.1.99",
    "einops==0.6.1",
    "einops-exts==0.0.4",
    "gradio_client==0.2.9",
    "urllib3<=2.0.0",
    "datasets==2.16.1",
    "pydantic==1.10.8",
    "timm",
    "hf_transfer",
    "opencv-python",
    "av",
    "decord",
    "tyro",
    #"scipy",
    "pytorch-lightning==1.2.5",
    "torchmetrics==0.11.4",
]
```

Then I was able to install the repo as follows:
```
conda create -n drivevla python=3.10
conda activate drivevla
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install numpy==1.26.4
cd third_party/mmcv_1_7_2/
pip install -r requirements/optional.txt
MMCV_WITH_OPS=1 MMCV_WITH_CUDA=1 pip install -v -e . --no-build-isolation
pip install mmdet==2.26.0 mmsegmentation==0.29.1 mmengine==0.9.0 motmetrics==1.4.0 casadi==3.6.0
cd ../mmdetection3d_1_0_0rc6/
pip install scipy==1.10.1 scikit-image==0.19.3 fsspec
pip install -v -e . --no-build-isolation
cd ../..
pip install -e ".[train]" --no-build-isolation --no-cache-dir
pip install "numpy<2"
```

Before evaluating the model, I had to reduce the `num-workers` value in `scripts/eval_drivevla.sh` to `0`, or else I ran into a multiprocessing spawn error in the dataloader.

Then I could evaluate the model with:
`bash scripts/eval_drivevla.sh checkpoints/DriveVLA-Qwen2.5-0.5B-Instruct/ 1`

This evaluation took about 5 hours on one A40 GPU and resulted in the following values on the `v1.0-trainval` split (numbers are similar to those in the paper):
```
-------------------------------------------------------------------
Processed total 6019 samples
gt collision: 36

-------------------------------------------------------------------
UniAD evaluation:
Method                  L2 (m)                 Collision (%)       
                1s    2s    3s    Avg.     1s    2s    3s    Avg. 
DriveAgent      0.21  0.60  1.23  0.68     0.00  0.12  0.58  0.23 

-------------------------------------------------------------------
STP-3 evaluation:
Method                  L2 (m)                 Collision (%)       
                1s    2s    3s    Avg.     1s    2s    3s    Avg. 
DriveAgent      0.15  0.32  0.57  0.35     0.01  0.05  0.19  0.09 
```

# Evaluating on Other Splits

I believe #27 was closed before explicitly answering the OP's question. Could you explain how to evaluate the model on other splits like `mini` and `test` from the NuScenes dataset? Thank you for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharing installation process, and question about evaluating on other NuScenes splits #29

Installation Process and Evaluation

Evaluating on Other Splits

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Sharing installation process, and question about evaluating on other NuScenes splits #29

Description

Installation Process and Evaluation

Evaluating on Other Splits

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions