Quantization-Aware Training for Efficient On-Board Object Detection on FPGAs: Case Studies

This paper proposes a Quantization-Aware Training (QAT) framework enhanced with object-scale-aware regularization to mitigate accuracy degradation in small-object detection caused by quantization noise, specifically targeting resource-constrained FPGA deployments for geoscience and remote sensing applications.

Case studies

Aerial-view building detection

Visualization of the building detection task with true labels (above) and detected building footprints (below). Several small cabins in white rectangles are newly detected and absent from the dataset (BBD).

Bird detection system for birdstrike prevention at airports

A 3D-printed case is designed to accommodate the Xilinx Kria KV260 FPGA with waterproof sealing, cable management, and air cooling. The board connects to an IP camera for video streaming through the RTSP protocol.

Some birds have been detected at Airport Oberpfaffenhofen, Weßling, Germany

Usage

Environment

Docker env

docker run -e UID=$(id -u) -e GID=$(id -g) --name qatdet --gpus device=0 -d -it --shm-size 32G --mount source=$(pwd),target=/workspace,type=bind tumbgd/vai-pt-cuda

docker exec -it qatdet bash

Inside docker container qatdet
```
python -m pip install --user -r requirements.txt
cd code
python -m pip install --user -v -e .
cd ..
```
In this way, the yolox library is installed. The installation is successful if you have the following output:
```
Installed /workspace/code
Successfully installed yolox
```
All following steps shall be executed in this docker environment until we obtain a compiled .xmodel.

Dataset preparation

Download the datasets and transform them into COCO format.

Bavaria Building Dataset (BBD)

Download the dataset from here. We use bbd2k5-images-image.tar.bz2 and bbd2k5-images-umring.tar.bz2 in this project. Unzip them and put them into ./bbd/data. The resulting directory tree should be like

.
|-- LICENSE
|-- README.md
`-- bbd
    `-- data
        |-- bbd2k5-images-image
        `-- bbd2k5-images-umring

Then we can generate COCO-format json files by

python ./bbd/Mask2COCO.py

For detailed dataset description, please check here.

Citation:

@inproceedings{10.1145/3589132.3625658,
    author = {Werner, Martin and Li, Hao and Zollner, Johann Maximilian and Teuscher, Balthasar and Deuser, Fabian},
    title = {Bavaria Buildings - A Novel Dataset for Building Footprint Extraction, Instance Segmentation, and Data Quality Estimation},
    year = {2023},
    isbn = {9798400701689},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3589132.3625658},
    doi = {10.1145/3589132.3625658},
    booktitle = {Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems},
    articleno = {108},
    numpages = {4},
    location = {Hamburg, Germany},
    series = {SIGSPATIAL '23}
}

MVA2023: Small Object Detection Challenge for Spotting Birds

You can find the download link in the dataset repository here. We only use the drone2021 (~63 GB) part of this dataset. Put unzipped files into bird/data. The resulting directory tree should be like

|-- LICENSE
|-- README.md
`-- bird
    |-- data
    |   |-- annotations
    |   |   |
    |   |   `--.json
    |   `-- images
    |       |--1
    |       |   `--.jpg
    |       |--2
    |       `--...
    `-- usable_images_updater.py

The annotations given in the original dataset is already COCO-format. We use the following script to remove images containing no birds or birds that are too small (< 40 pxl width for 4K images).

python bird/Filter.py

Citation:

@inproceedings{mva2023_sod_challenge,
  title={{MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results}},
  author={Yuki Kondo and Norimichi Ukita and Takayuki Yamaguchi and Hao-Yu Hou and Mu-Yi Shen and Chia-Chi Hsu and En-Ming Huang and Yu-Chen Huang and Yu-Cheng Xia and Chien-Yao Wang and Chun-Yi Lee and Da Huo and Marc A. Kastner and Tingwei Liu and Yasutomo Kawanishi and Takatsugu Hirayama and Takahiro Komamizu and Ichiro Ide and Yosuke Shinya and Xinyao Liu and Guang Liang and Syusuke Yasui},
  booktitle={2023 18th International Conference on Machine Vision and Applications (MVA)},
  note={\url{https://www.mva-org.jp/mva2023/challenge}},
  year={2023}
}

Training

Perform QAT and get .xmodel.

For BBD:
```
bash code/bbd.sh
```
After training, you should find YOLOX_0_int.xmodel at ./YOLOX_outputs/bbd/convert_qat_results.
For MVA2023:
```
bash code/bird.sh
```
After training, you should find YOLOX_0_int.xmodel at ./YOLOX_outputs/bird/convert_qat_results.

Compilation for FPGA

The target deployment platform in this project is AMD (Xilinx) Kria KV260 FPGAs. After QAT, we compile the model obtained in the previous step for KV260.

vai_c_xir -x <PATH_TO/YOUR.xmodel> -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/KV260/arch.json -o <EXPORT_PATH> -n <NEWNAME>

If you are playing with another kind of board, you need to change the value for the -a option. Moreover, you can export the computation graph to default.svg of xmodel by

xdputil xmodel <PATH_TO_COMPILED.xmodel> -s

You can find more information in the official document here.

On-board setup

Switch to the README in onboard.

Citation

tbd

Acknowledgement

This repository is built based on pt_yolox-nano_3.5 in the Model Zoo of Vitis AI. Keep an eye on COCO path specifications in ./code/yolox/exp/yolox_base.py.

Copyright 2022-2023 Advanced Micro Devices Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

License

Licenced under MPL-2.0 license (LICENSE or https://opensource.org/license/mpl-2-0).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantization-Aware Training for Efficient On-Board Object Detection on FPGAs: Case Studies

Case studies

Aerial-view building detection

Bird detection system for birdstrike prevention at airports

Usage

Environment

Dataset preparation

Bavaria Building Dataset (BBD)

MVA2023: Small Object Detection Challenge for Spotting Birds

Training

Compilation for FPGA

On-board setup

Citation

Acknowledgement

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
bbd		bbd
bird		bird
ckpt		ckpt
code		code
onboard		onboard
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

tum-bgd/QATDet

Folders and files

Latest commit

History

Repository files navigation

Quantization-Aware Training for Efficient On-Board Object Detection on FPGAs: Case Studies

Case studies

Aerial-view building detection

Bird detection system for birdstrike prevention at airports

Usage

Environment

Dataset preparation

Bavaria Building Dataset (BBD)

MVA2023: Small Object Detection Challenge for Spotting Birds

Training

Compilation for FPGA

On-board setup

Citation

Acknowledgement

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages