Skip to content

tum-bgd/QATDet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantization-Aware Training for Efficient On-Board Object Detection on FPGAs: Case Studies

This paper proposes a Quantization-Aware Training (QAT) framework enhanced with object-scale-aware regularization to mitigate accuracy degradation in small-object detection caused by quantization noise, specifically targeting resource-constrained FPGA deployments for geoscience and remote sensing applications.

Case studies

Aerial-view building detection

Visualization of the building detection task with true labels (above) and detected building footprints (below). Several small cabins in white rectangles are newly detected and absent from the dataset (BBD).

Bird detection system for birdstrike prevention at airports

A 3D-printed case is designed to accommodate the Xilinx Kria KV260 FPGA with waterproof sealing, cable management, and air cooling. The board connects to an IP camera for video streaming through the RTSP protocol.

Some birds have been detected at Airport Oberpfaffenhofen, Weßling, Germany

Usage

Environment

  1. Docker env

    docker run -e UID=$(id -u) -e GID=$(id -g) --name qatdet --gpus device=0 -d -it --shm-size 32G --mount source=$(pwd),target=/workspace,type=bind tumbgd/vai-pt-cuda
    docker exec -it qatdet bash
  2. Inside docker container qatdet

    python -m pip install --user -r requirements.txt
    cd code
    python -m pip install --user -v -e .
    cd ..

    In this way, the yolox library is installed. The installation is successful if you have the following output:

    Installed /workspace/code
    Successfully installed yolox
    

    All following steps shall be executed in this docker environment until we obtain a compiled .xmodel.

Dataset preparation

Download the datasets and transform them into COCO format.

Bavaria Building Dataset (BBD)

Download the dataset from here. We use bbd2k5-images-image.tar.bz2 and bbd2k5-images-umring.tar.bz2 in this project. Unzip them and put them into ./bbd/data. The resulting directory tree should be like

.
|-- LICENSE
|-- README.md
`-- bbd
    `-- data
        |-- bbd2k5-images-image
        `-- bbd2k5-images-umring

Then we can generate COCO-format json files by

python ./bbd/Mask2COCO.py

For detailed dataset description, please check here.

Citation:

@inproceedings{10.1145/3589132.3625658,
    author = {Werner, Martin and Li, Hao and Zollner, Johann Maximilian and Teuscher, Balthasar and Deuser, Fabian},
    title = {Bavaria Buildings - A Novel Dataset for Building Footprint Extraction, Instance Segmentation, and Data Quality Estimation},
    year = {2023},
    isbn = {9798400701689},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3589132.3625658},
    doi = {10.1145/3589132.3625658},
    booktitle = {Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems},
    articleno = {108},
    numpages = {4},
    location = {Hamburg, Germany},
    series = {SIGSPATIAL '23}
}

MVA2023: Small Object Detection Challenge for Spotting Birds

You can find the download link in the dataset repository here. We only use the drone2021 (~63 GB) part of this dataset. Put unzipped files into bird/data. The resulting directory tree should be like

|-- LICENSE
|-- README.md
`-- bird
    |-- data
    |   |-- annotations
    |   |   |
    |   |   `--.json
    |   `-- images
    |       |--1
    |       |   `--.jpg
    |       |--2
    |       `--...
    `-- usable_images_updater.py

The annotations given in the original dataset is already COCO-format. We use the following script to remove images containing no birds or birds that are too small (< 40 pxl width for 4K images).

python bird/Filter.py

Citation:

@inproceedings{mva2023_sod_challenge,
  title={{MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results}},
  author={Yuki Kondo and Norimichi Ukita and Takayuki Yamaguchi and Hao-Yu Hou and Mu-Yi Shen and Chia-Chi Hsu and En-Ming Huang and Yu-Chen Huang and Yu-Cheng Xia and Chien-Yao Wang and Chun-Yi Lee and Da Huo and Marc A. Kastner and Tingwei Liu and Yasutomo Kawanishi and Takatsugu Hirayama and Takahiro Komamizu and Ichiro Ide and Yosuke Shinya and Xinyao Liu and Guang Liang and Syusuke Yasui},
  booktitle={2023 18th International Conference on Machine Vision and Applications (MVA)},
  note={\url{https://www.mva-org.jp/mva2023/challenge}},
  year={2023}
}

Training

Perform QAT and get .xmodel.

  • For BBD:

    bash code/bbd.sh

    After training, you should find YOLOX_0_int.xmodel at ./YOLOX_outputs/bbd/convert_qat_results.

  • For MVA2023:

    bash code/bird.sh

    After training, you should find YOLOX_0_int.xmodel at ./YOLOX_outputs/bird/convert_qat_results.

Compilation for FPGA

The target deployment platform in this project is AMD (Xilinx) Kria KV260 FPGAs. After QAT, we compile the model obtained in the previous step for KV260.

vai_c_xir -x <PATH_TO/YOUR.xmodel> -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/KV260/arch.json -o <EXPORT_PATH> -n <NEWNAME>

If you are playing with another kind of board, you need to change the value for the -a option. Moreover, you can export the computation graph to default.svg of xmodel by

xdputil xmodel <PATH_TO_COMPILED.xmodel> -s

You can find more information in the official document here.

On-board setup

Switch to the README in onboard.

Citation

tbd

Acknowledgement

This repository is built based on pt_yolox-nano_3.5 in the Model Zoo of Vitis AI. Keep an eye on COCO path specifications in ./code/yolox/exp/yolox_base.py.

Copyright 2022-2023 Advanced Micro Devices Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

See also YOLOX.

License

Licenced under MPL-2.0 license (LICENSE or https://opensource.org/license/mpl-2-0).

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published