A deep learning-based approach for multi-modal 3D object detection using LiDAR and RGB images.
This project implements an advanced 3D object detection framework leveraging depth completion techniques. It is based on the methodology outlined in the master's thesis:
"Depth Completion for 3D Object Detection from Sparse Point Cloud and Color Image" by Rahul Selvaraj
The system improves object detection by densifying sparse LiDAR point clouds using RGB images, a self-supervised learning approach, and multi-modal fusion techniques.
- LiDAR & RGB Fusion: Combines depth information from LiDAR with visual semantics from RGB images.
- Depth Completion: Uses Simple Linear Iterative Clustering (SLIC) and deep learning models to fill in missing depth values.
- Self-Supervised Learning: Trains without requiring fully labeled ground-truth depth maps.
- Improved 3D Object Detection: Uses denser depth maps to enhance detection accuracy.
- Multi-Modal Input Handling: Supports LiDAR, RGB, and depth maps.
- Benchmarking Against State-of-the-Art: Compared with leading models on the KITTI dataset.
DepthCompletion-3DObjectDetection/
│── src/ # Core source code
│ ├── basic.py
│ ├── CoordConv.py
│ ├── criteria.py
│ ├── Cropping.py
│ ├── generate_lidar_from_depth.py
│ ├── helper.py
│ ├── inverse_warp.py
│ ├── metrics.py
│ ├── model.py
│ ├── RGB_to_SLIC.py
│ ├── vis_utils.py
│── dataloaders/ # Data loading and processing
│ ├── calib_cam_to_cam.txt # Camera calibration parameters
│ ├── kitti_loader.py # KITTI dataset loader
│ ├── transforms.py # Data transformations
│── scripts/ # Scripts to run the project
│ ├── main.py # Entry point of the project
│── data/ # Placeholder for dataset files (if needed)
│── notebooks/ # Jupyter Notebooks for visualization & analysis
│── docs/ # Documentation (research papers, architecture)
│── tests/ # Unit tests (to ensure code correctness)
│── README.md # Project documentation
│── .gitignore # Ignore unnecessary files
│── LICENSE # License for usage
This project follows the depth completion pipeline:
- Preprocessing
- Convert sparse LiDAR points into depth maps.
- Apply SLIC segmentation on RGB images.
- Align LiDAR depth with segmented RGB regions.
- Depth Completion
- Use self-supervised learning to train a model for predicting dense depth maps.
- Incorporate warping techniques to refine depth accuracy.
- 3D Object Detection
- Convert dense depth maps back to point clouds.
- Feed the enhanced point clouds into state-of-the-art 3D object detection networks.
- Benchmark detection performance against KITTI.
git clone git@github.com:Laihu08/DepthCompletion-3DObjectDetection.git
cd DepthCompletion-3DObjectDetectionpython -m venv venv
source venv/bin/activate # For macOS/Linux
venv\Scripts\activate # For Windowspip install -r requirements.txtTo process depth completion and run object detection:
python scripts/main.py- (a) RGB Image → The raw input image from the camera.
- (b) Sparse Depth Map → LiDAR depth values, usually incomplete.
- (c) Predicted Dense Depth Map → Model-generated dense depth from (b).
- (d) Ground Truth Dense Depth Map → The actual full depth map (for evaluation).
- Top row → RGB images with predicted 3D bounding boxes.
- Bottom row → Corresponding LiDAR point cloud with bounding boxes.
| Method | RMSE (mm) ↓ | MAE (mm) ↓ | IRMSE (1/km) ↓ | IMAE (1/km) ↓ |
|---|---|---|---|---|
| SparseConvs | 1601.33 | 481.27 | 4.94 | 1.78 |
| Self-supervised | 954.36 | 288.64 | 3.21 | 1.35 |
| CSPN | 1019.64 | 279.46 | 2.93 | 1.15 |
| STD | 814.73 | 249.95 | 2.80 | 1.21 |
| Spade-RGBsD | 917.64 | 279.46 | 2.93 | 1.15 |
| DFuseNet | 1353.65 | 446.64 | 3.78 | 1.80 |
| NLSPN | 761.68 | 299.59 | 2.79 | 1.04 |
| PENet | 730.08 | 210.55 | 2.17 | 0.94 |
| Ours | 728.79 | 204.32 | 2.13 | 0.96 |
If you use this work in your research, please cite:
@article{selvaraj2021depthcompletion,
title={Depth Completion for 3D Object Detection from Sparse Point Cloud and Color Image},
author={Rahul Selvaraj},
year={2021},
institution={National Chung Cheng University}
}This project is licensed under the MIT License – see the LICENSE file for details.
Rahul Selvaraj
🚀 Senior Software Engineer, R&D at Karma Medical Products
🔗 GitHub | LinkedIn
🔥 If you like this project, give it a star! ⭐

