Skip to content

Advanced 3D object detection using LiDAR and RGB images, leveraging self-supervised depth completion.

License

Notifications You must be signed in to change notification settings

Laihu08/DepthCompletion-3DObjectDetection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Depth Completion for 3D Object Detection

image

A deep learning-based approach for multi-modal 3D object detection using LiDAR and RGB images.

📌 Project Overview

This project implements an advanced 3D object detection framework leveraging depth completion techniques. It is based on the methodology outlined in the master's thesis:

"Depth Completion for 3D Object Detection from Sparse Point Cloud and Color Image" by Rahul Selvaraj

The system improves object detection by densifying sparse LiDAR point clouds using RGB images, a self-supervised learning approach, and multi-modal fusion techniques.


🚀 Features

  • LiDAR & RGB Fusion: Combines depth information from LiDAR with visual semantics from RGB images.
  • Depth Completion: Uses Simple Linear Iterative Clustering (SLIC) and deep learning models to fill in missing depth values.
  • Self-Supervised Learning: Trains without requiring fully labeled ground-truth depth maps.
  • Improved 3D Object Detection: Uses denser depth maps to enhance detection accuracy.
  • Multi-Modal Input Handling: Supports LiDAR, RGB, and depth maps.
  • Benchmarking Against State-of-the-Art: Compared with leading models on the KITTI dataset.

📂 Repository Structure

DepthCompletion-3DObjectDetection/
│── src/                        # Core source code
│   ├── basic.py
│   ├── CoordConv.py
│   ├── criteria.py
│   ├── Cropping.py
│   ├── generate_lidar_from_depth.py
│   ├── helper.py
│   ├── inverse_warp.py
│   ├── metrics.py
│   ├── model.py
│   ├── RGB_to_SLIC.py
│   ├── vis_utils.py
│── dataloaders/                 # Data loading and processing
│   ├── calib_cam_to_cam.txt      # Camera calibration parameters
│   ├── kitti_loader.py           # KITTI dataset loader
│   ├── transforms.py             # Data transformations
│── scripts/                      # Scripts to run the project
│   ├── main.py                    # Entry point of the project
│── data/                          # Placeholder for dataset files (if needed)
│── notebooks/                     # Jupyter Notebooks for visualization & analysis
│── docs/                          # Documentation (research papers, architecture)
│── tests/                         # Unit tests (to ensure code correctness)
│── README.md                      # Project documentation
│── .gitignore                      # Ignore unnecessary files
│── LICENSE                         # License for usage

📖 Methodology

This project follows the depth completion pipeline:

  1. Preprocessing
    • Convert sparse LiDAR points into depth maps.
    • Apply SLIC segmentation on RGB images.
    • Align LiDAR depth with segmented RGB regions.
  2. Depth Completion
    • Use self-supervised learning to train a model for predicting dense depth maps.
    • Incorporate warping techniques to refine depth accuracy.
  3. 3D Object Detection
    • Convert dense depth maps back to point clouds.
    • Feed the enhanced point clouds into state-of-the-art 3D object detection networks.
    • Benchmark detection performance against KITTI.

💻 Installation & Setup

1️⃣ Clone the Repository

git clone git@github.com:Laihu08/DepthCompletion-3DObjectDetection.git
cd DepthCompletion-3DObjectDetection

2️⃣ Create & Activate a Virtual Environment

python -m venv venv
source venv/bin/activate  # For macOS/Linux
venv\Scripts\activate     # For Windows

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Run the Project

To process depth completion and run object detection:

python scripts/main.py

🔬 Example Results

Depth Completion Results

  • (a) RGB Image → The raw input image from the camera.
  • (b) Sparse Depth Map → LiDAR depth values, usually incomplete.
  • (c) Predicted Dense Depth Map → Model-generated dense depth from (b).
  • (d) Ground Truth Dense Depth Map → The actual full depth map (for evaluation).

Image

3D Object Detection Results

  • Top row → RGB images with predicted 3D bounding boxes.
  • Bottom row → Corresponding LiDAR point cloud with bounding boxes.

Image


📊 Performance Benchmarks

Method RMSE (mm) ↓ MAE (mm) ↓ IRMSE (1/km) ↓ IMAE (1/km) ↓
SparseConvs 1601.33 481.27 4.94 1.78
Self-supervised 954.36 288.64 3.21 1.35
CSPN 1019.64 279.46 2.93 1.15
STD 814.73 249.95 2.80 1.21
Spade-RGBsD 917.64 279.46 2.93 1.15
DFuseNet 1353.65 446.64 3.78 1.80
NLSPN 761.68 299.59 2.79 1.04
PENet 730.08 210.55 2.17 0.94
Ours 728.79 204.32 2.13 0.96

📜 Citation

If you use this work in your research, please cite:

@article{selvaraj2021depthcompletion,
  title={Depth Completion for 3D Object Detection from Sparse Point Cloud and Color Image},
  author={Rahul Selvaraj},
  year={2021},
  institution={National Chung Cheng University}
}

📄 License

This project is licensed under the MIT License – see the LICENSE file for details.


🎯 Author

Rahul Selvaraj
🚀 Senior Software Engineer, R&D at Karma Medical Products
🔗 GitHub | LinkedIn


🔥 If you like this project, give it a star!

About

Advanced 3D object detection using LiDAR and RGB images, leveraging self-supervised depth completion.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages