Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering (CVPR 2024)

¹ SenseTime Research and Tetras.AI, ² SKL-IOTSC, CIS, University of Macau

^* Equal Contribution. ^† Corresponding Authors.

🔔 Overview

Hightlights

We propose a new architecture for RAW video derendering. This architecture can efficiently de-render RAW video sequences using only one RAW frame and sRGB videos as input. By adopting this method, both storage and computation efficiency for RAW video capturing can be significantly improved.

We propose a new benchmark for RAW video derendering to comprehensively evaluate the methods for this task. To our knowledge, this is the first benchmark specifically designed for the RAW video de-rendering task.

Framework

The framework consists of two main stages:

Temporal Affinity Prior Extraction: This stage generates a reference RAW image by leveraging motion information between adjacent frames.
Spatial Feature Fusion and Mapping: Using the reference RAW as the initial state, a pixel-level mapping function is learned to refine inaccurately predicted pixels from the first stage. This process incorporates guidance from the sRGB image and preceding frames.

⏰ TODO List

Dataset Release
Model Release
Code Release

🔧 Installation

Our model does not use any hard-to-configure packages. You only need to install torch and some simple dependencies (such as numpy, cv2). Of course, you can directly follow the steps below to configure the environment:

# git clone this repository
git clone https://github.com/zhangchen98/RAW_CVPR24.git
cd RAW_CVPR24

# create an environment
conda create -n videoRaw python=3.8
conda activate videoRaw
pip install -r requirements.txt

📁 Dataset

You can download our RVD dataset from here(code: dk5h). Then put the dataset into the folder ./RVD.

🚩 If you have trouble unzipping, you can use the following command:

sudo apt update
sudo apt install p7zip-full
7z x RVD.zip

The folder structure is as follows:

RVD
├── Part1
│   ├── test
│   │   ├── data.json
│   │   ├── DNG
│   │   ├── flow
│   │   ├── RAW
│   │   ├── sRGB
│   │   └── tags.json
│   └── train
│       ├── data.json
│       ├── DNG
│       ├── flow
│       ├── RAW
│       └── sRGB
└── Part2
    ├── test
    │   ├── data.json
    │   ├── flow
    │   ├── RAW
    │   └── sRGB
    └── train
        ├── data.json
        ├── flow
        ├── RAW
        └── sRGB

For both subsets, we provide optical flow data using the unimatch method. In addition, we also provide the original '.DNG' data for RVD-Part1.

🔥Training

Since different camera ISP pipelines are specific, we train the derendering model on each sub-dataset separately.

Train the model on the RVD-Part1 dataset:

python3 -u main.py \
--trainset_root='./RVD/Part1/train' \
--testset_root='./RVD/Part1/test' \
--input_size="900,1600" \
--save_dir='./checkpoints/RVD_Part1' \
--batch_size=2 \
--test_freq=20 \
--patch_size=256 \
--load_from='' \
--port=12355 \
--max_epoch=60 \
--num_worker=8 \
--init_lr=0.002 \
--lr_decay_epoch=20 \
--aux_loss_weight=0.5 \
--ssim_loss_weight=1.0 \
--local

Train the model on the RVD-Part2 dataset:

python3 -u main.py \
--trainset_root='./RVD/Part2/train' \
--testset_root='./RVD/Part2/test' \
--input_size="640,1440" \
--save_dir='./checkpoints/RVD_Part2' \
--batch_size=2 \
--test_freq=20 \
--patch_size=256 \
--load_from='' \
--port=12347 \
--max_epoch=60 \
--num_worker=8 \
--init_lr=0.002 \
--lr_decay_epoch=20 \
--aux_loss_weight=0.5 \
--ssim_loss_weight=1.0 \
--local \

You can also amend the startup script in 'scripts' folder to use multi-GPU training.

⚡ Checkpoints and Inference

Download the pretrained models (RVD_Part1.pth, RVD_Part2.pth) from BaiduYun (code: axh6).
Put the pretrained models in the './pretrain' folder.
Run the test script:

# test on RVD-Part1
python3 -u main.py \
--trainset_root='./RVD/Part1/train' \
--testset_root='./RVD/Part1/test' \
--input_size="900,1600" \
--save_dir='./checkpoints/RVD_Part1' \
--batch_size=8 \
--test_freq=20 \
--patch_size=256 \
--load_from='./pretrain/RVD_Part1.pth' \
--port=12355 \
--max_epoch=60 \
--num_worker=8 \
--init_lr=0.002 \
--lr_decay_epoch=20 \
--aux_loss_weight=0.5 \
--ssim_loss_weight=1.0 \
--local \
--test_only \
# --save_predict_raw  # add this option to save the predicted raw images

# test on RVD-Part2
python3 -u main.py \
--trainset_root='./RVD/Part2/train' \
--testset_root='./RVD/Part2/test' \
--input_size="640,1440" \
--save_dir='./checkpoints/RVD_Part2' \
--batch_size=8 \
--test_freq=20 \
--patch_size=256 \
--load_from='./pretrain/RVD_Part2.pth' \
--port=12347 \
--max_epoch=60 \
--num_worker=8 \
--init_lr=0.002 \
--lr_decay_epoch=20 \
--aux_loss_weight=0.5 \
--ssim_loss_weight=1.0 \
--local \
--test_only \
# --save_predict_raw # add this option to save the predicted raw images

You can find the testing results in the ./checkpoints/RVD_Part1 and ./checkpoints/RVD_Part2 directories.

🥰 Acknowledgements

Our dataset contains part of the data from Real-RawVSR Dataset(https://github.com/zmzhang1998/Real-RawVSR), thanks to the excellent work of Yue et al.

🎓 Citation

@inproceedings{zhang2024leveraging,
  title={Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering},
  author={Zhang, Chen and Han, Wencheng and Zhou, Yang and Shen, Jianbing and Xu, Cheng-zhong and Liu, Wentao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={25659--25668},
  year={2024}
}

✉️ Contact

If you have any questions, please contact: zhangchen2@tetras.ai

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
dataset		dataset
model		model
scripts		scripts
.gitignore		.gitignore
README.md		README.md
config.py		config.py
logger.py		logger.py
loss.py		loss.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering (CVPR 2024)

🔔 Overview

Hightlights

Framework

⏰ TODO List

🔧 Installation

📁 Dataset

🔥Training

⚡ Checkpoints and Inference

🥰 Acknowledgements

🎓 Citation

✉️ Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

zhangchen98/RAW_CVPR24

Folders and files

Latest commit

History

Repository files navigation

Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering (CVPR 2024)

🔔 Overview

Hightlights

Framework

⏰ TODO List

🔧 Installation

📁 Dataset

🔥Training

⚡ Checkpoints and Inference

🥰 Acknowledgements

🎓 Citation

✉️ Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages