What-veon?

X-veon: neural network demosaicing for Bayer and X-Trans sensors.

This project consists of two parts: first one is the neural net itself with a bunch of scripts for dataset building and training, the other is a web application with a full RAW development pipeline.

Neural network

The demosaicing model is a U-Net (encoder-decoder with skip connections) that takes a 4-channel input — the raw CFA mosaic value plus 3 binary masks indicating which color filter covers each pixel — and outputs a full-color 3-channel RGB image.

The encoder has 4 downsampling stages (64 → 128 → 256 → 512 → 1024 channels at the bottleneck for full-width model), each consisting of two 3×3 convolutions with BatchNorm and ReLU, followed by 2×2 max pooling. The decoder mirrors this with transposed convolutions for upsampling and skip connections from the corresponding encoder stage.

A key design choice is the residual CFA skip: the single-channel mosaic value is broadcast to all 3 output channels as a baseline, and the network only learns the color correction deltas on top of it. This makes the model largely exposure-agnostic — it doesn't need to reproduce absolute brightness, just fill in the missing color information.

The architecture is fully CFA-agnostic: the same model currently works for both 6×6 X-Trans and 2×2 Bayer patterns. It should be trivial to add Quad HDR support if necessary.

Dataset

The network is trained on synthetic input/target pairs generated from real RAW photos. The build process works as follows:

Ground truth generation: RAW files (RAF, ARW, CR2, etc.) are demosaiced using traditional algorithms — DHT for X-Trans, AHD for Bayer — in linear sensor space with no white balance or color correction applied. The results are downscaled 4x via area averaging to produce clean, alias-free reference images stored as float32 .npy files.
Synthetic re-mosaicing: During training, patches are randomly cropped from the ground truth and re-mosaiced through the appropriate CFA pattern to create the network's input. This means the model never sees the original noisy RAW data — it learns from a clean demosaic that has been "re-captured" through the CFA.
Augmentations: Each patch gets random flips, additive Gaussian noise, exposure shifts (pushing toward clipping), white balance perturbation in log space, and optional OLPF (anti-aliasing filter) blur simulation. These help the model generalize across cameras and shooting conditions.
Torture patterns: A fraction of synthetic gradient and edge patterns can be mixed into the training set to improve performance on worst-case inputs like fine diagonal lines and color fringes near Nyquist.

Web application

A small, fully offline (as in all processing is done in the browser) web application was built along the model. It uses ONNX WebGPU runtime for inference, so a decent GPU is required. Processing times on an M1 Macbook Pro are in the tens of seconds at worst.

Live demo:

https://naorunaoru.github.io/x-veon

What it can do:

open RAW files from different cameras, tested mainly on Fujifilm RAFs and Sony ARWs
perform neural net or traditional numeric demosaicing for comparison
preview and save HDR photos

Supported output formats:

UHD JPEG: 3-channel gain map, works best
AVIF is super slow and has incorrect gamma, which can be solved by moving from HLG to PQ
TIFF is most likely broken and clipping, but it's there

What it can't do yet:

perform any adjustments apart from OpenDRT tone mapping preset
export as DNG
passthrough full EXIF metadata
do batch operations

Something about giants and their shoulders. Mom, I'm on TV

Parts of the code were adapted piecemeal from various open-source projects, to name a few:

darktable (segmentation-based highlight reconstruction, reference image pipeline)
Jed Smith's OpenDRT and ART CTL by agriggio (tone mapping)

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
src		src
tests		tests
web		web
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
backfill_metadata.py		backfill_metadata.py
build_dataset.py		build_dataset.py
cfa.py		cfa.py
check_cfa.py		check_cfa.py
classify_hf_ha.py		classify_hf_ha.py
classify_raf.py		classify_raf.py
compare_output.py		compare_output.py
dataset.py		dataset.py
debug_first_strip.py		debug_first_strip.py
examine_raf.py		examine_raf.py
export_onnx.py		export_onnx.py
fuji_decompress.py		fuji_decompress.py
infer.py		infer.py
infer_hdr.py		infer_hdr.py
losses.py		losses.py
model.py		model.py
save_reference.py		save_reference.py
torture_v2.py		torture_v2.py
train.py		train.py
ui.py		ui.py
xtrans_pattern.py		xtrans_pattern.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What-veon?

Neural network

Dataset

Web application

Live demo:

Something about giants and their shoulders. Mom, I'm on TV

About

Uh oh!

Releases

Packages

Languages

naorunaoru/x-veon

Folders and files

Latest commit

History

Repository files navigation

What-veon?

Neural network

Dataset

Web application

Live demo:

Something about giants and their shoulders. Mom, I'm on TV

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages