Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
[paper (NeurIPS 2025)] [paper (arXiv)] [website] [code] [models]
Authors: Zinan Lin, Enshu Liu, Xuefei Ning, Junyi Zhu, Wenyu Wang, Sergey Yekhanin
Correspondence to: Zinan Lin (zinanlin AT microsoft DOT com)
Abstract: Generative modeling, representation learning, and classification are three core problems in machine learning (ML), yet their state-of-the-art (SoTA) solutions remain largely disjoint. In this paper, we ask: Can a unified principle address all three? Such unification could simplify ML pipelines and foster greater synergy across tasks. We introduce Latent Zoning Network (LZN) as a step toward this goal.
At its core, LZN creates a shared Gaussian latent space that encodes information across all tasks. Each data type (e.g., images, text, labels) is equipped with an encoder that maps samples to disjoint latent zones, and a decoder that maps latents back to data. ML tasks are expressed as compositions of these encoders and decoders: for example, label-conditional image generation uses a label encoder and image decoder; image embedding uses an image encoder; classification uses an image encoder and label decoder.
We demonstrate the promise of LZN in three increasingly complex scenarios: (1) LZN can enhance existing models (image generation): When combined with the SoTA Rectified Flow model, LZN improves FID on CIFAR10 from 2.76 to 2.59—without modifying the training objective. (2) LZN can solve tasks independently (representation learning): LZN can implement unsupervised representation learning without auxiliary loss functions, outperforming the seminal MoCo and SimCLR methods by 9.3% and 0.2%, respectively, on downstream linear classification on ImageNet. (3) LZN can solve multiple tasks simultaneously (joint generation and classification): With image and label encoders/decoders, LZN performs both tasks jointly by design, improving FID and achieving SoTA classification accuracy on CIFAR10.
9/21/2025: 🚀 The models and training/inference code for image generation on AFHQ-Cat and image embedding trained on ImageNet have been released! Due to the sensitive nature of the datasets, the remaining models and code are undergoing an internal review process and will be released at a later date. Stay tuned!9/21/2025: The paper is released here.
We provide a docker file with the necessary dependencies in docker/Dockerfile. Alternatively, you can install PyTorch as well as the libraries in docker/requirements.txt to set up the environment.
The training code uses Distributed Data Parallelism (DDP) for multi-GPU and multi-node training. In the configuration files, specify the number of GPUs per node using config.distributed.num_gpus_per_node.
For single-node training, this is all you need—the training script will automatically launch processes on each GPU.
For multi-node training, the training command must be executed once on each node, and the following environment variables need to be set:
MASTER_ADDR: The IP address of the master node.MASTER_PORT: The port number on the master node for inter-node communication.WORLD_SIZE: The total number of nodes.NODE_RANK: The rank of the current node, starting from 0 on the master node.
Download the AFHQ-Cat dataset and put the images in the following folder structure:
📦/tmp/data/AFHQ
┣ 📂train
┃ ┣ 📂cat
┃ ┃ ┣ 📜flickr_cat_000002.jpg
┃ ┃ ┗ ... (more images)
┗ 📂val
┃ ┣ 📂cat
┃ ┃ ┣ 📜flickr_cat_000008.jpg
┃ ┃ ┗ ... (more images)
Note that the root folder /tmp/data/AFHQ can be moved to any locations, as long as it matches config.data.params.afhq_root in the configuration file configs/lzn1/case_study_1_afhqcat.py.
python train.py --config=configs/lzn1/case_study_1_afhqcat.py
The results (checkpoints, logs, metrics, generated images, etc.) will be saved in the folder results/case_study_1_afhqcat. For examples, the generated images using the RK45 sampler in the Numpy format can be found in results/case_study_1_afhqcat/ema-rk45-random_samples_array/ and results/case_study_1_afhqcat/ema-rk45-random_samples_more_array.
python evaluate.py --config=configs/lzn1/case_study_1_afhqcat.py --config.checkpoint.load_checkpoint=manual --config.checkpoint.path="<path to the checkpoint>"
where the checkpoint lzn1/case_study_1_afhqcat/000003000-000060000.pt can be downloaded from here.
Simiarly to model training, the results (logs, metrics, generated images, etc.) will be saved in the folder results/case_study_1_afhqcat.
Please download the ImageNet dataset and place ILSVRC2012_devkit_t12.tar.gz, ILSVRC2012_img_train.tar, and ILSVRC2012_img_val.tar in /tmp/data/ImageNet.
Note that the root folder /tmp/data/ImageNet can be other locations, as long as it matches config.data.params.root in the configuration file configs/lzn1/case_study_2.py.
python train.py --config=configs/lzn1/case_study_2.py
The results (checkpoints, logs, image representations, linear classification accuracies, etc.) will be saved in the folder results/case_study_2. For examples, the image representations of the validation set can be found in results/cast_study_2/ema-representation_no_head_representations/.
python evaluate.py --config=configs/lzn1/case_study_2.py --config.checkpoint.load_checkpoint=manual --config.checkpoint.path="<path to the checkpoint>"
where the checkpoint lzn1/case_study_2/000032051-005000000.pt can be downloaded from here.
Simiarly to model training, the results (logs, image representations, linear classification accuracies, etc.) will be saved in the folder results/case_study_2.
Please cite the paper if you use this code:
@article{lin2025latent,
title={Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification},
author={Lin, Zinan and Liu, Enshu and Ning, Xuefei and Zhu, Junyi and Wang, Wenyu and Yekhanin, Sergey},
journal={arXiv preprint arXiv:2509.15591},
year={2025}
}
See Microsoft Privacy Statements.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.