Towards Graph-Based Privacy-Preserving Federated Learning: ModelNet -- A ResNet-based Model Classification Dataset1
1Aarhus University, Denmark
⭐ If you find "ModelNet" helpful to your research, Don't forget to give a star to this repo. Thanks! 🤗
- ✅ 2025-05-31: Release the first version of the paper at Arxiv.
- ✅ 2025-06-09: Release the codes and results of ModelNet generation.
- ✅ 2025-06-11: Upload the image variants of ModelNet.
- [To Do] 2024-06-11: Upload the parameter variants of ModelNet.
Abstract: Federated Learning (FL) has emerged as a powerful paradigm for training machine learning models across distributed data sources while preserving data locality. However, the privacy of local data is always a pivotal concern and has received a lot of attention in recent research on the FL regime. Moreover, the lack of domain heterogeneity and client-specific segregation in the benchmarks remains a critical bottleneck for rigorous evaluation. In this paper, we introduce ModelNet, a novel image classification dataset constructed from the embeddings extracted from a pre-trained ResNet50 model. First, we modify the CIFAR100 dataset into three client-specific variants, considering three domain heterogeneities (homogeneous, heterogeneous, and random). Subsequently, we train each client-specific subset of all three variants on the pre-trained ResNet50 model to save model parameters. In addition to multi-domain image data, we propose a new hypothesis to define the FL algorithm that can access the anonymized model parameters to preserve the local privacy in a more effective manner compared to existing ones. ModelNet is designed to simulate realistic FL settings by incorporating non-IID data distributions and client diversity design principles in the mainframe for both conventional and futuristic graph-driven FL algorithms. The three variants are ModelNet-S, ModelNet-D, and ModelNet-R, which are based on homogeneous, heterogeneous, and random data settings, respectively. To the best of our knowledge, we are the first to propose a cross-environment client-specific FL dataset along with the graph-based variant. Extensive experiments based on domain shifts and aggregation strategies show the effectiveness of the above variants, making it a practical benchmark for classical and graph-based FL research. The dataset and related code are available online.
- Subset Diversity
Fig. Subset diversity of ModelNet-R, ModelNet-D, and ModelNet-S.
- Class Occurrence Histogram
Fig. Class occurrence histogram of ModelNet-R, ModelNet-D, and ModelNet-S.
- t-SNE Plot
Fig. t-SNE plot of ModelNet-R, ModelNet-D, and ModelNet-S..
- Jaccard Similarity
Fig. Jaccard similarity of ModelNet-R, ModelNet-D, and ModelNet-S..
- Subset Redundancy
Fig. Subset redundancy of the ModelNet datasets.
- Intra-subset Variance
Fig. Intra-subset variance of the ModelNet datasets.
- Feature Space Coverage
Fig. Feature space coverage of the ModelNet datasets.
Download the image variants:
Download the parameter variants:
@article{ray2025towards,
title={Towards Graph-Based Privacy-Preserving Federated Learning: ModelNet-A ResNet-based Model Classification Dataset},
author={Ray, Abhisek and Esterle, Lukas},
journal={arXiv preprint arXiv:2506.00476},
year={2025}
}
This project is licensed under the MIT License and was originally developed by @abhisek-ray.
If you have any questions, please email rayabhisek0610@gmail.com to discuss with the authors.

