Skip to content

swapUniba/datared-green-recsys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparing Data Reduction Strategies for Energy-efficient and Green Recommender Systems

This is the repository for the paper "Comparing Data Reduction Strategies for Energy-efficient and Green Recommender Systems".

This source code aims at tracking the emissions of a given recommendation model on a given dataset. It performs the model execution by applying the default parameters set or by applying the hyperparameters tuning carrying out the grid search. It also saves the metrics and the parameters configuration obtained during each run.

Recommendations models, datasets and metrics refers to @Recbole implementation.

Emission tracking is made by mean of @CodeCarbon library.

Dataset

We considered two state-of-the-art datasets, MovieLens-1M and AmazonBooks.

We applied the following data reduction strategies:

  • considered the k newest user ratings
  • considered all the ratings after a certain date
  • stratified random user sampling
  • stratified random item sampling

Moreover, these strategies have been applied considering different values. More details can be found on our paper.

All datasets (full and reduced) can be found in our data folder. The src folder containes the script we used to perform the data reduction as well (src/split.py).

Running the code

To run our code, you can use Python3.8 and later versions; we suggest to create a new virtual environment, activating it, and finally installing the required libraries in the requirements.txt file, as follows:

virtualenv -p python3.8 env
source env/bin/activate
pip install -r requirements.txt

The core script of our work is src/default_tracker, which tracks the emissions of a given recommendation model with default and statically defined parameters on a given dataset (both passed as script’s arguments).

The results are saved in the results_shared folder, and will be composed of 3 files:

  • emissions.csv: output of CodeCarbon (consumption related data)
  • metrics.csv: output of the RecBole evaluation
  • params.csv: parameters used to train the recommendation model.

Parameters names are case unsensitive while parameters values are case sensitive.

Examples

$ python3 src/default_tracker.py --dataset=movielens_1m --model=LightGCN

This script trains the BPR model on the full version of MovieLens-1M.

$ python3 src/default_tracker.py --dataset=movielens_1m_train_200_newest_ratings_each_user --model=BPR

This script trains LightGCN on a reduced version of MovieLens-1M, obtained by considering only the latest 200 ratings each user provided.

Plot graphs

We added the graphs folder, in which we plot all the graphs we have produced; in order to produce the same graph (or change values or sizes), you can refer to the produce_grapg.ipynb Python notebook.

Experimental settings

Experiments were conducted with the following resources:

  • GPU: 1 x NVIDIA NC4as T4 v3.

The selected models and datasets list is as follow: ['DMF', 'LINE', 'BPR', 'CFKG', 'CKE', 'KGCN', 'KGNNLS', 'MultiDAE', 'LightGCN', 'NGCF']

  • Datasets: AmazonBooks, MovieLens-1M.
  • Models: DMF, LINE, BPR, CFKG, CKE, KGCN, KGNNLS, MultiDAE, LightGCN, DGCF, NGCF.

Acknowledgments

This work has been carried out by the Computer Science Bachelor Degree student Michele Matteucci, University of Bari Aldo Moro.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published