3. Flow

Jump to bottom

Arif Agustyawan edited this page Dec 24, 2023 · 2 revisions

Data Loading and Processing:

The engine initiates by loading data from the designated CSV file and processes it by sorting based on purchase IDs.
Grouping by customer IDs follows, wherein, for each group, product IDs are extracted as lists, forming sequences of product purchases.

Data Preprocessing:

Sequences are flattened for efficient processing.
A Tokenizer is initialized and fitted to the flattened data.
Product names are converted into sequences of tokenized integers.
Input sequences (X) and target values (y) are separated and padded to ensure uniform length.

Model Building:

A neural network model is constructed, consisting of an embedding layer, LSTM layer, and dense layer.
Model parameters are configurable based on settings in the config.conf file.

Model Training:

The model undergoes training with specified settings, including the number of epochs and batch size.
Training progress and metrics are logged using Weights & Biases (W&B).

Model Evaluation:

The trained model is evaluated by making predictions on the data.
Metrics such as accuracy, precision, recall, and F1 score are calculated and logged.

Model Saving:

The trained model and tokenizer are saved for later use.