Skip to content

3. Flow

Arif Agustyawan edited this page Dec 24, 2023 · 2 revisions

Data Loading and Processing:

  • The engine initiates by loading data from the designated CSV file and processes it by sorting based on purchase IDs.
  • Grouping by customer IDs follows, wherein, for each group, product IDs are extracted as lists, forming sequences of product purchases.

Data Preprocessing:

  • Sequences are flattened for efficient processing.
  • A Tokenizer is initialized and fitted to the flattened data.
  • Product names are converted into sequences of tokenized integers.
  • Input sequences (X) and target values (y) are separated and padded to ensure uniform length.

Model Building:

  • A neural network model is constructed, consisting of an embedding layer, LSTM layer, and dense layer.
  • Model parameters are configurable based on settings in the config.conf file.

Model Training:

  • The model undergoes training with specified settings, including the number of epochs and batch size.
  • Training progress and metrics are logged using Weights & Biases (W&B).

Model Evaluation:

  • The trained model is evaluated by making predictions on the data.
  • Metrics such as accuracy, precision, recall, and F1 score are calculated and logged.

Model Saving:

  • The trained model and tokenizer are saved for later use.

Clone this wiki locally