Skip to content

AutoARIMAProphet actually slower than native Prophet #630

@vesran

Description

@vesran

What happened + What you expected to happen

AutoARIMAProphet suggests to be faster than Prophet (with higher accuracy) LINK

Still, it doesn't look to be quite faster nor more accurate compared to the native Prophet library.

Versions / Dependencies

Python 3.10.11
statsforecast 1.5.0
prophet 1.1.4

Reproduction script

Data

Data from [Kaggle]
Filename: train.csv

family = 'GROCERY I'
store_nbr = 44

df = pd.read_csv("data_input/train.csv").drop('id', axis=1)
df['date'] = pd.to_datetime(df['date'])
df_train = df.query('(family == @family) & (store_nbr == @store_nbr)').reset_index(drop=1)

cutoff = '2016-01-01'
df_test = df_train.query('date >= @cutoff').reset_index(drop=1)
df_train = df_train.query('date < @cutoff').reset_index(drop=1)

Experiment with AutoARIMAProphet:

import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from statsforecast.adapters.prophet import AutoARIMAProphet

def process_df_train(df):
    df = df.rename({'sales': "y", 'date': 'ds'}, axis=1)
    return df.sort_values('ds', ascending=True)

start = time.time()
aaprophet = AutoARIMAProphet()
aaprophet.fit(process_df_train(df_train), disable_seasonal_features=False)
print("Train:", time.time() - start)

aap_pred = aaprophet.make_future_dataframe(periods=593)
aap_pred = aaprophet.predict(aap_pred)
print("Pred:", time.time() - start)

Output:

Train: 8.129928588867188  # Time for training
Pred: 8.15820026397705  # Time for prediction

FA score

df_plot = (aap_pred
           .rename({'ds': 'date'}, axis=1)
           .merge(df_train[['date', 'sales']].rename({'sales': 'sales_train'}, axis=1), on='date', how='outer')
           .merge(df_test[['date', 'sales']].rename({'sales': 'sales_test'}, axis=1), on='date', how='outer')
          )

def compute_fa(pred):
    return 1 - np.abs(np.sum(pred['yhat'] - pred['sales_test'])) / np.sum(pred['sales_test'])

compute_fa(df_plot)

Output:

0.8975427859368323

Experiment with native Prophet:

from prophet import Prophet 

def process_df_train(df):
    df = df.rename({'sales': "y", 'date': 'ds'}, axis=1)
    return df.sort_values('ds', ascending=True)

start = time.time()
aaprophet = Prophet()
aaprophet.fit(process_df_train(df_train))
print("Train:", time.time() - start)

aap_pred = aaprophet.make_future_dataframe(periods=593)
aap_pred = aaprophet.predict(aap_pred)
print("Pred:", time.time() - start)

Output:

Train: 0.15532231330871582
Pred: 0.7572097778320312

FA score:

df_plot = (aap_pred
           .rename({'ds': 'date'}, axis=1)
           .merge(df_train[['date', 'sales']].rename({'sales': 'sales_train'}, axis=1), on='date', how='outer')
           .merge(df_test[['date', 'sales']].rename({'sales': 'sales_test'}, axis=1), on='date', how='outer')
          )
1 - np.abs(np.sum(df_plot['yhat'] - df_plot['sales_test'])) / np.sum(df_plot['sales_test'])

Output:

0.9119373623344371

Issue Severity

High: It blocks me from completing my task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions