-
Notifications
You must be signed in to change notification settings - Fork 353
Open
Labels
Description
What happened + What you expected to happen
AutoARIMAProphet suggests to be faster than Prophet (with higher accuracy) LINK
Still, it doesn't look to be quite faster nor more accurate compared to the native Prophet library.
Versions / Dependencies
Python 3.10.11
statsforecast 1.5.0
prophet 1.1.4
Reproduction script
Data
Data from [Kaggle]
Filename: train.csv
family = 'GROCERY I'
store_nbr = 44
df = pd.read_csv("data_input/train.csv").drop('id', axis=1)
df['date'] = pd.to_datetime(df['date'])
df_train = df.query('(family == @family) & (store_nbr == @store_nbr)').reset_index(drop=1)
cutoff = '2016-01-01'
df_test = df_train.query('date >= @cutoff').reset_index(drop=1)
df_train = df_train.query('date < @cutoff').reset_index(drop=1)Experiment with AutoARIMAProphet:
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsforecast.adapters.prophet import AutoARIMAProphet
def process_df_train(df):
df = df.rename({'sales': "y", 'date': 'ds'}, axis=1)
return df.sort_values('ds', ascending=True)
start = time.time()
aaprophet = AutoARIMAProphet()
aaprophet.fit(process_df_train(df_train), disable_seasonal_features=False)
print("Train:", time.time() - start)
aap_pred = aaprophet.make_future_dataframe(periods=593)
aap_pred = aaprophet.predict(aap_pred)
print("Pred:", time.time() - start)Output:
Train: 8.129928588867188 # Time for training
Pred: 8.15820026397705 # Time for prediction
FA score
df_plot = (aap_pred
.rename({'ds': 'date'}, axis=1)
.merge(df_train[['date', 'sales']].rename({'sales': 'sales_train'}, axis=1), on='date', how='outer')
.merge(df_test[['date', 'sales']].rename({'sales': 'sales_test'}, axis=1), on='date', how='outer')
)
def compute_fa(pred):
return 1 - np.abs(np.sum(pred['yhat'] - pred['sales_test'])) / np.sum(pred['sales_test'])
compute_fa(df_plot)Output:
0.8975427859368323
Experiment with native Prophet:
from prophet import Prophet
def process_df_train(df):
df = df.rename({'sales': "y", 'date': 'ds'}, axis=1)
return df.sort_values('ds', ascending=True)
start = time.time()
aaprophet = Prophet()
aaprophet.fit(process_df_train(df_train))
print("Train:", time.time() - start)
aap_pred = aaprophet.make_future_dataframe(periods=593)
aap_pred = aaprophet.predict(aap_pred)
print("Pred:", time.time() - start)Output:
Train: 0.15532231330871582
Pred: 0.7572097778320312
FA score:
df_plot = (aap_pred
.rename({'ds': 'date'}, axis=1)
.merge(df_train[['date', 'sales']].rename({'sales': 'sales_train'}, axis=1), on='date', how='outer')
.merge(df_test[['date', 'sales']].rename({'sales': 'sales_test'}, axis=1), on='date', how='outer')
)
1 - np.abs(np.sum(df_plot['yhat'] - df_plot['sales_test'])) / np.sum(df_plot['sales_test'])Output:
0.9119373623344371
Issue Severity
High: It blocks me from completing my task.
Reactions are currently unavailable