This airflow provider allows you to interact with the Kaggle API using the corresponding CLI tool. This provider is implemented in an abstract way in order to provide the maximum backward and forward compatibility, as a result, using this provider you can run any command supported by the kaggle-api.
Pre-requisites: An environment running apache-airflow >= 2.0
pip install airflow-provider-kaggleIn order to use this airflow-provider, you have to create a kaggle connection:
Conn ID:kaggle_defaultConn Type:KaggleExtra:<kaggle.json>
NOTE: More details on how to get your Kaggle credentials are available here
This is the main operator that can be used to execute any kaggle cli command:
from kaggle_provider.operators.kaggle import KaggleOperator
list_competitions_op = KaggleOperator(
task_id='foo',
command='competitions_list',
op_kwargs={'sort_by': 'prize'}
)This is the kaggle hook which is used by the operator and can also be used directly in your custom operator too.
from kaggle_provider.hooks.kaggle import KaggleHook
hook = KaggleHook()
hook.run('datasets_list', sort_by="votes", user="sp1thas")competitions_listcompetition_submitcompetition_submissionscompetition_list_filescompetition_download_filecompetition_download_filescompetition_leaderboard_downloadcompetition_leaderboard_viewdataset_listdataset_metadata_prepdataset_metadata_updatedataset_metadatadataset_list_filesdataset_statusdataset_download_filedataset_download_filesdataset_create_versiondataset_initializedataset_create_newdownload_filekernels_listkernels_initializekernels_pushkernels_pullkernels_outputkernels_statusmodel_getmodel_listmodel_initializemodel_create_newmodel_deletemodel_updatemodel_instance_getmodel_instance_initializemodel_instance_createmodel_instance_deletemodel_instance_updatemodel_instance_version_createmodel_instance_version_downloadmodel_instance_version_deletedownload_needed
Details regarding the command arguments can be found in the corresponding method docstring of this module