Looks for new data in https://github.com/m-nolan/doge-scrape/tree/main/data and uploads any new data to Big Local News project. This is a way of archiving the data from Michael Nolan's doge-scrape.
This version has been completely rebuilt by @paigemoody for production use. The original notebook mangled by @stucka is available at https://github.com/biglocalnews/sync-doge-scrape/blob/master/notebooks/sync-doge-scrape.ipynb
- Python 3.8+
- Valid BLN API token wich account access to both BLN
DOGE claim archive projects(test, prod. - Slack credentials for alerts
git clone git@github.com:biglocalnews/sync-doge-scrape.git
cd sync-doge-scrape-
Copy the example files and fill in your credentials:
cp .env.test.example .env.test cp .env.prod.example .env.prod
-
Fill in values for all variables in .env files
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtRun locally with test as the argument. Running with prod will publish updates to real alert channels for production use.
python run.py test- Loads environment-specific variables from
.env.testor.env.prod - Fetches the list of files in the source GitHub repo, doge-scrape (with last-modified timestamps).
- Fetches the list of current files in the target BLN project.
- Compares the two to determine which GitHub files are new or updated since last run.
- Downloads and uploads new files to the BLN project.
(Scrappy method for now)
If you want to test the script, go into the test BLN project and delete the most recent files before running the command above.