There are notebooks for each country for the first parts of the process. Then we pull the data sets into a joint notebook to run the models. Below are the five types of notebooks submitted with how many versions of each one to expect.
- Scraping: one notebook for each country to obtain the text data.
- Generating dictionaries: two notebooks, one for English and one for Spanish.
- Sentiment analysis: one notebook for each country to run the text data through the three approaches of getting sentiment values.
- Cleaning economic data: one notebook for each country to add features to the economic data.
- Modelling: single notebook in which data for all countries is joined together and a baseline model is trained on the economic data while an augmented model is trained on economic and text data.