-
Notifications
You must be signed in to change notification settings - Fork 91
Description
First of all, thanks a lot for this incredible dataset.
However, I find a small flaw in it: the club_involved_name feature contains club names as written in the text of the correspondent Transfermarkt entry. However, these names are often inconsistent with the names in club_name. Having the same names on both columns would ease the analysis of the data - e.g., allowing to join the involved club name with its own league, to study flows between leagues.
In Transfermarkt, the name to use is the title of the very same a HTML tag, should be an easy fix. I'd love to help with a pull request, but I had a look at the source code and R is out of my league. In the future I could think of proposing a Python alternative to scrape the data.
Again, congratulation on such a useful repo.