Skip to content

Commit 05013fc

Browse files
authored
Merge pull request #21 from eurostat/dev
correction of errors in toc and documentation
2 parents 4ef3627 + 12b974d commit 05013fc

13 files changed

+33
-21
lines changed

DESCRIPTION

100644100755
Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
Package: restatapi
22
Type: Package
33
Title: Search and Retrieve Data from Eurostat Database
4-
Date: 2025-01-27
5-
Version: 0.24.2
4+
Date: 2026-01-30
5+
Version: 0.24.4
66
Encoding: UTF-8
77
Authors@R: c(person("Mátyás", "Mészáros", email = "matyas.meszaros@ec.europa.eu", role = c("aut", "cre")),
88
person("Sebastian", "Weinand", role = "ctb"))
99
Description: Eurostat is the statistical office of the European Union and provides high quality statistics for Europe.
1010
Large set of the data is disseminated through the Eurostat database (<https://ec.europa.eu/eurostat/web/main/data/database>).
1111
The tools are using the REST API with the Statistical Data and Metadata eXchange (SDMX) Web Services
12-
(<https://wikis.ec.europa.eu/pages/viewpage.action?pageId=44165555>) to search and download data from
12+
(<https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-detailed-guidelines/sdmx2-1>) to search and download data from
1313
the Eurostat database using the SDMX standard.
1414
License: EUPL
1515
Imports: data.table, rjson, xml2

NEWS.md

100644100755
Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,17 @@
1+
# restatapi 0.24.4
2+
3+
- correction of links in the documentation
4+
5+
# restatapi 0.24.3
6+
7+
- correction to handle errors in the text version of the Table of Content
8+
- correction of test because the use of Euro in Bulgaria since 2026
9+
110
# restatapi 0.24.2
211

312
- correction of test because of the changed Table of Content
413
- CRAN release
5-
-
14+
615
# restatapi 0.24.1
716

817
- correction of outdated URLs and documentation

R/get_eurostat_bulk.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
#' @param ... other parameter(s) to pass on the \code{\link{load_cfg}} function
3636
#' @export
3737
#'
38-
#' @details Data sets are downloaded from \href{https://wikis.ec.europa.eu/display/EUROSTATHELP/Transition+-+from+Eurostat+Bulk+Download+to+API}{the Eurostat bulk download facility}
38+
#' @details Data sets are downloaded from \href{https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-migrating/bulkdownload}{the Eurostat bulk download facility}
3939
#' in TSV format as in this case smaller file has to be downloaded and processed. If there is more then one frequency then
4040
#' the dataset is filtered for a unique time frequency.
4141
#' If no frequency is selected and there are multiple frequencies in the dataset, then the most common value is used used for frequency.

R/get_eurostat_codelist.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
#' @seealso \code{\link{get_eurostat_dsd}}.
1919
#' @details The codelist is downloaded from Eurostat's website, through the REST API in XML (SDMX-ML) format.
2020
#'
21-
#' @references For more information see the detailed documentation of the \href{https://wikis.ec.europa.eu/display/EUROSTATHELP/API+SDMX+2.1+-+metadata+query}{API}.
21+
#' @references For more information see the detailed documentation of the \href{https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access}{API}.
2222
#' @examples
2323
#' if (!(grepl("amzn|-aws|-azure ",Sys.info()['release']))) options(timeout=2)
2424
#' get_eurostat_codelist("freq",lang="de",cache=FALSE,verbose=TRUE)

R/get_eurostat_data.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,8 +63,8 @@
6363
#' @export
6464
#'
6565
#' @details Data sets are downloaded from the Eurostat Web Services
66-
#' \href{https://wikis.ec.europa.eu/pages/viewpage.action?pageId=44165555}{SDMX API} if there is a filter otherwise the
67-
#' \href{https://wikis.ec.europa.eu/display/EUROSTATHELP/Transition+-+from+Eurostat+Bulk+Download+to+API}{the Eurostat bulk download facility} is used.
66+
#' \href{https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-detailed-guidelines/sdmx2-1}{SDMX API} if there is a filter otherwise the
67+
#' \href{https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-migrating/bulkdownload}{the Eurostat bulk download facility} is used.
6868
#' If only the table \code{id} is given, the whole table is downloaded from the
6969
#' bulk download facility. If also \code{filters} or \code{date_filter} is defined then the SDMX REST API is
7070
#' used. In case after filtering the dataset has more rows than the limitation of the SDMX REST API (1 million values at one time) then the bulk download is used to retrieve the whole dataset .

R/get_eurostat_raw.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
#' @param ... further argument for the \code{\link{load_cfg}} function
3131
#' @export
3232
#'
33-
#' @details Data sets are downloaded from \href{https://wikis.ec.europa.eu/display/EUROSTATHELP/Transition+-+from+Eurostat+Bulk+Download+to+API}{the Eurostat bulk download facility}
33+
#' @details Data sets are downloaded from \href{https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-migrating/bulkdownload}{the Eurostat bulk download facility}
3434
#' in CSV, TSV or SDMX format.
3535
#'
3636
#'

R/get_eurostat_toc.R

100644100755
Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ get_eurostat_toc<-function(mode="xml",
7979
load_cfg()
8080
}
8181
}
82-
# if (verbose) {message("get_eurostat_toc - API version:",get("rav",envir=restatapi::.restatapi_env)," - number of cores:",getOption("restatapi_cores",1L))}
82+
8383
if(any(grepl("get_eurostat_bulk|get_eurostat_data|get_eurostat_raw",as.character(sys.calls()),perl=TRUE))) {update_cache<-FALSE}
8484

8585
if ((cache) & (!update_cache)) {
@@ -104,7 +104,7 @@ get_eurostat_toc<-function(mode="xml",
104104
tbc<-FALSE
105105
})
106106
if (tbc) {
107-
tryCatch({toc<-data.table::fread(temp,header=TRUE,sep="\t",stringsAsFactors=FALSE)},
107+
tryCatch({toc<-data.table::fread(temp,header=TRUE,sep="\t",stringsAsFactors=FALSE,fill=Inf)},
108108
error = function(e) {
109109
if (verbose) {message("get_eurostat_toc - Error during the reading of the tsv version of the TOC file:",'\n',paste(unlist(e),collapse="\n"))}
110110
else {message("There is an error by the reading of the downloaded txt TOC file. Run the same command with verbose=TRUE option to get more info on the issue.")}
@@ -117,8 +117,11 @@ get_eurostat_toc<-function(mode="xml",
117117
})
118118
if (tbc) {
119119
if (!is.null(toc)) {
120-
names(toc)<-c("title","code","type","lastUpdate","lastModified","dataStart","dataEnd","values")
121-
toc<-toc[toc$type!="folder",]
120+
cnames<-c("title","code","type","lastUpdate","lastModified","dataStart","dataEnd","values")
121+
if (ncol(toc)>8) cnames<-c(cnames,paste0("X",1:(ncol(toc)-8)))
122+
names(toc)<-cnames
123+
toc$code<-sub("^\\s*","",toc$code)
124+
toc<-toc[toc$type!="folder"&toc$code!="",1:8]
122125
toc$title<-sub("^\\s*","",toc$title)
123126
}
124127
}

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ remotes::install_github("eurostat/restatapi")
2929
```
3030

3131
## background
32-
This package is similar to other packages like the [eurodata](https://github.com/alekrutkowski/eurodata), [eurostat](https://cran.r-project.org/package=eurostat), [rdbnomics](https://cran.r-project.org/package=rdbnomics), [RJSDMX](https://cran.r-project.org/package=RJSDMX) or [TSsdmx](https://cran.r-project.org/package=TSsdmx) which can be used to download data from Eurostat database. The difference is that `restatapi` is based on SDMX (Statistical Data and Metadata eXchange) and XML to search and retrieve filtered datasets and use the TSV (tab separated values) bulk download facility to get whole data tables. The code was written in a way that the number of dependencies on other packages should be very small. The `restatapi` package provides flexible filtering options, data caching, and uses the `parallel` and `data.table` package to handle large dataset in an efficient way.
32+
This package is similar to other packages like the [eurodata](https://cran.r-project.org/package=eurodata), [eurostat](https://cran.r-project.org/package=eurostat), [rdbnomics](https://cran.r-project.org/package=rdbnomics), [RJSDMX](https://cran.r-project.org/package=RJSDMX) or [TSsdmx](https://cran.r-project.org/package=TSsdmx) which can be used to download data from Eurostat database. The difference is that `restatapi` is based on SDMX (Statistical Data and Metadata eXchange) and XML to search and retrieve filtered datasets and use the TSV (tab separated values) bulk download facility to get whole data tables. The code was written in a way that the number of dependencies on other packages should be very small. The `restatapi` package provides flexible filtering options, data caching, and uses the `parallel` and `data.table` package to handle large dataset in an efficient way.
3333

3434
## content
3535
The package contains 8 main functions and several other sub functions in 3 areas.
@@ -95,7 +95,7 @@ options(restatapi_update=TRUE)
9595
options(restatapi_cache_dir=file.path(tempdir(),"restatapi"))
9696
```
9797
<a name="updated-date-filter"></a>
98-
**Example 6:** First download the annual (`select_freq="A"`) air passenger transport data for the main airports of Montenegro (`avia_par_me`) and do not cache any of the data (`cache=FALSE`). Then from the same table download the monthly (`select_freq="M"`) and quarterly (`filters="Q...`) data for 2 specific airport pairs/routes (`filters=...ME_LYPG_HU_LHBP+ME_LYTV_UA_UKKK"`) in August 2016 and on 1 July 2017 (`date_filter=c("2016-08","2017-07-01")`). The filters are provided in the format how it is required by the [REST SDMX web service](https://wikis.ec.europa.eu/pages/viewpage.action?pageId=44165555). Under the old API, it returned the value for the selected routes for the month August 2016, July 2017 and the 3rd quarter of 2017. Meanwhile under the ***new API***, it returns all the quarterly and monthly value, as there is a single day in the `date_filter`.
98+
**Example 6:** First download the annual (`select_freq="A"`) air passenger transport data for the main airports of Montenegro (`avia_par_me`) and do not cache any of the data (`cache=FALSE`). Then from the same table download the monthly (`select_freq="M"`) and quarterly (`filters="Q...`) data for 2 specific airport pairs/routes (`filters=...ME_LYPG_HU_LHBP+ME_LYTV_UA_UKKK"`) in August 2016 and on 1 July 2017 (`date_filter=c("2016-08","2017-07-01")`). The filters are provided in the format how it is required by the [REST SDMX web service](https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-detailed-guidelines/sdmx2-1). Under the old API, it returned the value for the selected routes for the month August 2016, July 2017 and the 3rd quarter of 2017. Meanwhile under the ***new API***, it returns all the quarterly and monthly value, as there is a single day in the `date_filter`.
9999
Then download again the monthly and quarterly data (`filters=c("Quarterly","Monthly")`) where there is exact match in the DSD for "HU" for August 2016 and 1 March 2014 (`date_filter=c("2016-08","2014-03-01")`). This query will provide only monthly data for 2016, as the quarterly data is always assigned to the first month of the quarter and there is no data for 2014. Since there is no exact match for the "HU" pattern, it returned all the monthly data for August 2016 and put the labels (like the name of the airports and units) so the data can be easier understood (`label=TRUE`) under the old API. Under the ***current API***, it returns all the quarterly and monthly data as there is a single day in the `date_filter`.
100100
Finally, download only the quarterly data (`select_freq="Q"`) for several time periods (`date_filter=c("2017-03",2016,"2017-07-01",2012:2014)`, the order of the dates does not matter) where the "HU" pattern can be found anywhere, but only in the `code` column of the DSD (`filters="HU",exact_match=FALSE,name=FALSE`). The result was all the statistics about flights from Montenegro to Hungary in the 3rd quarter of 2017, as there were no information for the other time periods under the old API. Under the ***current API***, it gives back all the quarterly data in the dataset for flights from Montenegro to Hungary because in the `date_filter` there is a single day.
101101
Before 2022, in the old dissemination chain the value was assigned to *the first day* of the month, quarter and year, so it was enough to filter for one day to get the value for the whole period. Under the current API the value belongs to the full period. If a date range does not cover the whole period no value is returned. For example, to get the value of the whole quarter the date filter should start at least on the first date of the quarter and end at least on the last day of the quarter. With exact numerical example to get the value for 2022/Q3, the `startDate` should be 2022-07-01 or earlier and the `endDate` should be 2022-09-30 or later. In the old version of the API it was enough if the period included the day 2022-07-01 only.

inst/tinytest/test_restatapi.R

100644100755
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -301,7 +301,7 @@ if (grepl("\\.amzn|-aws|5.4.109+|-azure ",Sys.info()['release'])) {
301301
expect_true(system.time({get_eurostat_dsd(testid1)})[3]<system.time({get_eurostat_dsd(testid1,update_cache=TRUE,parallel=FALSE,api_version=api_version)})[3]) # a0
302302

303303
#### additional test of the search_eurostat_dsd function
304-
expect_equal(nrow(search_eurostat_dsd(pattern,dsd,ignore.case=TRUE)),19) # a1
304+
expect_equal(nrow(search_eurostat_dsd(pattern,dsd,ignore.case=TRUE)),20) # a1
305305
expect_equal(nrow(search_eurostat_dsd(pattern,dsd)),15) # a2
306306
expect_equal(nrow(do.call(rbind,lapply(c(eu$EU15,eu$EA19),search_eurostat_dsd,dsd=dsd,name=FALSE,exact_match=TRUE))),34) # a3
307307
expect_equal(nrow(do.call(rbind,lapply(eu$NMS2,search_eurostat_dsd,dsd=dsd,exact_match=TRUE,ignore.case=TRUE))),2) # a4

man/get_eurostat_bulk.Rd

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)