-
Notifications
You must be signed in to change notification settings - Fork 6
Description
First, many thanks for (again another) very helpful package!
I am trying to use the htmlunit package to scrap results from https://juris.ohchr.org/Search/Documents
(the site doesn’t show any search results unless you select at least one search option, e.g treaty).
While browsing the site/reading the results works (most of the time), I get an error message when trying to retrieve the results with the htmlunit package.
Is this error exclusively related to the website in the sense that it is not properly set up, or is there something to htmlunit what triggers the error? If so, any means to circumvent this error with htmlunit?
If you think this is something better to put on SO let me know.
Thanks again!
library(htmlunit)
#> Loading required package: rJava
#> Loading required package: htmlunitjars
#> Loading required package: rvest
#> Loading required package: xml2
library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.6.2
my_site2 <- "http://juris.ohchr.org/search/results/2?typeOfDecisionFilter=0&countryFilter=0&treatyFilter=0"
#my_site2 <- "https://juris.ohchr.org/search/results"
js_pg2 <- htmlunit::hu_read_html(my_site2)
js_pg2
#> {html_document}
#> <html>
#> [1] <head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8 ...
#> [2] <body bgcolor="white">\r\n <span>\r\n <h1>\r\n Server Erro ...
html_nodes(js_pg2, "td")
#> {xml_nodeset (2)}
#> [1] <td>\r\n <code>\r\n \n\nAn unhandled exceptio ...
#> [2] <td>\r\n <code>\r\n <pre>\r\n ...Created on 2020-03-05 by the reprex package (v0.3.0)