-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Traceback (most recent call last):
File "/Users/nsamarin/Projects/ccpa-compliance/scripts/scraper/main.py", line 159, in <module>
scrape_policies(**kwargs)
File "/Users/nsamarin/Projects/ccpa-compliance/scripts/scraper/main.py", line 132, in scrape_policies
future.result()
File "/opt/anaconda3/envs/ccpa/lib/python3.9/concurrent/futures/_base.py", line 433, in result
return self.__get_result()
File "/opt/anaconda3/envs/ccpa/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/opt/anaconda3/envs/ccpa/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/polipy/polipy.py", line 292, in download_policy
policy.extract(extractors=extractors)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/polipy/polipy.py", line 112, in extract
content = extract(extractor, **vargs)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/polipy/extractors.py", line 11, in extract
content = extract_text(**kwargs)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/polipy/extractors.py", line 18, in extract_text
content = extract_pdf(static_source)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/polipy/extractors.py", line 28, in extract_pdf
text = parse_pdf(f)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/pdfminer/high_level.py", line 114, in extract_text
for page in PDFPage.get_pages(
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/pdfminer/pdfpage.py", line 128, in get_pages
doc = PDFDocument(parser, password=password, caching=caching)
File "/opt/anaconda3/envs/ccpa/lib/python3.9/site-packages/pdfminer/pdfdocument.py", line 596, in __init__
raise PDFSyntaxError('No /Root object! - Is this really a PDF?')
pdfminer.pdfparser.PDFSyntaxError: No /Root object! - Is this really a PDF?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels