Convert documents to PDF "chow" "two" p.d.f. - Cloud, Yet Another Office 2 PDF
cyao2pdf is a POC to convert office documents to pdf. The docker image exposes a REST'ish service that connects users to libreoffice "convert to" pdf functionality.
Using curl to convert a file to pdf
-
build the java app
cd topdf
mvn package
-
build the docker image
docker build -t mirsaes/cyao2pdf:beta ./
-
launch the docker image with process reaper
docker run --rm -d --init -p 8080:8080 mirsaes/cyao2pdf:betaor run with process reaper and local file used as application.properties to override app settings
docker run --init --rm -d -p 8080:8080 --mount 'type=bind,src=/full/path/to/sample.properties,dst=/topdf/application.properties' mirsaes/cyao2pdf:beta -
check health or run basic tests
curl http://localhost:8080/live/healthcurl http://localhost:8080/live/health?testConvert=truecurl http://localhost:8080/live/test -
use curl to convert a file to pdf
curl -X POST -F "name=test.txt" -F "file=@/home/mirsaes/test.txt" http://localhost:8080/live/topdf
if configured to use a password add the user:password as seen below, however ssl is not configured on the service by default
curl -X POST -u user:password -F "name=test.txt" -F "file=@/home/mirsaes/test.txt" http://localhost:8080/live/topdf
-
use curl to convert a remote file to a pdf
curl -X POST -F "name=web.txt" -F "file=https://somesite.com/withatextfile" http://localhost:8080/live/urltopdf
odd, this blob share apparently had its permissions locked down, oops
https://interoperability.blob.core.windows.net/files/MS-DOCX/%5bMS-DOCX%5d-200219.docx
so an alternative test file, has been specified in example below
curl -X POST -F "name=test.docx" -F "file=https://msopenspecs.azureedge.net/files/MS-DOCX/%5bMS-DOCX%5d-230815.docx" http://localhost:8080/live/urltopdf > docx.pdf
and ... this also got moved.
Here is an example that takes the 308 value from public page https://officeprotocoldocs-f5hpbjgea6b8gneq.b02.azurefd.net/files/MS-DOCX/%5bMS-DOCX%5d-251113.docx
curl -X POST -F "name=test.docx" -F "file=https://officeprotocoldoc.z19.web.core.windows.net/files/MS-DOCX/%5bMS-DOCX%5d-251113.docx" http://localhost:8080/live/urltopdf > docx.pdf
Point being, you might need to find your own valid url that has a docx file.
This might be useful when using Amazon S3 and Temporary Credentials via Query String Request Authentication - but that has not been tested.
- noted
# whether to use a user pool to convert documents
convertusers.enabled: true
# number of users to use (users must exist, up to max of 8 users)
convertusers.count: 4
# username prefix used to form converting username, e.g. cyao2pdf1, cyao2pdf2, etc
convertusers.username.prefix: cyao2pdf
-
0.0.14
- update spring boot from 3.4.2 to 3.5.3 - Support LifeCycle
- Ubuntu 24.04
- LibreOffice 24.2.7.2
- jre 21
- spring 3.5
-
0.0.13
- update java from 17 to 21, ubuntu from 22 to 24, and spring boot from 3.2 to 3.4.2 - Support LifeCycle
- Ubuntu 24.04
- LibreOffice 24.2.7.2
- jre 21
- spring 3.4
-
0.0.12
- update spring boot to 3.2.2 - Support LifeCycle
- Ubuntu 22.04
- LibreOffice 7.3
- jre 17
- spring 3.2
-
0.0.11
- update spring boot to 3.0.6 - Support LifeCycle
- Ubuntu 22.04
- LibreOffice 7.3
- jre 17
- spring 3.0
-
0.0.10
-
0.0.9
- update spring boot to 2.6.6 - LTS
- includes security fix for CVE-2022-22965
- however, was not vulnerable as build uses default for generating executable jar rather than a war
- Ubuntu 20.04
- LibreOffice 6.4.7.2
- jre 11
-
0.0.8d
- security update to include log4j 2.17.1
-
0.0.8c
- security update to include log4j 2.17.0
-
0.0.8b
- log4helle fix vengeance
- update to include log4j 2.16
-
0.0.8a
- log4helle fix, dunkel
- analysis showed default implementation in 0.0.8 and 0.0.7 should have been unaffected, however if users configured access log logging via property file overrides then it would have been affected
-
0.0.8
- Reduced image size, ubuntu 20
- Ubuntu 20
- LibreOffice
- jre 11
-
0.0.7d
- security update to include log4j 2.17.1
-
0.0.7c
- update to include log4j 2.17
-
0.0.7b
- update to include log4j 2.16
-
0.0.7a
- includes updated log4j library to fix exploit
- update to include log4j 2.15
-
0.0.7
- Basic parallel document conversion support
- Ubuntu 18
- LibreOffice
- jre 11