Home Random

pdf

Tools#

wkhtmltopdf#

Can be used to convert html to pdf. Can be used a single binary, run in a docker container. Is uses a rather old webkit engine, which means newer css syntax won’t get rendered. For example, I had problems when using tailwindcss.

There is also a Golang-Library which utilizes the binary to create pdfs:

import(
    wkhtml "github.com/SebastiaanKlippert/go-wkhtmltopdf"
)

Gotenberg#

Can be used to convert html into pdf. Runs within a docker container and is accessable via an API. I don’t have much experience with it, but I looks promising.

ocrmypdf#

Can be used to apply Optical Character Recogntion (OCR) to pdfs. It is written in python uses tesseract in backend. Because of several dependancys I recommend to use the a docker image.

docker run --rm -i jbarlow83/ocrmypdf-alpine