Post

Mastering data extraction from PDF tables - at scale

If you’ve ever tried to do anything with data provided to you in PDFs, you know how painful this is — you can’t easily copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data in CSV format, through installing a java program that gives you a locally hosted simple web interface.

Tabular Tabular on Github

This post is licensed under CC BY 4.0 by the author.