- Ashik Nesin
Extracting a table from PDF is a tedious task.
If you're lucky you should be able to copy-paste it. But that won't be the case almost all the time if you try to extract financial documents like credit cards or bank transactions.
There are lots of apps out there that claim to do extracting PDFs but charge heavy subscription fees which don't make sense for using it once in a while.
That's where Tabula comes in.
It's an open-source tool to extract tables from PDF. Based on my personal experience it's the best tool out there which works almost perfectly even better than the cloud providers which offer PDF table extraction.
Let's see how to use it:
- Head over to Tabula website and download the app based on your OS.
- Start the app. It doesn't have any GUI. It'll start a local server and you can access it on localhost
- Now, upload your PDF. Select the region you want to extract and then click on Preview & Export Extracted Data
- Once the extraction process is complete. You can export it. And if the current extraction method doesn't suit your case. Then you can configure its Extraction Method as well
Happy extracting table!