19

Desktop app which makes information trapped in PDF format manipulable again.

 

Main functions
  • Extract data from tables in PDF documents.
  • Make files which contain information extracted from tables in PDF format manipulable again.
Description

Desktop app which makes information trapped in PDF format manipulable again. 


 
This tool was developed by Manuel Aristarán, Mike Tigas and Jeremy B. Merrill with the support of ProPublica, La Nación DATA, Knight-Mozilla OpenNews, and The New York Times. Tabula was designed by Jason Das.  


 
If you’ve ever tried to use data in tables in a PDF and realized that there’s no easy way to copy and paste the rows from that format to a spreadsheet, you’ll find Tabula very useful. 


 
This tool allows you to extract this data into a manipulable format such as CSV or a Microsoft Excel spreadsheet using a friendly interface. 
 

reconocimiento-de-columnas

 

Researchers from diverse areas of expertise use this tool to convert PDF documents into spreadsheets and other formats for use in analysis and databases. 

 


Tabula is being used to empower investigative reporting at organizations of all sizes, including: 
ProPublica, The Times of London, Foreign Policy, La Nación (Argentina), The New York Times and the St. Paul (MN) Pioneer Press. 

 

Language
Java
HTML
CSS
Javascript
1
Knowledge areas
Science and Technology