• Back to SocialCops.com
SocialCops
SocialCops
  • Academy
    • Resources
    • Webinar
    • Ebooks & Courses
  • Data Intelligence
    • Data Stories
    • Best Practices
    • Real Numbers
  • Inside SocialCops
    • Team & Culture
    • Announcements
    • Press
    • Community
  • Technology
    • Data Science
    • Engineering
    • Product Updates

Posts by tag

PDF

  • Engineering
  • Technology

Announcing Camelot, a Python Library to Extract Tabular Data from PDFs

  • 6 minute read
  • October 3, 2018
  • by Vinayak Mehta
pdf
We use GitHub issues to keep track of all issues. Please do not report bugs or issues in this blog’s comments. Instead, post them on GitHub as an issue. Before submitting a comment with an issue, please use GitHub search to look for existing issues (both open and closed) that may be similar.…
View Post
Share
How to use a workflow in Airflow to track disease outbreaks in IndiaHow to use a workflow in Airflow to track disease outbreaks in India
View Post
  • 8 minute read
  • Data Science
  • Editors' Picks
  • Technology

How to Create a Workflow in Apache Airflow to Track Disease Outbreaks in India

  • June 18, 2018
  • by Vinayak Mehta
What is the first thing that comes to your mind upon hearing the word ‘Airflow’? Data engineering, right? For good reason, I suppose. You are likely to find Airflow mentioned in every other blog post that talks about data engineering. Apache Airflow is a workflow management platform. To oversimplify, you…
View Post
data analyst
View Post
  • 5 minute read
  • Inside SocialCops
  • Team & Culture

A Day in the Life of a SocialCops Data Analyst

  • November 17, 2015
  • by Lilianna Bagnoli
I did not have a traditional college experience. While studying at Grinnell College, I spent a year living in Pune, Delhi and Mumbai, where I completed two internships, learned Hindi, traveled the country, and studied Economics at St. Stephen’s College. This year inspired me to leave my home in Kentucky…
View Post
pdf
View Post
  • 4 minute read
  • Engineering
  • Technology

PDF Is Evil: Extracting Tabular Data From PDFs

  • January 22, 2015
  • by Vaishak Salin
Update: As this blog explains, getting data out of PDFs is a nightmare, even with tools like PDFTables and Tabula. To solve this problem, we created and released Camelot, an open-source Python library and command-line tool that makes it easy for anyone to extract data tables trapped inside PDF files. Read…
View Post
About us

We are a data intelligence company on a mission to tackle the world’s most critical problems with data.

Contact us
  • Academy
    • Resources
    • Webinar
    • Ebooks & Courses
  • Data Intelligence
    • Data Stories
    • Best Practices
    • Real Numbers
  • Inside SocialCops
    • Team & Culture
    • Announcements
    • Press
    • Community
  • Technology
    • Data Science
    • Engineering
    • Product Updates
Subscribe
People from the World Bank, USAID, and University of Chicago read our newsletters. Now it's your turn.

Input your search keywords and press Enter.

We use cookies to give you a better experience on this site. If you stick around, we'll assume you're okay with this.Ok