Tools to scrape data from a website

This is another one of those notes-to-self for later, and perhaps to inspire others to try. Putting the log back into blog. While I’d love to learn enough Python or what-have-you to scrape the data from a website, the following tools got the job done.

  • Import.io to do the heavy lifting of scraping. The best option I found in an exhaustive half hour of searching and testing.
  • Open Refine to split columns where I wanted, though that’s only a part of its power
  • Using a spreadsheet as a crowbar to make sure the data was in the right columns. Open Refine probably is the right tool, but good ol’ LibreOffice Calc got the job done.

And pen and index cards, to note what I did so before I try and scrape data from another site, I’ll do a better job.

Published by

Scott Wells

Scott Wells, 46, is a Universalist Christian minister doing Universalist theology and church administration hacks in Washington, D.C.

Leave a Reply

Your email address will not be published. Required fields are marked *