Tom Meagher bio photo

Tom Meagher

News nerd.

Twitter LinkedIn Github

More tips for using OpenRefine

At IRE’s computer-assisted reporting conference this week in Louisville, I am once again teaching a course on using OpenRefine to clean data. Although the program has changed names over the last few months, its features are pretty much the same, and it’s still an amazingly powerful and free tool for cleaning and standardizing difficult data. If you ever find yourself frustrated with typos, misspellings or any number of other mistakes in the data you get from government agencies, OpenRefine will change your life.

I won’t repeat verbatim my Refine tutorial, which you can find here, but I will share some of the resources again and point you to a few other tutorials.

First, check out the updated slides for my class. If you’d like to follow along with the class, you can download our data for the lottery winners, the hospital report cards, and the campaign finance data.

If you’re just looking for a good cheat sheet of Refine’s key functions, try this.

If you thought my class was easy, and you’re looking for more help with Refine, I can recommend these tutorials, from Refine creator David Huynh, developer Dan Nguyen and journalism educator Paul Bradshaw.

What’s your favorite use for OpenRefine? Leave a comment here, drop me an email or mention me on Twitter. I’d love to hear what you think.