Scraping with Python
in the Newsroom

Day Two - GIJC 2015, Lillehammer

By Tom Meagher / @ultracasual
and Adriana Homolova / @naberacka

The goal

To start thinking about
how to break problems down
into the smallest tasks
that can be programmed.

So far, we've gone from this...

...to this.

Hour Three

Now, we want to expand our code to scrape a similar, but bigger page.

We probably won't have time to get to these.
But if you want to keep working,
try the extra, extra credit project here.

And you can find the working scripts in the completed dir.

Keep learning

Excellent post on ethics of scraping

"Web Scraping With Python"

Python Journos

More resources for learning Python

NICAR-L

Source

Github, StackOverflow, Google

--30--

Clone the source code


Email me or ping me on Twitter