We are a community group of journalists and others who work in the media, or are planning to in the future. Together, we aim to learn and share ✨ technical skills for use in our reporting in a supportive environment.

That might be 🔎 data analysis as part of an investigation, 💻 scraping data from government websites, 📉 building data visualisations to better tell a story, or something else entirely.

We meet monthly to learn and network. All sessions are appropriate for complete beginners and above.

Automatically scraping the web with Github Actions

This month we'll be learning how to automatically scrape data from the web using Github Actions.

The internet is full of data that changes over time, and tracking these changes can sometimes be more interesting than looking at the static data. For example, the @nyt_diff Twitter account tracks changes made to New York Times headlines, giving us an insight into their editorial process.

Github is a site for sharing and collaborating on code, and keeping track of how it changes. A few years ago they launched Actions, which lets you run that code on their servers automatically. It's remarkably simple to set up, and for public code repositories, it's free. Furthermore, it is especially powerful for web scraping as it means we can easily put a scraper on loop, and then store the output data alongside the code that did the scraping.

We'll be looking at how to approach extracting data from a website, we'll set up a simple automatic scraper using Github Actions, and then look at how we can do some basic analysis on the output, including how to identify changes over time.

All of our events are suitable for beginners, and no programming experience is required. Bring a laptop along as this a practical, hands-on workshop. Before you arrive please also sign up for a Github account if you don't already have one.

Before the event, check out the shared doc we'll be using, and sign up for a Dropbox account if you don't already have one so you can edit it. Then please add links to the show and tell section! These can be great data stories you've seen, new tools, jobs you’re hiring for, announcements -- anything others might be interested in. We'll check out what everyone has shared at the start of the event.

7:00 🚪 Doors open
7:30 🗣 Show and tell
7:40 💻 Tutorial
9:00 🍺 Drinks at the George

