Web scraping & relational data models with R
Details
Welcome back after the summer break! Join us at Alte Börse for the September edition of the RUG meetup, sponsored by Wüest Partner / Datahouse.
Schedule:
06.15 pm Doors open
06.30 pm Introduction / Welcome
06.40 pm Talks (see below)
07.40 pm Are you hiring? Looking for an R related job? ... or organising an interesting event? Take the stage for one minute. (DM us if you need a slot!)
ca 07.45-09.00 pm Apéro
Relational data models with the db package
Kirill Müller (Cynkra)
Kirill Müller will present {dm}, a new package (https://krlmlr.github.io/dm) that facilitates working with multiple tables. He gives a motivation for using multiple tables in the first place, outlines the features of this package, and discusses future development. The presentation will be interactive with live coding and a script that attendants can run during or after the presentation.
Tracking Real Estate Project Web Sites with R
Thomas Maier (Datahouse AG)
Hundreds of real estate project web sites list newly available condominiums and rental flats. This data, however, is not available in an aggregated and chronological format. Our aim was to develop a tool to observe these projects and regularly store their data.
We set up a tool consisting of a front-end to register projects and analyze data, and a back-end with an intelligent background crawling service.
The user interface is powered by R Shiny, connected to a Postgres database and crawling is facilitated using an automated R daemon via a Splash server. The project is deployed with Docker and Shiny Proxy.
The resulting tool enables an efficient collection and observation of relevant projects, automates parsing of target web sites to a high degree and builds up an aggregated and chronological database of newly available condominiums and rental flats.

