Anybody in? An Open Data Project

Details
Maria Kondili will give a presentation about an ongoing Open Data project looking for ideas and contributers to move to the net level.
Many cities in Germany provide a website with data and geo-location of properties that are not inhabited or are demolished. They either belong to individuals or to the state. Main page: https://www.leerstandsmelder.de .
In the frame of an Open Data project with a committed team, we can extract all the available data as an HTML format file and in a large JSON, for all cities of Germany where the data is available. Python is the prefered language, but knowledge of Javascript is also useful since the updated web page is written in java and contains a map, illustrating each vacant space. Extraction of the information can be done using BeautifulSoup (bs4) or Scrapy, HTMLParser or any other web-scraper you find handy.
The challenge is that CSS or PHP text data need to be extracted on top of html. A preliminary work started with Frankfurt data-downloaded in JSON, available at https://github.com/mariakondili/LEERSTAND_FRA . Here, the BeautifulSoup(bs4) package was used, but the updated webpage doesn’t allow for the same code for extraction, since the HTML structure has changed. For each city, the registered vacant buildings contain the important information (address, type of usage, owner) inside different html tabs.
The goal is to make the extraction automatically for each city, since the webpages have the same format. Then a better re-organisation of the information in a new webpage, using e.g. Django, and also Pandas to show statistics on the collected data is suggested. They can thus be presented in a machine-readable format, and saved in easily downloadable files (tables, images). Many communities could profit of the use of the new formatted website. But the priority will be given to cover the needs of homeless and refugees.
Lightning Talks
Usually there will also be the opportunity to give lightning talks (http://en.wikipedia.org/wiki/Lightning_talk). Please consider giving one and register yours on this EtherPad (https://pad.freitagsrunde.org/qrJelFzDyO)!
Restaurant
At about 8:45 pm we’ll move to a nearby restaurant.

Anybody in? An Open Data Project