The focus of this Meetup group is to provide free monthly community data events, leading towards the London Data Science Festival.
What is the London Data Science Festival?
Data Science Festival Live - Monday, April 16th to Saturday, April 21st, 2018.
The Data Science Festival LIVE is a free week long, celebration of all things data science. The festival consists of lectures, workshops, demos, code sprints, panel discussions and social events, spread across London and culminated with a day long Mainstage event. Monday to Thursday of the week will be at different venues every evening featuring a range of speakers. Friday night will be a Data Science Networking event and Saturday is our MainStage conference from 9 AM - 6 PM.
Who is this meetup and the Data Science Festival for?
• Data engineers, analysts, scientists, and other practitioners
• R, Python and other software engineers who work with data or want to learn
• Data visualisation developers and designers
• Non-technical team leads, executives, and other decision makers from data centric start-ups and large companies looking to utilise open source tools
Join Data Science Festival - London in partnership with Secret Escapes. June 11th, we will be featuring 6 new and upcoming companies at our Start-up Showcase. Come and hear how these new companies use DS to solve real work problems, the issues their teams have encountered and also the mistakes and success that you should look for when you are starting your own projects.
Due to the popularity of Data Science Festival events, we are now allocating event tickets via a random ballot. Registering here enters you into the ticket ballot for the Data Science Festival Event at Secret Escapes on June 11th 2019, the ballot will be drawn on the 4th June 2019. Those randomly selected will then be e-mailed a Universe ticket for the event, with the joining details.
If you get an allocated Universe ticket, please bring a copy of your paper ticket or your ticket on your phone to the event to check in with your QR code. Tickets are non-transferable.
PLEASE NOTE REGISTERING ON MEETUP DOES NOT GUARANTEE YOU ENTRY TO THIS EVENT.
Please click here to apply for a ticket: https://www.datasciencefestival.com/event/dsf-startup-showcase-with-secret-escapes/
6.00pm doors open
[masked]:15pm talk - 3 Short Sharp Lightning Talks
7:15-7:45 pm - Refreshments
7:45-8:30pm talk - 3 Short Sharp Lightning Talks
8:[masked]pm - Close
Talk 1 - Secret Escapes challenges scaling Airflow to running hundreds of dynamically generated DAGs
The jobs in our data pipeline are either self describing or built dynamically off of config files. As Airflow DAGs are simple Python objects we thought an elegant solution that's loosely coupled with Airflow would be to generate DAGs dynamically based on our job config files/metadata. We've now got a couple hundred jobs with this implementation and we're seeing performance issues where the scheduler is slow to assign new work. We anticipate hundreds if not thousands of jobs running in production when we reach maturity so tackling the high risk questions of "Can Airflow scale?" and "Can Airflow scale micro batch pipelines with jobs running every X mins?" is a priority for us to answer. We're therefore taking an experiment driven approach to tackling the question. We'd like to share that journey and our findings with the community.