What we're about

The focus of this Meetup group is to provide free monthly community data events, leading towards the London Data Science Festival.

What is the London Data Science Festival?

Data Science Festival - Monday, November 2nd to Saturday, November 7th, 2020.

The Data Science Festival LIVE is a free week long, celebration of all things data science. The festival consists of lectures, workshops, demos, code sprints, panel discussions and social events, spread across London and culminated with a day long Mainstage event. Monday to Thursday of the week will be at different venues every evening featuring a range of speakers. Friday night will be a Data Science Networking event and Saturday is our MainStage conference from 9 AM - 6 PM.

About us

The Data Science Festival is a global data community. We aim to connect the data world and foster the sharing of knowledge, inspiration and ideas.

The global network is dedicated to free education through grassroots technical events. we will cover the latest topics that matter most to data scientists and data engineers. There will be no demos that you can learn from a book or video, instead our speakers will be discussing real world problems, what works, what doesn’t work and why they’ve implemented the solutions they have. They will generate lively discussion and debate, offering real-world take-aways to help you in your job.

Who is the Data Science Festival for?

• Data engineers, analysts, scientists, and other practitioners

• Academics, founders, researchers, authors

• R, Python and other software engineers who work with data or want to learn

• Data visualisation developers and designers

• Non-technical team leads, executives, and other decision makers from data centric startups and large companies looking to utilise open source tools

How can I get involved?

Visit the DSF site to get involved! (http://www.datasciencefestival.com/)

We are actively looking for community minded individuals to help build and grow our group, please feel free to get in touch if you would like to:

• Host an event

• Sponsor an event

• Present a session

• Volunteer to help organise the festival

Upcoming events (5)

Lunch & Learn - Streamlining feature engineering pipelines with Feature-engine

Welcome to the Data Science Festival. A Month of free virtual events. 20K Community Members. Unlimited access. November 2020. #DSFthegreatIndoors In this Lunch & Learn Session on November 2nd, we will discuss Streamlining feature engineering pipelines with Soledad Galli from Train In Data. Ticket Allocation Process: Once registered through Heysummit, our new event platform, you will be sent your schedule via email which you can you add to your calendar. You will get sent a reminder the day before with your URL link to join the event. Don't forget you can create your own bespoke schedule throughout the whole month of November, adding any talks that catch your eye. Click here to sign up to this specific event: https://tickets.datasciencefestival.com/talks/scala-for-big-data-the-big-picture/ Click here to see ALL DSF 2020 talks: https://tickets.datasciencefestival.com/schedule/ SCHEDULE 1:00pm Intro 1:05pm Speaker: Soledad Galli Talk Title: Streamlining feature engineering pipelines with Feature-engine Talk Abstract: Machine learning models output predictions based of patterns learned from data. Before we can use the data to train a machine learning algorithm, we perform extensive transformations of the variables, which are commonly referred to as feature engineering. Feature engineering includes procedures to impute missing data, encode categorical variables, transform or discretise numerical variables, put features in the same scale, combine features into new variables, extract information from dates, transaction data, time series, text and sometimes even images. To use our models in production, we need to deploy both the machine learning models and the entire pipeline of data transformations and feature creation. We must also ensure that the deployed model is identical to the model developed in the research environment. Kludging together an ad-hoc process for feature engineering is not efficient, debug friendly or reproducible. Using well established open source projects removes the task of coding from our hands, improving team performance, while supporting reproducibility, thus reducing model research and deployment timelines. Feature-engine is an open source Python library for feature engineering which smooths building and deployment of feature engineering pipelines. Feature-engine supports multiple data transformation techniques, preserves fit() and transform() functionality, and can be used within a Scikit-learn pipeline, therefore, allowing organisations to build and deploy an entire machine learning pipeline by saving one object (.pkl). In this talk, I will give a high level overview of the main data transformations that we use in the industry, bring forward the challenges encountered while deploying machine learning pipelines, and highlight how Feature-engine can mitigate some of these challenges. 1:40pm Community Q&A 2:00pm Close Speaker Bio: Soledad Galli is a Lead Data Scientist with 10+ years of experience in world class academic institutions and renowned organisations. She has experience in finance and insurance, received a Data Science Leaders Award in 2018 and was selected “LinkedIn’s voice” in data science and analytics in 2019. Sole researched, developed and put into production machine learning models for Insurance Claims, Credit Risk Assessment and Fraud Prevention. Sole founded Train in Data with the idea to bring practical knowledge of machine learning and AI software engineering to the community. She created online courses on these topics which have enrolled about 20k students worldwide. Sole also created Feature-engine, an open source Python package to streamline feature engineering pipelines. Please note the Time Zone when you book this event

DSF Keynote - Data Driven Approach to Creating Safer Communities

Welcome to the Data Science Festival. A Month of free virtual events. 20K Community Members. Unlimited access. November 2020. #DSFthegreatIndoors In this Keynote Session on November 3rd, we will discussData Driven Approach to Creating Safer Communities with Veronika Belokhvostova from Facebook Ticket Allocation Process: Once registered through Heysummit, our new event platform, you will be sent your schedule via email which you can you add to your calendar. You will get sent a reminder the day before with your URL link to join the event. Don't forget you can create your own bespoke schedule throughout the whole month of November, adding any talks that catch your eye. Click here to sign up to this specific event: https://tickets.datasciencefestival.com/talks/keynote/ Click here to see ALL DSF 2020 talks: https://tickets.datasciencefestival.com/schedule/ SCHEDULE 6:30pm Intro 6:35pm Speaker: Veronika Belokhvostova Talk Title: Data Driven Approach to Creating Safer Communities Talk Abstract: Ensuring safety of its users and preventing the abuse of the platform is paramount to Facebook. Facebook more than tripled the size of its investments in Integrity over the last 4 years - ensuring that our policies, detection, and enforcement tools cover a broad range of account, content, and other violations. Each year Facebook takes down billions of posts and fake accounts, and warns millions of Facebook users about misinformation. In her presentation, Veronika will share insights from 4 years of managing Integrity teams at Facebook, working on issues ranging from Hate and Misinformation to Fake accounts. She will explain some of the nuances of the problems her teams work on and the critical role Data Science plays in informing and shaping Integrity work. 7:10pm Community Q&A 7:30pm Close Speaker Bio: Veronika Belokhvostova is a Director of Data Science at Facebook. In this role she oversees Integrity and Social Impact teams in the US and Europe. Integrity teams focus on protecting 2+B Facebook users from hate speech, terrorism propaganda, harassment, account compromise, election interference and other Community Standards violations through sophisticated algorithms and support tools. Social Impact teams build tools that enable millions of Facebook users to do good, such as volunteer and raise funds for non-profits. Before Facebook, Veronika spent over a decade leading analytics efforts at other companies. She was the VP of Analytics at Hotwire (Expedia Inc), led an elite analytics and strategy team focused on projects sponsored by the company’s President at PayPal (eBay Inc.), and led litigation-focused analyses as part of the Deloitte Economic Consulting Group. Veronika graduated from Stanford with a BA with Honors in Economics and a minor in Computer Science. She holds an MBA from the Haas School of Business at UC Berkeley. Please note the Time Zone when you book this event

Author Interview: Brian Christian - The Alignment Problem: ML and Human Values

Welcome to the Data Science Festival. A Month of free virtual events. 20K Community Members. Unlimited access. November 2020. #DSFthegreatIndoors In this Author Interview Session on November 4th, we will discuss "The Alignment Problem" book by Author Brian Christian. Ticket Allocation Process: Once registered through Heysummit, our new event platform, you will be sent your schedule via email which you can you add to your calendar. You will get sent a reminder the day before with your URL link to join the event. Don't forget you can create your own bespoke schedule throughout the whole month of November, adding any talks that catch your eye. Click here to sign up to this specific event: https://tickets.datasciencefestival.com/talks/author-interview/ Click here to see ALL DSF 2020 talks: https://tickets.datasciencefestival.com/schedule/ SCHEDULE 6:30pm Intro 6:35pm Speaker: Brian Christian Talk Title: Discussion of "The Alignment Problem" book by Brian Christian. Summary: With the incredible growth of machine learning over recent years has come an increasing concern about whether ML systems' objectives truly capture their human designers' intent: the so-called "alignment problem." Over the last five years, these questions of both ethics and safety have moved from the margins of the field to become arguably its most central concerns. The result is something of a movement: a vibrant, multifaceted, interdisciplinary effort to address the alignment problem head on, which is producing some of the most exciting research happening today. Brian Christian, author of the acclaimed bestsellers The Most Human Human and Algorithms to Live By, will survey this landscape of recent progress and the frontier of open questions that remain. 7:05pm Community Q&A 7:30pm Close Speaker Bio: Brian Christian is the author of The Most Human Human, which was named a Wall Street Journal bestseller, a New York Times Editors’ Choice, and a New Yorker favorite book of the year. He is the author, with Tom Griffiths, of Algorithms to Live By, a #1 Audible bestseller, Amazon best science book of the year and MIT Technology Review best book of the year. His third book, The Alignment Problem, is forthcoming in October of 2020. Christian’s writing has been translated into nineteen languages, and has appeared in The New Yorker, The Atlantic, Wired, The Wall Street Journal, The Guardian, The Paris Review, and in scientific journals such as Cognitive Science. Christian has been featured on The Daily Show with Jon Stewart, Radiolab, and The Charlie Rose Show, and has lectured at Google, Facebook, Microsoft, the Santa Fe Institute, and the London School of Economics. His work has won several awards, including fellowships at Yaddo and the MacDowell Colony, publication in Best American Science & Nature Writing, and an award from the Academy of American Poets. Born in Wilmington, Delaware, Christian holds degrees in philosophy, computer science, and poetry from Brown University and the University of Washington. A Visiting Scholar at the University of California, Berkeley, he lives in San Francisco. Please note the Time Zone when you book this event

DSF Panel - Tech + Academia: A match made for success.

Welcome to the Data Science Festival. A Month of free virtual events. 20K Community Members. Unlimited access. November 2020. #DSFthegreatIndoors In this Panel Session on November 5th, we will discuss Tech + Academia: A match made for success. Ticket Allocation Process: Once registered through Heysummit, our new event platform, you will be sent your schedule via email which you can you add to your calendar. You will get sent a reminder the day before with your URL link to join the event. Don't forget you can create your own bespoke schedule throughout the whole month of November, adding any talks that catch your eye. Click here to sign up to this specific event: https://tickets.datasciencefestival.com/talks/the-benefits-of-a-researched-lead-approach-in-a-commercial-ds-team/ Click here to see ALL DSF 2020 talks: https://tickets.datasciencefestival.com/schedule/ SCHEDULE 1:00pm Intro 1:05pm Lightning talks 1:45pm Panel Discussion and Community Q&A 2:30pm Close Reda Kechouri: How ASOS’s Data Science teams adapted to the new COVID world. Betty Schirrmeister: TBC Speaker 3: Speaker 4: For speaker bios: https://tickets.datasciencefestival.com/talks/the-benefits-of-a-researched-lead-approach-in-a-commercial-ds-team/ Please note the Time Zone when you book this event

Past events (83)

Photos (326)