- Interactive Analytics in the Cloud with Presto and Alluxio
Location: Galvanize, 44 Tehama St, San Francisco, CA 94105 Data Council is promoting this meetup by the Alluxio Meetup (https://www.meetup.com/Alluxio) and Silicon Valley Cloud Computing (https://www.meetup.com/cloudcomputing/). Agenda: 6:00pm: Happy Hour and networking 6:30pm: Building Fast SQL Analytics on Anything with Presto, Alluxio 7:10pm: Building Cloud-native Analytical Pipelines on AWS 7:30pm: Into the Cloud: Twitter's Presto Journey to GCP 8:00pm: Q&A & Mingle Talk 1: Building Fast SQL Analytics on Anything with Presto, Alluxio This talk describes a stack to combine Presto, Alluxio, and Cloud object storage systems (e.g.,AWS S3) for high-concurrent and low-latency SQL queries on big data on the cloud. Presto, an open-source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Alluxio is an open-source data orchestration that brings data closer to compute and provides a unified data access layer at in-memory speeds. Presto can use Alluxio as a distributed caching tier on top of S3 for the hot data to query, avoiding reading data repeatedly from the cloud. This talk will cover: - the architecture of Presto, its separation of compute and storage, cloud-readiness, recent advancements in the project such as Cost-Based Optimizer and Kubernetes Support. - an overview of Alluxio’s key concepts, architecture and data flow, - Presto and Alluxio production use cases running hundreds of nodes, including ING Bank, JD.com, and NetEase Games. Bio: Kamil Bajda-Pawlikowski, CTO, Starburst Kamil is a technology leader in the large scale data warehousing and analytics space. He is CTO of Starburst, the enterprise Presto company. Prior to co-founding Starburst, Kamil was the Chief Architect at the Teradata Center for Hadoop in Boston, focusing on the open source SQL engine Presto. Previously, he was the co-founder and chief software architect of Hadapt, the first SQL-on-Hadoop company, acquired by Teradata in 2014. Bin Fan, founding engineer and VP of Community, Alluxio Bin Fan is the founding member of Alluxio, Inc. and the PMC maintainer of Alluxio open source project. Prior to Alluxio, he worked for Google. Bin received his Ph.D. in CS from CMU. Talk 2: Building Cloud-native Analytical Pipelines on AWS With the ease and flexibility that the cloud brings, many data platform teams are building their data pipelines on Amazon AWS leveraging many of the services it provides. For frameworks like Apache Spark and Hive, Amazon EMR that includes the Hadoop stack, greatly simplifies and speeds up the installation and configuration of clusters. Amazon S3 also provides a cost-effective and easy way to store large amounts of data. However, there are still challenges that data engineers see with workloads that are latency sensitive, need data sharing across pipelines, or need constant synchronization with S3. In this talk, Irene will share her experience with building data pipelines on AWS and how Alluxio, a data orchestration layer can greatly simplify these challenges while eliminating problems caused by S3 throttling or slowdowns. Bio: Irene Cai is a software engineer at Google, working in Google Brain team on TensorFlow and TFX Fleetwide metrics. She previously worked at Amazon for four years where she worked on big data pipelines and applications that process hundreds of TBs of data daily. Talk 3: Into the Cloud: Twitter's Presto Journey to GCP Hao from Twitter will share Twitter's cloud journey from performance requirements to authentication and authorization. Hao Luo is a Sr. Software Engineer focusing on interactive query and real time computing @ Twitter.
- Building fast and scalable big data and ML platforms at Pinterest and JD.com
Location: 1825 South Grant St, San Mateo, CA 94402 Data Council is co-hosting this meetup with the Alluxio Meetup (www.meetup.com/Alluxio). We are excited to co-host with them for this event. Also make sure to check out the Presto Summit this month. https://www.starburstdata.com/technical-blog/presto-summit-2019/ Agenda: 6:00pm: Happy Hour and networking 6:30pm: Tao Huang from JD.com will share “Building ad hoc and real-time data platform using Presto + Alluxio in JD” 7:00pm: Yongsheng Wu from Pinterest will share “Big data and Machine Learning at Pinterest” 7:30pm: Calvin Jia from Alluxio will share “Scalable File System Metadata Services on RocksDB, gRPC and etc: A Story from Alluxio” 8:00pm: Q&A & Mingle Share this! Twitter: http://bit.ly/2WKIRzH LinkedIn: bit.ly/2Wg6b4j #1: How we accelerated queries by 10x using Alluxio and Presto for ad hoc and real-time stream computing at JD. As the world’s 2nd largest e-retailer, JD.com builds the Big Data Platform (BDP) to run more than 400K jobs daily, with more than 15K cluster nodes and a total capacity of 210 PB. For our BDP, we have been running Presto on Alluxio for ad-hoc and real-time stream computing on more than 400 machines for 2+ years in production, and we have seen 10X performance gain. In this talk I will share our BDP’s requirements, the challenges we have encountered, and how we leveraged JDPresto and Alluxio to solve those challenges. At JD, we leverage Alluxio’s HDFS compatible API and use Alluxio to connect various frameworks, including JDPresto, Spark, and Hive. Bio: Tao Huang is a big data platform development engineer at JD.com, where he is mainly engaged in the development and maintenance of the company’s big data platform, using open source projects such as Hadoop, Spark and Alluxio. #2: Big data and Machine Learning at Pinterest Yongsheng, the head of big data and ML platform at Pinterest, will share his journey to build a fast and scalable big data and ML platform in AWS for Pinterest to handle the requests and complexity in data at scale. In this talk, he will cover different aspects from the requirements of the platform, the challenges encountered, the technologies chosen, and the tradeoffs that were made. Bio: Yongsheng Wu was one of the early engineers at Pinterest, who was instrumental in making it possible for Pinterest to scale from 10M to 300M MAUs. Yongsheng leads a team of 70+ engineers working on core online infrastructure, big data platform and ML platform to enable Pinterest quickly innovate its product and scale its revenue generation capability. Prior to Pinterest, Yongsheng worked at Twitter, Salesforce, and Oracle. #3: Scalable File System Metadata Services on RocksDB, gRPC: A Story from Alluxio Alluxio is an open-source distributed virtual file system that provides a single namespace federating multiple external distributed storage systems. Therefore, it is critical for Alluxio to be able to store and serve the metadata of all files and directories from all mounted external storages both at scale and at speed. This talk shares our design, implementation and optimization of Alluxio metadata service to address the scalability challenges, focusing on how to apply and combine techniques including tiered metadata storage (based on off-heap KV store RocksDB), fine-grained file system inode tree locking scheme, embedded state-replicate machine (based on RAFT), exploration and performance tuning in the correct RPC frameworks (thrift vs gRPC) and etc. As a result of combined above techniques, Alluxio 2.0 is able to store at least 1 billion files with a significantly reduced memory requirement, serving 3000 workers and 30K clients concurrently. Bio: Calvin Jia is the top contributor of the Alluxio project. He has been involved as a core maintainer and release manager since the early days when the project was known as Tachyon. Calvin has a B.S. from the UC Berkeley.
- Data Council San Francisco 2019
Data Council (https://www.datacouncil.ai) is coming to San Francisco, will you join us? The main event was born out of a similar meetup group to this, and we're excited to have become a cornerstone of the growing data community on meetup. What you will get out of Data Council SF 2019 (https://www.datacouncil.ai/san-francisco-2019): - 2 days & 50+ insightful talks by leading data scientists and engineers from top companies like Facebook, Salesforce, IBM, Netflix, Google, WeWork, Lyft, Stitch Fix, Datadog, Segment, Datacoral, Stanford University and many more. - 6 unique tracks: Data Platforms & Pipelines, Databases & Tools, Data Analytics, Machine & Deep Learning, and our all-new tracks: Hero Engineering and AI Products. - All-new content including our brand new Founders Panel of top founders in the data space. - Extensive networking opportunities at the conference, or connect with speakers & attendees at our Wednesday night after-party between conference days. - Small group Speaker Office Hours following each talk with an opportunity to dive deeper into the subject matter 1:1 with the speaker. - Attendees that are highly-technical data scientists, engineers, analysts & technical founders from top tech, media, and finance companies around the SF area. - Connect with our great partner companies at Sponsor Spotlight to discover their available data jobs and latest product developments. This year Data Council San Francisco ‘19 takes place on April 17 & 18th. As members of this meetup group and our community I wanted to extend you a sweet deal to get tickets for $100 lower than our lowest early bird pricing. To redeem go here: https://www.datacouncil.ai/san-francisco-2019 using coupon code: 100offeb to redeem your $100 discount. Why should you join this year?, If you believe in Quality Content > $, and would like to learn from companies like Facebook, Apache Foundation, Google, Netflix, Salesforce, Spotify, WeWork, Beeswax, Stitch Data, Capital One, Airbnb, Datadog, Lyft, Segment, Starburst, Datacoral, Columbia University, Uber, TapRecruit, Figure Eight, Dia&Co and many more along with many awesome speakers, You should join! Cheers, -Pete
- Improv for Engineers - Become a Stronger Storyteller
You might not have realized it yet, but leveling up your career as an engineer or data scientist can be accelerated by learning to communicate more effectively by becoming a better storyteller. This especially applies when it comes to explaining the story behind a technical problem you solved - explaining the context or *how* the problem came up in the business and *why" we're trying to solve it is important. Then the *what* you did, the *business impact* of a successful project are also key ingredients in communicating to stakeholders. This is an on-your feet workshop. You will play games, do fun storytelling exercises and learn tools to help you connect to your audience. This class is for all levels of engineers who are looking to up their professional communication game. *Note, expedite check-in and register here: https://www.eventbrite.com/e/improv-for-technologists-you-are-a-storyteller-tickets-59113468977 Why Improv? Forbes, the New York Times, Harvard Business Review and many other publications have written about how improv training is amazing business and startup training. How does improv help technology workers? Improv helps people: - Listen and be present - Be better communicators - Tap into creativity - Relieve anxiety/stress and release tension - Celebrate making mistakes and be more resilient - Gracefully deal with uncertainty - Make your partners look good and collaborate on teams more effectively - YES AND in addition to being useful, improv is fun! ***Remember, register here: https://www.eventbrite.com/e/improv-for-technologists-you-are-a-storyteller-tickets-59113468977 What Takeaways to Expect: - Develop deeper listening skills - Hone your audience awareness - Get in touch with your natural presence - Embrace body language that makes you open and flexible to ideas instead of closed off. - Be more comfortable going off-script (i.e. improvising) during presentations, in Q&A, and in life. - Take the focus off yourself and serve your audience by “being interested, instead of being interesting” - Use “yes and” thinking to collaborate more creatively and effectively with your team - Take care of your teammates and notice when team members feel excluded or afraid to contribute their thoughts. Sometimes, great ideas go unspoken. Who Should Attend: - Data Scientists and Engineers who want to improve communication, listening skills, and be more compelling. - Industry professionals and bootcamp students who want to feel more relaxed and confident in job interviews. - Founders and executives leading teams - Product Managers - Designers - Creative Directors, Copywriters, Marketers, who want a creativity boost and new storytelling tools - Teachers, therapists, yoga instructors, health and wellness professionals - All kinds of people from all walks of life! Class Agenda: - 6:30 pm - 7 pm: drinks and snacks (sponsored by Galvanize) - 7 pm - 8 pm: joyful storytelling games to warm up spontaneity, ignite our imaginations, practice being kind to ourselves and practice celebrating mistakes - 8:00 pm - 8:15: bathroom break - 8:15 - 9:15 pm: Additional storytelling games / exercises - 9:15 pm - 10:15 pm: Integration. Here’s where we put it all together and do group scenes games, mock job interviews, improvised presentations, etc. in front of the class. Donations Encouraged: Donations of $30+ per person for this three hour workshop are encouraged. This workshop usually costs $60+ per person. Donation Methods Accepted: - Donorbox: https://donorbox.org/you-are-a-storyteller-sf - Venmo: @jaredpolivka - Cash About the Teacher: Jared Polivka Jared previously served thousands of students around the world as Director of the School of AI at Udacity.com. With a background in product, design thinking, developer evangelism, storytelling and improvisation, Jared has found that improv skills are incredibly useful in working at organizations (i.e. startups) dealing with uncertainty and change.
- Building Beyond MVP Data Infrastructure
Talk #1: Architecting Cloud Native Apache Airflow Apache Airflow is the most popular and effective open-source tool for managing workflows in Python and is used by startups and the Fortune 100 alike. But operationalizing this at scale for a growing team is easier said than done when questions around security, resource monitoring, system tolerance, testing, and deployment still linger. This talk will cover the associated stack necessary to run Airflow in a cloud native environment. Topics will include orchestration with Kubernetes, logging with Elasticsearch, monitoring with Prometheus and Grafana, service token creation and integration into CI services, and role based authentication. Presenter: Greg Neiheisel - CTO & Co-Founder, Astronomer Greg started his career building apps for Great American Insurance Group before leaving to become a partner with Differential Dev Group, helping them to become one of the earliest adopters of Meteor. He left in 2015 to help launch Astronomer, The Airflow Company, and has been CTO ever since. Greg works in a mix of Node, Python, and Go and is an expert in Docker, Kubernetes, and, of course, Airflow. --- Talk #2: Monitoring the Data Lake: Detecting Problems in Data Pipelines The fundamental problem solved by the data engineer is to ensure that the data pipeline line is working. They must answer questions like: Are data flows operating normally? Do my data tables contain the correct results? Are data apps able to access the data quickly? This talk will focus on best practices for monitoring data flowing through a data lake architecture. Topics will include performance monitoring, data quality monitoring, and end-user monitoring. We’ll also cover the metrics you need, and how to acquire those metrics. Presenter: Paul Lappas - CEO & Co-Founder, intermix.io Paul is the CEO and Co-Founder of intermix.io. Intermix.io is a single dashboard that lets data engineers monitor their mission critical data flows. Paul hold multiple patents for cloud computing and performance analytics. ----- Recording: The event will be recorded and distributed afterwards with copies of the slides. Depending on availability, a livestream may be available during the event itself for those registered. There will be opportunity for up to two lightning talks of 5-10 minutes in length. If interested, please submit your topic to the event organizers.
- DataEngConf SF '18 (paid event w/ discount)
Join us for DataEngConf SF (http://dataengconf.com) - our own community-oriented technical conference that bridges the gap between data scientists, engineers & analysts - in San Francisco April 17 - 18, 2018. Members of our meetup can receive a 20% discount on tickets with code SFDE20 at http://dataengconf.com/tickets This year, our largest ever data event yet includes: • 2 days of insightful talks by 50+ leading data scientists and engineers from top teams at Lyft, Instacart, Facebook, Databricks, Stanford University, New Relic, WeWork, Pivotal, Citus Data, Clover Health, CoreOS & many more • Brand-new Data-oriented Startups Track (http://dataengconf.com/startups) featuring the technical stories of 20+ hand-picked startups doing novel things with data • Career networking opportunities at the conference, plus connect with speakers & attendees at our Tuesday night data community party between conference days • Small group Office Hours with speakers, and separate tracks that focus exclusively on data engineering and data science/analytics • Meet other attendees that are highly technical data scientists, engineers & analysts from top tech, media and finance companies around the Bay Area • Connect with companies at Sponsor Spotlight to discover their open data jobs and latest data tools Schedule: • Tue, Apr. 17: Day 1, opening keynote, track talks, Sponsor Spotlight gallery & conference after-party • Wed, Apr. 18: Day 2 & open & closing keynotes, track talks & Sponsor Spotlight gallery For full event information and to reserve your seat now visit: http://dataengconf.com
- AI in Recruiting & HR - Donut.ai, Mya, Spoke & Zenefits - Panel Discussion
Join us at New Relic on Thursday, April 12th, for a panel discussion about how AI is being used in recruiting and HR. Doors open at 6pm for networking, pizza, and soda. The panel, hosted by Rosalie Bartlett (Director of Community, H2O.ai), will start at 6:30pm and wrap up at 7:30pm, with audience Q&A to follow. Speakers: - James Maddox, CTO, Mya Mya (https://www.hiremya.com/) automates the process from resume to hire so you can cultivate and engage the best candidates. - Jay Srinivasan, Co-Founder & CEO, Spoke Spoke (https://www.askspoke.com/) is the simpler, smarter way to manage employee requests. Using design and AI, Spoke gives you more time to get things done and happier, more productive employees. - Stacey Nordwall, Sr. Global People Operations Manager, Culture Amp Culture Amp (https://www.cultureamp.com/) makes it easy to collect, understand and act on employee feedback. Stacey will be sharing how her team uses Donut.ai (https://www.donut.ai/). Donut encourages trust, collaboration, and goodwill across your team and organization. - Alyssa Kwan, Senior Data Engineer, Zenefits Zenefits (https://www.zenefits.com/) gives your team a single place to manage all of your HR needs - payroll, benefits, compliance, and more. Thank you to New Relic (https://www.newrelic.com/) and DataEngConf (http://www.dataengconf.com/) for their help with this event. Hope to see you there! Team H2O.ai https://www.h2o.ai - - DataEngConf (http://www.dataengconf.com/) is a 2-day conference that bridges the gap between data scientists, data engineers and data analysts. It features 3 tracks (Data Science, Data Engineering, Data Startups) and 50+ deeply technical talks from leading data scientists and engineers at companies like Facebook, Lyft, Instacart, Netflix, Airbnb, WeWork, Databricks, Stitch Fix, Stripe + 20 early-stage startups doing amazing things with data. Use code H2OVIP when you buy tickets for 25% OFF! New Relic (https://www.newrelic.com/) provides the real-time insights that software-driven businesses need to innovate faster. New Relic’s cloud platform makes every aspect of modern software and infrastructure observable, so companies can find and fix problems faster, build high-performing DevOps teams, and speed up transformation projects. Learn why more than 50% of the Fortune 100 trust New Relic at newrelic.com. - - - [Speakers] - Jay Srinivasan, Co-Founder and CEO, Spoke In 2014, Jay sold his previous company, Appurify, to Google, where he led product management for developer testing tools. - James Maddox, CTO, Mya Prior to Mya, James co-founded another HR startup, FirstJob, a job board specializing in entry-level jobs. - Alyssa Kwan, Senior Data Engineer, Zenefits Alyssa builds out data product, infrastructure, and teams at early-stage startups, establishing both technical and human processes. - Stacey Nordwall, Sr. Global People Operations Manager, Culture Amp Stacey has scaled Culture Amp’s onboarding program through 7x growth, and is responsible for Culture Amp’s people operations globally.
- Miners Aren't Your Friends + In-Depth Blockchain Analytics
Agenda 6:00 Networking 6:30 Speakers • Talk 1: Understanding The Ethereum Network – In-Depth Blockchain Metrics and Analytics (Shawn Douglass, Amberdata) • Talk 2: Miners Aren't Your Friends – Issues at the Mining and Consensus Layer (James Prestwich, Integral) 8:00 End ----------------------------------- This in-depth look at data in the blockchain is being done in conjunction with our friends at SF Ethereum Developers meetup and DataEngConf SF (April 17-18). See below for details on the conference. ----------------------------------- Talk 1 – Understanding The Ethereum Network – In-Depth Blockchain Metrics and Analytics Shawn Douglass from Amberdata will provide an analysis and share insights into the Ethereum Network with specific emphasis on network integrity, limitations, transaction throughput, and the most active DApps. About the Speaker Shawn Douglass is CEO of Amberdata, a platform for monitoring, searching, analyzing public and private blockchain. Prior to founding Amberdata, Mr. Douglass has been a cloud visionary and key contributor to the emerging enterprise cloud operating model for over a decade. He has held roles as Board Member, Operating Executive, Technologist, Advisor, and Investor. Mr. Douglass is a graduate of Harvard Business School. (http://amberdata.io/) ----------------------------------- Talk 2 – Miners Aren't Your Friends – Issues at the Mining and Consensus Layer James Prestwich from Integral will provide an in-depth look at how the consensus layer – and miners actions and motivations in particular – can adversely affect decentralized applications. Developers like to believe in a consensus layer that takes care of all the hard distributed systems problems and lets them write applications. Miners live in the consensus layer, doing whatever it is miners do. But miners aren’t your friend. James talk about how miners can use the EVM to take money from your Dapps and from your users, why you can't stop them, and what we can do to fix it. Coding experience recommended but not required. About the Speaker James Prestwich is a developer, ethical software activist, and curry enthusiast. With a BA in Japanese, he works on consensus system design, smart contract security, and cross-chain contracts. James co-founded Storj and served as COO/CFO for several years. Currently, he is the founder and CEO of Integral. ----------------------------------- DataEngConf SF The 3rd annual DataEngConf SF will be held on April 17-18. Join our Bay Area data community at a special discount rate. This year’s DataEngConf SF will feature: • Deeply technical talks by 50+ speakers from companies like Facebook, Instacart, Lyft and many more top Bay Area data teams • Partners like Clover Health and Pivotal Software who will be scouting for the best data talent • Fun networking, including everyone’s favorite data community after-party • A dedicated track featuring data-oriented blockchain companies such as Amberdata and NuCypher • Special Office Hours with dozens of speakers Members can get a special discount of 20% off conference tickets using code SFDE20 at http://dataengconf.com
- Measuring Product Launches and Deploying TensorFlow Models @ Facebook HQ
Please join us for our first SF Data Engineering Meetup of the year on March 8th from 6-8pm at Facebook HQ, 1 Hacker Way, Menlo Park 94025. The agenda and speaker information can be found below. I hope to see you there! Parking and Check-in Instructions: If you are driving to MPK, please park in any open parking spot around Building 12 which will be on your left as you enter our campus. If you are using Lyft/Uber, you may have your driver pull directly up to Building 12 to check-in. Please check-in with security at Building 12. Please tell them that you are here for the Data Engineering Meetup. When you sign-in on the iPads, please report that you are here to see Kyle Schmidt. ------ Schedule: 6:00pm - Doors open for Networking, Food, and Drinks 6:30pm - Brad Ruderman 7:15pm - Chris Fregly ------ Speaker: Brad Ruderman Bio: Brad is a Data Engineering manager at Facebook working on Facebook’s core app & features, where he enjoys building on every part of the data stack. He most recently worked on the launch of Messenger Kids, a new app for families to connect. Previously Brad has worked in roles spanning from strategy consulting to software engineering. Most recently he worked on everything data at UpCounsel, a b2b startup in San Francisco, including streaming ETL to reporting + analytics to data visualization to product development, integrations, and most importantly growth. His experience includes working on all stacks of the warehouse from logging to insights. Title: Building a Product From 0 to 1, or how to tell if the startup you joined is going to survive Abstract: At Facebook we ship new features every day. We treat each feature as a product, ensuring the product has a fit within a market, the person using the product receives meaningful value, and it improves the experience of using Facebook. This talk will focus on the key data & metrics to track when launching any product in any business, how to ensure your product is successful including monitoring your improvement over time, and finally some sql tips & tricks to handle this type of analysis at scale. ------ Speaker: Chris Fregly Bio: Chris Fregly is Founder and Engineer at PipelineAI, a Streaming Machine Learning and Artificial Intelligence Startup based in San Francisco. He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, "High Performance TensorFlow in Production." Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco. Title: Deploy Serverless TensorFlow Models using Kubernetes, OpenFaaS, GPUs, PipelineAI Abstract: Applying my Netflix experience to a real-world problem in the ML and AI world, I will demonstrate a full-featured, open-source, end-to-end TensorFlow Model Training and Deployment System using the latest advancements with TensorFlow, Kubernetes, OpenFaaS, GPUs, and PipelineAI. In addition to training and hyper-parameter tuning, our model deployment pipeline will include continuous canary deployments of our TensorFlow Models into a live, hybrid-cloud production environment. This is the holy grail of data science - rapid and safe experiments of ML / AI models directly in production. Following the famous Netflix Culture that encourages "Freedom and Responsibility", I use this talk to demonstrate how Data Scientists can use PipelineAI to safely deploy their ML / AI pipelines into production using live data. Offline, batch training and validation is for the slow and weak. Online, real-time training and validation on live production data is for the fast and strong. ------ Special thanks to Facebook (https://www.facebook.com/careers/) for hosting us as well as the Facebook data engineering coordinators: Justin Gourley, Prasad Bhandarkar, and Faviola Remulla.
- DataEngConf SF '17 (paid event w/ discount)
Join us for DataEngConf (http://bit.ly/2kMB1mW) - the highest quality technical conference that bridges the gap between data scientists, data engineers, & data analysts - in San Francisco April 26 - 28, 2017. Members of our meetup get a 20% discount on tickets with code SFDE20x (http://bit.ly/2kMB1mW). Our most exciting data event yet includes: • 2 days of insightful talks by 30+ leading data scientists and engineers from top teams at Netflix, Airbnb, Facebook, Instacart, Clover Health, Metamarkets, InfluxDB, Coinbase & many more • 1 day of hands-on, pre-conference workshops that provide expert instruction by top trainers from companies like Insight Data Science, Airbnb, Metis & others • Extensive networking opportunities at the conference, or connect with speakers & attendees at our Thursday night after-party between conference days • Small group Office Hours with speakers and separate tracks that focus exclusively on data engineering and data science • Attendees that are highly technical data scientists, engineers & analysts from top tech, media and finance companies around the SF Bay area • Connect with companies at Sponsor Spotlight to discover their latest products and available data jobs Schedule: • Wednesday, April 26: Workshop Day • Thursday, April 27: Conference Day 1, opening keynote & conference after-party • Friday, April 28: Conference Day 2 & closing keynote For full event information, visit: http://dataengconf.com (http://bit.ly/2kMB1mW)