What we're about

Data Science DC is a non-profit professional group that meets monthly to discuss diverse topics in predictive analytics, applied machine learning, statistical modeling, open data, and data visualization. Our members are professionals, students, and others with a deep interest in these fields and related technologies. Meeting topics are varied and range from tutorials on basic concepts and their applications, to success stories from local practitioners, to discussions of tools, new technologies, and best practices. All are welcome -- to attend, to meet others, and to present their work!

Data Science DC is a Program of Data Community DC, Inc.

Upcoming events (1)

Using Kafka and Pinot for Real-Time, User-Facing Analytics

Apache Kafka is the de facto standard for real-time event streaming, but what do you do if you want to perform user-facing, ad-hoc, real-time analytics too? That's a hard problem. Apache Pinot solves it, and the two together are like chocolate and peanut butter, peaches and cream, and Steve Rogers and Peggy Carter. Come to this talk for an introduction to both systems and a view of how they work together. Tim Berglund is a teacher, author, and technology leader with Confluent, where he serves as the Senior Director of Developer Advocacy. He can frequently be found speaking at conferences in the United States and all over the world. He is the co-presenter of various training videos on topics ranging from Git to Distributed Systems to Apache Kafka. He tweets as @tlberglund, blogs very occasionally at http://timberglund.com, and lives in Littleton, CO, USA. Neha Pawar is a Founding Engineer at a Stealth Mode Startup. Prior to this, she worked at LinkedIn as a Senior Software Engineer in the Data Analytics Infrastructure org. Neha is an Apache Pinot PMC and Committer & has made numerous impactful contributions to the Apache Pinot project. She actively fosters the growing Apache Pinot community & loves to evangelize Apache Pinot in the form of blogs, video tutorials, speaking in meetups and conferences. You can find her on Twitter at @nehapawar18. About Apache Pinot: Pinot is a real-time distributed OLAP datastore, built to deliver scalable real-time analytics with low latency. It can ingest from batch data sources (such as Hadoop HDFS, Amazon S3, Azure ADLS, Google Cloud Storage) as well as stream data sources (such as Apache Kafka). Pinot was built by engineers at LinkedIn and Uber and is designed to scale up and out with no upper bound. Performance always remains constant based on the size of your cluster and an expected query per second (QPS) threshold. Brought to you by Confluent, Pinot, and Data Science DC, part of Data Community DC. Come join us to learn how to give users real-time analytics! 🎉

Past events (111)

[ONLINE] Managing AI Liabilities

Online event

Photos (381)