- Leveraging "Big Data" with Spark to Analyze Global News Events
**UPDATE** We're back in the very special CU Boulder space, CMCI Studio located in downtown Boulder at 13th and Walnut. Sandwiches will be provided. Jason Hill, Senior Data Scientist at Rally Software, will lead us through a quick walkthrough of indexing and pulling insights from big data. Using a 200gb corpus of global news event data (https://gdeltproject.org/), Jason will show us the speed and agility of a Spark-based approach.
- Cleaning and visualizing data the easy way using Tableau
**UPDATE** We just received approval to use a very special CU Boulder space, CMCI Studio located in downtown Boulder at 13th and Walnut. We also have a sponsor, so we're excited to be able to offer more of a spread for dinner! Tableau is an industry leader in data visualization. The software (available on Mac and PC) offers an intuitive and powerful way to merge, segment and tell stories with all types of datasets. Trina Arnett is an adjunct instructor of advertising strategy at CU Boulder and founder of her own analytics firm, trinalytics. In this hands-on walkthrough, Trina will show how Tableau can conquer common data challenges and create visualizations beyond the traditional capacities of Excel and others. Please install Tableau and Tableau Prep before attending: https://www.tableau.com/products/desktop/download https://www.tableau.com/trial/tableau-prep Both versions come with a free 14 day trial for all users. Students and faculty will learn how to receive complimentary licensing for full functionality beyond the trial period. Sandwiches and drinks from Naked Lunch (https://www.nakedlunchcolorado.com/) in Boulder will be provided.
- Forecasting at Scale with Ben Letham from Facebook Data Science
Forecasting is a common data science task with many business applications, such as predicting sales, setting goals, and anomaly detection. However, it is also a specialized skill that lies outside the expertise of many data scientists. This leads to serious challenges in producing reliable forecasts when the number of forecasts to be made in an organization grows beyond the availability of time series experts. Ben Letham from Facebook will join us to talk about their recently-open-sourced Prophet forecasting package, which is designed to be flexible enough to handle a range of business time series while still being configurable by non-experts. We will discuss the details of the Bayesian model implemented in the package and will then do a hands-on demo of constructing a forecasting model in Python. Ben Letham is a research scientist on the Core Data Science team at Facebook. He develops methods for using machine learning to design and analyze field experiments, and turns these methods into tools that are used across the company. He also works on Facebook's open source time series forecasting software, Prophet. He joined Facebook after completing his PhD in operations research at MIT.
- We’re Entering the Golden Age of Sports Analytics
Today, analytics plays an important role in nearly every industry – including sports. Leveraged for everything from player recruitment to ticket pricing, from play tactics to health and safety, analytics are helping teams make better decisions on and off the field. Join sports analytics expert Dr. Dave Schrader as he discusses: • What's happening around the world to collect and analyze data for recruiting, player development, game planning, and injury prevention? • What kind of “big data” helps improve sports decisions? • How are analytics used to improve business operations – ticket pricing, sales, sponsorships? • What analytics do leading pro teams and leagues use for basketball, baseball, football, and soccer tactics? • What happening to use data to improve athlete health and safety? • How quickly are analytics being adopted at the college level? Who is leading? What are they doing? • How can other parts of the university - like the business school or stats or computer science departments - collaborate with sports programs to provide analytics for their teams? What projects are underway to do “Moneyball on Campus?” Dr. Dave Schrader retired after 31 years of experience at 3 high-tech database companies leading advanced development and marketing projects. He stayed on the Board of Directors for the Teradata University Network, the sponsor of his talks. While he continues to give business analytics and Big Data talks for schools, he’s finding sports analytics to be far more interesting to students. So far in 2017, he has given 45 talks at 20 schools to more than 1000 students, faculty, and coaches, and is currently sponsoring 5 student projects. In 2016 he gave 47 talks at 25 universities to 2700 students. He has presented at TDWI Munich and the INFORMs conference in Las Vegas on sports analytics. He holds a PhD in Computer Science from Purdue University.
- Scaling Up Data Visualization through Visual Cognition
Join us for some sandwiches at 5:00pm with Danielle's presentation starting at 5:30pm. Danielle is a knowledgeable colleague who's research has already affected industry visualization thinking. Large collections of data can provide valuable insight into problems across a wide range of domains. However, this data is often inaccessible by itself--there is simply too much information for people to process. This talk explores how information visualization can combine design, cognition, and computation to make data more accessible to people, discussing visualization techniques and systems that allow people to take advantage of their own expertise and visual capabilities to make sense of data at scale. I will discuss how people can makes sense of visual information at scale, how we can capture these processes in actionable design metrics, and how these metrics enable new interactive systems for data visualization that support more accurate and scalable exploratory analyses. These findings have led to new systems in domains including the literary studies, genetics, biochemistry, and defense that dramatically increase the scales of data people can leverage for knowledge generation and decision making. Danielle Albers Szafir is an Assistant Professor and member of the founding faculty of the Department of Information Science (http://colorado.edu/cmci/academics/information-science) at the University of Colorado Boulder. Her research sits at the intersection of information visualization, data science, and cognitive science. This work has been integrated into leading tools such as D3 and Tableau and has received best paper awards at IEEE VIS and IS&T Color and Imaging, an honorable mention for the VGTC Best Dissertation Award, and funding from the National Science Foundation and US Air Force. She received a B.S. in Computer Science at the University of Washington as a NASA Space Grant Scholar and a Ph.D. in Computer Sciences at the University of Wisconsin-Madison. Also she recently received the Best Paper award for her single-author paper, Modeling Color Difference for Visualization Design (http://cmci.colorado.edu/visualab/VisColors/) at InfoVis 2017. See her 'portrait' at EagerEyes (https://eagereyes.org/portrait/danielle-albers-szafir). Competitive ice hockey?
- Big Data Bootcamp - Sep 8th - 10th 2017 Denver CO
Global Big Data Conference is offering 3 day extensive bootcamp on Big Data. This is a fast paced, vendor agnostic, technical overview of the Big Data landscape. No prior knowledge of databases or programming is assumed. Big Data Bootcamp is targeted towards both technical and non-technical people who want to understand the emerging world of Big Data, with a specific focus on Spark, NoSQL(Cassandra, MongoDB etc..), Real Time Streaming (Spark Streaming, Kafka etc..), Hadoop, Architecting Big Data Platform, Blockchain, Machine Learning, Deep Learning, NLP , Tensorflow & Data Science. Attendees will experience real Hadoop clusters and the latest Hadoop distributions. Attendees will receive Big Data Certificate 2017 after attending 3 days workshop & sessions. Meetup Members will receive $200 discount by using promotional code MEETUP and register at below URL: http://www.globalbigdataconference.com/denver/big-data-bootcamp/attendee-registration- 90.html Agenda: http://www.globalbigdataconference.com/denver/big-data-bootcamp/schedule-90.html Confirmed Speakers: http://www.globalbigdataconference.com/denver/big-data-bootcamp/speakers-90.html Overview: http://www.globalbigdataconference.com/denver/big-data-bootcamp/event-90.html
- Hands-on Workshop: Predictive Analytics for Everyone
Dear members: We are doing a repeat of this workshop after the January meeting filled up in 48 hours and we received amazing feedback. Dan Becker of DataRobot did a great job placing analytics within a business process framework and taking attendees through a gentle tour-de-force of predictive analytics. It shouldn't matter what level of predictive analytics experience you have. You will learn a lot this night! We'll start with some food, networking, and account setup at 5pm. The workshop starts at 5:30pm sharp. Bring a laptop. Please do not sign up unless you are committed to attending for the whole evening as this will be a popular session. Best, Kai Workshop description: Most aspiring data scientists are told they need to learn dozens of algorithms and years worth of specialized math before they can gain hands-on experience. In this workshop, you will jump into practical real-world experience with an innovative data science tool that automates algorithm implementation. Both beginner and advanced data scientists will learn new ways to derive insights from even the most complex models, and you will build a better model in one night than what most data scientists could do with a month of manual effort. Instructor: Dan is the Technical Product Director at DataRobot. He has broad data science expertise, with consulting experience for 6 companies from the Fortune 100, a 2nd place finish in Kaggle's $3million Heritage Health Prize, and contributions to the Keras and Tensorflow libraries for deep learning. Dan developed the training materials for DataRobot University, and he has led DataRobot's Product team through a period of extreme growth. Dan has a PhD in Econometrics from the University of Virginia.
- Internet-of-Things Analytics - Building Complex Systems That Never Break!
Boulder's own industry analyst, Richard Hackathorn is sharing his extensive knowledge of the Internet of Things (IoT). Richard provides an introduction to the topic as well as several case studies of how major companies are using IoT to improve their bottom lines. Looking forward to seeing you at the Leeds School again. Kai :-) Abstract: The focus is: Why should business managers care about IoT? The use cases for IoT applications by eight companies across several industries illustrate that there is a spectrum of business motivations, along with trade-offs and challenges. By combining data warehousing and data analytics with IoT applications, companies can leverage existing IT technology to exploit new business opportunities. The study concludes with recommendations for managers to assimilating IoT applications into their IT architecture and corporate culture. Biography: Richard Hackathorn is well-known industry analyst in business intelligence, technology innovator, and international educator. For 13 years, Richard was a professor at the Wharton School and at the University of Colorado, Leeds School of Business. In 1982, he founded in Boulder MicroDecisionware, an early vendor of decision support and database connectivity products, which was acquired by Sybase (now SAP) in 1994. Since then, he has conducted numerous studies with large corporations into technology trends for database warehousing, business intelligence, and recently business analytics. He received his BS from California Institute of Technology and MS/PhD from University of California, Irvine. This talk is based on a recent study, sponsored by Teradata Corporation, titled "Business Values from the Analytics of Things," May 2016. It can be download from http://bit.ly/BizValue-AoT , along with a Forbes article at http://bit.ly/AoT-Forbes .
- Hands-on Workshop: Site Location Analysis using Consumer Behavior Data
Dear members: Here is a workshop I requested from our local and not so local friends at Alteryx. Jessica will present through videoconference and Criston will be in the room to help everyone follow along. Some very cool pieces added together here. Our first teleconference workshop, so should be interesting. This is an intermediary topic. Participants should have a sense of joins, group bys, and at least a general sense of data science. Come at 5pm for food, networking, and lab logins. Workshop starts at 5:30pm. -- Kai :-) We will be building out a use case to look for a new location for a Humane Society adoption center in the greater Denver Area. Outline: 1. Start with each participant matching themselves into the Experian Database to explore the demographics and psychographics that Experian has recorded for them and get an overview of the data available. Focus on Experian’s Mosaic Segmentation and explore what information is associated with their segment. 2. Bring in a sample customer file for the Humane Society and aggregate the file to group by Mosaic Segment, then calculate metrics for each segment. For example, determine the count and percentage of customers that belong to each segment, further identify the average donation made by each segment, and the total donations received from each segment: a. Do a quick analysis to determine which segments are the ‘top’ segments for the shelter, so that they can investigate locations in an area with high counts of their ‘top segments’ 3. Run an association analysis on the target customers' attributes against the general population in Denver, to determine which other variables are most important to look for when selecting a new location. 4. Bring in a spatial file of the locations Humane Society is considering throughout Denver: a. Use Spatial Tools to create drive time around each location to simulate trade area b. Eliminate locations too close to existing shelters c. Use Allocate Append to append the Mosaic counts and variables identified in previous steps to each trade area 5. Analyze and rank the locations based on preferable counts of top Mosaic Segments and key variables. Criston Schellenger and Jessica Silveri work on the Customer Support Engineering team at Alteryx. Their role is to ensure success within Alteryx by providing technical support and guidance, and to help customers find creative ways to implement their ideas. Criston has a degree in Mathematics, and a background in non-profit work, since she loves using her powers for good instead of evil. She has been using Alteryx for almost five years, and truly enjoys helping other users be successful with the software. Criston’s area of expertise around Alteryx is in predictive analytics. Jessica has a degree in Geographic Information Science, and comes from a background in spatial analytics and consulting. Jessica has been using Alteryx for 3 years and her area of expertise around Alteryx is in behavioral analysis.
- Hands-on Workshop: Using NLP to understand your text