• Joint meetup with Chicago Women in Big Data

    Join us for a joint meetup with Chicago Women in Big Data! https://www.meetup.com/Chicago-Women-in-Big-Data/ Thank You IBM for sponsoring drinks and food, and Thank You to Gogo, Inc for providing space so close to Union & Ogilvie stations! We will have scheduled talks on R, talks from Chicago Women in Big Data, and an Open Mic (or perhaps as it will come to be known in the R community, kaRaoke). Evolving schedule includes: Developing credit risk models for recommendation engines by Helen Ristov, Manager of Data and Insights & Machine Learning Developer at Capgemeni. Analysis of Women's National Basketball Association (WNBA) shot distributions with R by Jonathan Mizel, Demand Planning Analyst at Gogo, Inc. Race for Data will be presented by Amanda Mordacq, Analyst at Gogo, Inc. A short analysis on race times from the recent JP Morgan Corporate Challenge, using R. We've had some really nice talks from 'walk ins' at the previous Open Mic and we are looking forward to the same! If you think you'd like to participate, bring your laptop or your talk on a thumb drive. Better yet, message us prior so we can make arrangements. Finally, in order to gain entrance to the building, you will need to register with both your first and last name.

  • Pre-R/Finance: Talks and an Open Mic!

    Jefferson Tap & Grille

    A night of talks from developers and users of the R Statistical Programming language – including potentially you! Food and drink, of various options will be provided by IBM. Doors open at 5:00 and we'll start building the speaker queue to get started at 5:30ish. UPDATE: We have a few early submissions already! - "R for Explainable Stock Price Predictions”, by Sou-Cheng Terrya Choi, PhD. - "James-Stein Estimation - An Unexpected Fact: Some R, some Stats, and some Finance", by Adam Ginensky, PhD. -"Try BLIS: Benchmarking competing Basic Linear Algebra Subprograms", by Justin Shea - And more! We will have a laptop and projector so you can bring slides rendered on .html or pdf, or even publish on https://rpubs.com/. Or, bring your own laptop, but compatibility beyond HDMI might be an issue so message us if your planning on doing so. Examples of topics to talk about: single functions you find useful, statistical methods, a live coding exercise, infrastructure, productivity tips, R coding philosophy, hardware choices, and more! Peripheral topics such as DB applications and other languages welcome too. R/Finance was born out of similar open source conversations 11 years ago, whose founders would go on to create some of the most used R packages in existence. But the way some tell it, it was first about collaboration among open source folks in an informal setting!

  • Big Data & Biking: Bike Lane Uprising | Chicago Women in Big Data + CRUG

    Big Data, Civics and Biking? YES! Join Chicago Women in Big Data and CRUG on March 20 to learn about Bike Lane Uprising: a cyclist-led civic tech platform Same event as this: https://www.meetup.com/Chicago-Women-in-Big-Data/events/259390991/ Speaker and BLU Founder, Christina Whitehouse, will share her founder story, how it work and some visual examples of things they've been able to do with the data they're collecting. https://www.bikelaneuprising.com/ Food, fun, networking, and opportunity to volunteer with Bike Lane Uprising and buy some sweet gear! And a big Chicago Women in Big Data announcement! Directions & Agenda Meetup at WeWork 1 W Monroe on March 20, 2019 from 5:30-7:30pm. 530: Networking; Bike Lane sales/volunteer sign up 6pm: Christina talk 7pm: Q&A, Bike Lane Uprising sales/volunteer sign up; networking 730: over Directions: We are located at 1 W Monroe St, right across from the CIBC Theater (where Hamilton is playing). Conveniently located off the Monroe-Red line and Monroe-Blue line stops. When your guests arrive at 1 W Monroe, they will ring the buzzer on the door that says "WeWork" on it, and from there they will take the elevator up to the 4th floor. Overview of Bike Lane Uprising: Bike Lane Uprising, is a cyclist-led civic tech platform. Our goal is to make cycling safer by making it easy to report bike lane obstructions. While many miles of bike lanes exist, they’re often blocked by vehicles that use them as free parking. By creating a central database of bike lane obstructions uploaded by members, we are able to highlight problem areas and trends surrounding bike lane violations. Bike Lane Uprising works with local organizations, city departments, and companies in an effort to prevent future bike lane obstructions. Cyclists from over 60 cities have also signed up and our work has gained local and national headlines. After launching Bike Lane Uprising in Chicago, over 7,000 bike lane obstructions have been submitted to our database in just over a year. 6,000 of those were recorded in Chicagoland area alone. From these submissions, we’ve identified over 1,000 repeat bike lane obstructors. We’ve also reached out directly to many companies, including rideshare services, corporations, and even construction companies. Our communications have led to changes in city infrastructure, company policies, as well as driver education. Collectively, we have the tools to make cities safer for cyclists, pedestrians, and drivers.

  • Hardware for Data Science: A CRUG|ChiPy joint production

    Metis Data Science

    Whatever your language, at some point the code is processed on Silica and metal. Considering ever growing data sets, what are the implications for working with the standard issue 8gb laptop, and what attributes should you consider in your next computer? Join us for a discussion on the latest in hardware for Data Science to find out! Hosted at Metis with pizza and drinks provided by the venerable IBM! Doors open at 5:00 pm, talks start at 5:30! Parfait Gasana will kick off the meetup by analyzing hardware benchmarking results from R and Python data analytics simulations. He will show how processing time and memory usage varies with hardware specs: available RAM, virtual memory, 32/64-bit architecture, OS type and version, number of cores, and core speed. Brian Peterson of Heymeyer trading + investments will crack open a desktop built for modeling high-frequency time series data and discuss its components. As someone that routinely process greater amounts of data than can be contained in RAM, he offers the practitioners point of view on hardware choices. Seth Carpenter of FHLBC, will discuss how and when to use Amazon Web Services (renting someone else’s hardware) and how to do so using Python. Justin Shea will discuss his recent custom build containing an AMD Threadripper 1950x processor. This newer chip contains 16 cores for parallel processing, offering greater performance than its Intel counterpart, at a fraction of the cost.

    2
  • Single Function Lightning Talks

    WeWork

    Happy New YeaR! Come join us and dive into the functional programming of R. We have a lineup of multiple speakers who will present 5-10 minute talks on the usefulness and applicability of a favorite function in R. Everything that exists in R is an object... Everything that happens in R is a function call... - John Chambers, co-creator of S programming language and R core member. Chase Clark - lengths (5-10 mins.), not length(). Nathan Frey - cat (5-10 mins.) Brandon Allen - merge & subset (10-15 mins.) Parfait Gasana - by (5-10 mins.) Dale Rosenthal - optim (5-10 mins.) Justin Shea - data.table::dcast & melt (5-10 mins.) Ray Buhr - future.apply::future_lapply (5-10 mins.) Brian Burns - pkgnet::functions() (10-15 mins.) Beverages and pizza will be served thanks to IBM! Doors open at 6 pm, and talks begin at 6:30ish.

    4
  • bRew and view: Trevor Hastie, Andrew Gelman, and Szilard Pfafka.

    Come join us for the first ever R-themed Brew and View! We are going to curate a few of our favorite talks into 20 minute clips, and then highlight key points and discuss afterward. Food and a drink provided by IBM! After that, open bar. ## Adam Ginensky presents: 'Gradient Boosting Machine Learning' by Trevor Hastie at H20.ai World, 2014 Trevor Hastie discusses trees, random forests, and boosting and how to implement them in R. Adam likes this video because it quickly develops the idea for ensemble methods, formalizes it, and shows one how to code it up. For those who want a deeper understanding of one of the top machine learning approaches in practice today, this talk is for you. ## Irena Kaplan presents: 'But When You Call Me Bayesian, I Know I’m Not the Only One' by Andrew Gelman at the New York R Conference, 2015. Andrew Gelman's work on Bayesian statistics and the STAN project is well known. Sit back and watch as Gelman discusses different approaches to Bayesian statistics in the context of political races of past. An excellent talk for those wishing to understand Bayesian Statistics in an interesting and entertaining manner. ## Justin Shea presents: 'No-Bullshit Data Science' by Szilard Pfafka at R/Finance, 2017. In his enlightening talks, Szilard has disbanded the myth of the superiority of "Big Data" tools as well "Deep learning" for most Data Science problems. Inviting us to look beyond the hype and focus on the practical application, he illustrates how R has what it takes not only to keep up, but to outperform, with bench-marked examples ranging from loading data to running machine learning models. For those focused on the practical and getting the job done, you'll love this talk.

  • HacktobeRfest @Haymarket

    HayMarket Brewery

    We've teamed up with R-Ladies Chicago this month to bring you a joint meetup worth toasting to: *HacktobeRfest* at Haymarket Brewery (graphic above courtesy of Anne-Corinne Carroll)! Special thanks to the incredibly supportive IBM for sponsoring this event. # Schedule 5:45-6:00-- NEW! Rtour for beginners 6-6:30 -- Food and drinks, Intros & Announcements 6:30-7:00 -- Speakers: Stephen Ziliak & Eleanor Chodroff 7:00-7:45 -- Beer Data Hack-a-thon 7:45-8:00 -- Lightning Style Presentations by Small Groups [Optional] 8:00-Until -- Q&A/Socializing/Networking/Hanging out Important to Know: This is a friendly joint meetup with R-Ladies, so please take a moment to review the R-Ladies Global code of conduct to get familiar with their organization: https://rladies.org/code-of-conduct/. # Speakers ## Eleanor Chodroff, Postdoctoral researcher at Northwestern University, Department of Linguistics Eleanor’s research focuses on the representation of speech and language in the mind. She became a beer-enthusiast as an exchange student in the Czech Republic and an R-enthusiast as a graduate student at Johns Hopkins. She will talk about the development of a Shiny app featuring Chicagoland breweries Chicagoland Breweries app: https://eleanorchodroff.com/apps/brews.html Code: https://github.com/echodroff/chicago-breweries Personal Website: https://www.eleanorchodroff.com/index.html ## Stephen Ziliak will introduce us to the contribution brewing has made to statistics with his latest paper "How large are your G-values?" forthcoming The American Statistician (Fall 2018). Home Page: https://blogs.roosevelt.edu/sziliak/ Book: https://www.amazon.com/Cult-Statistical-Significance-Economics-Cognition/dp/0472050079/ Guinnessometrics: Saving Science and Statistics With Beer http://www.chicagomag.com/Chicago-Magazine/The-312/February-2012/Guinnessometrics-Saving-Science-and-Statistics-With-Beer/ A crisis of validity has emerged from three related crises of science, that is, the crises of statistical significance and complete randomization, of replication, and of reproducibility. The ten principles of Guinnessometrics or G-values outlined here can help. Originally developed and market-tested by William S. Gosset aka “Student” in his job as Head Experimental Brewer at the Guinness Brewery in Dublin, Gosset’s economic and common sense approach to statistical inference and scientific method has been unwisely neglected. Stephen T. Ziliak is probably best known for his best selling and critically acclaimed book (with Deirdre McCloskey), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives, and numerous essays on "Guinnessometrics," that is, the scientific and statistical legacy of William S. Gosset aka "Student", Guinness’s Oxford-educated brewmaster. Ziliak is Professor of Economics and Faculty Member of the Social Justice Studies Program at Roosevelt University, Conjoint Professor of Business and Law at the University of Newcastle (Australia), Faculty Affiliate in the Graduate Program of Economics at Colorado State University, and Faculty Member of The Angiogenesis Foundation (Cambridge). In 2016 he was a lead contributing author of the ASA Statement on Statistical Significance and P-values, and he is currently co-editing a special issue of The American Statistician on Statistical Inference and Scientific Method in the World Beyond P < 0.05.

  • R + Postgres Joint Meetup

    Microsoft Technology Center

    Capturing the synergy of the Open Source movement, we bring together the popular analytical language and relational database for a joint meetup of the Chicago R User Group (CRUG) and PostgreSQL User Group (PUG)! Doors open at 6 PM with talks starting at 6:30. Pizza will be provided by Microsoft developer advocates! Our lineup includes an introduction and two speakers who will enlighten the use of application layer language with a powerful, backend database engine: Parfait Gasana, Co-Organizer and Data Analyst/Database Developer, Parfait will open the event with a brief introduction to R and Databases including the RPostgreSQL API, setting the stage for the speakers. Gene Leynes, Data Scientist at City Of Chicago Gene's work focuses on predictive analytics to help inform business decisions with advanced statistical techniques in many areas for the City of Chicago. He will discuss the City's database use case with "Practical Tips for Operationalizing R with Oracle and Postgres". Adam Dziedzic, PhD student at the University of Chicago, Department of Computer Science Adam's research centers on data analysis and databases. He worked on data loading and migration between diverse database systems, specializing in PostgreSQL. Adam has interned at Microsoft Research and Google in core database research and software development. His interest and coursework involves machine and deep learning. He will discuss "Database Loading and Migration for PostgreSQL and other databases".

    11
  • CRUG & ChiPy: The cross-over meetup of the year!

    Braintree Payments

    CRUG and ChiPy are holding their first ever joint R and Python meetup! We've got an R talk, we've got a Python talk, we've even got R AND Python talks, and we'll host a Round table discussion about R and Python! Door's open at 6, talks begin at 6:30 pm. Pizza and meeting space kindly provided by our event host Braintree Payments: The Simple Way to Get Paid. Closed captioning and beer generously provided by IBM. Thank you to both for making this meetup possible! **IMPORTANT NOTE: Braintree will ask for a photo ID at the door and the name must match your name on their list. Upon registration, please type in the name on your ID in the pop-up box prompt.** Below is our current lineup, and we may have a few more late additions as well. Each talk will be about 20 minutes in duration. ## Package mmap: Map Pages of Memory, by Jeff Ryan https://cran.r-project.org/web/packages/mmap/index.html The mmap package offers a cross-platform interface for R to information that resides on disk. As datasets grow, the finite limits of Random Access Memory constrain the ability to process larger-than-memory files efficiently. Memory mapped files ("mmap" files) leverage the operating system's demand-based paging infrastructure to move data from disk to memory as needed, and do it in a transparent and highly optimized way. This package implements a simple low-level interface to the related system calls, as well as provides a useful set of abstractions to make accessing data on disk consistent with R usage patterns. As new breakthroughs in data storage technologies crush disk read/write times, the mmap package stands to become even more valuable to analysts managing larger-than-memory data sets. ## Python EOgmaNeo Self-Driving Model Car: Powered by Raspberry Pi 3B+ and Arduino, by Yana Lustina https://github.com/ylustina/sdc-EOgmaNeo The program runs on the Pi, and performs all the online machine learning on the Pi's CPU. With the OgmaNeo library, there is no pre-training and no offloading to a more powerful machine. The network learns to predict the next command as you drive it. The car takes video from the camera and the steering angles as input, and uses a predictive hierarchy to predict the next desired steering angle. This greatly lowers the computational cost, as well as saves the time and work necessary for collecting data for other kinds of neural networks. Yana's ambitious work earned first place in ChiPy's competitive one-on-one Mentorship Program. ## Base R and Python Pandas, by Parfait Gasana https://stackoverflow.com/users/1422451/parfait This talk will compare Python Pandas and base R through import/wrangling/aggregation of XML, JSON, and SQL data. A glimpse of working with rpy2 (run R inside Python) will be offered as well. A very practical, interesting, and relevant talk to this joint meetup, Parfait will show us how both R and Python are great additions to a data professionals toolkit! ## Round-table Discussion A panel of experienced R and Python developers answer relevant questions about programming with data from our moderator and field a few from the audience too!

    9
  • Pre-R/Finance conference meetup: 5 awesome talks

    HayMarket Brewery

    In the spirit of the conference, this meetup will feature a series of short talks on practical applications with R! IBM will generously sponsor food and drinks at Haymarket Brewery, thank you IBM! Talks begin at about 6:00 pm with time for networking before and after. ## Ross Bennett: Predictive models and their applications in “high-ish” frequency finance. Ross will talk about how he approaches the overall process of building models, evaluating their performance, and integrating them into a trading strategy. While his process has been honed trading in the ultra-competitive futures markets, he will use crypto currency data for the application because it’s the hot topic right now and the data is freely accessible to anyone who wants to reproduce his examples. Ross is the quantitative analyst for a trading desk at a proprietary futures trading firm. He is the co-author of the PortfolioAnalytics R package, maintainer of the FinancialInstrument package, and contributes to several other R packages used in finance and trading, including the venerable xts. ## Joe Rickert, R Studio: Package Watching For over a year, Joe has been reviewing all of the new packages submitted to CRAN each month, selecting his "Top 40" * and blogging about them on R Views. In this talk, he will discuss his criteria and methodology, describe some trends he has noticed, and offer a few ideas on what makes a great R package. He will finish with brief demos of two new tools for studying R packages: Ioannis Kosmidis' cranly package and pkgnet from Brian Burns, James Lamb, Patrick Boueri], and Jay Qi. ## Ray Buhr: Time Series graphing in practice Ray is going to show you best practices for quickly and repeatedly producing good time series graphs. This includes how to make your own custom themes and color palettes, align multiple charts, and a discussion of packages which add interactivity to time series charts (like dygraphs). Ray Buhr is the Manager of Data Science at Pangea Money Transfer. He has a Masters of Information & Data Science from the University of California, Berkeley and is passionate about R programming, data architecture, and mentoring other data scientists. ## Troy Hernandez: Simulating March Madness in R March Madness has come and gone. Some won and some lost….Troy won. $500 to be precise. Using simulations to model potential risk is an interesting practice in finance, but using the concept to accurately predict March Madness outcomes is interesting too! Troy will show us how he used FiveThirtyEight.com's bracket estimations and discovered a significant discrepancy between their closed form solution and his simulations. The day after he posted the work on his personal blog*, FiveThirtyEight.com changed their estimates to match his and then rewrote their own history erasing their previous support for Virginia as National Champ! Did Troy’s blog post expose an error in their closed form solution? While the world may never know, you can know how to write a tournament simulator in R after watching his talk. Troy is an Executive Architect for IBM and has a PhD in statistics from University of Illinois – Chicago. Versed in many programming languages, Troy continues to find R to be the language of choice for machine learning and statistical work. *https://troyhernandez.com/2018/03/13/simulating-march-madness-in-r/ ## Dale Rosenthal: Preview of his upcoming book "A Quantitative Primer on Investments with R." In addition to interesting problems from the text, Dale will show us a current analysis of economic employment trends that promises to be eye-opening! Dale has been a co-organizer of the R/Finance conference since 2009. He holds a PhD in Statistics from the University of Chicago and has taught courses on market microstructure high frequency trading, at U of C and UIC. He has held analyst, researcher, and proprietary trading positions at Goldman Sachs, LTCM, and Morgan Stanley. https://sites.google.com/site/dalerosenthal/

    4