Databasing Without the Database


Details
External registration required at nyhackr.org.
We're back to in-person and virtual with base R acolyte Jeff Ryan talking about how much you can do on your computer without needing a database.
Thank you to Two Sigma for hosting us.
Everybody attending must RSVP through the registration form at nyhackr.org. There is a charge for in-person and virtual tickets are free.
After the talk we will randomly select two attendees (both in-person and virtual) to receive free tickets to The New York R Conference taking place July 11-14.
About the Talk:
Most data analysis doesn't require "boiling the ocean". From the beginning, data structures, file systems and memory mapping have allowed for low cost and scalable ways to do fast analysis on domain specific data. The "trick" is recognizing the algorithms and structures that most databases employ are already available in your favorite language. In this talk I will pull the curtain back on the important parts of a database, show how these are not magic but merely code and ideas, and then show how you can build high performance data layers that are problem specific - no longer requiring you to rely on the limits of a general purpose solution. I'll introduce basic database structure and design, how they work and how they get their speed. I'll then expand on these ideas using R, including examples from cosmological data, a few examples of rolling your own custom structures, and an introduction to a new(er) R package called “indexing”, which is an out-of-core data.frame able to work with very large data sets efficiently outside of RAM using syntax that looks like R---because it is.
About Jeff:
Jeff has spent the last two decades working in quantitative finance. Beginning a career as an option market maker on the floor of the CBOE, he went on to build quantitative software for trading. Releasing a series of packages for the R programming language in the mid 2000s, this software became the foundation of many future tools and efforts adopted across the industry. After giving a talk at an early quant conference in Switzerland he decided that a conference in the United States was required. He co-founded R/Finance in Chicago in 2009 and with its success, an amazing community around better software thrived. During this time, Jeff was actively consulting to technology driven hedge funds and proprietary trading firms. He was also laying the groundwork for multiple startups around data and time series management. His work in the field is referenced in more than 100 books, journals and courses around the world. In 2013 he joined Citadel’s then new Quantitative Strategies desk---where he was responsible for many of the early research tools around data processing, alpha validation, risk management and high performance computing. He left Citadel in 2019, after watching the desk grow from its first trade to become one of the most successful quant teams in the world. Reemerging from a garden leave in Summer of 2020, he is once again back to building the tools he hopes will power the next twenty years of quantitative trading.
The venue doors open at 5:30 PM America/New_York where we will resume enjoying pizza together (we encourage the virtual audience to have pizza as well). The talk, and livestream, begins at 6:00 PM America/New_York.
Remember, register at nyhackr.org.
COVID-19 safety measures

Databasing Without the Database