Effective Data & Time Series Analysis with Deedle (from BlueMountain)
Deedle (http://bluemountaincapital.github.io/Deedle) is a new open-source library for data and time series manipulation. It supports wide range of operations such as slicing, joining and aligning, handling of missing values, grouping and aggregation, statistics and more. Deedle is written in F#, but it also provides an easy to use C# interface.
In this talk, we’ll demonstrate Deedle by analyzing time series financial data from historical Yahoo stock prices. Then we’ll try to understand the US government spending with data from Freebase and WorldBank and finally, we’ll explore the Titanic survivals data set.
Although the main focus of the talk is the Deedle library, we will demonstrate the entire “data science” stack for F# along the way. We will:
· Use F# Data type providers to fetch data from the internet
· Align, process and explore data using Deedle
· Perform advanced statistical computations using the R type provider
· Visualize data using R’s ggplot and F# Charting
Tomas Petricek is a long time F# enthusiast and a founding member of the F# Software Foundation. He has been a Microsoft C# MVP since 2004, and together with Jon Skeet wrote Real-world Functional Programming which explains basic functional concepts using C# 3.0 (teaching F# alongside). He also contributed to the development of F# during two internships at Microsoft Research in Cambridge. Tomas is a PhD student working on functional programming languages, but he spent last 3 months as an intern at BlueMountain Capital.
Howard Mansell is the head of Core Quantitative Strategies at BlueMountain Capital, and was previously the CTO of Quantitative Strategies at Credit Suisse and a Core Strategist at Goldman Sachs. Howard ran the F# initiative at Credit Suisse, rolling out F# to over a 100 Quantitative Strategists. He has worked with a variety of functional and dysfunctional programming languages and frameworks. At BlueMountain, Howard is responsible for the research framework, FinLab, which includes the open source technologies described in this talk.