Open Source & Large Scale Data Science


Details
Rob Vesse (Software Engineer, Cray Inc) will be discussing open source technologies for data science on high performance systems (Spark, Hadoop, PyData ecosystem, containers, etc), focusing on some of the implementation and scaling challenges they face.
Jacob Tomlinson (Lead Engineer, Met Office Informatics Lab) will be discussing large-scale distributed python (using a combination of AWS and an in-house cluster).
About Cray Inc.:
https://www.cray.com
Cray is the global supercomputing technology leader and has its EMEA Headquarters here in Bristol. For over 40 years they have been developing highly advanced computing solutions for the world’s most complex science, engineering and analytics challenges. Ever since they introduced the world’s first supercomputer in 1976, their technologies have helped solve today’s problems and made tomorrow’s questions possible.

Open Source & Large Scale Data Science