MapReduce for Astronomical Applications
I will describe a distributed version of the Friends-of-Friends technique, which is an astronomical application for identifying clusters of galaxies. This version is based on a map-reduce "wrapper," which distributes a set of galaxies among multiple cores, runs a sequential Friends-of-Friends algorithm on each core, and then merges the results. It can process tens of billions of galaxies, which makes it sufficiently powerful for all modern astronomy data sets.
Dmitriy Ryaboy and Ashutosh Chauhan
Apache Pig implements a data processing language for analyzing large datasets using Hadoop. In this talk we will introduce Pig and go over some advanced features such as streaming, a variety of available join algorithms, and columnar storage. We will also discuss ongoing work in query plan optimization, as well as some ideas for future research directions.