In this meetup we will integrate Storm and Apache Cassandra to build a distributed web crawling system.
Storm allows you to build long running, distributed, services that scale and offer processing guarantee. While it holds some simple state in process, Storm usually relies on a third party datastore to store its results. Storm is used successfully by many companies (often using Cassandra) and was recently accepted into the Apache incubator.
I will cover the basics of Storm in the context of a simple web crawling system that relies of Cassandra to store its metadata and the web content.
About the Presenter:
Jake Luciani is a Apache Cassandra Committer and PMC member. You can follow him at http://twitter.com/tjake