Cambridge Search Meetup - Crawling & Scraping
Details
Our first talk will be on 'Hidden Webs' by Harry Way, CTO of Arachnys (http://www.arachnys.com) :
"For the majority of searches over the open web, Google and Bing are great at finding what user want: but for business intelligence there's a need to narrow the search domain and improve precision. Harry will show how emerging market specialist Arachnys approach this problem with custom web crawling, processing and search."
Our second speaker is Shane Evans, former head of development at mydeco and http://lastminute.com , and now founder of Scrapinghub (http://scrapinghub.com/) and co-creator of Scrapy (http://www.scrapy.org) - he'll be talking on 'Scrapy - a flexible crawler to power your search':
"Scrapy is a popular high-level scraping & web crawling framework. This talk will provide an overview of the Scrapy architecture, show how to write custom crawlers and discuss some of the challenges of scraping content for search engines." We're big fans of Scrapy at Flax and we're using it in several projects.
.
