addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1linklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

NodeJS: Building a fast web scraper and parallelized work queue with async

  • Jun 18, 2014 · 6:00 PM
  • This location is no longer available

Join us at 6:00pm this Wednesday (June 18) at the Center for Open Science as Michael Holroyd (, Arqball and Knollop) walks through two of his recent projects built with nodejs and async.

This meetup combines two previously proposed node topics into one.

• 6:00pm   Pizza

• 7:00pm   Presentations


Async makes working with asynchronous JS a breeze, providing powerful functions like .auto, .series, .parallel and .map. 

We'll look at two different applications that use to do as much work in parallel as possible (and reduce callback nesting):

Very fast web scrapers (async+reques­t+cheerio)

When Knollop needed to use data from sites without a public API, it was time to get scraping. Request is a library for super-easy http requests, and cheerio is a high performance DOM parser that uses JQuery-style selectors.

Arqspin transcoder overhaul (async+beanstalkd+foreverjs+logging)

The new transcoder uses beanstalkd as a work-queue, and has several small worker programs that pop jobs off one tube, do some work, and put them in the next one.  The whole thing is managed by foreverjs and a real-time logging system.

Join or login to comment.

Our Sponsors

  • Locus Health

    The key to transforming healthcare is patient understanding.

  • ChartIQ

    Charting and data visualization solutions for capital markets.

  • Arqball

    Arqball is a small research and development lab in Charlottesville, VA.

  • Tech Dynamism

    Expert-Driven IT Services Help Your Business Run Better.

  • ENSCO, Inc.

    Autonomous monitoring & web data management for the railroad industry.

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy