Java and Web Scraping: Collecting Internet Data

Java and Web Scraping: Collecting Internet Data 

Java is often thought of as a stuffy enterprise language, while web scraping is the often-murky domain of scripting languages. By combining the robustness and extensibility of Java with the flexibility and power of web scraping, we can create immensely useful tools that can solve very difficult problems.

Instant Web Scraping with Java will guide you, step by step, through setting up your Java environment. You will also learn how to write simple web scrapers and distributed networks of crawlers. Throughout the book, we will provide useful tips, out-of-the-box working code, and additional resources to build expert knowledge.

Instant Web Scraping with Java will teach how to build your own web scrapers using real-world scraping examples that collect and store data from Wikipedia, public records data sites, IP address geolocation services, and more. You will learn how to run scrapers across multiple servers, run them in parallel, and subvert common methods of anti-scraper security used on modern websites. This book will also provide you with detailed step-by-step instructions, out-of-the-box working code, and expert pointers to further resources on key topics.

Instant Web Scraping with Java will show you how to view and collect any Internet data at the speed of your processor!

See Ryan's book at  http://www.packtpub.com/web-scraping-with-java/book 

Ryan Mitchell

Ryan Mitchell has 10 years of programming experience, including Java, C, Perl, PHP, and Python. In addition to “traditional” programming, she specializes in web technologies, with 3 years of Drupal development experience, and is Sitecore developer certified.

Ryan graduated from Olin College of Engineering and is currently studying at the Harvard University Extension School for a Masters in Software Engineering. In addition to academic life, Ryan currently works at Velir Studios as a Web Systems Analyst, and has also worked as a developer for Harvard University, and Abine Inc.


Join or login to comment.

  • Ryan M.

    Hi All! Thanks for joining me tonight. As far as the non-working "header changing" code: I tried it out at home and the test ran successfully. Unfortunately I don't have any error logs to examine from the run during the presentation, so it's hard to tell what went wrong. Zipped presentation and Scraper.jar file (containing all libraries and source code needed -- I make no guarantees. For information purposes only!) is at http://javasaur.com/JavaWebScraping.zip Thanks again! See you next month!

    September 10, 2013

    • Ryan M.

      P.S. Send me an email at [masked] if you have any problems unpacking this or finding/running things

      September 10, 2013

    • Jasmine B.

      Thank you! great presentation!

      September 10, 2013

  • Bill H.

    excellent. Presenter was knowledgeable and well prepared

    September 10, 2013

  • Will J.

    Thanks Ryan for the informative instructions on the web scraping. Now we have our feet wet.

    September 10, 2013

  • Bruce W.

    Thanks to Ryan for the great presentation on web scraping. Good intro to jsoup, I briefly interrupted a session in TextMate, playing with jsoup, to write this comment!

    September 10, 2013

  • Rick U.

    Enjoyed the talk, Ryan!

    September 10, 2013

  • Steven B.

    He's right. You did a great job, Ryan!

    September 10, 2013

  • Mahesh A.

    Good presentation! I don't think I will ever have to do web scraping, but it was a great introduction.

    September 10, 2013

  • Bill H.

    Been doing web scraping in Java for about 6 years, it'll be interesting to learn more and share war stories.

    September 4, 2013

  • A former member
    A former member

    Hi, I come from Thailand and I'm english student at Boston.In Thailand I'm a software engineer and agile coach. Please let me know, Might I attend in this meetup.Your kind advise is greatly appreciated.

    August 24, 2013

Our Sponsors

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy