|Sent on:||Friday, September 21, 2012 12:07 AM|
I've been using Scrapy for my Web scraping projects and it's very tunable. Can change user agents, set delays between requests, set or even disable cookies, and a multitude of other options . You can export to CSV, JSON and other formats as well as db connectors, though I've not gotten that fancy yet. It includes an interactive shell so you can pretty easily work out the Xpaths without having to recode so usually works the first time you run the spider. It's also Python so it's super easy to setup.
Off Topicdon't abuse the mailing list, its a great resource and should be used when absolutely needed.Intentionim building a health insurance rate request application the requires the following parameters:zip, gender, ageWith these args i want to retrieve near accurate rates for various insurance companies
Web Services?unfortunately we are still in the eve of the 21'st century and people are still using file cabinets!Does anyone tap into health insurance web services?I've tried contactingeHealthit seems they are pretty preoccupied with their corporate customerseligibleAPIthey require to many parameters and seem geared towards hospitals and insurance agents.So if you know of service that I can send the following paremteScrape time!Additionally lets say i wanted the rates from the following:after having a gander at the source thier is csrf protection in place.using curl with an htmldom parser should get me to the rates im looking for.What is the best method of mimicking a legit user from a browser via curl to trick csrf into thinking imreal. I undestand curl has a cookiejar but i was curious if someone had a gist of implementation they could point me to,Thanks!
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Joseph Persie ([address removed]) from The Orlando PHP User Group.
To learn more about Joseph Persie, visit his/her member profile
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages
Meetup, PO Box 4668 #37895 New York, New York[masked] | [address removed]