March 7, 2013
That 90% of most Data Science project effort will be spent on Data Logistics - its filthy underbelly. Get it wrong and you've got bad data, unacceptably-high latencies, and unaffordable maintenance. Get it right and you're ready for the fun part of the job. Ignore it and just focus on the fun part and you're putting the cart in front of the horse.
I'm a data geek at ProtectWise where I am building, feeding, and running a Hadoop Impala cluster for data analysis.