March 28, 2013 · 6:00 PM
This location is shown only to members
6:00 - 6:30 PM - Food, Drinks & Networking
6:30 - 6:35 PM - Announcements
6:35 - 7:15 PM - Basic Concepts
7:15 - 7:20 PM - Break
7:20 - 8:40 PM - Main Speaker
8:40 - 8:55 PM - Door Prize Drawings
MAIN TOPIC ABSTRACT
Mining knowledge from unstructured text using the Python Natural Language Tool Kit
How to work with unstructured text and build ideas about what all that text means. Using the python natural Language tool kit we will look at texts as bags of words, build histograms, and look at parts of speech tagging. We will also briefly apply a statistical methods to text to extract topics for document comparison, and document grouping. Mark will demonstrate code that uses topic extraction to improve technical writing for an expert audience with google or bing as a sounding board for your ideas.
MAIN SPEAKER BIOGRAPHY
Mark Menkhus isa Software Engineer in HP's Software Security Response Team and uses python for evaluating code quality.
BASIC CONCEPTS ABSTRACT
Using Python for Data Logistics
Big Data and Data Science projects looking to reduce the risks, costs, and nightmares associated with managing dozens of data feeds have discovered the ETL (Extract, Transform, Load) product category. But there's no such thing as a silver bullet, and while there are practices and lessons to be learned from ETL, the tools are mostly the legacy of early 90s thinking in which data feeds were fewer, the alternative were COBOL or C, and writing code was deemed risky by DBAs and management. Ken will show how a high-level language like Python, when matched with certain practices and design patterns can offer a very successful alternative to these diagram-driven development tools. The discussion will focus on concepts, designs and patterns, and will include examples of successes and failures with a small amount of code.
BASIC CONCEPTS SPEAKER BIOGRAPHY
Ken Farmer is a data architect at IBM responsible for their Security Data Warehouse where he has used Python extensively for systems management, general data management, ETL, and analytics. He writes about data management at www.ken-far.com, and writes data analysis tools like DataGristle for fun on the side.
Website Sponsor: Homeland Security Careers
Food Sponsor: TEKSystems
Door Prize Sponsors: Jetbrains - Software license (Several products to choose from)
Book Sponsor: OReilly Publishing - Technical books