December 4, 2012
Text analytics manager at Thomson Reuters IP&Science. Background in Bioinformatics, with a key challenge of interest being to rapidly build a model for understanding a domain, given an available corpus of information. Interested in term extraction algorithms, named entity recognition, document clustering, domain summarisation and more. Question answering systems are also of interest. Enjoy algorithms but remembering the simplest solution is often the most elegant.
Python, Nltk and other libraries; Java and Lingpipe; various commercial text analytics solutions. Some familiarity with other api's and libraries.
Text analytics manager, with bioinformatics background. General interest in technology and big fan of Python.
Good, looking forward to the next one!