addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

via OCAndroid: OCJUG(Java) Std Mtg 2013.11-Computationa­l Linguistics, etc



Computational Linguistics, Machine Learning, and Text Mining with Groovy (aka Java++)


Jim White


This talk is a brief introduction to computational linguistics by way of looking at some projects I've worked on over the last year demonstrating Information Extraction, Lexical Analysis (the linguistics rather than compiler kind), an Ensemble Method for Statistical Parsing, and Corpus Construction (which includes parsing some English extracted from Javadoc).

In each case I'll give a very brief description of the motivating problem and then dive into code that deals with some part of it. We'll see how Groovy makes the most out of Java for text processing, simple web browser UIs, and cluster computing. This talk should be of interest both to those curious about what goes on in NLP as well as those who would simply like to get some of their work done faster by using more powerful tools.

Technologies we'll see at work (all of which are Open Source Software):
* Stanford  CoreNLP
* Semgrex
* GATE (General Architecture for Text Engineering)
* MALLET (MAchine Learning for LanguagE Toolkit)
* ERG (English Resource  Grammar)
* Lucene
* Ratpack
* Gradle
* Condor


  • 6:30 - 7:00 pm - Networking
  • 7:00 - 8:30 pm - Presentation

Ray Tayek OCJUG Co-Chair and original author of above text posted on:

Details derived from an OCJUG mailing list email sent by Ray Tayek 11/09/13


(OCJUG(Java) Regular Meeting's OCAndroid listing

  1. (LTNVQ2 track description=All regular meetings of (OCJUG=Orange County Java User Group http://OCJUG.Org) starting 2011.11.))
  2. (LTNW5G=We list all OCJUG events on OCAndroid because:
    1. MMIKPH: 1st: since both:
      1. (LTNVVT="native" & typical Android programming is done in Java.)
      2. (LTNVY0=Both groups serve exactly the same region: Orange County.)
    2. MMIKU1: 2nd:
      (LTNVYJ=SoCalAndroid Hack Nights, which OCAndroid re-lists, are scheduled not to be the same week as the OCJUG Regular Meeting.)
  3. (LTOOCC Occurrences in order, 3-or-more in a row, starting ideally at the last to take place=

Join or login to comment.

  • Martin M.

    This presenter speaks my language ;) I have been worked in the NLP (Natural Language Processing) field for many years using some of the tools mentioned as well as others

    November 10, 2013

3 went

  • Martin M.
    of Orange&Caltech;2012.02.15-,32+RSVP YES,23+attends; OCJUG+ Ldr, Assistant Organizer,
    Event Host
  • Dave G
    of Irvine; 2013.03.26-, 1+RSVP YES;coder
  • A former member

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy