Knowledge Extraction

Raymond describes his independent Lisp research project intended to extract knowledge from the Wikipedia website.

His project combines natural language processing techniques, knowledge representation paradigms and machine learning algorithms that creates a semantic model of the information contained in Wikipedia.

This presents an algorithm for the automatic generation of topic taxonomies and suggests how such a model can be used to implement contextually relevant web searches. In doing so, Raymond provides a brief overview of the following topics and algorithms:

  • Natural Language Processing
  • Semantic Nets
  • Similarity Metrics
  • Clustering Algorithms

Join or login to comment.

  • Heow Goodman

    Standing room only, it was definitely one of our biggest meetings at 70 people! (see the photo)

    July 16, 2012

  • Raymond de Lacaze

    It was a lot of fun. The audience was dynamic, enthusiastic and attentive and endured two hours of presentation. This is a great group to give a talk to. LispNYC rocks!

    July 12, 2012

  • Geoffrey Knauth

    This was the best talk I've attended in a very long time, well worth the 400 mile round-trip. The speaker showed he'd worked hard and long on a difficult problem, he explained it in a way that the audience understood, and the audience too was full of very smart people. This is cutting edge stuff. I've been working on a similar, but smaller project. I found Ray's approach and his results to be very interesting. He obviously put a lot of work into this.

    July 11, 2012

  • A former member
    A former member

    Excellent. Fascinating overview of the field, and even more interesting presentation. Very few people go with purely structured representations nowadays, and it's really good to see NLP that actually has a sense of understanding what is going on.

    July 11, 2012

  • A former member
    A former member

    A good summary of components of and techniques used in a Knowledge Extraction system.

    July 11, 2012

  • Harry French

    NLP with a great corpus. Techniques are not cutting edge.

    July 11, 2012

  • Bennett Todd

    The most exciting talk I've heard in a while. I just tried to write a summary of what he described and got all excited again.

    July 11, 2012

  • Mike S

    It was awesome! Super interesting project, really enjoyed it. As a lisp novice I would liked to have seen maybe a bit of code, but I found the talk worthwhile, nonetheless.

    July 11, 2012

  • A former member
    A former member

    It was a great presentaion that used propositional calculus, hypergraphs and clustering algorithms to extract data from wikipedia. Would definitely attend another LISP meetup.

    July 11, 2012

  • A former member
    A former member

    Such a great presentation. Wonderful how you use LISP, Hypercubes, clustering algorithms and propositional calculus to extract information from Wikipedia pages.

    July 10, 2012

  • A former member
    A former member

    Will the slides be available?

    July 10, 2012

  • romy

    elephant hineys == parentheses!!!!!! amazing last slide!!!!

    July 10, 2012

  • romy

    hahahahha!! crawling wikipedia: "out of memory" error!!!! hahahahaha!!!

    1 · July 10, 2012

  • romy

    THIS TALK IS SO AWESOMESAUCE!!!!!!!1

    r0x0rz my b0x0rzzzzzzz!!!!!!!!!!!!!!!!!1

    1 · July 10, 2012

  • Johan Bager

    How do we get in? Ten people in lobby now

    1 · July 10, 2012

  • Raymond de Lacaze

    In terms of topics:
    Natural Language Processing (NLP)
    Lexical, Syntactic & Semantic Analysis
    Knowledge Representation (KR)
    FOPC
    Semantic Nets
    Hypergraphs
    Machine Learning (ML)
    Clustering
    K-Means Clustering
    Hierachical Agglomerative Clustering
    Recommender Systems
    Collaborative Filtering
    Similarity Metrics
    Jaccard Index
    Pearson Correlation Coefficient
    Elephants
    Babar

    There should be Wiki page on each of these. That would probably be the best place to start.

    2 · July 7, 2012

  • Raymond de Lacaze

    In terms of books, I will presenting material from the following:

    Language and Speech Processing (Jurafsky and Martin)
    Artificial Intelligence (Russel and Norvig)
    Principles of Semantic Networks (Sowa)
    Machine Learning (Mithchell)
    Pattern Classification and Scene Analysis (Duada and Hart)
    Algorithms of the Intelligent Web (Marmanis and Babenko)

    1 · July 7, 2012

  • romy

    WOW wow WOW wow WOWOWOWOWOWOWO

    2 · June 21, 2012

Our Sponsors

  • NYI.net

    Colocation and hosting in downtown Manhattan

People in this
Meetup are also in:

illustration

Create a community of like-minded people

It's quick and easy to start the perfect Meetup Group for you.

Start a Meetup Group

Log in

Not registered with us yet?

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy