RSS Feed for This PostCurrent Article

LingPipe – Java Libraries for Linguistic Analysis of Human Language

This is one of the Java libraries that I tried out and found quite useful for linguistic analysis of human language.

LingPipe is a suite of Java libraries for the linguistic analysis of human language. The features as quoted from the website.

  • track mentions of entities (e.g. people or proteins);
  • link entity mentions to database entries;
  • uncover relations between entities and actions;
  • classify text passages by language, character encoding, genre, topic, or sentiment;
  • correct spelling with respect to a text collection;
  • cluster documents by implicit topic and discover significant trends over time; and
  • provide part-of-speech tagging and phrase chunking.

One of the features that I like is that it can do Chinese word segmentation. Chinese is written without spaces between the words, and it is not a simple task to break Chinese into words.

The libraries are used in quite a number of commercial, academic and government institutions. You should have a look at it if your research area is on linguistics or semantic analysis.

Trackback URL

RSS Feed for This Post1 Comment(s)

  1. rohit | Jun 16, 2008 | Reply


    I studied your website contents these are very nice.
    But i am not getting how to use OpenNLP tools for parsing tagging etc.

    Please ,could you send me the code of parsing and tagging using Maxent or OpenNLP tools in java of a given string.

    waiting for reply.

Sorry, comments for this entry are closed at this time.