RSS Feed for semantic webCategory: semantic web

Natural Language Toolkit »

NTLK is a set of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux. The NLTK project began when Steven Bird was teaching CIS-530 at the University of Pennsylvania in 2001, and hired his star [...]

Calais – Annotates Content with Rich Semantic Metadata »

As quoted from the website, Calais seeks to help make all the worlds content more accessible, interoperable and valuable via the automated generation of rich semantic metadata, the incorporation of user defined metadata, the transportation of those metadata resources throughout the content ecosystem and the extension of it’s capabilities by user-contributed components. At the core [...]

Natural Language Processing using OpenNLP Tools »

I used tools from OpenNLP as part of my research on natural language processing and semantic web OpenNLP is an organizational center for open source projects related to natural language processing. It hosts a variety of java-based NLP tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP [...]

Java – Open Source Social Networking Applications »

Here is the link to a list of open source social networking applications in Java. http://www.manageability.org/blog/stuff/java-open-source-social-network This is an intesting area that I am currently looking at.

Build Domain Knowledge by Extracting Keywords from DMOZ »

Download Source This is a experiment I am currently doing, extracting keywords from categories in DMOZ to see how accurate it is to be used for web page categorization. From my previous post, I load the DMOZ categories into a database. The Perl script also generates a pipe delimited file for me, as show below [...]

Load DMOZ RDF Structure and Content RDF »

The DMOZ Open Directory Project is the largest human edited directory of web. As part of my research area, I need to load the structure and content RDF into MySQL database. At first I was trying to use Jena to parse the RDF. However, the DMOZ RDF files are not conforming to the standard, and [...]

Java – Build Your Semantic Web Application using Jena »

In my previous articles, I talked about GATE, Protégé, Crunch and Lucene. I used all these tools together with Jena to build a semantic web application. If you have no idea what semantic web is, here is a good introduction. Tim Berners-Lee originally expressed the vision of the semantic web as follows: I have a [...]

Protégé – Open Source Ontology Editor and Knowledge Acquisition System »

Protégé is a free, open source ontology editor and knowledge-base framework. I used it together with GATE, Crunch , Lucene,and other tools to create my knowedge-based system with automated ontology population. As quoted from the website, Protégé is a free, open-source platform that provides a growing user community with a suite of tools to construct [...]

GATE – A General Architecture for Text Engineering »

In my previous article, Combine Crunch and Lucene for Efficient Web Page Indexing, I mentioned that I used Crunch and Lucene in one of my projects. The project actually aims to build a semantic knowledge framework. As part of the project, I need to do content/knowledge extraction, semantic tagging, knowledge storage and knowledge representation. I [...]