CDP Categorizer is a content categorization engine which can be used to categorize web pages. It is a total rewrite of the old open source version at https://twit88.com/home/opensource/textguru.

The old version is based on Reuters Corpus. The problem with it is that it cannot learn by itself and with the amount of new web pages growing every day, the categorization is no longer accurate.

  • Subproject of: CDP