I open sourced my Java KBtextmaster project
KBtextmaster reads a variety of document formats (Word, Powerpoint, PDF, OpenOffice.org, AbiWord) and performs categorization, summarization, part of speech tagging, document clustering, and indexing/search using Lucene.
You can get it here. It is released under the GPL, with alternative licenses available if the GPL does not work for your project.
You can get it here. It is released under the GPL, with alternative licenses available if the GPL does not work for your project.
Comments
Post a Comment