mahout
Entity Extraction/Recognition with free tools while feeding Lucene Index
The problem you are facing in the ‘wicket’ example is called entity disambiguation, not entity extraction/recognition (NER). NER can be useful but only when the categories are specific enough. Most NER systems doesn’t have enough granularity to distinguish between a sport and a software project (both types would fall outside the typically recognized types: person, … Read more
What is the difference between Apache Mahout and Apache Spark’s MLlib?
The main difference will come from underlying frameworks. In case of Mahout it is Hadoop MapReduce and in case of MLib it is Spark. To be more specific – from the difference in per job overhead If your ML algorithm mapped to the single MR job – main difference will be only startup overhead, which … Read more