NLP (Natural Language Processing) is a complex task. In the past it was only accessible to data scientists and linguistics experts, however with the rising popularity of machine learning it is now accessible to all developers, with enough time and energy.
This is still not a simple process and it does require understanding the main concepts of NLP and preferably machine learning.
The OpenNLP library wraps Apache open source OpenNLP project: https://opennlp.apache.org/
The library includes the APIs to load models and use the various features. The training, when needed, should be done with the command line methods. They are quite convenient to use.
The currently exposed features are:
- Language detection
- Sentence detector
- Tokenzier
- Names / entities recognition
- Document categorizer
- Part of speech tagger
- Lemmatizer
- Stemmer
- Chunker
- The ApacheOpenNLP documentation is a must: https://opennlp.apache.org/docs/1.9.3/manual/opennlp.html
- Download the complete package: https://github.com/AnywhereSoftware/B4J-OpenNLP (zip)
- Copy OpenNLP.b4xlib and opennlp-tools-1.9.3.jar to the additional libraries folder.
- Run the OpenNLP Example project.
It depends on Java 11+. Download link is available here: https://www.b4x.com/b4j.html
Tutorials
Movie reviews sentiment analysis: https://www.b4x.com/android/forum/threads/nlp-sentiment-analysis.133922/
Last edited: