B4J Library [NLP] OpenNLP - Text analysis

81nbSzMPaD.gif


NLP (Natural Language Processing) is a complex task. In the past it was only accessible to data scientists and linguistics experts, however with the rising popularity of machine learning it is now accessible to all developers, with enough time and energy.
This is still not a simple process and it does require understanding the main concepts of NLP and preferably machine learning.

The OpenNLP library wraps Apache open source OpenNLP project: https://opennlp.apache.org/
The library includes the APIs to load models and use the various features. The training, when needed, should be done with the command line methods. They are quite convenient to use.

The currently exposed features are:
  • Language detection
  • Sentence detector
  • Tokenzier
  • Names / entities recognition
  • Document categorizer
  • Part of speech tagger
  • Lemmatizer
  • Stemmer
  • Chunker
How to get started?

- The ApacheOpenNLP documentation is a must: https://opennlp.apache.org/docs/1.9.3/manual/opennlp.html
- Download the complete package: https://github.com/AnywhereSoftware/B4J-OpenNLP (zip)
- Copy OpenNLP.b4xlib and opennlp-tools-1.9.3.jar to the additional libraries folder.
- Run the OpenNLP Example project.


It depends on Java 11+. Download link is available here: https://www.b4x.com/b4j.html

Tutorials

Movie reviews sentiment analysis: https://www.b4x.com/android/forum/threads/nlp-sentiment-analysis.133922/
 
Last edited:

HAH

Well-Known Member
Licensed User
Please fix, Where is shared files? error result: The system cannot find the file specified.
B4X:
#CustomBuildAction: folders ready, %WINDIR%\System32\Robocopy.exe,"..\..\Shared Files" "..\Files"
 

Hanz

Active Member
This is like the grammarly... But, for business applications, like accounting system, I am wondering if it has relevance. But it's very interesting, I hope it can be used with b4a, FB's messengers translate messages from non-english sentences into english.
 

HAH

Well-Known Member
Licensed User
Example compilation error ..
B4X:
Compiling generated Java code.    Error
B4J line: 63
End Sub
javac 1.8.0_121
src\b4j\example\opennlp.java:916: error: local variable target is accessed from within inner class; needs to be declared final
                jo.setObject(target);
                             ^
1 error
 

John Naylor

Active Member
Licensed User
This is amazing! Perfectly timed too. My middle daughter is studying linguistics at university and was asking if I had any access to NLP as she has some ideas she'd like to explore as part of her course delving in to forensics and language use. This maybe paves the way to a nice little project together. Awesome work @Erel thank you.
 

Roberto P.

Well-Known Member
Licensed User
HI,
very interesting, do you think it can also be used to automate a content management system (KB \ LMS) for technical content?
 

fredo

Well-Known Member
Licensed User
...for technical content...

This would be also a good idea for search support for laymen in extensive collections of special knowledge.

An example would be an App searching medical texts for possible causes based on symptoms.

Searchers are guided to helpful information based on non-professional inputs that help them find the underlying cause.

This will not be able to replace professional medical support. But it can avoid unnecessary medication, for example, if you find out that your regular headaches are related to excessive sugar consumption or that you need proper vision care.
 

Roberto P.

Well-Known Member
Licensed User
I need more information... What exactly are you trying to automate?
the search for technical answers in a db. For example procedures and user manuals or FAQs
 

fredo

Well-Known Member
Licensed User

Erel

B4X founder
Staff member
Licensed User
NLP, by itself, is less suitable for information retrieval / search engines tasks. It can help the search engine by adding more information to the text, but if you want to build a search engine then it is best to use a search engine such as Elastic Search.

I did create an example with Elastic Search: https://www.b4x.com/android/forum/t...arch-search-and-text-analytics.73335/#content
It is probably outdated for now.
 
Top