Android Question [Request] Apache Tika Lib/Wrap, Extract structured text from different file types

fredo

Well-Known Member
Licensed User
Longtime User
For a future Android project, it will be necessary to obtain structured textual content from any document format.

It is well known that some parser solutions already exist here in the forum. However, those are mostly specifically designed for a certain file format. In this case, a solution is actually requested to cover as many formats as possible.
The research for the most optimal solution led to this product:

Apache Tika - a content analysis toolkit

The data is extracted from the source document using a parser API.
An initial search for feasibility on the Android platform showed that there is a lot of interest and several solutions are being discussed.


Since our team lacks sufficient expertise for the realisation, the following questions go to the community:

a) Do I see it correctly that it is basically possible to use Apache Tika in Android apps?

b) Would it be possible to create a B4X-wrap for such a seemingly large package?

 
Cookies are required to use this site. You must accept them to continue using the site. Learn more…