B4J Library [server] jElasticsearch - Search and Text Analytics

Discussion in 'B4J Libraries & Classes' started by Erel, Nov 22, 2016.

  1. Erel

    Erel Administrator Staff Member Licensed User

    The battle between a directory:

    [​IMG]

    And search:

    [​IMG]

    Has been won long ago.

    Providing a search feature is important for good user experience.
    Unfortunately it is not a simple task.
    Elasticsearch is a search framework built over Lucene. Lucene is an open source project for text search. (Note that the forum search and similar threads features are also powered by Lucene.)

    Elasticsearch simplifies many tasks required to build a search engine and adds many features over Lucene.
    Elasticsearch is quite similar to MongoDB. It is also a document store. However the use cases and focus are different. Elasticsearch stores the data in a Lucene index and offers powerful text features.

    jElasticsearch is a wrapper for the REST client provided by Elasticsearch.
    It is expected to be used from a server solution.

    The attached example implements a simple search engine that builds an index from B4X libraries files and allows searching them.

    [​IMG]

    Note that it uses the new Background Worker feature of jServer library: https://www.b4x.com/android/forum/threads/73269/#content
    It depends on jServer v2.70+.

    The worker checks for updates every 10 minutes.

    Building a good search engine is not simple. It requires learning the main search concepts and the many available options.
    I recommend starting with the guide: https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html
    And reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

    Elasticsearch can be downloaded here: https://www.elastic.co/downloads/elasticsearch
    You need to start Elasticsearch.bat and then run the example.

    jElasticsearch depends on several additional jars:
    www.b4x.com/b4j/files/jElasticsearch_additional.zip

    Copy them to the additional libraries folder.

    V1.10 - Fixes compatibility with ES6.0. Note that you need to download jElasticsearch_additional as well.
     

    Attached Files:

    Last edited: Dec 6, 2017 at 9:56 AM
    paragkini, adjie, Mashiane and 10 others like this.
  2. hibrid0

    hibrid0 Active Member Licensed User

    Hi, I have a server with 3 tb of documents.
    I want to make a tool on b4j to search.
    Maybe this will work for search on local disk?
     
  3. Erel

    Erel Administrator Staff Member Licensed User

    Yes, you can use Elasticsearch to search the documents. Indexing 3 tb of data is not a simple task. You will most probably need to use a cluster solution.

    I recommend you to start with a smaller set of documents or only index part of the documents (titles and abstracts).
     
    hibrid0 likes this.
  4. Hugo Abrego Nava

    Hugo Abrego Nava New Member Licensed User

    Hi, I just installed the ElasticSearch product and works fine in my equipment, so I installed this ElasticSearch library and the client works well for the GET and a simple PerformRawRequest, but when I try to use ".Search" I get this error:
    org.elasticsearch.client.ResponseException: GET http://127.0.0.1:9200/products/default/_search: HTTP/1.1 406 Not Acceptable
    {"error":"Content-Type header [text/plain; charset=UTF-8] is not supported","status":406}

    This my test code:
    Code:
    Sub AppStart (Args() As String)
        
    Private esclient As ESClient
        
    Private tr As TextReader
        
    Private query As String
        
    Private qmap  As Map
        
    Private jparse As JSONParser
        esclient.Initialize(
    ""Array("127.0.0.1:9200"))
        
    Log(esclient.PerformRawRequest( "GET""_cluster/health"Null"").ResponseAsMap)
        
    Log(esclient.PerformRawRequest( "GET""products/default/_search?q=name:prod1"Null"").ResponseAsMap)
        
    Log(esclient.Get("products","default","1"))
       
        tr.Initialize(
    File.OpenInput(File.DirApp,"query.txt"))
        query=tr.ReadAll
        
    Log(query)
        jparse.Initialize(query)
        qmap=jparse.NextObject
        
    Log(qmap)
        
    Log(esclient.Search("products","default",qmap).ResponseAsMap)
        esclient.Close
    End Sub
    And this the Log:
    Code:
    (MyMap) {number_of_pending_tasks=0, cluster_name=elasticsearch, active_shards=6, active_primary_shards=6, unassigned_shards=6, delayed_unassigned_shards=0, timed_out=false, relocating_shards=0, initializing_shards=0, task_max_waiting_in_queue_millis=0, number_of_data_nodes=1, number_of_in_flight_fetch=0, active_shards_percent_as_number=50.0, status=yellow, number_of_nodes=1}
    (MyMap) {_shards={total=5, failed=0, successful=5, skipped=0}, hits={hits=[{_index=products, _type=default, _source={quantity=1, name=prod1}, _id=1, _score=0.2876821}], total=1, max_score=0.2876821}, took=18, timed_out=false}
    (MyMap) {quantity=1, name=prod1}
    {"query":
      {"match":
        {"name": "prod1"}
      }
    }
    (MyMap) {query={match={name=prod1}}}
    org.elasticsearch.client.ResponseException: GET http://127.0.0.1:9200/products/default/_search: HTTP/1.1 406 Not Acceptable
    {"error":"Content-Type header [text/plain; charset=UTF-8] is not supported","status":406}
        at org.elasticsearch.client.RestClient$1.completed(RestClient.java:311)
        at org.elasticsearch.client.RestClient$1.completed(RestClient.java:300)
        at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)
        at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
        at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
        at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
        at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
        at java.lang.Thread.run(Thread.java:748)
    Can you help me to fix this error?
     
  5. Erel

    Erel Administrator Staff Member Licensed User

    You are probably running ES v6.0, right?

    Replace the existing jar with the one attached. It is actually elasticsearch-rest-client-6.0.0.

    Does it work?
     

    Attached Files:

  6. Hugo Abrego Nava

    Hugo Abrego Nava New Member Licensed User

    Hello, I have changed the existing jar with the attachment rest-5.0.1.jar, but it doesn't work. I get the same error.
    And yes, I am running ES v6.0
    Do I need to downgrade the version?
     
  7. Erel

    Erel Administrator Staff Member Licensed User

    Don't downgrade.

    Is the ES accessible over the internet? If so then I would be happy to try it myself.
     
  8. Hugo Abrego Nava

    Hugo Abrego Nava New Member Licensed User

    I try to connect ES to the internet but it was not possible, it does not start when I set a different IP from the localhost in my VPS.
     
  9. Erel

    Erel Administrator Staff Member Licensed User

    I will install ES 6 here and try it.
     
  10. Hugo Abrego Nava

    Hugo Abrego Nava New Member Licensed User

    Hi Erel, I finally managed to boot ES6 using my IP, it is an undocumented trick, in the .YML file when the network is configured the IP must be in brackets. So if you still need it, I can send my IP by mail.
     
  11. Erel

    Erel Administrator Staff Member Licensed User

    Sure. That will be helpful. Send me a private message.
     
  12. Erel

    Erel Administrator Staff Member Licensed User

    v1.10 is released. It fixes the issue discussed above.
     
    Cableguy likes this.
Loading...