B4J Library [B4X] Xml2Map - Simple way to parse XML documents

Discussion in 'B4J Libraries & Classes' started by Erel, Jan 4, 2017.

Thread Status:
Not open for further replies.
  1. Erel

    Erel Administrator Staff Member Licensed User

    Nobody likes to parse XML.

    Parsing JSON is simple and fun. Parsing XML is tedious and boring.

    That is the reason behind the Xml2Map class. It internally parses the XML document and returns a Map with the parsed data. It is similar to parsing JSON.
    Tip: You can use this tool to help you with parsing JSON: https://b4x.com:51041/json/index.html

    So instead of the code explained in the old tutorial: https://www.b4x.com/android/forum/threads/xml-parsing-with-the-xmlsax-library.6866/#content

    We can achieve the same thing with this code:
    Code:
    Sub Process_Globals
       
    Private ParsedData As Map
    End Sub

    Sub Globals
       
    Private ListView1 As ListView
    End Sub

    Sub Activity_Create(FirstTime As Boolean)
       
    If FirstTime Then
         
    Dim xm As Xml2Map
         xm.Initialize
         ParsedData = xm.Parse(
    File.ReadString(File.DirAssets, "rss.xml"))
       
    End If
       
    Activity.LoadLayout("1")
       ListView1.SingleLineLayout.ItemHeight = 
    60dip
       
    Dim rss As Map = ParsedData.Get("rss")
       
    Dim channel As Map = rss.Get("channel")
       
    Dim items As List = channel.Get("item")
       
    For Each item As Map In items
         
    Dim title As String = item.Get("title")
         
    Dim link As String = item.Get("link")
         ListView1.AddSingleLine2(title, link)
       
    Next
    End Sub

    Sub ListView1_ItemClick (Position As Int, Value As Object)
       
    Dim pi As PhoneIntents
       
    StartActivity(pi.OpenBrowser(Value))
    End Sub
    You can use the JSON library to convert the Map to a json string, this is useful for understanding how to access the data:
    Code:
    Dim jg As JSONGenerator
    jg.Initialize(ParsedData)
    Log(jg.ToPrettyString(4))
    The result in this case will look like:
    Note that attributes are added under the Attributes key. In such cases the text will be available under the Text key.

    This module is compatible with B4A, B4J and B4i.

    It depends on XmlSax library (which is included in the IDE).

    upload_2017-1-4_14-26-40.png

    Edit (October 2017):

    Common pitfall


    Consider this xml:
    Code:
    <root>
     <book>
       <title>Book 
    1</title>
     </book>
     <book>
       <title>Book 
    2</title>
     </book>
    </root>
    There could be any number of book elements.
    You can parse it with:
    Code:
    Dim root As Map = ParsedData.Get("root")
    For Each book As Map In root.Get("book")
     
    Dim title As String = book.Get("title")
    Next
    However this code will fail in two cases:
    1. There is only one book in the xml so root.Get("book") will return a Map instead of a List.
    2. There are no books at all so root.Get("book") will return Null.

    To solve this issue you can use this sub:
    Code:
    Sub GetElements (m As Map, key As StringAs List
       
    Dim res As List
       
    If m.ContainsKey(key) = False Then
         res.Initialize
         
    Return res
       
    Else
         
    Dim value As Object = m.Get(key)
         
    If value Is List Then Return value
         res.Initialize
         res.Add(value)
         
    Return res
       
    End If
    End Sub
    It will return a list in all cases.
    You can safely use it with:
    Code:
    Dim root As Map = ParsedData.Get("root")
    For Each book As Map In GetElements(root, "book"))
     
    Dim title As String = book.Get("title")
    Next

    Map2Xml - New class!

    Map2Xml converts the map created with Xml2Map to a Xml string. It uses XmlBuilder library and it is compatible with B4A, B4i and B4J.
    It can be useful to modify existing XML documents. You read the document with Xml2Map, make the changes in the returned map and write it back with Map2Xml.

    The two classes are packed as a b4x lib. If you are using an older version that doesn't support b4x libs then unzip it and copy the modules to your project.
     

    Attached Files:

    Last edited: Jan 10, 2019
    pedrocam, Cain Soft, MarkusR and 32 others like this.
  2. Mahares

    Mahares Well Known Member Licensed User

    Where do you download this library XmlSax. It seems like I am always having trouble finding the link to download libraries, even with a forum search. You are always sent to the lib doc, but not the lib download itself.
     
  3. DonManfred

    DonManfred Expert Licensed User

  4. Mahares

    Mahares Well Known Member Licensed User

    I already checked both before posting. There is no library there. Only B4A code. Did you verify yourself there was a lib there.
     
  5. DonManfred

    DonManfred Expert Licensed User

    Ohh. No, i didn´t...
     
    Last edited: Jan 5, 2017
    Mahares likes this.
  6. Erel

    Erel Administrator Staff Member Licensed User

    It should never take more than a single search to find a library if it is in the "additional libraries" section.

    XmlSax / jXmlSax / iXmlSax are not in the forum because they are included in the IDE package (internal libraries).
     
    Mahares likes this.
  7. Mahares

    Mahares Well Known Member Licensed User

    When I saw that you mentioned that it depends on the XmlSax lib, I automatically thought it is part of the additional lib folder. Otherwise , it is a moot point and you did not have to mention it if it is part of the internal lib.
    Thank you
     
  8. Erel

    Erel Administrator Staff Member Licensed User

    It is not a moot point as developers will need to select it after adding the module to their project. I've edited the post and clarified this point.
     
  9. corwin42

    corwin42 Expert Licensed User

    Unfortunately it is still synchronous because it depends on XmlSax. So it is only usable for relatively small xml data. For larger xml files it blocks the ui thread too long. A asynchonous XmlSax library would be nice.
     
    Last edited: Jan 5, 2017
  10. Erel

    Erel Administrator Staff Member Licensed User

    It is indeed synchronous. What is the size of the XML document you are parsing? Can you upload it?
     
  11. corwin42

    corwin42 Expert Licensed User

    The maximum size of the uncompressed file is about 170kB.
    The example file is generated from this url.

    Because parsing with XmlSax needs a few seconds I do the whole step of uncompressing the data, parsing the xml and saving the result into a SQLite database in a seperate thread with the treading library. Unfortunately this breaks the debugger so I can't use it anymore with the app.

    I have thought about to convert the code to warwounds XOM library to try if this works better but I hadn't time to do the conversion for now.
     

    Attached Files:

  12. Erel

    Erel Administrator Staff Member Licensed User

    It takes 125ms to parse it in release mode (300ms in debug):
    Code:
    Dim xm As Xml2Map
    xm.Initialize
    Dim n As Long = DateTime.Now
    ParsedData = xm.Parse(
    File.ReadString(File.DirAssets, "data.xml"))
    Log(DateTime.Now - n)
    Inserting to SQLite is not relevant to this thread however make sure to create a single transaction and if it is not fast enough then you can use SQL.ExecNonQueryBatch to insert it in the background.
     
  13. corwin42

    corwin42 Expert Licensed User

    Hmm, on which device? I remember it took several seconds in the past.

    [QUOTE\
    Inserting to SQLite is not relevant to this thread however make sure to create a single transaction and if it is not fast enough then you can use SQL.ExecNonQueryBatch to insert it in the background.[/QUOTE]
    Yes I know. I think I will have to do some refactoring again.

    If I remember correctly one of the main causes why I handle all this in its own thread was that the whole stuff was done in the background by a service and the UI was not very fluid when parsing and updating the database was handled on the ui thread, too.
     
  14. Erel

    Erel Administrator Staff Member Licensed User

    Nexus 5X. Test it on your device. It should be fast.

    SQL already supports asynchronous operations. They will not affect the main thread.
     
  15. samikinikar

    samikinikar Member Licensed User

    My XML file

    Code:
    <id>1770</id>
    <
    image>http://Cityonline.com/custom/domain_1/image_files/sitemgr_photo_4681.png</image>
    <thumb>http://Cityonline.com/custom/domain_1/image_files/sitemgr_photo_4682.png</thumb>
    <updated>
    2017-01-26 16:41:19</updated>
    <entered>
    2017-01-26 16:41:08</entered>
    <renewal_date>
    0000-00-00</renewal_date>
    <title>City | Cheque 
    in Kannada dishonoured, Customer drags bank to court</title>
    <seo_title>City | Cheque 
    in Kannada dishonoured, Customer drags bank to court</seo_title>
    <friendly_url>City-cheque-
    in-kannada-dishonoured-customer-drags-bank-to-court</friendly_url>
    <author>www.dummyurl.com</author>
    <author_url></author_url>
    <publication_date>
    2017-01-26</publication_date>
    <abstract>A customer has dragged ICCI bank 
    to court after his cheque was dishonoured on grounds that the information on it was written in Kannada.</abstract>
    <seo_abstract>A customer has dragged ICCI bank 
    to court after his cheque was dishonoured on grounds that the information on it was written in Kannada.</seo_abstract>
    <
    keywords>City neews || news of City || belagavi || news news || news about City || City news || news baout City</keywords>
    <seo_keywords>City neews, news of City, belagavi, news news, news about City, City news, news baout City</seo_keywords>
    <content>&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span style=&quot;
    font-size: small; font-family: verdana, geneva;&quot;&gt;City | Belagavi&lt;/span&gt;&lt;/p&gt;
    &lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span style=&quot;
    font-size: small; font-family: verdana, geneva;&quot;&gt;A customer has dragged ICCI bank to court after his cheque was dishonoured on grounds that the information on it was written in Kannada.&lt;br /&gt;&lt;/span&gt;&lt;lt;br /&gt;&lt;span style=&quot;font-size: small; font-family: verdana, geneva;&quot;&gt;Anand Diwakar Garag has filed a case, alleging lack of service, with the district consumer redressal court in Belagavi. In November, Garag presented a cheque for Rs 17,220 to the Life Insurance Corporation of India (LIC), as premium for his insurance policy. The LIC handed the cheque over to Corporation Bank, which handles its accounts. However, when the cheque was presented to ICICI for payment, it was returned with a note &quot;present with document&quot;.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-size: small; font-family: verdana, geneva;&quot;&gt;Before he filed the case, Garag sought clarification from both ICICI bank and LIC as to why his cheque had been dishonoured. However, neither furnished a satisfactory explanation. Garag told TOI that he made Hescom payment in cheques, wherein all details were filled in Kannada. &quot;My bank told me that the reason they dishonoured my cheque was because the details were filled in Kannada. Also, in another incident that occurred after this one, ICICI bank dishonoured a cheque I had given to a private firm,&quot; he said. The consumer redressal court has issued notices to LIC, ICICI and Corporation Bank in connection with the case, which will be heard on February 28.&lt;/span&gt;&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;&lt;strong&gt;&lt;em&gt;&lt;strong&gt;
    Image is for representation only&lt;br /&gt;&lt;/strong&gt;&lt;/em&gt;&lt;em&gt;&lt;strong&gt;Source :TOI&lt;/strong&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;</content>
    <status>Active</status>
    <suspended_sitemgr>n</suspended_sitemgr>
    <level>article</level>
    <number_views>
    735</number_views>
    <avg_review>
    0</avg_review>
    </article_info>
    I have no issues retrieving data with other tags other than <content> tag, the content tag contains HTML tags, so the data is not fetch and there are no results in the list view.
    Can someone please help to get the contents of the <content> tag without html tags using this XML parser ?
     
  16. Erel

    Erel Administrator Staff Member Licensed User

    Can you upload an XML file as an example?
     
  17. samikinikar

    samikinikar Member Licensed User

    Attached is the xml file
     

    Attached Files:

    • test.xml
      File size:
      53.1 KB
      Views:
      187
  18. Erel

    Erel Administrator Staff Member Licensed User

    I don't see anything special here.
    Code:
    Dim xm As Xml2Map
    xm.Initialize
    Dim root As Map = xm.Parse(File.ReadString(File.DirAssets, "test.xml"))
    Dim m As Map = root.Get("eDirectoryData")
    m = m.Get(
    "ObjectData")
    Dim entries As List = m.Get("entry")
    For Each m As Map In entries
       
    Log("********************")
       
    Log(m.Get("articleContent"))
    Next
     
  19. samikinikar

    samikinikar Member Licensed User

    Thank you for the update, yes without removing any formatting it retrieves the content, but If i remove the html tags and fetch the contents, it displays the following error. Attach is the formatted xml file ( format.xml )

    Error :
    ** Activity (main) Create, isFirst = true **
    Error occurred on line: 84 (xml2map)
    org.apache.harmony.xml.ExpatParser$ParseException: At line 5, column 0: undefined entity
    at org.apache.harmony.xml.ExpatParser.parseFragment(ExpatParser.java:515)
    at org.apache.harmony.xml.ExpatParser.parseDocument(ExpatParser.java:474)
    at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:316)
    at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:279)
    at anywheresoftware.b4a.objects.SaxParser.parse(SaxParser.java:80)
    at anywheresoftware.b4a.objects.SaxParser.Parse(SaxParser.java:73)
    at anywheresoftware.b4a.samples.xmlsax.xml2map._parse2(xml2map.java:252)
    at anywheresoftware.b4a.samples.xmlsax.xml2map._parse(xml2map.java:90)
    at java.lang.reflect.Method.invoke(Native Method)
    at java.lang.reflect.Method.invoke(Method.java:372)
    at anywheresoftware.b4a.shell.Shell.runMethod(Shell.java:708)
    at anywheresoftware.b4a.shell.Shell.raiseEventImpl(Shell.java:340)
    at anywheresoftware.b4a.shell.Shell.raiseEvent(Shell.java:247)
    at java.lang.reflect.Method.invoke(Native Method)
    at java.lang.reflect.Method.invoke(Method.java:372)
    at anywheresoftware.b4a.ShellBA.raiseEvent2(ShellBA.java:134)
    at anywheresoftware.b4a.samples.xmlsax.main.afterFirstLayout(main.java:102)
    at anywheresoftware.b4a.samples.xmlsax.main.access$000(main.java:17)
    at anywheresoftware.b4a.samples.xmlsax.main$WaitForLayout.run(main.java:80)
    at android.os.Handler.handleCallback(Handler.java:815)
    at android.os.Handler.dispatchMessage(Handler.java:104)
    at android.os.Looper.loop(Looper.java:194)
    at android.app.ActivityThread.main(ActivityThread.java:5651)
    at java.lang.reflect.Method.invoke(Native Method)
    at java.lang.reflect.Method.invoke(Method.java:372)
    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:959)
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:754)
     

    Attached Files:

  20. Erel

    Erel Administrator Staff Member Licensed User

    Please start a new thread and upload a complete project that demonstrates the error. I did test it with the file that you uploaded and didn't encounter any issue.
    Also make sure that the XML is valid: https://www.xmlvalidation.com/
     
Thread Status:
Not open for further replies.
Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice