Android Tutorial XML Parsing with the XmlSax library

It is simpler to parse XML with Xml2Map class: https://www.b4x.com/android/forum/threads/b4x-xml2map-simple-way-to-parse-xml-documents.74848/

The XmlSax library provides an XML Sax parser.
This parser sequentially reads the stream and raises events at the beginning and end of each element.
The developer is responsible to do something useful with those events.

There are two events:
B4X:
StartElement (Uri As String, Name As String, Attributes As Attributes)
EndElement (Uri As String, Name As String, Text As StringBuilder)
The StartElement is raised when an element begins. This event includes the element's attributes list.
EndElement is raised when an element ends. This event includes the element's text.

In this example we will parse the forum RSS feed. RSS is formatted using XML.
A simplified example of this RSS is:
B4X:
<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0">
    <channel>
        <title>Basic4ppc  / Basic4android - Android programming</title>
        <link>http://www.b4x.com/forum</link>
        <description>Basic4android - android programming and development</description>
        <ttl>60</ttl>
        <image>
            <url>http://www.b4x.com/forum/images/misc/rss.jpg</url>
            <title>Basic4ppc  / Basic4android - Android programming</title>
            <link>http://www.b4x.com/forum</link>
        </image>
        <item>
            <title>Phone library was updated - V1.10</title>
            <link>http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</link>
            <pubDate>Sun, 12 Dec 2010 09:27:38 GMT</pubDate>
            <guid isPermaLink="true">http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</guid>
        </item>
        ...MORE ITEMS HERE
    </channel>
</rss>
The first line is part of the XML protocol and is ignored.
On the second line the StartElement event will be raised with "Name = rss" and the attributes will include the "version" field.
The EndElement of the rss element will only be called on the last line: </rss>.

We will populate a list view with all items parsed from an offline file. When the user will press on an item we will open the browser with the relevant link.
Every item represents a forum thread.

xmlsax_1.png


For each item we are interested in two values. The title and the link.
The SaxParser object includes a handy list that holds the names of all the current parents elements.
This is useful as it will help us find the "correct" 'title' and 'link' elements. The correct elements are the ones under the 'item' element.

The parsing code in this case is pretty simple:
B4X:
Sub Parser_StartElement (Uri As String, Name As String, Attributes As Attributes)

End Sub
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
    If parser.Parents.IndexOf("item") > -1 Then
        If Name = "title" Then
            Title = Text.ToString
        Else If Name = "link" Then
            Link = Text.ToString
        End If
    End If
    If Name = "item" Then
        ListView1.AddSingleLine2(Title, Link) 'add the title as the text and the link as the value
    End If
End Sub
Title and Link are global variables.
We are only using EndElement events in this program.
First we check if we are inside an 'item' element. If this is the case we check the actual element name and save it if it is 'title' or 'link'.

If the current element is 'item' it means that we are done parsing an item.
So we add the data collected to the list view.

We are using ListView.AddSingleLine2. This method receives two values. The first is the item text and the second is the value that will return when the user will click on this item. In this case we are storing the link as the return value.

Later we will use it to open the browser:
B4X:
Sub ListView1_ItemClick (Position As Int, Value As Object)
    StartActivity(PhoneIntents1.OpenBrowser(Value)) 'open the brower with the link
End Sub
The code that initiated the parsing is:
B4X:
    Dim in As InputStream
    in = File.OpenInput(File.DirAssets, "rss.xml") 'This file was added with the file manager.
    parser.Parse(in, "Parser") '"Parser" is the events subs prefix.
    in.Close
 

Attachments

  • XmlSax.zip
    10 KB · Views: 6,470
Last edited:

MikieK

Member
Licensed User
Longtime User
Sorry

:sign0013: Obviously I didn't take into consideration that the file may change when I unmount my SDcard. Sorry for wasting your time.
 

yonson

Active Member
Licensed User
Longtime User
maximum file size for parser?

Hi I've been using the parser to read in an xml file and write the contents to the app's database.

Its all been working fine however I've noticed if the xml file goes over a critical filesize that the parser fails with at the line

parser.Parse2(reader,"Parser")

with 'LastException - java.io.IOException'

The conditions under which this happens is when the filesize of the source xml file approaches 1MB. I can't find anything about this limitation though in the documentation, can someone explain what I'm doing wrong?

Thanks
John
 

yonson

Active Member
Licensed User
Longtime User
thanks for the prompt reply, if I understand correctly then the necessary step is to rename the file as a 'jpg' and then it will read in ok?

I saw Erel has mentioned this on another forum
 

taind

New Member
Hi, Erel!

I have XML :
B4X:
<?xml version="1.0" encoding="utf-8" ?> 
 <string xmlns="http://tempuri.org/">test</string>

How do I get the string "test"?
 

Koushik

Member
Licensed User
Longtime User
List to XML

Hi Erel,

Is there any way to generate XML directly from a list of map objects?

Thanks,
Koushik
 

walterf25

Expert
Licensed User
Longtime User
XML parsing directly

Hi all, i'm new to Basic4Android, i need to figure out how to parse XML information when retrieved directly from a URI Address, i've seen examples here that do this but using a saved file, i need to do it directly after receiving the information from the HTTP client, any help will be greatly appreciated guys

Regards,
Walter Flores
 

AHilberink

Active Member
Licensed User
Longtime User
Trouble with dual TAG

Hello,

Using this fine library I have a little problem.

The main TAG is also used within a SUBTAG. This causes every line twice within my ListView.

XML:
<?xml version="1.0" encoding="windows-1252"?>
<!--DVD Profiler Collection Export-->
<Collection>
<DVD>
<ProfileTimestamp>2008-12-26T18:18:53.000Z</ProfileTimestamp>
<ID>0044005373028.9</ID>
<MediaTypes>
<DVD>true</DVD>
<HDDVD>false</HDDVD>
<BluRay>false</BluRay>
</MediaTypes>
<UPC>0-044005-373028</UPC>
<CollectionNumber>658</CollectionNumber>
<CollectionType>Owned</CollectionType>
<Title>Het Meisje met het Rode Haar</Title>
<DistTrait/>
<OriginalTitle/>
<CountryOfOrigin>Netherlands</CountryOfOrigin>
<ProductionYear>1981</ProductionYear>
<Released>2001-09-05</Released>
<RunningTime>109</RunningTime>
<RatingSystem>Videovoorlichtingsysteem</RatingSystem>
<Rating>12</Rating>
</DVD>

Source:
B4X:
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
   If parser.Parents.IndexOf("DVD") > -1 Then
      If Name = "Title" Then
         Title = Text.ToString
      Else If Name = "Released" Then
         Link = Text.ToString
      Else If Name = "RunningTime" Then
         pubDate = Text.ToString
      End If
   End If
   If Name = "DVD" Then
      ListView1.AddSingleLine2(Title, Link) 'add the title as the text and the link as the value
   End If
End Sub
Can someone tell me how to prevent this behaviour?

Thanks,
André
 

Erel

B4X founder
Staff member
Licensed User
Longtime User
Try this:
B4X:
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
   If parser.Parents.IndexOf("DVD") > -1 Then
      If Name = "Title" Then
         Title = Text.ToString
      Else If Name = "Released" Then
         Link = Text.ToString
      Else If Name = "RunningTime" Then
         pubDate = Text.ToString
      End If
   End If
   If Name = "DVD" AND parser.Parents.IndexOf("DVD") = -1 Then
      ListView1.AddSingleLine2(Title, Link) 'add the title as the text and the link as the value
   End If
End Sub
 

ppousset

New Member
Licensed User
Longtime User
Issues Parsing XML from HTTP Response

Hello, I'm not sure where the issues is. This snippets is from a program that attempts to parse a Yahoo API response. Many posts have shown that this should work. However, I continue to get a Null Value exception: :sign0104:

java.lang.NullPointerException

B4X:
Sub find_Bars(lat,lon As String)
    Dim request As HttpRequest
   URL = "http://local.yahooapis.com/LocalSearchService/V3/localSearch?appid=YahooDemo"
    request.InitializeGet(URL & "&query=bar" & "&results=20" & "&radius=10" & "&latitude=42.53" & "&longitude=-83.73")
   request.Timeout = 10000
    If HttpClient1.Execute(request, 1) = False Then Return 
End Sub
Sub HttpClient1_ResponseSuccess (Response As HttpResponse, TaskId As Int)
Dim result As InputStream
result = Response.GetInputStream
parser.Parse(result, "Parser")
End Sub
'Sub HttpClient1_ResponseSuccess (Response As HttpResponse, TaskId As Int)
'Dim result As String
'result = Response.getstring("UTF8")
'Log(result)
'End Sub

When the second sub HttpClient1_ResponseSuccess is substituted, the xml response is returned correctly.

Thanks for your help.
 

stevel05

Expert
Licensed User
Longtime User
I have not looked at parsing, so probably can't help with that. But if you post your project (zipped) those that could help you would have a starting point without having to re-invent the wheel. And may be able to find a problem with your code without having to do too much work.

And you could have a solution in less time.
 

neelmon

Member
Licensed User
Longtime User
Thanks for your quick reply.

I have used the following code for testing:

Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
If Parser.Parents.IndexOf("item") > -1 Then
Log(Name & "-->" & Text)
End If
End Sub

After one iteration, its giving an error at following line:

parser.Parse(resultString,"Parser")

Error Details:
org.apache.harmony.xml.ExpatParser$ParseException: At line 86, column 0: not well-formed (invalid token)

At this point LOG is showing:


title-->Senior ASP.NET/SQL Server Developer (I-10 &amp; Elliott)
link-->Senior ASP.NET/SQL Server Developer

I dont know if these details will help you to identify the problem.
 

walterf25

Expert
Licensed User
Longtime User
XML parsing

Hello guys, i was wondering if anyone here can help me out figure out how to parse an xml portion below is the exact portion i'm working with, my code is working i get no error but i get nothing,

B4X:
<result>
<rep name="Brad Sherman" party="D" state="CA" district="27" phone="202-225-5911" office="2242 Rayburn House Office Building" link="http://bradsherman.house.gov/"/>
<rep name="Howard Berman" party="D" state="CA" district="28" phone="202-225-4695" office="2221 Rayburn House Office Building" link="http://www.house.gov/berman/"/>
<rep name="Barbara Boxer" party="D" state="CA" district="Junior Seat" phone="202-224-3553" office="112 Hart Senate Office Building" link="http://boxer.senate.gov"/>
<rep name="Dianne Feinstein" party="D" state="CA" district="Senior Seat" phone="202-224-3841" office="331 Hart Senate Office Building" link="http://feinstein.senate.gov"/>
</result>

here's my portion of my code that takes care of the parsing
can anyone look at it and maybe give me some pointers, i know i must be missing something, any and all help will be greatly appreciated

B4X:
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
Dim replistview1 As List
replistview1.Initialize
   If parser.Parents.IndexOf("result") > -1 Then
      If Name = "rep" Then
         repname = Text.ToString
      End If
   End If
   If Name = "result" Then
   ToastMessageShow(repname, True)
   End If


End Sub

thank you guys
:sign0163:
 
Top