Android Question HTML Tag variations?

GuyBooth

Active Member
Licensed User
Longtime User
I'm reading responses from UPnP Events which have the following format:

B4X:
NOTIFY / HTTP/1.1
HOST: 192.168.3.121:5656
CONTENT-TYPE: text/xml;charset="utf-8"
CONTENT-LENGTH: 1707
NT: upnp:event
NTS: upnp:propchange
SID: uuid:46b5bbcb-6b62-1e7d-9c38-da25b4b15c0f
SEQ: 0

<e:property>
    <LastChange>
        <Event xmlns="urn:schemas-upnp-org:metadata-1-0/AVT/">
            <InstanceID val="0">
                <TransportState val="NO_MEDIA_PRESENT"/>
                <CurrentTrackDuration val="0:03:54"/>
            </InstanceID>
        </Event>
    </LastChange>
</e:property>

A SAX parser doesn't seem to like this, giving me the following error:
(ParseException) org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 4: not well-formed (invalid token). I notice that after <LastChange> until </LastChange> the lines are statements (?) within < and > characters.
Are these a particular format for which a parsing tool is already available, or do i need to write my own?

Guy
 

warwound

Expert
Licensed User
Longtime User
Following on from Erel's question...

Is the response you posted the actual body of the response or have you included the additional HTTP headers?
Obviously your posted response is not valid XML.

Assuming that you can make a request and receive a response that is a valid XML document (no headers etc) then you could look at my XOM library:
https://www.b4x.com/android/forum/threads/lib-xom.23551/#content

You could use the XOMBuilder BuildFromURL method to request your XML.
The XOMBuilder BuildDone event will then be raised and passed an XOMDocument which represents the parsed XML response.
You can then use the various XOM objects and methods to obtain the values you require from the response.

Martin.
 
Upvote 0

GuyBooth

Active Member
Licensed User
Longtime User
Erel, the part I want is the part that doesn't quite look like XML, e.g. <CurrentTrackDuration val="0:03:54"/>

Martin, what I have posted is exactly what I receive including headers, except that there are often more values than I have shown in the "body".
Building a parser to extract the information I need isn't difficult - all the items I am looking for include a "Name" followed by "val=" and the "Value" ends at "/>". The guts of the parser I have built is shown below, but I am not very familiar with XML and similar formats so I wondered whether there was already a parser available for this. I can use the one I have written. Maybe there's a more efficient way.

For each line:
B4X:
Sub Parse_Item(Item as String)
    Dim sItem, sValue as String
    If Item <> "" then
        If Item.Contains("<") AND Item.Contains("val=") AND Item.Contains("/>") Then
            sItem = Item.SubString2(Item.IndexOf("<")+1,Item.IndexOf("val=") - 1)
            sValue = Item.SubString2(Item.IndexOf("val=")+5,Item.IndexOf("/>") - 1)
        End If
    End if
End Sub

Thanks for your input.
 
Upvote 0

sorex

Expert
Licensed User
Longtime User
if you are sure the format is always like that you can get the time like this

B4X:
Dim t As String
t="NOTIFY / HTTP/1.1 " & CRLF & _
"HOST: 192.168.3.121:5656" & CRLF & _
"CONTENT-Type: text/xml;charset=""utf-8"" " & CRLF & _
"CONTENT-LENGTH: 1707" & CRLF & _
"NT: upnp:event" & CRLF & _
"NTS: upnp:propchange" & CRLF & _
"SID: uuid:46b5bbcb-6b62-1e7d-9c38-da25b4b15c0f" & CRLF & _
"SEQ: 0" & CRLF & _
"" & CRLF & _
"<e:property>" & CRLF & _
"    <LastChange>" & CRLF & _
"        <Event xmlns=""urn:schemas-upnp-org:metadata-1-0/AVT/"">" & CRLF & _
"            <InstanceID val=""0"">" & CRLF & _
"                <TransportState val=""NO_MEDIA_PRESENT""/>" & CRLF & _
"                <CurrentTrackDuration val=""0:03:54""/>" & CRLF & _
"            </InstanceID>" & CRLF & _
"        </Event>" & CRLF & _
"    </LastChange>" & CRLF & _
"</e:property>"


Log (t.SubString2(t.IndexOf("CurrentTrackDuration val=")+26,t.IndexOf2("/",t.IndexOf("CurrentTrackDuration val="))-1))


it spits out "0:03:54" (without the quotes)
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Use the new smart strings literal:
B4X:
  Dim t As String = $"NOTIFY / HTTP/1.1
HOST: 192.168.3.121:5656
CONTENT-TYPE: text/xml;charset="utf-8"
CONTENT-LENGTH: 1707
NT: upnp:event
NTS: upnp:propchange
SID: uuid:46b5bbcb-6b62-1e7d-9c38-da25b4b15c0f
SEQ: 0

<e:property>
  <LastChange>
  <Event xmlns="urn:schemas-upnp-org:metadata-1-0/AVT/">
  <InstanceID val="0">
  <TransportState val="NO_MEDIA_PRESENT"/>
  <CurrentTrackDuration val="0:03:54"/>
  </InstanceID>
  </Event>
  </LastChange>
</e:property>"$
   Dim m As Matcher = Regex.Matcher($"(\w+) val=\"([^"]+)""$, t)
   Do While m.Find
     Log($"Match found: ${m.Group(1)}: ${m.Group(2)}"$)
   Loop

Output:

Match found: InstanceID: 0
Match found: TransportState: NO_MEDIA_PRESENT
Match found: CurrentTrackDuration: 0:03:54
 
Upvote 0

GuyBooth

Active Member
Licensed User
Longtime User
Use the new smart strings literal:

Yes that worked for me once I learned how to use the placeholders.
The xml formatting takes a "<" and changes it to &lt, ">" to &gt etc etc. Is there a format that goes the other way? Change &lt to "<" ... for example?

Excellent support as usual, thank you Erel.
 
Upvote 0
Top