B4J Question How can I parse HTML code with unlimited nodes with unlimited child nodes to a treeview? [solved]

Mashiane

Expert
Licensed User
Longtime User
Hi there

I need some help parsing HTML code to a tree. The main point is getting the structure and some properties. As an example... this 'form' below has some kids

B4X:
<form id="el47" class="clearfix">
<div id="el125" class="form-group has-success has-feedback">
<label id="el126" for="el127" class="form-control-label text-success">
Something cool</label>
<input id="el127" type="text" aria-describedby="el328" class="form-control form-control-sm form-control-success">
<span id="el329" class="fa fa-check form-control-feedback">
</span>
<span id="el328">
(Success)</span>
</div>
<button id="el129" type="submit" class="btn btn-primary">
Submit</button>
<p id="el327" class="form-control-static">
[email protected]</p>
</form>

Ive done some jtidy code inclusive of the sax class... and stuck with continuance..

B4X:
If File.Exists(File.DirApp,"temp.xml") Then File.Delete(File.DirApp,"temp.xml")
    'save the contents first
    File.WriteString(File.DirTemp,"temp.html",sText)
    'parse with jtidy to convert to xml
    tid.Initialize
    'ensure it shows the output
    Dim jo As JavaObject = tid
    jo.GetFieldJO("tidy").RunMethod("setForceOutput", Array(True))
    'parse the Html page and create a new xml document.
    tid.Parse(File.OpenInput(File.DirTemp, "temp.html"), File.DirApp, "temp.xml")
    'does the file exist, then parse it
    Dim ParsedData As Map
    ParsedData.Initialize
    If File.Exists(File.DirApp,"temp.xml") Then
        Dim xm As Xml2Map
        xm.Initialize
        ParsedData = xm.Parse(File.ReadString(File.dirapp, "temp.xml"))
    End If
    If ParsedData.ContainsKey("html") Then
        Dim html As Map = ParsedData.Get("html")
        Dim body As Map = html.Get("body")
        
        ''''
    End If

Issues: I cant convert this to JSON with the JSONParsor as the map is not returned the same way it was saved. Is there a way to do this please?

Thanks...
 

clarionero

Active Member
Licensed User
Longtime User
Hi there

I need some help parsing HTML code to a tree. The main point is getting the structure and some properties. As an example... this 'form' below has some kids

B4X:
<form id="el47" class="clearfix">
<div id="el125" class="form-group has-success has-feedback">
<label id="el126" for="el127" class="form-control-label text-success">
Something cool</label>
<input id="el127" type="text" aria-describedby="el328" class="form-control form-control-sm form-control-success">
<span id="el329" class="fa fa-check form-control-feedback">
</span>
<span id="el328">
(Success)</span>
</div>
<button id="el129" type="submit" class="btn btn-primary">
Submit</button>
<p id="el327" class="form-control-static">
[email protected]</p>
</form>

Ive done some jtidy code inclusive of the sax class... and stuck with continuance..

B4X:
If File.Exists(File.DirApp,"temp.xml") Then File.Delete(File.DirApp,"temp.xml")
    'save the contents first
    File.WriteString(File.DirTemp,"temp.html",sText)
    'parse with jtidy to convert to xml
    tid.Initialize
    'ensure it shows the output
    Dim jo As JavaObject = tid
    jo.GetFieldJO("tidy").RunMethod("setForceOutput", Array(True))
    'parse the Html page and create a new xml document.
    tid.Parse(File.OpenInput(File.DirTemp, "temp.html"), File.DirApp, "temp.xml")
    'does the file exist, then parse it
    Dim ParsedData As Map
    ParsedData.Initialize
    If File.Exists(File.DirApp,"temp.xml") Then
        Dim xm As Xml2Map
        xm.Initialize
        ParsedData = xm.Parse(File.ReadString(File.dirapp, "temp.xml"))
    End If
    If ParsedData.ContainsKey("html") Then
        Dim html As Map = ParsedData.Get("html")
        Dim body As Map = html.Get("body")
      
        ''''
    End If

Issues: I cant convert this to JSON with the JSONParsor as the map is not returned the same way it was saved. Is there a way to do this please?

Thanks...

Hi. Do you know JSoup? It's better and easier to use than JTidy.

http://www.jsoup.org

and a wrapper already exists.

Rubén
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
Xml2Map class should also be considered to use.
 
Upvote 0

Mashiane

Expert
Licensed User
Longtime User
Thanks guys, found it, my app is already using Xm2Map and actually that does what I needed to do perfectly, I just needed to understand how it works with the loops.

I just exposed the 'elements' property and defined it as public and then added this code.. I'm starting from the body of the html document which does not have a !doctype.

B4X:
'stub
    Dim xml As XmlElement = xm.elements.Get(0)
    'html
    Dim xml1 As XmlElement = xml.Children.Get(0)
    'body (clear the parent tree item)
    cogTI.Children.clear
    Dim xml2 As XmlElement = xml1.Children.Get(1)
    For Each xmlChild As XmlElement In xml2.Children
        Dim htmlNode As TreeItem
        htmlNode.Initialize("treeProject",xmlChild.Name)
        htmlNode.Image = cogImg
        htmlNode.Expanded = True
        cogTI.Children.Add(htmlNode)
        'add the children
        XML2TreeView(xmlChild,htmlNode)
    Next
End Sub

'This function is called recursively until all nodes are loaded
Private Sub XML2TreeView(xmlNode As XmlElement, treeNode As TreeItem)
    For Each xmlChild As XmlElement In xmlNode.children
        Dim htmlNode As TreeItem
        htmlNode.Initialize("treeProject",xmlChild.Name)
        htmlNode.Image = cogImg
        htmlNode.Expanded = True
        treeNode.Children.Add(htmlNode)
        XML2TreeView(xmlChild,htmlNode)
    Next
End Sub
 
Upvote 0
Top