Android Question [Solved] How to use MiniHtmlParser to parse a html table ?

AnandGupta

Expert
Licensed User
Longtime User
I have a html table like below, received from company url,
B4X:
<body>
<table>
<tr>
<th> name</th>
<th>age</th>
<th>sex</th>
</tr>
<tr>
<td>Jack</td>
<td>16</td>
<td>M</td>
</tr>
<tr>
<td>Jill</td>
<td>15</td>
<td>F</td>
</tr>
</table>
</body>

Now I have been using .IndexOf() and .SubString2(), no complain.

But after reading,
and understanding that my logic will fail if the table,th or td are not proper.
So I tried using MiniHtmlParser, but am unable to wrap my head on it. I search the Forum but did not find any similar one. Also read "B4X-Pleroma-master" codes, still could not make it.

Below is my code,
B4X:
    Dim html As String = $"
<body>
<table>
<tr>
<th> name</th>
<th>age</th>
<th>sex</th>
</tr>
<tr>
<td>Jack</td>
<td>16</td>
<td>M</td>
</tr>
<tr>
<td>Jill</td>
<td>15</td>
<td>F</td>
</tr>
</table>
</body>"$

    Dim parser As MiniHtmlParser
    parser.Initialize
    Dim root As HtmlNode = parser.Parse(html)
    Dim p As HtmlNode = parser.FindNode(root, "table", Null)
    Log(parser.GetTextFromNode(p, 0))
    Dim p1 As HtmlNode = parser.FindNode(p, "th", Null)
    Log(parser.GetTextFromNode(p1, 0))
    Dim p1 As HtmlNode = parser.FindNode(p, "th", Null)
    Log(parser.GetTextFromNode(p1, 0))

But I am unable to get all th values and then respective td values per row.

I want to extract,
name,age,sex​

and then

Jack,16,M​
Jill,15,F​

Any guide on it is appreciated.

Regards,

Anand
 

Erel

B4X founder
Staff member
Licensed User
Longtime User
B4X:
Sub AppStart (Args() As String)
    Dim s As String = $"
    <body>
<table>
<tr>
<th>name</th>
<th>age</th>
<th>sex</th>
</tr>
<tr>
<td>Jack</td>
<td>16</td>
<td>M</td>
</tr>
<tr>
<td>Jill</td>
<td>15</td>
<td>F</td>
</tr>
</table>
</body>"$
    Dim parser As MiniHtmlParser
    parser.Initialize
    Dim root As HtmlNode = parser.Parse(s)
    Dim table As HtmlNode = parser.FindNode(root, "table", Null)
    Dim trs As List = parser.FindDirectNodes(table, "tr", Null)
    For i = 0 To trs.Size - 1
        Dim childtag As String
        If i = 0 Then childtag = "th" Else childtag = "td"
        Dim tds As List = parser.FindDirectNodes(trs.Get(i), childtag, Null)
        Log("*****")
        For Each td As HtmlNode In tds
            Log(parser.GetTextFromNode(td, 0))
        Next
    Next
End Sub
 
Upvote 0

AnandGupta

Expert
Licensed User
Longtime User
Perfect !

Checked both in B4J and B4A, works smooth. We are all grateful to you @Erel for MiniHtmlParser, which makes so easy parsing html tags.
Thanks a lot.

Regards,

Anand
 
Upvote 0
Top