iOS Question HTML/JSOUP parser

walterf25 · May 10, 2018

Hello all, i'm currently porting a B4A app to B4i, i am using a library written by a forum member to parse HTML files called jsoup and can be found here, is there an equivalent or any other library in B4i that can parse html downloaded strings.

currently i only need to retrieve the description and names of the soccer coaches on this page https://www.titanssoccerclub.com/coaches/
I'm already able to retrieve the images using the regex class without a problem, i can't figure out how to retrieve the description and the title at each section.

I wondered if there is already a library that can do this easily, or will i have to just try to parse the whole page and scrape the necessary information i need?

Thanks in advance all, if anyone has any ideas or suggestions please feel free to throw it my way.

Regards,
Walter

Erel · May 10, 2018

Currently there is no similar library in B4i. You will need to use regex for this.

This seems to work:

B4X:

Sub ListPersonnel(lines As List)
   For Each line As String In lines
       Dim m As Matcher = Regex.Matcher($"<div class=\"col sqs-col-9 span-9\">.*<h2>([^<]+)</h2>(.*)</div></div></div></div>"$, line)
       If m.Find Then
           Log(m.Group(1))
           Log(m.Group(2))
       End If
   Next
End Sub

iOS Question HTML/JSOUP parser

walterf25

Expert

Erel

B4X founder

Similar Threads