Android Question Need get content by line from html page

chaiwatt

Member
Licensed User
Longtime User
Hi Erel, I just purchased B4A yesterday and fall in love it now. So I very newbie in B4A :D but I work on VB6 for more 10 yrs. My target project is about getting line from html page. I use HttpUtils2 to download content as below.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body><td>a,306902,52,242+077+149+510,16-01-2014,</td><br><td>a,180149,95,406+492+888+976,1-02-2014,</td><br><td>a,384245,1,074+397+521+530,16-02-2014,</td><br><td>a,906318,35,116+537+753+798,1-03-2014,</td><br><td>a,531404,79,250+305+400+904,16-03-2014,</td><br><td>a,28866,95,186+499+835+938,1-04-2014,</td><br><td>a,153406,26,013+344+355+634,16-04-2014,</td><br><td>a,103297,52,143+158+673+797,2-05-2014,</td><br><td>a,87523,20,150+505+112+246,16-05-2014,</td><br><td>a,781198,18,160+324+409+636,1-06-2014,</td><br><td>a,673920,95,140+158+576+639,16-06-2014,</td><br><td>a,378477,39,123+271+441+864,1-07-2014,</td><br><td>a,468728,45,104+117+205+944,10-07-2014,</td><br><td>a,766391,82,349+576+623+637,1-08-2014,</td><br><td>a,662842,91,187+633+639+912,16-08-2014,</td><br><td>a,856763,22,308+477+490+912,1-09-2014,</td><br><td>a,772269,35,112+257+342+790,16-09-2014,</td><br></body></html>

Next, I need extract string inside tag <td> into listview. I read form related topic, said that I need jTidy library but don't know how to go or even how to import jTidy into B4A. Need help advice, Thanks.
 

Beja

Expert
Licensed User
Longtime User
Hi,
May be you can use Json and then parse the downloaded Json and extract the text you want and save it in a list.
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
btw: The html-code you´ve posted is not a valid html....

the table you are using is missing the <table> and </table> tags...

And it´s also missing <tr> and </tr> tags...

And <br> is not allowed between <td>-Tags...

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<table>
<tr><td>a,306902,52,242+077+149+510,16-01-2014,</td></tr>
<tr><td>a,180149,95,406+492+888+976,1-02-2014,</td></tr>
<tr><td>a,384245,1,074+397+521+530,16-02-2014,</td></tr>
<tr><td>a,906318,35,116+537+753+798,1-03-2014,</td></tr>
<tr><td>a,531404,79,250+305+400+904,16-03-2014,</td></tr>
<tr><td>a,28866,95,186+499+835+938,1-04-2014,</td></tr>
<tr><td>a,153406,26,013+344+355+634,16-04-2014,</td></tr>
<tr><td>a,103297,52,143+158+673+797,2-05-2014,</td></tr>
<tr><td>a,87523,20,150+505+112+246,16-05-2014,</td></tr>
<tr><td>a,781198,18,160+324+409+636,1-06-2014,</td></tr>
<tr><td>a,673920,95,140+158+576+639,16-06-2014,</td></tr>
<tr><td>a,378477,39,123+271+441+864,1-07-2014,</td></tr>
<tr><td>a,468728,45,104+117+205+944,10-07-2014,</td></tr>
<tr><td>a,766391,82,349+576+623+637,1-08-2014,</td></tr>
<tr><td>a,662842,91,187+633+639+912,16-08-2014,</td></tr>
<tr><td>a,856763,22,308+477+490+912,1-09-2014,</td></tr>
<tr><td>a,772269,35,112+257+342+790,16-09-2014,</td></tr>
</table>
</body>
</html>
 
Last edited:
Upvote 0

udg

Expert
Licensed User
Longtime User
You may try with a Regex matcher.
Something like:
<td>a\,([\d\+\-]*\,){4}<\/td>

That should select anything between a <td> and its matching </td> as per the code you posted.

Anyway, Manfred already pointed out that your HTML code is not complete/correct.

udg
 
Upvote 0
Top