B4J Question Parsing HTML Tables

icakinser

Member
Licensed User
Longtime User
Is there any way to parse this HTML table to a CSV file?
B4X:
<h3 class="mobileH3"><i style="color:#c14e4e;margin-right: 5px;" class="fa fa-arrow-circle-down" aria-hidden="true"></i><a href="/topstocks/">Losers</a></h3>
<div id="ContentPlaceHolder1_tdLosers"><table id="tblMovers"><tr><td class="last"><div class="chgDown">-58.08%</div><div class="lastPrice">2.62</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=ICD">ICD</a><a class="company" href="/stock/?stock=ICD">Independence Contr </a><div class="volume"><span>volume:</span>4230</div></td></tr><tr><td class="last"><div class="chgDown">-28.45%</div><div class="lastPrice">2.44</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=ASTC">ASTC</a><a class="company" href="/stock/?stock=ASTC">Astrotech Corporat </a><div class="volume"><span>volume:</span>12300</div></td></tr><tr><td class="last"><div class="chgDown">-24.47%</div><div class="lastPrice">2.64</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=DZZ">DZZ</a><a class="company" href="/stock/?stock=DZZ">DB Gold Double Sho </a><div class="volume"><span>volume:</span>200</div></td></tr><tr><td class="last"><div class="chgDown">-24.29%</div><div class="lastPrice">37.50</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=KEX">KEX</a><a class="company" href="/stock/?stock=KEX">Kirby Corporation </a><div class="volume"><span>volume:</span>1</div></td></tr><tr><td class="last"><div class="chgDown">-23.42%</div><div class="lastPrice">2.06</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=AGS">AGS</a><a class="company" href="/stock/?stock=AGS">Playags Inc. </a><div class="volume"><span>volume:</span>25010</div></td></tr><tr><td class="last"><div class="chgDown">-20.94%</div><div class="lastPrice">6.72</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=MCFT">MCFT</a><a class="company" href="/stock/?stock=MCFT">Mastercraft Boat H </a><div class="volume"><span>volume:</span>598</div></td></tr><tr><td class="last"><div class="chgDown">-20.66%</div><div class="lastPrice">4.57</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=LOVE">LOVE</a><a class="company" href="/stock/?stock=LOVE">The Lovesac Company </a><div class="volume"><span>volume:</span>543</div></td></tr><tr><td class="last"><div class="chgDown">-20.48%</div><div class="lastPrice">3.30</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=IDN">IDN</a><a class="company" href="/stock/?stock=IDN">Intellicheck Inc. </a><div class="volume"><span>volume:</span>5150</div></td></tr><tr><td class="last"><div class="chgDown">-19.94%</div><div class="lastPrice">7.67</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=AINV">AINV</a><a class="company" href="/stock/?stock=AINV">Apollo Investment </a><div class="volume"><span>volume:</span>610</div></td></tr><tr><td class="last"><div class="chgDown">-18.18%</div><div class="lastPrice">4.50</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=BATL">BATL</a><a class="company" href="/stock/?stock=BATL">Battalion Oil Corp </a><div class="volume"><span>volume:</span>1</div></td></tr><tr><td class="last"><div class="chgDown">-14.26%</div><div class="lastPrice">271.95</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=CACC">CACC</a><a class="company" href="/stock/?stock=CACC">Credit Acceptance </a><div class="volume"><span>volume:</span>57</div></td></tr><tr><td class="last"><div class="chgDown">-11.70%</div><div class="lastPrice">81.00</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=CCF">CCF</a><a class="company" href="/stock/?stock=CCF">Chase Corporation </a><div class="volume"><span>volume:</span>11</div></td></tr><tr><td class="last"><div class="chgDown">-9.16%</div><div class="lastPrice">5.65</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=FSP">FSP</a><a class="company" href="/stock/?stock=FSP">Franklin Street Pr </a><div class="volume"><span>volume:</span>101</div></td></tr><tr><td class="last"><div class="chgDown">-9.15%</div><div class="lastPrice">230.00</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=COKE">COKE</a><a class="company" href="/stock/?stock=COKE">Coca-cola Consolid </a><div class="volume"><span>volume:</span>105</div></td></tr><tr><td class="last"><div class="chgDown">-8.96%</div><div class="lastPrice">356.63</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=GHC">GHC</a><a class="company" href="/stock/?stock=GHC">Graham Holdings Co </a><div class="volume"><span>volume:</span>51</div></td></tr><tr><td class="last"><div class="chgDown">-8.93%</div><div class="lastPrice">169.10</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=SIVB">SIVB</a><a class="company" href="/stock/?stock=SIVB">Svb Financial Group </a><div class="volume"><span>volume:</span>281</div></td></tr><tr><td class="last"><div class="chgDown">-7.59%</div><div class="lastPrice">474.31</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=TPL">TPL</a><a class="company" href="/stock/?stock=TPL">Texas Pacific Land </a><div class="volume"><span>volume:</span>374</div></td></tr><tr><td class="last"><div class="chgDown">-6.84%</div><div class="lastPrice">11.04</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=SDI">SDI</a><a class="company" href="/stock/?stock=SDI">Standard Diversifi </a><div class="volume"><span>volume:</span>15010</div></td></tr><tr><td class="last"><div class="chgDown">-6.83%</div><div class="lastPrice">197.80</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=RE">RE</a><a class="company" href="/stock/?stock=RE">Everest Re Group L </a><div class="volume"><span>volume:</span>408</div></td></tr><tr><td class="last"><div class="chgDown">-6.59%</div><div class="lastPrice">196.17</div></td><td class="tdSymbol" style="vertical-align:top"><a class="symbol" href="/stock/?stock=NWLI">NWLI</a><a class="company" href="/stock/?stock=NWLI">National Western L </a><div class="volume"><span>volume:</span>5</div></td></tr></table></div>
</td>
</tr>
</table>
I have tried to use jTidy and Regex with no success.
 

emexes

Expert
Licensed User
I would have thought it can be done with regex by a multi-stage process and string arrays:

- first extract the lines <tr .. </tr>
- then for each line, extract the fields <td .. </td>
- then for each field, replace all tags < .. > with ""
- then output the fields to a CSV (or just use the data directly from the fields() string array)


Another way is to just use the standard b4x string functions, and:

- for printable text not within < >, just emit the text
- for </td, emit a comma
- for </tr, emit a newline


For both methods, you might need to extract the table first, ie, <table .. </table>

If I have time tonight, I'll whip up some code.
 
Upvote 0
Top