Read content from HTML to string?


Hi guys,

Is it possible to read content from a webpage and read that data in a string variable, which could then be used for data manipulation?




Licensed User
As posted you can download the webpage directly using various methods.

Alternatively you can get the HTML that's already loaded using WebViewExtras:

'Activity module
Sub Process_Globals
   'These global variables will be declared once when the application starts.
   'These variables can be accessed from all modules.

End Sub

Sub Globals
   'These global variables will be redeclared each time the activity is created.
   'These variables can only be accessed from this module.
   Dim WebViewExtras1 As WebViewExtras
   Dim WebView1 As WebView
End Sub

Sub Activity_Create(FirstTime As Boolean)
   '   add the B4A javascript interface to the WebView
   WebViewExtras1.addJavascriptInterface(WebView1, "B4A")
   '   now load a web page
End Sub

Sub Activity_Resume

End Sub

Sub Activity_Pause (UserClosed As Boolean)

End Sub

Sub WebView1_PageFinished (Url As String)
   '   Now that the web page has loaded we can get the page content as a String
   '   see the documentation for details of the second parameter callUIThread
   Dim Javascript As String
   Javascript="B4A.CallSub('ProcessHTML', false, document.documentElement.outerHTML)"

   Log("PageFinished: "&Javascript)
   WebViewExtras1.executeJavascript(WebView1, Javascript)
End Sub

Sub ProcessHTML(Html As String)
   '   This is the Sub that we'll get the web page to send it's HTML content to
   '   Log may truncate a large page so you'll not see all of the HTML in the log but the 'html' String should still contain all of the web page HTML
   Log("ProcessHTML: "&Html)
End Sub