Android Question jTidy Outputs Empty XML file

mangojack

Well-Known Member
Licensed User
Longtime User
Hi .. I cannot get jTidy lib to parse a downloaded HTML file to XML.
I have had success on a small web page generated by me and located on my host server and also a small html file in Assets folder..
but all other attemps result in empty XML file .

What am I doing wrong.

B4X:
Sub GetData 
    Okhc.Initialize("Okhc")
    req.InitializeGet("https://www.b4x.com/android/forum/")
    Okhc.Execute(req, 1) 
End Sub

Sub Okhc_ResponseSuccess (Response As OkHttpResponse, TaskId As Int)
        Response.GetAsynchronously("GetHTML", File.OpenOutput(File.DirDefaultExternal, "page.html", False), True, TaskId)     
End Sub

Sub GetHTML_StreamFinish (Success As Boolean, TaskId As Int)      
    tid.Initialize
    tid.Parse(File.OpenInput(File.DirDefaultExternal, "page.html"), File.DirDefaultExternal, "data.xml")
    sax.Initialize
    sax.Parse(File.OpenInput(File.DirDefaultExternal, "data.xml"), "sax")
End Sub

The sax.Parse line errors ...org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: no element found

Many thanks and Regards
 

mangojack

Well-Known Member
Licensed User
Longtime User
I was using OkHttpUtils2 in Main project .. but was working on an addition in a test project with OKhttp

Changing over I still am having no success .. The pages are definitely downloading , but jTidy will not parse to an XML file.

B4X:
Sub GetData
 
    Dim getHTML As HttpJob
    getHTML.Initialize("", Me)     
    getHTML.Download("https://www.b4x.com/android/forum/")
    Wait For (getHTML) JobDone(getHTML As HttpJob)
    If getHTML.Success Then
     
        Log(getHTML.GetString) 
     
        Dim out As OutputStream = File.OpenOutput(File.DirDefaultExternal, "page.html", False)
        File.Copy2(getHTML.GetInputStream, out)
        out.Close  
     End If

    tid.Initialize
    tid.Parse(File.OpenInput(File.DirDefaultExternal, "page.html"), File.DirDefaultExternal, "data.xml")   'page.html all good
    sax.Initialize
    sax.Parse(File.OpenInput(File.DirDefaultExternal, "data.xml"), "sax")   'data.xml is empty

End Sub


Output of Log(Job.GetString)...

 
Last edited:
Upvote 0

mangojack

Well-Known Member
Licensed User
Longtime User
This works on a small test webpage ..
B4X:
    Dim getHTML As HttpJob
    getHTML.Initialize("", Me)      
    getHTML.Download("http:\\icyg.net")
    Wait For (getHTML) JobDone(getHTML As HttpJob)
    If getHTML.Success Then
        tid.Initialize
        tid.Parse(getHTML.GetInputStream,File.DirDefaultExternal, "data.xml")
        sax.Initialize
        sax.Parse(File.OpenInput(File.DirDefaultExternal, "data.xml"), "sax")  
    End If


EDIT .... After a lot of testing ,getHTML.GetInputStream appears OK and changes in size depending on URI .. but jTidy still refuses to parse it to file.xml.
 
Last edited:
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
I've tested it with this code:
B4X:
Sub Activity_Click
   Dim getHTML As HttpJob
   getHTML.Initialize("", Me)
   getHTML.Download("http:\\icyg.net")
   Wait For (getHTML) JobDone(getHTML As HttpJob)
   If getHTML.Success Then
     Dim tid As Tidy
     tid.Initialize
     tid.Parse(getHTML.GetInputStream,File.DirInternal, "data.xml")
     Log(File.ReadString(File.DirInternal, "data.xml"))
   End If
End Sub

It works properly. Are you using jTidy v1.10?
 
Upvote 0

mangojack

Well-Known Member
Licensed User
Longtime User
@Erel .. Yes jTidy 1.10. I have success with that url and a few others ... but Not this for example (and many others..)

B4X:
   Dim getHTML As HttpJob
   getHTML.Initialize("", Me)
    getHTML.Download("https://www.b4x.com/android/forum/")
   Wait For (getHTML) JobDone(getHTML As HttpJob)
   If getHTML.Success Then
     Dim tid As Tidy
     tid.Initialize
     tid.Parse(getHTML.GetInputStream,File.DirInternal, "data.xml")
     Log(File.ReadString(File.DirInternal, "data.xml"))
   End If
 
Upvote 0

mangojack

Well-Known Member
Licensed User
Longtime User
Thanks ... That works .

I read references online to .. setForceOutput.
As a bonus I have been given a glimpse of understanding/ using Java object.
 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…