B4A Library jSoup HTML Parser

Jaames · Jan 9, 2015

Great work, keep it up! and thanks for sharing this. I made jsoup wrapper but only portion of it for my needs, and this is grat lib that you are sharing!

TheJinJ · Jan 12, 2015

Updated to v0.13, few small changes.

Added some extra options to connect and a single string return selector

B4X:

html = js.selectorFirst(html, "a", "") to return the first tag in a string

or

html = js.selectorFirst(html, "a", "href") to return the attrib contents in a string

paragkini · Jan 17, 2015

Hi, how do I use cleanhtml method? I tried reading from jsoup.org but couldnt understand the usage.

TheJinJ · Jan 17, 2015

I haven't implemented the whitelist functionality as yet so you won't be able to use clean. If you parse the html it cleans the tags etc

Inman · Jan 19, 2015

Every single project I did with B4A so far involved some level of HTML parsing. I always had to parse it by hand, until now. This library is Godsend for me. Thank you so much.

Erel · Jan 20, 2015

Inman said:
I always had to parse it by hand, until now.

Note that you could have used jTidy library to convert the html to XML and then parse the XML.

Jaames · Jan 20, 2015

Erel said:
Note that you could have used jTidy library to convert the html to XML and then parse the XML.

But with jsoup you can parse unformatted (messed up) html without a problem, and it works great, it's really the best library for html parsing as i know.

forestd · Jan 20, 2015

Great work !thks

forestd · Jan 27, 2015

"A referenced library is missing: jsoup-1.8.1"
have select jsoup-0.13

TheJinJ · Jan 27, 2015

forestd said:
"A referenced library is missing: jsoup-1.8.1"
have select jsoup-0.13

You need to put the jsoup jar in your additional libraries

forestd · Jan 31, 2015

TheJinJ said:
You need to put the jsoup jar in your additional libraries

thanks.

but have new question:

url = "https://www.b4x.com/"
Log(js.connect(url))
Log(js.connectXtra(url, "Mozilla", 0))

app Unable run ；An error occurred.
log said:
" at dalvik.system.NativeStart.main(Native Method)"
"android.os.NetworkOnMainThreadException"

Where trouble to solve the error.
thank you

inakigarm · Jan 31, 2015

I had the same problem because it makes a network call on the Main thread and in upper Android versions this lead to an exception; you'll have to add this snippet to the code

https://www.b4x.com/android/forum/threads/workaround-the-networkonmainthread-exception.44760/

Erel · Feb 1, 2015

The real solution is to avoid using this library feature as it is not implemented correctly. It will cause your app to hang and after 5 seconds Android may kill it.

Download whatever you need to download with HttpUtils2.

Jaames · Feb 1, 2015

Erel said:
The real solution is to avoid using this library feature as it is not implemented correctly. It will cause your app to hang and after 5 seconds Android may kill it.

Download whatever you need to download with HttpUtils2.

How do you mean, not to use it at all or only while downloading html's?
Is it safe to download the site with httputils2 and then use it with jsoup (this OP library)?

Erel · Feb 1, 2015

I'm not familiar with this library so I do not know whether it allows you to set the html from a string instead of a url.

Sending a http request on the main thread is a bad solution.

Jaames · Feb 1, 2015

Erel said:
I'm not familiar with this library so I do not know whether it allows you to set the html from a string instead of a url

Yes, it does.

Erel said:
Sending a http request on the main thread is a bad solution.

Aha... Thanks for clearing this up. I hope author of the lib will find solution... It's a great lib...

forestd · Feb 2, 2015

inakigarm said:
I had the same problem because it makes a network call on the Main thread and in upper Android versions this lead to an exception; you'll have to add this snippet to the code

https://www.b4x.com/android/forum/threads/workaround-the-networkonmainthread-exception.44760/

As you say, can now normal use
Thank you very much

Jaames · Feb 2, 2015

It would be great if it can be done to use this lib in this way:

B4X:

  Document doc = Jsoup.connect("http://www.example.com/view.jsp")
              .data("Field1", Integer.toString(Field1Mode.getValue()))
              .data("Field2", Field2Name)
              .header("Accept-Language", "en")
              .post();

B4X:

Dim doc as jSoupDocument = jSoup.connect("www.example.com/view.jsp?") _
                 .data1("Field1", Field1Mode) _
                 .data2("Field2",Field2Name) _
                 .header("Accept-Language", "en") _
                 .post

I know is possible, I saw some libs done in this way in b4A, but how?

TheJinJ · Feb 18, 2015

Fixed an error I had with getting elements with tags and attributes. Will do a bit more work on this and try and get it further complete.
v0.15 attached to 1st post

B4A Library jSoup HTML Parser

Attachments

Active Member

Active Member

Member

Active Member

Well-Known Member

B4X founder

Active Member

Member

Member

Active Member

Member

Well-Known Member

B4X founder

Active Member

B4X founder

Active Member

Member

Active Member

Active Member

Similar Threads