Android Question parsing text that contain invalid charcters

Addo

Well-Known Member
Licensed User
i am trying to receive bytes and convert it to string and parse it

the bytes that is string list that saved to a memorystream and send to b4a Asyncstream client socket

the code of new data looks like following

B4X:
Public Sub NewData (data() As Byte)
 
Dim msg As String

msg = BytesToString(data, 0, data.Length, "UTF8")

Dim param As String = msg
Dim paramnum() As String = Regex.Split("\~", param)

Log(paramnum(0))

End Sub

the text that sent from server looks like this

url1~
url2~
Url3~

and so on but after received and convert bytes to string there is some invalid charcters inserted that break the parsing like following

B4X:
url1~
������������
url2~
Url3~

i dont know from where this characters came after bytetostring conversion

now when i try to capture the parsing data as example paramnum(2)
i got an exception


how can i solve this invalid charter after conversion ?

i have used msg = msg.trim with no luck to solve

also i have try to capture if there is any hidden charter that could break that so i used check hidden charcters online i come out with this result

B4X:
URL1~

URL2~
URL3~
URL4~
URL5~
 
Last edited:

MarkusR

Well-Known Member
Licensed User
Longtime User
the example used "UTF-8" i don't know if this will be a difference. you used "UTF8" above
 
Upvote 0

MarkusR

Well-Known Member
Licensed User
Longtime User
i meant
i just saw in b4a was it mentioned with a minus char. your code above is without.
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
No, he is talking about that you use "UTF8" and usually it is named "UTF-8"
It would be helpful if you post the data received as HEX. If it contains the Bytes 0xef,0xbb,0xbf then it is probably a BOM-Header.

Can you upload a textfile with the data you are transfering?
 
Last edited:
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
I did not asked for a code you are using.
Write the data to a textfile instead of a stream you are sending. Or write the data to a stream which points to a file.
Upload this file.

All i can say is that the data looks like to have a Byte order Mark (BOM) which it should NOT.
 
Upvote 0

Addo

Well-Known Member
Licensed User
i have tried with no luck

B4X:
msg = BytesToString(data, 0, data.Length, "UTF-8")
msg = msg.Trim
msg = msg.Replace(CRLF, "")
msg = msg.Replace(Chr(10), "")
msg = msg.Replace(Chr(13), "")
msg = msg.Replace(Chr(127), "")
 
Upvote 0

klaus

Expert
Licensed User
Longtime User
The code below works with you file:

B4X:
Private txt, split() As String

txt = File.ReadString(File.DirAssets, "mss.txt")
Log (txt)
split = Regex.Split("~" & Chr(13) & Chr(10), txt)
For i = 0 To split.Length - 1
    Log(split(i))
Next
In the file you sent, there are no other 'invalid' characters like in the text you show in your firt post.
 
Upvote 0

Addo

Well-Known Member
Licensed User
because it was pure text from server side does not process as byetstostring
iam still trying to figure how bytesbuilder works in this case

 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…