Android Question [solved] JSON encoding

ALBRECHT

Active Member
Licensed User
Hello,

Please, i have a Json provided from an external DB Table like that :
B4X:
[{"Id":"1","Libelle":"CONFORT ET BIEN-ÊTRE","PictureId":"2428"},
{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},{"Id":"40","Libelle":"MOBILITE","PictureId":"3464"},{"Id":"53","Libelle":"MATERNITE","PictureId":"3466"},
{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},
{"Id":"105","Libelle":"MATERIEL D’INJECTION","PictureId":"3469"},
{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},
{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]

but when i read that with :
B4X:
Dim parser As JSONParser
parser.Initialize(Job.GetString)
Dim rows As List
rows = parser.NextArray
Thats show me into the listview some special char like : ♦ replacing the accented characters

and if i use the function (readed into the b4x forum):
B4X:
public Sub UnicodeEscape (s As String) As String
    Dim sb As StringBuilder
    sb.Initialize
    For i = 0 To s.Length - 1
        Dim u As String = Bit.ToHexString(Asc(s.CharAt(i)))
        sb.Append("\u")
        For i2 = 1 To 4 - u.Length
            sb.Append("0")
        Next
        sb.Append(u)
    Next
    Return sb.ToString
End Sub
Thats return something like that : \u00xx for each characters into the strings

Should i use a special B4X function to read all kind of international characters
or should i analyse that before the json creation ?

Thank you
Michel
 

DonManfred

Expert
Licensed User
Longtime User
Sounds like you are not using the correct charset.

a Json provided from an external DB Table
Can you post a link to request some data from your external DB?

Can you upload a small project which shows the issue?
 
Upvote 0

ALBRECHT

Active Member
Licensed User
The return is exactly what you see (with no header):

B4X:
[{"Id":"1","Libelle":"CONFORT ET BIEN-ÊTRE","PictureId":"2428"},
{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},
{"Id":"40","Libelle":"MOBILITE","PictureId":"3464"},
{"Id":"53","Libelle":"MATERNITE","PictureId":"3466"},
{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},
{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},
{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},
{"Id":"105","Libelle":"MATERIEL D’INJECTION","PictureId":"3469"},
{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},
{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]

with 2 special Char : "CONFORT ET BIEN-ÊTRE" at the line 1
and : a quote : "D’INJECTION" at line 105
 
Upvote 0

emexes

Expert
Licensed User
Did you get this going? I tried it with the following code, using the string from your first post:
B4X:
Sub Process_Globals

    Dim J As String = _
    $"[{"Id":"1","Libelle":"CONFORT ET BIEN-ÊTRE","PictureId":"2428"},"$ & _
    $"{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},"$ & _
    $"{"Id":"40","Libelle":"MOBILITE","PictureId":"3464"},"$ & _
    $"{"Id":"53","Libelle":"MATERNITE","PictureId":"3466"},"$ & _
    $"{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},"$ & _
    $"{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},"$ & _
    $"{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},"$ & _
    $"{"Id":"105","Libelle":"MATERIEL D’INJECTION","PictureId":"3469"},"$ & _
    $"{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},"$ & _
    $"{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]"$
  
End Sub

Sub Activity_Create(FirstTime As Boolean)

    Log(J)
  
    Dim parser As JSONParser
    parser.Initialize(J)
    Dim rows As List
    rows = parser.NextArray
  
    For I = 0 To rows.Size -1
        Dim X As Map = rows.Get(I)
        Log(I & " = " & X)
    Next
  
End Sub
and get these results:
B4X:
** Activity (main) Create, isFirst = true **
[{"Id":"1","Libelle":"CONFORT ET BIEN-ÊTRE","PictureId":"2428"},{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},{"Id":"40","Libelle":"MOBILITE","PictureId":"3464"},{"Id":"53","Libelle":"MATERNITE","PictureId":"3466"},{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},{"Id":"105","Libelle":"MATERIEL D’INJECTION","PictureId":"3469"},{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]
0 = (MyMap) {Id=1, Libelle=CONFORT ET BIEN-ÊTRE, PictureId=2428}
1 = (MyMap) {Id=14, Libelle=AIDES TECHNIQUES, PictureId=3465}
2 = (MyMap) {Id=40, Libelle=MOBILITE, PictureId=3464}
3 = (MyMap) {Id=53, Libelle=MATERNITE, PictureId=3466}
4 = (MyMap) {Id=66, Libelle=HYGIENE, PictureId=3452}
5 = (MyMap) {Id=79, Libelle=DIAGNOSTIQUE, PictureId=4533}
6 = (MyMap) {Id=92, Libelle=PROTECTION, PictureId=4534}
7 = (MyMap) {Id=105, Libelle=MATERIEL D’INJECTION, PictureId=3469}
8 = (MyMap) {Id=32, Libelle=EQUIPEMENT DE CABINET, PictureId=4535}
9 = (MyMap) {Id=36, Libelle=SOINS ET PANSEMENTS, PictureId=3437}
** Activity (main) Resume **
which includes the non-ASCII characters from the original, and seems to have been delineated correctly.
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
I tried it with the following code
In fact you started with UTF-8 Data. So the result is that it is working.

The online data from the TO probably send the data in any other charset than UTF-8.

So the solution is:
1. to use
B4X:
    Wait For (j) JobDone(j As HttpJob)
    If j.Success Then
        Dim res As String = j.GetString2("charset") ' replace charset with the correct charsetname
        Log(res)
        ' Now parse the data using JsonParser
    Else
        Log(j.ErrorMessage)
    End If
    j.Release
2. Change the online tool to return UTF-8

PS: Thats the reason i asked for a sample project or a url which returns some data.
Using this url we would see the used charset encoding.
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
In fact you started with UTF-8 Data.
I started with the Unicode string that Albrecht supplied in post #1, which appears to be before the "♦ replacing the accented characters" issue occurred. Whether it was encoded in UTF-8 or UTF-7 or UTF-16 or Windows-1250 is not relevant, as long as it decoded back to the original Unicode here. Because it is JSON, I expect that it is what is being returned from the database to B4A and thus includes the effects of any Unicode transfer encoding/decoding. But I agree that perhaps the supplied string has been obtained by some other means, and is not an exact copy of the string returned by Job.GetString.
So the result is that it is working.
It does. I posted my code and string so that Albrecht could run it using the configuration that is causing the replacement of (non-ASCII) characters. If the same processing that works ok with "my" string, fails with the same string retrieved from the database, then we can compare those two input strings to see what the cause might be.
The online data from the TO
What is TO?
probably send the data in any other charset than UTF-8
Perhaps, but then the string in post#1 should already have contained the replacement characters. Like you:
Thats the reason i asked for a sample project or a url which returns some data. Using this url we would see the used charset encoding.
I agree that this would be useful to see.
 
Upvote 0

ALBRECHT

Active Member
Licensed User
Hello,

In fact, my link (api) : https://www.materiel-medical-a-domicile.com/Asp/ListCatJson.asp

with that kind of settings before the creation of the Json :
response.ContentType = "application/json"
response.Charset = "UTF-8"


that return :

[{"Id":"1","Libelle":"CONFORT ET BIEN-�TRE","PictureId":"2428"},{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},{"Id":"40","Libelle":"MOBILIT�","PictureId":"3464"},{"Id":"53","Libelle":"MATERNIT�","PictureId":"3466"},{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},{"Id":"105","Libelle":"MATERIEL D'INJECTION","PictureId":"3469"},{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]

so, that kind of Special unknowed Char are displayed in in the listview as is ! : �

(the chars should be : Ê , É)

Please, what script to use to fix this when reading the JSON in the listview

thanks for your helps
Michel
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
that return
the data seems to be not utf8

This code is working for me

B4X:
    Dim j As HttpJob
    j.Initialize("",Me)
    j.Download("https://www.materiel-medical-a-domicile.com/Asp/ListCatJson.asp")
    Wait For (j) JobDone(j As HttpJob)
    If j.Success Then
        Dim res As String = j.GetString2("ISO-8859-1") ' replace charset with the correct charsetname
        Log(res)
        Dim parser As JSONParser
        parser.Initialize(res)
        Dim root As List = parser.NextArray
        For Each colroot As Map In root
            Dim PictureId As String = colroot.Get("PictureId")
            Dim Id As String = colroot.Get("Id")
            Dim Libelle As String = colroot.Get("Libelle")
            Log($"ID ${Id},PictureID ${PictureId}: Libelle ${Libelle}"$)
        Next
        ' Now parse the data using JsonParser
    Else
        Log(j.ErrorMessage)
    End If
    j.Release

Logger connected to: 988ad036525346515630
--------- beginning of crash
--------- beginning of main
--------- beginning of system
*** Service (starter) Create ***
** Service (starter) Start **
** Activity (main) Create, isFirst = true **
** Activity (main) Resume **
*** Service (httputils2service) Create ***
** Service (httputils2service) Start **
[{"Id":"1","Libelle":"CONFORT ET BIEN-ÊTRE","PictureId":"2428"},{"Id":"14","Libelle":"AIDES TECHNIQUES","PictureId":"3465"},{"Id":"40","Libelle":"MOBILITÉ","PictureId":"3464"},{"Id":"53","Libelle":"MATERNITÉ","PictureId":"3466"},{"Id":"66","Libelle":"HYGIENE","PictureId":"3452"},{"Id":"79","Libelle":"DIAGNOSTIQUE","PictureId":"4533"},{"Id":"92","Libelle":"PROTECTION","PictureId":"4534"},{"Id":"105","Libelle":"MATERIEL D'INJECTION","PictureId":"3469"},{"Id":"32","Libelle":"EQUIPEMENT DE CABINET","PictureId":"4535"},{"Id":"36","Libelle":"SOINS ET PANSEMENTS","PictureId":"3437"}]
ID 1,PictureID 2428: Libelle CONFORT ET BIEN-ÊTRE
ID 14,PictureID 3465: Libelle AIDES TECHNIQUES
ID 40,PictureID 3464: Libelle MOBILITÉ
ID 53,PictureID 3466: Libelle MATERNITÉ
ID 66,PictureID 3452: Libelle HYGIENE
ID 79,PictureID 4533: Libelle DIAGNOSTIQUE
ID 92,PictureID 4534: Libelle PROTECTION
ID 105,PictureID 3469: Libelle MATERIEL D'INJECTION
ID 32,PictureID 4535: Libelle EQUIPEMENT DE CABINET
ID 36,PictureID 3437: Libelle SOINS ET PANSEMENTS
** Activity (main) Pause, UserClosed = false **
** Activity (main) Resume **
 
Upvote 0

ALBRECHT

Active Member
Licensed User
indeed, the charset replacement will works the fine and the européan Chars are correctly displayed,

but the utf-8 Charset cover more :

so if i change by :
B4X:
j.GetString2("utf-8")

the unknowed Chars � appear again, why ?
 
Upvote 0

emexes

Expert
Licensed User
if i change by :
B4X:
j.GetString2("utf-8")
the unknowed Chars � appear again, why ?
I started explaining Unicode mappings, but... the simplest answer here is Manfred's, ie: what is received from the database is NOT UTF-8 encoded Unicode.

It is some 8-bit subset of Unicode, probably ISO-8559-1 or windows-1252 (a superset of ISO-8559-1).

UTF-8 encodes Unicode characters 0..127 to byte values 0..127, and Unicode characters > 127 to a multibyte sequence. These multibyte sequences are marked by the high bit being set to 1. If you decode a stream of 8-bit characters using UTF-8, then any characters that have the high bit set (eg, your accented characters) will instead be misinterpreted as a multibyte sequence and thus not decode correctly.
 
Last edited:
Upvote 0

yfleury

Active Member
Licensed User
Longtime User
Your .asp file is not set to utf-8, then your web server send a file to same charset. Save your .asp to utf-8
 
Upvote 0

ALBRECHT

Active Member
Licensed User
hello, I insert back the initial settings on the.asp page :
B4X:
response.ContentType = "application/json"
response.Charset = "UTF-8"

Said by chrome consol : document.characterSet = "UTF-8"

But even like that, reading the JSON in the listview shows the unknown characters! with or not the function : Job.GetString2("UTF-8")

B4X:
Sub JobDone (Job As HttpJob)  
    If Job.Success = True Then      
        Dim res As String = Job.GetString2("UTF-8") ' or ISO-8859-1
        Dim parser As JSONParser
        parser.Initialize(res)
        Dim rows As List
        rows = parser.NextArray

        Dim LibTxt As String
        For i = 0 To rows.Size - 1
            Dim m As Map
            m = rows.Get(i)
            LibTxt = m.Get("Libelle")
            ListView1.AddSingleLine(LibTxt & " : " & i)
        Next
    Else
        Log ("Db Error !!")
    End If  
    Job.Release
End Sub

and UTF-8 stay the better charset choice ...
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
I am getting a header that says it is UTF-8

upload_2019-7-14_1-40-7.png


but it still looks like an 8-bit encoding:

upload_2019-7-14_1-42-7.png


so right now I am inclined towards: if it ain't broke, don't fix it

What does it look like in your browser, using the various character mapping settings?
 
Upvote 0

hatzisn

Well-Known Member
Licensed User
Longtime User
Your .asp file is not set to utf-8, then your web server send a file to same charset. Save your .asp to utf-8

@yfleury 's answer is the correct solution. Open your asp code with notepad > File > Save As > (select encoding UTF-8) and you are all set.
 
Upvote 0

ALBRECHT

Active Member
Licensed User
hello hatzisn,

that asp (as the others of my api) file is already record in UTF-8 :

encoding charset settings : Unicode (UTF-8 without sign) - Codes Page 65001
 
Upvote 0

hatzisn

Well-Known Member
Licensed User
Longtime User
Have you saved it with notepad?
 
Upvote 0

ALBRECHT

Active Member
Licensed User
Of course,

and as i said above, chrome consol confirm : document.characterSet = "UTF-8"

thanks
 
Upvote 0
Top