Android Question utf-8 from email body

rosippc64a

Active Member
Licensed User
Longtime User
Hi All!
I am reading emails with MailParser. I got an email, what is encoded as utf-8.
upload_2018-3-31_20-0-49.png

This is a multi-part message in MIME format.
--------------2AA4A1963CC8BFB384908ACE
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

LÁT6ÁR
-----------------------------------------
This is in a string variable. I save it into an sqlite database. When I get it back from datbase, the string looks like the same, not as I would, as a normal utd-8 string, like this:
------------------------------------------------------------------
LÁT6ÁR
a Láthatár segédeszköz bolt AKCIÓS újsága
------------------------------------------------------------------
How can I convert the string into a real utf-8?
thanks in advance
 

Attachments

  • upload_2018-3-31_20-0-5.png
    upload_2018-3-31_20-0-5.png
    8.8 KB · Views: 192

Erel

B4X founder
Staff member
Licensed User
Longtime User
Programmers should never use notepad. Use a decent text editor such as Notepad++ that allows you to see the encoding.

How can I convert the string into a real utf-8?
Once you load a string there is no more any encoding involved. Encoding is only relevant for converting bytes to a string and vice versa.
 
Upvote 0

rosippc64a

Active Member
Licensed User
Longtime User
thank you for quick reply, but what would be the reason of that, when I say: edittext1.text = this_string, then there is no correct characters appeared?
There should be: "LÁT6ÁR, no "HĂrlevĂ©l". As if the android wouldn't know that it is an utf-8 string?
 
Last edited:
Upvote 0

rosippc64a

Active Member
Licensed User
Longtime User
So interesting.
At the screensave there is a character, utf8 code is C3 A1. This is 'á'.
I found it in an utf8 charset table also.
á c3 a1 LATIN SMALL LETTER A WITH ACUTE
ok. so my text is utf8, but android think, it isn't.
This text I can convert to good string with this
B4X:
        Dim bc As ByteConverter
        Dim sBody As String
        If Headers.ToUpperCase.Contains("UTF-8") Then
            Dim sBody As String = bc.StringFromBytes(Body.GetBytes("iso-8859-1"),"UTF-8")
        End If
        Msg.Bodytxt = sBody
Then this text was utf8 or iso8859-1?

When I see this text in windows, then I should set the character coding to utf8 to see correct chars.
 
Last edited:
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Dim sBody As String = bc.StringFromBytes(Body.GetBytes("iso-8859-1"),"UTF-8")
This line can cause some of the characters to get corrupted.

You should save the data to a text file and upload it to the forum.

As if the android wouldn't know that it is an utf-8 string?
The OS doesn't know anything about the encoding. It uses the encoding that you specify or the default utf-8 encoding.
 
Upvote 0

rosippc64a

Active Member
Licensed User
Longtime User
Here is the message saved directly from the gmail (in windows). But the file signature can be seen at the beginning of the file isn't relevant for the Mailparser, because it got the text as ordinary string.
 

Attachments

  • original_msg.txt
    75.5 KB · Views: 229
Upvote 0

rosippc64a

Active Member
Licensed User
Longtime User
Yes, I can see also. I tried, but in mailparser there is no that file, only a string containing that content. I write and read back and all chars remain the original, what android can't shows as a normal utf-8 text. I leave previous solution till will be better.
Thank you!
 
Upvote 0
Top