Android Question Decoding email headers

William Hunter

Active Member
Licensed User
Longtime User
I am seeing some headers that are encoded as shown below. How can I decode these headers to a format that will display correctly after parsing with MailParser? Any others I have seen are either base64 or quoted-printable. Any help appreciated.

Regards :)
B4X:
=?UTF-8?B?UGFya3MgQ2FuYWRh?= <[email protected]>

EDIT: It seems that =?utf-8?Q? is quoted-printable and =?utf-8?B? is base64, but I still cannot get these headers to display correctly.
 
Last edited:

Erel

B4X founder
Staff member
Licensed User
Longtime User
B4X:
Dim su As StringUtils
Dim b() As Byte = su.DecodeBase64("UGFya3MgQ2FuYWRh")
Log(BytesToString(b, 0, b.Length, "UTF8"))
General solution:
B4X:
 Log(ParseB64Header("=?UTF-8?B?UGFya3MgQ2FuYWRh?= <[email protected]>"))
   
Sub ParseB64Header(h As String) As String
   Dim sb As StringBuilder
   sb.Initialize
   Dim su As StringUtils
   
   Dim m As Matcher = Regex.Matcher("=\?([^?]+)\?B\?([^?]+)\?=", h)
   Dim lastIndex As Int = 0
   Do While m.Find
     If m.GetStart(0) > lastIndex Then
       sb.Append(h.SubString2(lastIndex, m.GetStart(0)))
     End If
     Dim b() As Byte = su.DecodeBase64(m.Group(2))
     sb.Append(BytesToString(b, 0, b.Length, m.Group(1)))
     lastIndex = m.GetEnd(0)   
   Loop
   sb.Append(h.SubString(lastIndex))
   Return sb.ToString
End Sub
]
 
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
Thank you Erel. Your General solution for base64 works very well. How can headers in quoted-printable, as in the samples below, be similarly parsed?

Regards :)
B4X:
=?utf-8?Q?RoamMobility?= <[email protected]>

"=?UTF-8?Q?M&MFoodMarket?=" <[email protected]> ' note the extra quotation marks
I can use the code below to decode some headers in quoted-printable, but it doesn't work in all instances, such as with the above samples.
B4X:
Sub DecodeQuotePrintable(MyString As String) As String
    Try
        Dim m As Matcher
        m = Regex.Matcher("=\?([^?]*)\?Q\?(.*)\?=$", MyString)
        If m.Find Then
            Dim charset As String
            Dim data As String
            charset = m.Group(1)
            data = m.Group(2)
            Dim bytes As List
            bytes.Initialize
            Dim i As Int
            Do While i < data.Length
                Dim c As String
                c = data.CharAt(i)
                If c = "_" Then
                    bytes.AddAll(" ".GetBytes(charset))
                Else If c = "=" Then
                    Dim hex As String
                    hex = data.CharAt(i + 1) & data.CharAt(i + 2)
                    i = i + 2
                    bytes.Add(Bit.ParseInt(hex, 16))
                Else
                    bytes.AddAll(c.GetBytes(charset))
                End If
                i = i + 1
            Loop
            Dim b(bytes.Size) As Byte
            For i = 0 To bytes.Size - 1
                b(i) = bytes.Get(i)
            Next
            Return BytesToString(b, 0, b.Length, charset)
        Else
            Return MyString
        End If
    Catch
        Return MyString
    End Try
End Sub
 
Last edited:
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
Where does this code come from? What is the output?
I found this sub posted on the Forum here: https://www.b4x.com/android/forum/threads/using-pop3-and-mailparser.66978/#post-424205

The Try/Catch block is my own addition. I have been using it to successfully decode quoted-printable message body. It will decode SOME quoted printable headers, but not all. Is there a better means of decoding quoted-printable?

Regards

Edit: I had stripped what I thought to be unnecessary characters from the headers shown in post #3 above. They are shown below in raw form. Sorry for the confusion.
B4X:
=?utf-8?Q?Roam=20Mobility?= <[email protected]>

"=?UTF-8?Q?M=26M=20Food=20Market?=" <[email protected]>
 
Last edited:
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
Why have you added $ to the regex?

Add Log(data) to make sure that you were able to get the correct part.
Thank you Erel. I didn't add $ to the Regex. Constructing an effective Regex is not one of my skills. I used the sub as I found it on the forum, with the exception of the added Try/Catch block. This is only to be used while in development. If an error is caught, the unprocessed string is returned so that I can review it in a WebView.

This is the logged output:
B4X:
=?utf-8?Q?Roam=20Mobility?= <[email protected]>

"=?UTF-8?Q?M=26M=20Food=20Market?=" <[email protected]>
The sub I am using can also be found in post #8 of this web link: https://www.b4x.com/android/forum/t...nicate-with-android-devices.11310/#post-87366
This is your post with the added notation "Note that it wasn't tested thoroughly enough...".

Perhaps the sub I am using could use improving. Do you have a better solution for decoding quoted-printable? I have been using it successfully, other than for decoding headers. Perhaps the decoding of headers requires a different solution.

Regards
 
Last edited:
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
The code from this post: https://www.b4x.com/android/forum/threads/using-pop3-and-mailparser.66978/#post-424205 properly parses the encoded string. However you need to first extract the encoded string from the header.

You can see how I did it in the second post. You just need to replace the B with Q.
Thank you Erel. I have tried using your modified (B to Q) code in the second post as below:
B4X:
Sub ParseQuotePrintableHeader(h As String) As String
    Dim sb As StringBuilder
    sb.Initialize
    Dim su As StringUtils
    Dim m As Matcher = Regex.Matcher("=\?([^?]+)\?Q\?([^?]+)\?=", h)
    Dim lastIndex As Int = 0
    Do While m.Find
        If m.GetStart(0) > lastIndex Then
            sb.Append(h.SubString2(lastIndex, m.GetStart(0)))
        End If
        Dim b() As Byte = su.DecodeBase64(m.Group(2)) ' Not for quoted-printable
        sb.Append(BytesToString(b, 0, b.Length, m.Group(1)))
        lastIndex = m.GetEnd(0)
    Loop
    sb.Append(h.SubString(lastIndex))
    Return sb.ToString
End Sub
It returned the following logged results:
B4X:
Unparsed Q-P Subject Header:  =?UTF-8?Q?3=20special=20offers=3A=20=246=2E99=20Mozzarella=20Sticks=20and=20=2410=2E99=20Italian=20Style=20Beef=20Meatballs=2C=20plus=20BONUS=20offer?=
Parsed Q-P Subject Header:  ���

Unparsed Q-P From Header:  "=?UTF-8?Q?M=26M=20Food=20Market?=" <[email protected]>
Parsed Q-P From Header:  "?�?���mj�" <[email protected]>
My email client displays the headers as below:
B4X:
3 special offers: $6.99 Mozzarella Sticks and $10.99 Italian Style Beef Meatballs, plus BONUS offer

[email protected]
While I seem to have hit an impasse here, I've gained a whole new respect for those who develop email clients. Incidentally, the sub posted in post #8 of this web link: https://www.b4x.com/android/forum/t...nicate-with-android-devices.11310/#post-87366 works very well, while the sub posted in the following web link does not. post: https://www.b4x.com/android/forum/threads/using-pop3-and-mailparser.66978/#post-424205 This last comment does not refer to the parsing of headers, rather the message body.

Regards :)

EDIT: I tried earlier to attach the raw unprocessed message, but Chrome's auditing blocked me. I have now uploaded it using Firefox.

This was the error message:
This page isn’t working

Chrome detected unusual code on this page and blocked it to protect your personal information (for example, passwords, phone numbers, and credit cards).

Try visiting the site's homepage.

ERR_BLOCKED_BY_XSS_AUDITOR
 

Attachments

  • RawMailM&M.txt
    36.9 KB · Views: 330
Last edited:
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
This code is wrong. You need to first find the the encoded text and then parse it with DecodeQuotePrintable.
Your code tries to decode it with the base64 decoder.
I finally clued in to what you were trying to tell me, 5 minutes after my head hit the pillow last night. Brain-fart!!! :oops: The code below works very well.
B4X:
Sub ParseQuotePrintableHeader(q As String) As String
    Dim m As Matcher
    m = Regex.Matcher("=\?([^?]+)\?Q\?([^?]+)\?=", q)
    If m.Find Then
        Dim charset As String
        Dim data As String
        charset = m.Group(1)
        data = m.Group(2)
        Dim bytes As List
        bytes.Initialize
        Dim i As Int
        Do While i < data.Length
            Dim c As String
            c = data.CharAt(i)
            If c = "_" Then
                bytes.AddAll(" ".GetBytes(charset))
            Else If c = "=" Then
                Dim hex As String
                hex = data.CharAt(i + 1) & data.CharAt(i + 2)
                i = i + 2
                bytes.Add(Bit.ParseInt(hex, 16))
            Else
                bytes.AddAll(c.GetBytes(charset))
            End If
            i = i + 1
        Loop
        Dim b(bytes.Size) As Byte
        For i = 0 To bytes.Size - 1
            b(i) = bytes.Get(i)
        Next
        Return BytesToString(b, 0, b.Length, charset)
    Else
        Return q
    End If
End Sub

Thank you again.
 
Last edited:
Upvote 0

Peter Simpson

Expert
Licensed User
Longtime User
I finally clued in to what you were trying to tell me, 5 minutes after my head hit the pillow last night. Brain-fart!!! :oops: The code below works very well.
B4X:
Sub ParseQuotePrintableHeader(q As String) As String
    Dim m As Matcher
    m = Regex.Matcher("=\?([^?]+)\?Q\?([^?]+)\?=", q)
    If m.Find Then
        Dim charset As String
        Dim data As String
        charset = m.Group(1)
        data = m.Group(2)
        Dim bytes As List
        bytes.Initialize
        Dim i As Int
        Do While i < data.Length
            Dim c As String
            c = data.CharAt(i)
            If c = "_" Then
                bytes.AddAll(" ".GetBytes(charset))
            Else If c = "=" Then
                Dim hex As String
                hex = data.CharAt(i + 1) & data.CharAt(i + 2)
                i = i + 2
                bytes.Add(Bit.ParseInt(hex, 16))
            Else
                bytes.AddAll(c.GetBytes(charset))
            End If
            i = i + 1
        Loop
        Dim b(bytes.Size) As Byte
        For i = 0 To bytes.Size - 1
            b(i) = bytes.Get(i)
        Next
        Return BytesToString(b, 0, b.Length, charset)
    Else
        Return q
    End If
End Sub

Thank you again.

Nice work.
You should put this particular code under code snippets, I would...
 
Upvote 0

William Hunter

Active Member
Licensed User
Longtime User
Nice work.
You should put this particular code under code snippets, I would...
Hello Peter. The nice work is actually Erel's. I originally found it in post #8 of this web link: https://www.b4x.com/android/forum/t...nicate-with-android-devices.11310/#post-87366

This sub, as I found it, successfully decodes q-p email message body. The only changes I made were in the sub name, and the regex. The change to the regex is also Erel's work. What would we ever do without him?

Regards :)
 
Upvote 0
Top