Android Question Question about the function BytesToString

Filippo

Expert
Licensed User
Hi,

I have derived the classe, here below, from the classe AsyncStreamsText, but I do not understand the different way in which "BytesToString" works.
For B4a I can use "UTF8" as a charset, but not for B4i because it crashes.
But if I use "iso-8859-1" in B4i as charset, then everything is alright.

The data comes from the same BLE module, so everything should be the same, right?
Is there a logical explanation?

B4X:
Sub Class_Globals
    Private mTarget As Object
    Private mEventName As String
    Private sb As StringBuilder
#if b4a
    Private charset As String = "UTF8"
#Else
    Private charset As String = "iso-8859-1"
#End If
End Sub

'Initializes the object. You can add parameters to this method if needed.
Public Sub Initialize(TargetModule As Object, EventName As String)
    mTarget = TargetModule
    mEventName = EventName
    sb.Initialize
End Sub

Public Sub NewData (Buffer() As Byte)
    Dim newDataStart As Int = sb.Length
    sb.Append(BytesToString(Buffer, 0, Buffer.Length, charset))
    Dim s As String = sb.ToString
    Dim start As Int = 0
    For i = newDataStart To s.Length - 1
        Dim c As Char = s.CharAt(i)
        If i = 0 And c = Chr(10) Then '\n...
            start = 1 'might be a broken end of line character
            Continue
        End If
        If c = Chr(10) Then '\n
            CallSubDelayed2(mTarget, mEventName & "_NewText", s.SubString2(start, i))
            start = i + 1
        Else If c = Chr(13) Then '\r
            CallSubDelayed2(mTarget, mEventName & "_NewText", s.SubString2(start, i))
            If i < s.Length - 1 And s.CharAt(i + 1) = Chr(10) Then '\r\n
                i = i + 1
            End If
            start = i + 1
        End If
    Next
    If start > 0 Then sb.Remove(0, start)
End Sub
Application_Start
Application_Active
pnlRoot_Resize
IsOrientationChanged=true
pnlRoot_Resize: IsListCronoTimerChanged=true
btnCronoModus_CheckedChange=false
pnlRoot_Resize
IsOrientationChanged=true
pnlRoot_Resize: IsListCronoTimerChanged=true
btnCronoModus_CheckedChange=false
Discovering services
Services discovery completed.
Error occurred on line: 21 (clsAsyncText)
Error decoding data as string.
Stack Trace: (
CoreFoundation <redacted> + 252
libobjc.A.dylib objc_exception_throw + 56
CoreFoundation <redacted> + 0
CMM-Lite -[B4ICommon BytesToString::::] + 344
CoreFoundation <redacted> + 144
CoreFoundation <redacted> + 292
CMM-Lite +[B4I runDynamicMethod:method:throwErrorIfMissing:args:] + 1624
CMM-Lite -[B4IShell runMethod:] + 448
CMM-Lite -[B4IShell raiseEventImpl:method:args::] + 1640
CMM-Lite -[B4IShellBI raiseEvent:event:params:] + 1372
CMM-Lite +[B4IDebug delegate:::] + 80
CMM-Lite -[b4i_clsasynctext _newdata::] + 432
CMM-Lite -[b4i_clsbluetooth _dataavailable_new:::] + 1084
CMM-Lite -[b4i_clsbluetooth _btmanager_dataavailable:::] + 756
CoreFoundation <redacted> + 144
CoreFoundation <redacted> + 292
CMM-Lite +[B4I runDynamicMethod:method:throwErrorIfMissing:args:] + 1624
CMM-Lite -[B4IShell runMethod:] + 448
CMM-Lite -[B4IShell raiseEventImpl:method:args::] + 2164
CMM-Lite -[B4IShellBI raiseEvent:event:params:] + 1372
CMM-Lite +[B4IObjectWrapper raiseEvent:::] + 300
CMM-Lite -[BlePeripheralDel peripheral:didUpdateValueForCharacteristic:error:] + 1276
CoreBluetooth <redacted> + 240
CoreBluetooth <redacted> + 132
CoreBluetooth <redacted> + 356
CoreBluetooth <redacted> + 204
CoreBluetooth <redacted> + 60
libdispatch.dylib <redacted> + 24
libdispatch.dylib <redacted> + 16
libdispatch.dylib <redacted> + 592
libdispatch.dylib <redacted> + 484
libdispatch.dylib <redacted> + 784
CoreFoundation <redacted> + 12
CoreFoundation <redacted> + 1964
CoreFoundation CFRunLoopRunSpecific + 436
GraphicsServices GSEventRunModal + 100
UIKitCore UIApplicationMain + 212
CMM-Lite main + 124
libdyld.dylib <redacted> + 4
)
 

Filippo

Expert
Licensed User
How is the data encoded?

iOS text decoder is more strict.
I simply send individual letters with arduino without defining a charset.
Is that perhaps the problem?
 

Filippo

Expert
Licensed User
B4X:
Public Sub NewData (Buffer() As Byte)
    Log("bc.HexFromBytes=" & bc.HexFromBytes(Buffer))
    Dim newDataStart As Int = sb.Length
    sb.Append(BytesToString(Buffer, 0, Buffer.Length, charset))
    Dim s As String = sb.ToString
    Dim start As Int = 0
    For i = newDataStart To s.Length - 1
        Dim c As Char = s.CharAt(i)
        If i = 0 And c = Chr(10) Then '\n...
            start = 1 'might be a broken end of line character
            Continue
        End If
        If c = Chr(10) Then '\n
            CallSubDelayed2(mTarget, mEventName & "_NewText", s.SubString2(start, i))
            start = i + 1
        Else If c = Chr(13) Then '\r
            CallSubDelayed2(mTarget, mEventName & "_NewText", s.SubString2(start, i))
            If i < s.Length - 1 And s.CharAt(i + 1) = Chr(10) Then '\r\n
                i = i + 1
            End If
            start = i + 1
        End If
    Next
    If start > 0 Then sb.Remove(0, start)
End Sub
This is the first data that comes after connecting to the BLE module.
What the Arduino sends comes in B4a and B4i alike.
I think that the problem must be with iOS and the detection of the BLE module.

B4i-log:
Test with = "iso-8859-1"
Discovering services
Services discovery completed.
bc.HexFromBytes=DC
bc.HexFromBytes=76330D0A

Test with = "UTF8"
Discovering services
Services discovery completed.
bc.HexFromBytes=DC
Error occurred on line: 23 (clsAsyncText)
Error decoding data as string.
Stack Trace: (
CoreFoundation <redacted> + 252
libobjc.A.dylib objc_exception_throw + 56
CoreFoundation <redacted> + 0
CMM-Lite -[B4ICommon BytesToString::::] + 344
CMM-Lite -[b4i_clsasynctext _newdata::] + 1164
CMM-Lite -[b4i_clsbluetooth _dataavailable_new:::] + 1084
CMM-Lite -[b4i_clsbluetooth _btmanager_dataavailable:::] + 756
CoreFoundation <redacted> + 144
CoreFoundation <redacted> + 292
CMM-Lite +[B4I runDynamicMethod:method:throwErrorIfMissing:args:] + 1624
CMM-Lite -[B4IShell runMethod:] + 448
CMM-Lite -[B4IShell raiseEventImpl:method:args::] + 2164
CMM-Lite -[B4IShellBI raiseEvent:event:params:] + 1372
CMM-Lite +[B4IObjectWrapper raiseEvent:::] + 300
CMM-Lite -[BlePeripheralDel peripheral:didUpdateValueForCharacteristic:error:] + 1276
CoreBluetooth <redacted> + 240
CoreBluetooth <redacted> + 132
CoreBluetooth <redacted> + 356
CoreBluetooth <redacted> + 204
CoreBluetooth <redacted> + 60
libdispatch.dylib <redacted> + 24
libdispatch.dylib <redacted> + 16
libdispatch.dylib <redacted> + 592
libdispatch.dylib <redacted> + 484
libdispatch.dylib <redacted> + 784
CoreFoundation <redacted> + 12
CoreFoundation <redacted> + 1964
CoreFoundation CFRunLoopRunSpecific + 436
GraphicsServices GSEventRunModal + 100
UIKitCore UIApplicationMain + 212
CMM-Lite main + 124
libdyld.dylib <redacted> + 4
)
B4a-log:
Test with = "UTF8"
Discovering services.
ble.IsConnected=true
bc.HexFromBytes=DC
Setting descriptor. Success = true
writing descriptor: true
bc.HexFromBytes=76330D0A
 

emexes

Well-Known Member
Licensed User
Are any of the byte values over 0x7F, ie, high bit set? If so, then I think that could confuse the UTF-8 processing, particularly if they are the end of the byte array, where they would look like truncated UTF-8 characters. Byte values 0xFE and 0xFF could be particularly troublesome, because they look like a UTF-8 Byte Order Mark.

So when Erel says iOS text decoder is more strict, perhaps this is one area where that strictness is apparent.

Does iso-8859-1 work for B4A? (perhaps with capital ISO?) Given that it seems to be a fixed 8-bit encoding, vs the variable-byte encoding of UTF-8, this is probably a more suitable conversion anyway.
 

emexes

Well-Known Member
Licensed User
bc.HexFromBytes=DC
Error occurred on line: 23 (clsAsyncText)
Error decoding data as string.
The fact that it's choking on a single byte with the high bit set (0xDC = 0b11011100) suggests that the B4A UTF-8 decoder is less strict about truncated UTF-8 multi-byte characters.

Perhaps have a closer look at what B4A is actually adding to the StringBuilder variable, eg:

B4X:
'sb.Append(BytesToString(Buffer, 0, Buffer.Length, charset))

Dim StringToAdd As String = BytesToString(Buffer, 0, Buffer.Length, charset)
Log(StringToAdd.Length)
For I = 0 To StringToAdd.Length - 1
    Log(ASC(StringToAdd.CharAt(I)))
Next
sb.Append(StringToAdd)

Dim s As String = sb.ToString
Reasonable things for B4A to do are:

- treat UTF-8 byte 0xDC as Unicode character 0x00DC
- treat UTF-8 byte 0xDC as Unicode character 0x001C ie strip off the three high bits that signify another byte is expected
- treat UTF-8 byte 0xDC as Unicode character 0x0700 ie strip off the three high bits and then shift it 6 left to make room for the next (missing) 6 bits
- not return anything

edit: fixed up the reasonable-things-to-do (forgot that UTF-8 uses 3 high bits to indicate number of bytes for this character)
 
Last edited:

Filippo

Expert
Licensed User
@emexes
Since everything works with iso-8859-1 in B4a, I will use it in the future.
Thank you for the info. :)
 

emexes

Well-Known Member
Licensed User
Yeah, bytes vs chars is a fun area - watch out for:
1/ B4A bytes are Java-like SIGNED 8-bits ie -128 to 127 (NOT the unsigned bytes 0 to 255 of most everywhere else) (including ARGB color components ;-)
2/ B4A chars and strings are 16-bit, and can thus hold values outside of the range of a byte
 

Filippo

Expert
Licensed User
Yeah, bytes vs chars is a fun area - watch out for:
1/ B4A bytes are Java-like SIGNED 8-bits ie -128 to 127 (NOT the unsigned bytes 0 to 255 of most everywhere else) (including ARGB color components ;-)
2/ B4A chars and strings are 16-bit, and can thus hold values outside of the range of a byte
Thanks again for the further explanation, I have to say that I've never really thought about it.
 
Top