Android Code Snippet Get data type from 1D string array

RB Smissaert

Well-Known Member
Licensed User
Longtime User
Will work this out in VBA now as that is a lot quicker, not having to compile all the time.

RBS
I think I have this worked out, at least it works fine in my app.
It will determine the data type and also the dot and/or comma separator types.
I added a little test routine as well.

B4X:
Sub Test()

    Dim c As Int
    Dim r As Int

    Dim arr() As String
    Dim arr1(4) As String
    Dim arr2(4) As String
    Dim arr3(4) As String
    Dim arr4(4) As String
    Dim arr5(4) As String
    Dim arr6(4) As String
    Dim arr7(4) As String
    Dim arr8(4) As String
    Dim arr9(4) As String
    Dim arr10(4) As String
    Dim arr11(4) As String
    
    Dim lstArrays As List
    lstArrays.Initialize

    arr1(0) = "12,12"
    arr1(1) = "12,123"
    arr1(2) = "123,123"
    arr1(3) = "1234,123"
    lstArrays.Add(arr1)

    arr2(0) = "12,123"
    arr2(1) = "12,123"
    arr2(2) = "123,123"
    arr2(3) = "1234,1234"    '<<<<<<< this means dot can't be thousands separator as ends with group of 4
    lstArrays.Add(arr2)

    arr3(0) = "1234.12345"    '<<<<<<< this means dot can't be thousands separator as ends with group of 5
    arr3(1) = "12.123"
    arr3(2) = "123.123"
    arr3(3) = "12,.123"  '<<<< because of this will be text
    lstArrays.Add(arr3)

    arr4(0) = "1234,123.12"
    arr4(1) = "12,123.12"
    arr4(2) = "123,123.12"
    arr4(3) = "12.12"
    lstArrays.Add(arr4)

    arr5(0) = "12.123"
    arr5(1) = "1.123,123"
    arr5(2) = "123.123.123,123"
    arr5(3) = "123.123"
    lstArrays.Add(arr5)

    arr6(0) = "123,123"
    arr6(1) = "123,123.123"
    arr6(2) = "123,123,123.123"
    arr6(3) = "123,123"
    lstArrays.Add(arr6)

    arr7(0) = "12,1231" 'this is misformed entry as should have comma before the last 1, but for sure comma is thousands separator
    arr7(1) = "12,123,123"
    arr7(2) = "123"
    arr7(3) = "1234,123"
    lstArrays.Add(arr7)

    arr8(0) = "123,12"    '<<< as this ends with a group of 2 (and not 3) the comma will be a decimal separator
    arr8(1) = "123,123"
    arr8(2) = "123,123"
    arr8(3) = "123,123"
    lstArrays.Add(arr8)

    'in these next 2 cases we can't determine the separator types, all array vars will be False
    '------------------------------------------------------------------------------------------
    arr9(0) = "123.123"
    arr9(1) = "123.123"
    arr9(2) = "123.123"
    arr9(3) = "123.123"
    lstArrays.Add(arr9)

    arr10(0) = "123,123"
    arr10(1) = "123,123"
    arr10(2) = "123,123"
    arr10(3) = "123,123"
    lstArrays.Add(arr10)

    arr11(0) = "123.123"
    arr11(1) = "123.123"
    arr11(2) = "123.123"
    arr11(3) = "123..123"
    lstArrays.Add(arr11)

    For c = 0 To lstArrays.Size - 1
        
        Dim arrbDotIsThousandsSeparator(1) As Boolean
        Dim arrbCommaIsThousandsSeparator(1) As Boolean
        Dim arrbDotIsDecimalSeparator(1) As Boolean
        Dim arrbCommaIsDecimalSeparator(1) As Boolean
        
        arr = lstArrays.Get(c)
    
        Log("arr" & (c + 1) & ": ------------------------------------")
        For r = 0 To 3
            Log(arr(r))
        Next
        Log("Column data type: " & GetColumnDataTypeX2(arr, _
                                                      arrbDotIsThousandsSeparator, _
                                                      arrbCommaIsThousandsSeparator, _
                                                      arrbDotIsDecimalSeparator, _
                                                      arrbCommaIsDecimalSeparator))
        Log("Dot Is Thousands Separator: " & arrbDotIsThousandsSeparator(0))
        Log("Dot Is Decimal Separator: " & arrbDotIsDecimalSeparator(0))
        Log("Comma Is Thousands Separator: " & arrbCommaIsThousandsSeparator(0))
        Log("Comma Is Decimal Separator: " & arrbCommaIsDecimalSeparator(0))
        
    Next

End Sub

Sub GetColumnDataTypeX2(arrString() As String, _
                        arrbDotIsThousandsSeparator() As Boolean, _
                        arrbCommaIsThousandsSeparator() As Boolean, _
                        arrbDotIsDecimalSeparator() As Boolean, _
                        arrbCommaIsDecimalSeparator() As Boolean) As String 'ignore

    Dim i As Int
    Dim r As Int
    Dim bHasDots As Boolean
    Dim bHasCommas As Boolean
    Dim strOld As String
    Dim arrBytes() As Byte
    Dim bHasNumber As Boolean
    Dim iDotPos As Int
    Dim iCommaPos As Int
    Dim strDataType As String
    Dim iDots As Int
    Dim iCommas As Int
    
    'best to make sure we don't start with any True values!
    '------------------------------------------------------
    arrbDotIsThousandsSeparator(0) = False
    arrbCommaIsThousandsSeparator(0) = False
    arrbDotIsDecimalSeparator(0) = False
    arrbCommaIsDecimalSeparator(0) = False

    For r = 0 To arrString.Length - 1

        If arrString(r).Length > 0 Then
            'this can make this about twice as fast if there is a sorted
            'column and if the rows to be tested are not taken randomly
            '-----------------------------------------------------------
            If arrString(r) <> strOld Then
                strOld = arrString(r)
                    
                arrBytes = bC.StringToBytes(strOld, "ASCII")
 
                iDots = 0
                iCommas = 0
                'can't do -1 here that would make a value text if that value started with a dot or comma (0 - -1 = 1)
                iDotPos = -2
                iCommaPos = -2


                'no separators done yet
                '----------------------
                For i = 0 To arrBytes.Length - 1
                    If arrBytes(i) > 57 Then
                        'Log("> 57: |" & arrString(i) & "|")
                        arrbDotIsThousandsSeparator(0) = False
                        arrbCommaIsThousandsSeparator(0) = False
                        arrbDotIsDecimalSeparator(0) = False
                        arrbCommaIsDecimalSeparator(0) = False
                        Return "T"
                    Else    'If arrBytes(i) > 57
                        If arrBytes(i) < 48 Then
                            Select Case arrBytes(i)
                                Case 45    '-
                                    'Log(" = minus")
                                    If i > 0 Then
                                        arrbDotIsThousandsSeparator(0) = False
                                        arrbCommaIsThousandsSeparator(0) = False
                                        arrbDotIsDecimalSeparator(0) = False
                                        arrbCommaIsDecimalSeparator(0) = False
                                        Return "T"    'as number can't have minus at position past zero
                                    Else
                                        strDataType = "T"    'provisional value, can change to an int or double
                                    End If
                                Case 46    '.
                                            
                                    'numbers can't have consecutive dots or comma's
                                    If i - iDotPos = 1 Then
                                        arrbDotIsThousandsSeparator(0) = False
                                        arrbCommaIsThousandsSeparator(0) = False
                                        arrbDotIsDecimalSeparator(0) = False
                                        arrbCommaIsDecimalSeparator(0) = False
                                        Return "T"    'as can't have consecutive dots
                                    Else
                                        If i - iCommaPos = 1 Then
                                            arrbDotIsThousandsSeparator(0) = False
                                            arrbCommaIsThousandsSeparator(0) = False
                                            arrbDotIsDecimalSeparator(0) = False
                                            arrbCommaIsDecimalSeparator(0) = False
                                            Return "T"    'as can't have consecutive comma and dot
                                        End If
                                    End If

                                    If iDots > 0 Then
                                        'multiple dots
                                        arrbDotIsThousandsSeparator(0) = True
                                        arrbCommaIsThousandsSeparator(0) = False
                                    End If

                                    If iCommas > 0 Then
                                        'dot after comma
                                        arrbDotIsDecimalSeparator(0) = True
                                        arrbCommaIsDecimalSeparator(0) = False
                                        arrbCommaIsThousandsSeparator(0) = True
                                        arrbDotIsThousandsSeparator(0) = False
                                    End If
                                            
                                    If arrbDotIsDecimalSeparator(0) = False And arrbCommaIsDecimalSeparator(0) = False And _
                                               arrbDotIsThousandsSeparator(0) = False Then
                                        If iDots = 0 And iCommas = 0 Then
                                            If arrBytes.Length - i < 4 Then
                                                'as there is no group of 3 following the dot
                                                '-------------------------------------------
                                                arrbDotIsDecimalSeparator(0) = True
                                                'bDecimalSepDone = True >> not done yet, as this can be overruled
                                            Else
                                                If arrBytes.Length > i + 4 Then
                                                    If arrBytes(i + 4) <> 44 And arrBytes(i + 4) <> 46 Then
                                                        arrbDotIsDecimalSeparator(0) = True
                                                        'bDecimalSepDone = True >> not done yet, as this can be overruled
                                                    End If
                                                End If
                                            End If
                                        End If
                                    End If

                                    iDotPos = i
                                    iDots = iDots + 1
                                    bHasDots = True

                                Case 44  ',
                                        
                                    'numbers can't have consecutive dots or comma's
                                    If i - iCommaPos = 1 Then
                                        arrbDotIsThousandsSeparator(0) = False
                                        arrbCommaIsThousandsSeparator(0) = False
                                        arrbDotIsDecimalSeparator(0) = False
                                        arrbCommaIsDecimalSeparator(0) = False
                                        Return "T"    'can't have consecutive comma's
                                    Else
                                        If i - iDotPos = 1 Then
                                            arrbDotIsThousandsSeparator(0) = False
                                            arrbCommaIsThousandsSeparator(0) = False
                                            arrbDotIsDecimalSeparator(0) = False
                                            arrbCommaIsDecimalSeparator(0) = False
                                            Return "T"    'can't have a dot followed by a comma
                                        End If
                                    End If

                                    If iCommas > 0 Then
                                        'multiple commas
                                        arrbCommaIsThousandsSeparator(0) = True
                                        arrbDotIsThousandsSeparator(0) = False
                                        arrbCommaIsDecimalSeparator(0) = False
                                    End If

                                    If iDots > 0 Then
                                        'comma after dot
                                        arrbCommaIsDecimalSeparator(0) = True
                                        arrbDotIsThousandsSeparator(0) = True
                                        arrbDotIsDecimalSeparator(0) = False
                                        arrbCommaIsThousandsSeparator(0) = False
                                    End If
                                            
                                    If arrbCommaIsDecimalSeparator(0) = False And  arrbDotIsDecimalSeparator(0) = False And _
                                               arrbCommaIsThousandsSeparator(0) = False Then
                                        If iDots = 0 And iCommas = 0 Then
                                            If arrBytes.Length - i < 4 Then
                                                'as there is no group of 3 following the dot
                                                '-------------------------------------------
                                                arrbCommaIsDecimalSeparator(0) = True
                                                'bDecimalSepDone = True >> not done yet, as this can be overruled
                                            Else
                                                If arrBytes.Length > i + 4 Then
                                                    If arrBytes(i + 4) <> 44 And arrBytes(i + 4) <> 46 Then
                                                        arrbCommaIsDecimalSeparator(0) = True
                                                        'bDecimalSepDone = True >> not done yet, as this can be overruled
                                                    End If
                                                End If
                                            End If
                                        End If
                                    End If

                                    iCommaPos = i
                                    iCommas = iCommas + 1
                                    bHasCommas = True
                            End Select

                        Else    'If arrBytes(i) < 48
                            bHasNumber = True
                        End If    'If arrBytes(i) < 48
                    End If    'If arrBytes(i) > 57
                Next
            End If    'If str <> strOld
        End If    'If str.Length > 0

    Next

    If bHasNumber Then
        If bHasDots = False Then
            If bHasCommas = False Then
                'no dots and no commas
                '---------------------
                Return "I"
            Else
                'no dots but has commas
                '----------------------
                If arrbCommaIsThousandsSeparator(0) Then
                    Return "I"
                Else
                    If arrbCommaIsDecimalSeparator(0) Then
                        Return "R"
                    Else
                        Return "I"
                    End If
                End If
            End If
        Else    'If bHasDots = False
            If bHasCommas = False Then
                'has dots but no commas
                '----------------------
                If arrbDotIsThousandsSeparator(0) Then
                    Return "I"
                Else
                    Return "R"
                End If
            Else    'If bHasCommas = False
                'has dots and commas
                Return "R"
            End If    'If bHasCommas = False
        End If    'If bHasDots = False

    Else   'If bHasNumber
        'not a number
        '------------
        If strDataType = "T" Then
            arrbDotIsThousandsSeparator(0) = False
            arrbCommaIsThousandsSeparator(0) = False
            arrbDotIsDecimalSeparator(0) = False
            arrbCommaIsDecimalSeparator(0) = False
            Return "T"
        Else
            Return "N"
        End If
    End If    'If bHasNumber

End Sub

RBS
 

emexes

Expert
Licensed User
I think I have this worked out, at least it works fine in my app.
It will determine the data type and also the dot and/or comma separator types.

The additional inference of what's-what from which-came-first (the comma or the dot?) is a nice touch.

When I'd been reading up on dots-vs-commas after your first draft, I remember a few "standards" said/suggested that group markers should only be used before the decimal point, not after, which seems a good idea re: making it easier for the eye to discern the decimal point.

But also I'm sure that I have seen, out in the real world, (thin-)spaces used for grouping after the decimal point, and that my gut-feel reaction would have been positive.
 

RB Smissaert

Well-Known Member
Licensed User
Longtime User
>> group markers should only be used before the decimal point

My code presumes that this is the case, but not sure this is always the case. I think I have seen it being used after the decimal separator.

>> (thin-)spaces used for grouping after the decimal point

I am not even sure my app will ever come across comma's in numbers, so for now I won't bother with spaces as a separator after the decimal separator.
It would make it all a lot more complex. The good news though is that by simply looking at bytes it all remains fast and if I had to look at something like
that it won't affect speed much.

RBS
 

RB Smissaert

Well-Known Member
Licensed User
Longtime User
This code had a simple but serious bug:

Need to add to the Select Case:

B4X:
Case Else
            arrbDotIsThousandsSeparator(0) = False
            arrbCommaIsThousandsSeparator(0) = False
            arrbDotIsDecimalSeparator(0) = False
            arrbCommaIsDecimalSeparator(0) = False
            Return "T"

RBS
 
Cookies are required to use this site. You must accept them to continue using the site. Learn more…