Android Question Trying to figure out bitwise checksum code from example in C

doncx

Active Member
Licensed User
Longtime User
I can't seem to figure out and duplicate a checksum calculation. Perhaps it's no surprise as I'm not really familiar with bitwise operations.

Could someone please look at the following and advise? It would most appreciated.
Here's the C code I'm trying to duplicate:
static uint32_t createCheckSum(char * fName)
{
    int fp = fopen(fName, “R”);
    uint32_t checkSum = 0;
    char c;
    while (fgets(c, 1, fp))
    {
        checkSum ^= c;
        checkSum <<= 1;
        if (checkSum & 0x80000000)
        {
            checkSum |= 1;
        }
    }
    fclose(fp);
    return checkSum;
}

And here's my faulty attempt in B4A:
Sub CalculateChecksum As Long
    
    Dim checkSum = 0 As Long
    Dim MyBytes() As Byte = File.ReadBytes(Starter.FilesDir, "MyData.bin" )

    For i = 0 To MyBytes.Length - 1
        checkSum = Bit.XorLong(checkSum,MyBytes(i))
        checkSum = Bit.ShiftLeftLong(checkSum,1)
        If Bit.AndLong(checkSum,0x80000000) = 0x80000000 Then
            checkSum = Bit.OrLong(checkSum,0x01)
        End If
    Next
    
    Return checkSum

End Sub

Where am I going wrong?
 
Solution
If anybody's interested, I solved this vexing problem. Two changes: 1) fix the byte when signed, and 2) remove the MSB when set.

I figured this out by stepping through a python version of the code. I'm sure my fixes are inelegant, but they work!

Suggestions for improvement are welcome.

Here's a properly functioning B4a version:
Sub CalculateChecksum as long
    
    Dim checkSum As Long = 0
    Dim ThisByteVal As Int = 0
    Dim MyBytes() As Byte = File.ReadBytes(B4XPages.MainPage.FilesDir, "MyBytes.bin" )

    For i = 0 To MyBytes.Length - 1
        If MyBytes(i) < 0 Then
            ThisByteVal = MyBytes(i) + 256 'make this an unsigned byte value
        Else
            ThisByteVal = MyBytes(i)
        End If
        checkSum = Bit.XorLong(checkSum,ThisByteVal)...

JohnC

Expert
Licensed User
Longtime User
ChatGPT says...

Your B4A code attempt to convert the C function into a checksum calculation is quite good, and it looks like you've adapted the logic correctly for the most part. However, there are a few subtleties and potential issues to address:
  1. Data Type Mismatch in Bitwise Operations: In B4A, when you're working with bitwise operations on bytes and long integers, you need to be mindful of how the data types interact. The MyBytes(i) will return a Byte, and when used in bitwise operations with Long values, it might not behave as expected without explicit casting or handling because of sign extension when converting from Byte to Long.
  2. Handling Byte to Long Conversion: Since MyBytes(i) is a byte and the bitwise operations are expected to be on a Long (32-bit or potentially 64-bit on some platforms), ensuring the byte is treated correctly in the context of a larger integer is important.
  3. Potential Issue with Bit.AndLong Check: The condition in the If statement might not work as expected due to the way B4A (based on Java) handles integers. The 0x80000000 is interpreted as a negative number in Java's signed integer arithmetic, and this might not match your expectations coming from a C background.
  4. Initialization of checkSum: The way you've initialized checkSum with Dim checkSum = 0 As Long is syntactically incorrect for B4A. You should declare the variable first and then assign a value to it in separate statements if needed.
Let's address these points with corrected code:
B4X:
Sub CalculateChecksum As Long
    Dim checkSum As Long
    checkSum = 0
    Dim MyBytes() As Byte = File.ReadBytes(Starter.FilesDir, "MyData.bin")

    For i = 0 To MyBytes.Length - 1
        Dim byteVal As Long = MyBytes(i) ' Ensure byte is treated as unsigned
        checkSum = Bit.XorLong(checkSum, byteVal)
        checkSum = Bit.ShiftLeftLong(checkSum, 1)
        If Bit.AndLong(checkSum, 0x80000000) <> 0 Then ' Checking if the MSB is set
            checkSum = Bit.OrLong(checkSum, 1)
        End If
    Next
   
    Return checkSum
End Sub
This corrected version makes the following adjustments:
  • Corrects the initialization of checkSum.
  • Adds a comment and corrects the handling for when Bit.AndLong(checkSum, 0x80000000) <> 0 is used to check if the most significant bit (MSB) is set.
  • Ensures that byte values are treated correctly in the context of Long operations, even though the direct assignment from Byte to Long might automatically handle this in B4A, being explicit can help avoid confusion.
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
Thanks for that, John! I've been using Gemini to deconstruct the C code but it wasn't willing to attempt conversion to B4X.

I'll give it a shot. Very helpful, thank you.
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
Something still doesn't add up correctly. I'm getting a much larger number in my result than the sender says I should. I'm checking to make sure I'm running the calculation on the correct data and will get back if I'm still having trouble.
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
Where am I going wrong?
You may be worrying about signedness too early. The only time you technically need to worry about it is when you create the resulting return value. Try
B4X:
Sub CalculateChecksum As Long
    Dim result As Long
    Dim checkSum As Int = 0
    Dim MyBytes() As Byte = File.ReadBytes(Starter.FilesDir, "MyData.bin" )

    For i = 0 To MyBytes.Length - 1
        checkSum = Bit.Xor(checkSum,Bit.And(MyBytes(i),0xFF))
        checkSum = Bit.ShiftLeft(checkSum,1)
        If Bit.And(checkSum,0x80000000) = 0x80000000 Then
            checkSum = Bit.Or(checkSum,0x01)
        End If
    Next
  
    result = Bit.AndLong(checkSum, 0xFFFFFFFF)
    Return result

End Sub
Note: Corrected. See post below
2nd Correction: change
B4X:
checkSum = Bit.Xor(checkSum,MyBytes(i))
to
B4X:
checkSum = Bit.Xor(checkSum,Bit.And(MyBytes(i),0xFF))
 
Last edited:
Upvote 1

OliverA

Expert
Licensed User
Longtime User
Thanks to @emexes , I fixed a bug in the code above. Original
B4X:
result = Bit.And(checkSum, 0xFFFFFFFF)
corrected
B4X:
result = Bit.AndLong(checkSum, 0xFFFFFFFF)
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
If anybody's interested, I solved this vexing problem. Two changes: 1) fix the byte when signed, and 2) remove the MSB when set.

I figured this out by stepping through a python version of the code. I'm sure my fixes are inelegant, but they work!

Suggestions for improvement are welcome.

Here's a properly functioning B4a version:
Sub CalculateChecksum as long
    
    Dim checkSum As Long = 0
    Dim ThisByteVal As Int = 0
    Dim MyBytes() As Byte = File.ReadBytes(B4XPages.MainPage.FilesDir, "MyBytes.bin" )

    For i = 0 To MyBytes.Length - 1
        If MyBytes(i) < 0 Then
            ThisByteVal = MyBytes(i) + 256 'make this an unsigned byte value
        Else
            ThisByteVal = MyBytes(i)
        End If
        checkSum = Bit.XorLong(checkSum,ThisByteVal)
        checkSum = Bit.ShiftLeftLong(checkSum,1)
        If checkSum > 4294967295 Then
            checkSum = checkSum - 4294967296 'force the shift to ditch the msb
        End If
        If Bit.AndLong(checkSum,0x80000000) <> 0 Then
            checkSum = Bit.OrLong(checkSum,0x01)
        End If
    Next

    Return checksum   

End Sub
 
Upvote 0
Solution

OliverA

Expert
Licensed User
Longtime User
Suggestions for improvement are welcome.
1)
This
B4X:
        If MyBytes(i) < 0 Then
            ThisByteVal = MyBytes(i) + 256 'make this an unsigned byte value
        Else
            ThisByteVal = MyBytes(i)
        End If
can also be solved via
B4X:
ThisByteVal = Bit.And(MyBytes(1),0xFF)
which, sadly, is one of the things I forgot to do (besides the incorrect int to long conversion) in the code I provided. :-(

Instead of
B4X:
checkSum = Bit.Xor(checkSum,MyBytes(i))
I should have had
B4X:
checkSum = Bit.Xor(checkSum,Bit.And(MyBytes(i),0xFF))

2)
I'm pretty sure you don't have to worry about overflowing the long beyond an int value and therefore instead of
B4X:
        If checkSum > 4294967295 Then
            checkSum = checkSum - 4294967296 'force the shift to ditch the msb
        End If
within the Next loop, you can just do a
B4X:
checksum = Bit.AndLong(checkSum, 0xFFFFFFFF) ' remove any bits beyond an Int value
outside of the Next loop.
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
Thanks for your suggestions, Oliver.

Your code to un-sign the byte is indeed more elegant. With a small typo correction it works perfectly.

Convert byte to unsigned:
ThisByteVal = Bit.And(MyBytes(i),0xFF)

However, I'm not able to find a way to prevent overflowing other than the decimal subtraction that I posted above. Interestingly, the python code that I ran and stepped through uses a Bit.And at every step, similar to your suggestion. It just doesn't seem work to prevent overflow in B4a.

Therefore, despite its seeming inelegance I'll stick to the integer subtraction to prevent overflow.

Here's the slightly more elegant calculation:
Sub CalculateChecksum as long
    
    Dim checkSum As Long = 0
    Dim ThisByteVal As Int = 0
    Dim MyBytes() As Byte = File.ReadBytes(B4XPages.MainPage.FilesDir, "MyBytes.bin" )

    For i = 0 To MyBytes.Length - 1
        ThisByteVal = Bit.And(MyBytes(i),0xFF)
        checkSum = Bit.XorLong(checkSum,ThisByteVal)
        checkSum = Bit.ShiftLeftLong(checkSum,1)
        If checkSum > 4294967295 Then
            checkSum = checkSum - 4294967296 'force the shift to ditch the msb
        End If
        If Bit.AndLong(checkSum,0x80000000) <> 0 Then
            checkSum = Bit.OrLong(checkSum,0x01)
        End If
    Next

    Return checksum   

End Sub
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
prevent overflow.
You don't have to worry about overflow. The original code did not worry about overflow. All you have to worry about is removing the overflow (in the translated version) before returning the value. And that is done via
B4X:
checksum = Bit.AndLong(checkSum, 0xFFFFFFFF) ' remove any bits beyond an Int value

Another way to state it, the overflow is what is being discarded (in the original code, after enough left bit shifts, bits on the left will be discarded). Therefore, while calculating, we don't worry about the overflow in the long value, since those bits are just to be discarded. This discarding can be done via Bit.And before returning the value.
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
Oliver, I respect and appreciate your expertise, but it just doesn't work.

I'm satisfied with my current solution, but will be happy to post a file and it's expected checksum value for you to inspect and experiment with if you like.
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
Thanks to @emexes, looks like there is a strange issuewith /or me not knowing how to properly use AndLong.

Try
B4X:
checksum = Bit.AndLong(checkSum, 0x0FFFFFFFF) ' remove any bits beyond an Int value. Note the extra 0 before the FFs
 
Upvote 0

doncx

Active Member
Licensed User
Longtime User
Thank you, one more time!

That worked - but only when executed after each shift. The leading zero was what I was missing.

Here's the fully elegant version of the checksum code:
Sub CalculateChecksum as long
    
    Dim checkSum As Long = 0
    Dim ThisByteVal As Int = 0
    Dim MyBytes() As Byte = File.ReadBytes(B4XPages.MainPage.FilesDir, "MyBytes.bin" )

    For i = 0 To MyBytes.Length - 1
        ThisByteVal = Bit.And(MyBytes(i),0xFF)
        checkSum = Bit.XorLong(checkSum,ThisByteVal)
        checkSum = Bit.AndLong(Bit.ShiftLeftLong(checkSum,1),0x0FFFFFFFF)
        If Bit.AndLong(checkSum,0x80000000) <> 0 Then
            checkSum = Bit.OrLong(checkSum,0x01)
        End If
    Next

    Return checksum   

End Sub
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
only when executed after each shift
I think I know why. I actually wrote about the issue here: https://www.b4x.com/android/forum/t...am-i-going-stark-raving-mad.98882/post-622810. 0xFFFFFFFF and even 0x80000000 are actually string literals of numbers. Both fit in the size of an int, so they are casted to an int and then casted to a long. Since both numbers are "negative", they are casted to a long as a negative number. Thats why BOTH need to be written as 0x0FFFFFFFF and 0x08000000. After changing your 0x80000000 to 0x080000000, then you can do

B4X:
    'MyBytes() is an Array of Byte()
    Dim checkSum As Long = 0
    Dim ThisByteVal As Int = 0
    For i = 0 To MyBytes.Length - 1
        ThisByteVal = Bit.And(MyBytes(i),0xFF)
        checkSum = Bit.XorLong(checkSum,ThisByteVal)
        checkSum = Bit.ShiftLeftLong(checkSum,1)
        If Bit.AndLong(checkSum,0x080000000) <> 0 Then
            checkSum = Bit.OrLong(checkSum,0x01)
        End If
    Next

    Return Bit.AndLong(checkSum, 0x0FFFFFFFF)
 
Upvote 0
Top