GZip issues

joneden

Active Member
Licensed User
Longtime User
Hi,
I'm using GZip to compress some XML in a SOAP webservice and push down to an Android app. I can get the data to work from a VB app but cannot get it to work when decoding in Android. Code is below. In the VB code it seems that you need to include the length of the orginal byte array of the string, is that needed in Android? I've tried with and without it and no luck :(

The error that I seem to get is about a magic number not existing...

WebService code:
B4X:
<WebMethod(Description:="Get various data tables - down only")> _
    Public Function GetGZippedDataTEST() As String
        Dim _encodedString As Byte() = Encoding.UTF8.GetBytes("Test text")
        Dim _memoryStream As MemoryStream = New MemoryStream()
        Dim _gZipStream As GZipStream = New GZipStream(_memoryStream, CompressionMode.Compress, True)
        _gZipStream.Write(_encodedString, 0, _encodedString.Length)
        _memoryStream.Position = 0

        Dim _compressedData As Byte() = New Byte(_memoryStream.Length - 1) {}
        ' Fill the compressed data byte array
        _memoryStream.Read(_compressedData, 0, _compressedData.Length)

        ' Add the length of the original byte array to the start of the output byte array
        Dim _amendedByteArray As Byte() = New Byte(_compressedData.Length + 3) {}
        System.Buffer.BlockCopy(_compressedData, 0, _amendedByteArray, 4, _compressedData.Length)
        System.Buffer.BlockCopy(BitConverter.GetBytes(_encodedString.Length), 0, _amendedByteArray, 0, 4)
        Return Convert.ToBase64String(_amendedByteArray)
    End Function

Android code (code triggers as expected then fails on marked line)
B4X:
Sub ReactiveDataParser_EndElement (uri As String, name As String, text As StringBuilder)
   If name = "string" Then
      Dim objStringUtils As StringUtils
       Dim objInputByteArray() As Byte
       objInputByteArray = objStringUtils.DecodeBase64(text)
       Dim objInputStream As InputStream
       'objInputStream.InitializeFromBytesArray(objInputByteArray, 0, objInputByteArray.Length)
      objInputStream.InitializeFromBytesArray(objInputByteArray, 4, objInputByteArray.Length-4)

      Dim objCompressedStreams As CompressedStreams
      Dim objDecompressedByteArray() As Byte 
' Code fails on next line
      objDecompressedByteArray = objCompressedStreams.DecompressBytes(objInputByteArray ,"gzip")
      Dim objCompressedString As String 
      objCompressedString= BytesToString(objDecompressedByteArray,0, objDecompressedByteArray.Length, "UTF8")
   End If
End Sub


Thanks for any help

Jon
 

joneden

Active Member
Licensed User
Longtime User
Hi Erel,

Yeah I'd noticed that myself and have been playing with it - the 3 came through with the example and I was wondering if it was something to do with arrays starting at 0.

I've changed the line to read

B4X:
Dim _amendedByteArray As Byte() = New Byte(_compressedData.Length + 4) {}

It still doesn't work though, I get the error "unknown format (magic number b)".

The following is some test code that includes the corrected string from the web service. So that I can avoid blind avenues, do I need the amendedbytearray with the extra 4 bytes that is in the VB or can I cut that out?

B4X:
   Dim objStringUtils As StringUtils
       Dim objInputByteArray() As Byte
       objInputByteArray = objStringUtils.DecodeBase64("CwAAAB+LCAAAAAAABADtvQdgHEmWJSYvbcp7f0r1StfgdKEIgGATJNiQQBDswYjN5pLsHWlHIymrKoHKZVZlXWYWQMztnbz33nvvvffee++997o7nU4n99//P1xmZAFs9s5K2smeIYCqyB8/fnwfPyLe5E2bNm1dLC8A")
       Dim objInputStream As InputStream
       objInputStream.InitializeFromBytesArray(objInputByteArray, 0, objInputByteArray.Length)
      'objInputStream.InitializeFromBytesArray(objInputByteArray,  3, objInputByteArray.Length-4)

      Dim objCompressedStreams As CompressedStreams
      Dim objDecompressedByteArray() As Byte 
      objDecompressedByteArray = objCompressedStreams.DecompressBytes(objInputByteArray ,"gzip")
      Dim objCompressedString As String 
      objCompressedString= BytesToString(objDecompressedByteArray,0, objDecompressedByteArray.Length, "UTF8")
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
Hi Erel,

OK so I've removed that side of it and now send from the web service the compressed string. I've taken the string and applied it as before, the whole code is as follows and now I get an EOFException at the same place as before.

Is my code below now completely correct from the B4A side of things? Ie do you think I've somehow screwed up on the string generation on the server?

B4X:
Dim objStringUtils As StringUtils
       Dim objInputByteArray() As Byte
       objInputByteArray = objStringUtils.DecodeBase64("H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/It7kTZs2bV0sLw==")
       Dim objInputStream As InputStream
       objInputStream.InitializeFromBytesArray(objInputByteArray, 0, objInputByteArray.Length)
      Dim objCompressedStreams As CompressedStreams
      Dim objDecompressedByteArray() As Byte 
      objDecompressedByteArray = objCompressedStreams.DecompressBytes(objInputByteArray ,"gzip")
      Dim objCompressedString As String 
      objCompressedString= BytesToString(objDecompressedByteArray,0, objDecompressedByteArray.Length, "UTF8")
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
Hi Erel,

I'm still struggling with this - I left it yesterday and back on it this morning. I've gone back to basics and have done one app so I can generate the string in vb.net and inspect the byte array etc. Another app in B4A to do the same.

So I generate a string and it starts with byte values 31, 139, 8, this is converted to a base 64 string (ToBase64String). I take that string and copy into B4A.

When I then decode it from a base 64 string the corresponding byte vals are different, 31, 117, 8. I suspect that other bytes are also wrong. I've no idea why 139 changes to 117...

Until that byte array can be converted properly I'm not even looking at the gzip side of things...

Hopefully this is something obvious that you can suggest now...

Regards,

Jon
 
Upvote 0

agraham

Expert
Licensed User
Longtime User
I've no idea why 139 changes to 117...
Probably an encoding problem. You are starting with 16bit characters in a string and using a string encoding function then decoding with a byte decoding function. If it is byte values you want to transfer then you need to start with a byte array in your VB code and encode that.
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
Thanks for looking at this. Are you saying that something before the ToBase64String function is causing the issue? With the code shown below I am generating the string from a byte array.

B4X:
Dim _encodedString As Byte() = System.Text.Encoding.UTF8.GetBytes(text)
Dim _memoryStream As New MemoryStream()
Using _gZipStream As New System.IO.Compression.GZipStream(_memoryStream, System.IO.Compression.CompressionMode.Compress, True)
    _gZipStream.Write(_encodedString, 0, _encodedString.Length)
End Using

_memoryStream.Position = 0

Dim _compressedData As Byte() = New Byte(_memoryStream.Length - 1) {}
_memoryStream.Read(_compressedData, 0, _compressedData.Length)

Return Convert.ToBase64String(_compressedData)
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
Cause length is not 0 based and the array is? If the length was 100 the array dimensioning should be 99. Shouldn't it?

Anyway, I've tried with and without the -1 and it doesn't affect the result of the 139 byte coming in as 117... :(
 
Upvote 0

agraham

Expert
Licensed User
Longtime User
Are you saying that something before the ToBase64String function is causing the issue? With the code shown below I am generating the string from a byte array.
You said you were encoding a string but that code shows you encoding a byte array so the effect I had in mind (although real) is not relevant here.

It is probably an encoding problem in "System.Text.Encoding.UTF8.GetBytes(text)" as you may see if you try decoding in VB. Chr(139) is Chr(0x8B) which is encoded as two bytes in UTF8 encoding.
 
Last edited:
Upvote 0

joneden

Active Member
Licensed User
Longtime User
OK must have something confused there....

Dim test() As String = New String() {"a", "b", "c"} ' Length is 3
ReDim test(3) ' now with 3 as the length the array has 4 elements so dimensioning needs to use the length desired less 1

Hence declaring it with length - 1 (or am I missing something?)


Anyway, yes I've managed to upload data OK.

Thanks agraham, yes I start with a string then do a byte array for the gzip as you saw.

I'll have a look at a raw byte array and see if that works properly...
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
OK so eliminating the middle man as it were I just passed a byte array to the base 64 encoder and same result - at the other end, byte 139 comes out as 117.

I'm wondering if I should try and get the data encoded without using the string....



re the array discussion if you run the following in vb.net you'll see what I mean... I hope that I'm not wrong otherwise some really weird stuff is going on with my PC :) weirder than normal anyway!

B4X:
Dim _compressedData As Byte() = New Byte(3) {}
MsgBox(_compressedData.Length) ' Result = 4 (elements - 0,1,2 and 3 - length = 4)
Dim testArray() As Byte = New Byte() {31, 139, 8}
MsgBox(testArray.Length) ' Result = 3
 
Upvote 0

agraham

Expert
Licensed User
Longtime User
I'm wondering if I should try and get the data encoded without using the string...
Yes. Try setting the values you want directly into a byte array. I'm guessing you are not quite au fait with Unicode. ;)
if you run the following in vb.net you'll see what I mean
My bad :( I hastily and wrongly assumed we were referring to B4A arrays.
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
I did try a dummy byte array with:

Dim testArray() As Byte = New Byte() {31, 139, 8}

then encoded that directly with
Return Convert.ToBase64String(_compressedData)

Same thing BUT I'm guessing that maybe byte 139 isn't right anyway. As you say I'm not familiar at all with encoding and stuff. I've managed to avoid getting into the character side of things so far.

I was wondering what was going on :), I'm pleased that it came up as I hadn't realised that B4A does it the other way ! I'll have to go and make sure that I'm not losing stuff somewhere!!!

Regards,

Jon
 
Upvote 0

agraham

Expert
Licensed User
Longtime User
Maybe I'm missing the point but using raw byte values is no problem.
B4X:
Dim testArray() As Byte = New Byte() {31, 139, 8}
Dim i As String = Convert.ToBase64String(testArray)
Dim t() As Byte = Convert.FromBase64String(i)
i = t(0).ToString + " " + t(1).ToString + " " + t(2).ToString
MessageBox.Show(i)
 
Upvote 0

joneden

Active Member
Licensed User
Longtime User
I wondered if somehow the 139 was a wrong value or something - like I said character manipulation isn't something I do much of. :)

Anyway, yes that code you did works fine. Now, run the code below in B4A where H4sI is the string that was generated by the Convert.ToBase64String(testArray) line when ran in vb.... you won't get it to show 139, it will be 117.

Dim objStringUtils As StringUtils
Dim objInputByteArray() As Byte
objInputByteArray = objStringUtils.DecodeBase64("H4sI")

And in it's simplest form is where the problem lies :)
 
Upvote 0
Top