Android Question compress and decompress string with Zlib

KZero

Active Member
Licensed User
Longtime User
this code working

B4X:
Dim data() As Byte
    Dim compress As CompressedStreams 
    data = "Playing with in-memory compression.".GetBytes("UTF8")
    Dim compressed(), decompressed() As Byte
    compressed = compress.CompressBytes(data, "zlib")
    decompressed = compress.DecompressBytes(compressed, "zlib")
    Msgbox(BytesToString(decompressed,0, decompressed.Length, "UTF8"), "")

why this not working ???

B4X:
Dim data() As Byte
    Dim compress As CompressedStreams 
    data = "Playing with in-memory compression.".GetBytes("UTF8")
    Dim compressed(), decompressed() As Byte
    compressed = compress.CompressBytes(data, "zlib")
Dim x As String 
x=BytesToString(compressed,0,compressed.Length,"UTF8")

    decompressed = compress.DecompressBytes(x.getbytes("UTF8"), "zlib")
    Msgbox(BytesToString(decompressed,0, decompressed.Length, "UTF8"), "")
 

KZero

Active Member
Licensed User
Longtime User
i get error in the runtime

in this line :decompressed = compress.DecompressBytes(x.getbytes("UTF8"), "zlib")

jave.io.IOException
 
Upvote 0

sirjo66

Well-Known Member
Licensed User
Longtime User
You can't use this line
B4X:
x=BytesToString(compressed,0,compressed.Length,.....
Because you are trying to convert an array of byte in a utf8 string, but the data isn't an utf8 string, so the system try to change it in string but it can't to do it, so you miss datas.

This because utf8 use 2 bytes on many chars.
You can try to change "UTF-8" to "ANSI" or "ISO-8859-1" that use only 1byte for any char and maybe that can transform array to string without miss datas
 
Last edited:
Upvote 0

KZero

Active Member
Licensed User
Longtime User
You can't use this line
B4X:
x=BytesToString(compressed,0,compressed.Length,.....
Because you are trying to convert an array of byte in a utf8 string, but the data isn't an utf8 string, so the system try to change it in string but it can't to do it, so you miss datas.

This because utf8 use 2 bytes on many chars.
You can try to change "UTF-8" to "ANSI" or "ISO-8859-1" that use only 1byte for any char and maybe that can transfort array to string without miss datas

ANSI or ISO only accepts regular characters and any other symbol will be "?" so alot of data will be lost

my android app receiving compressed UTF8 string from server app
i can't send it as bytes from the server, also encoding bytes to BASE64 will double the packet size

any idea ?
 
Upvote 0

James Chamblin

Active Member
Licensed User
Longtime User
ANSI or ISO only accepts regular characters and any other symbol will be "?" so alot of data will be lost
Not true. Unrecognized characters will be printed as ?, but will be stored as its original value.
my android app receiving compressed UTF8 string from server app
i can't send it as bytes from the server, also encoding bytes to BASE64 will double the packet size

any idea ?
The problem here is that once the UTF8 string is compressed, it is no longer UTF8. It will just appear as random bytes to everything except the decompresser. Using ANSI or ISO-8859-1 in BytesToString() will store the bytes as though they were 8 bit characters even though some may not be printable. When you use GetBytes, the characters are converted back to a byte array which then can be decompressed back into a UTF string.

If you try this modification in your test program, you will see that all data is preserved.
B4X:
Sub Activity_Create(FirstTime As Boolean)
    'Do not forget to load the layout file created with the visual designer. For example:
    'Activity.LoadLayout("Layout1")
Dim data() As Byte
    Dim compress As CompressedStreams 
    data = "Playing with in-memory compression. £££".GetBytes("UTF8")
    Dim compressed(), decompressed() As Byte
    compressed = compress.CompressBytes(data, "zlib")
Dim x As String 
x=BytesToString(compressed,0,compressed.Length,"ISO-8859-1")

    decompressed = compress.DecompressBytes(x.getbytes("ISO-8859-1"), "zlib")
    Msgbox(BytesToString(decompressed,0, decompressed.Length, "UTF8"), "")
End Sub
 
Upvote 0

KZero

Active Member
Licensed User
Longtime User
Not true. Unrecognized characters will be printed as ?, but will be stored as its original value.

The problem here is that once the UTF8 string is compressed, it is no longer UTF8. It will just appear as random bytes to everything except the decompresser. Using ANSI or ISO-8859-1 in BytesToString() will store the bytes as though they were 8 bit characters even though some may not be printable. When you use GetBytes, the characters are converted back to a byte array which then can be decompressed back into a UTF string.

If you try this modification in your test program, you will see that all data is preserved.
B4X:
Sub Activity_Create(FirstTime As Boolean)
    'Do not forget to load the layout file created with the visual designer. For example:
    'Activity.LoadLayout("Layout1")
Dim data() As Byte
    Dim compress As CompressedStreams
    data = "Playing with in-memory compression. £££".GetBytes("UTF8")
    Dim compressed(), decompressed() As Byte
    compressed = compress.CompressBytes(data, "zlib")
Dim x As String
x=BytesToString(compressed,0,compressed.Length,"ISO-8859-1")

    decompressed = compress.DecompressBytes(x.getbytes("ISO-8859-1"), "zlib")
    Msgbox(BytesToString(decompressed,0, decompressed.Length, "UTF8"), "")
End Sub

that's right
i thought "?" character in ANSI its value always = 63

the problem is the server side using custom socket library it only sends String packets and automatically converts it to UTF8
i didn't have the same issue with Delphi ZDecompressStr(StrToByte(DATA)); and everything is ok

i think the only solution in this case is converting the compressed bytes to Base64 as Erel mentioned

thanks
 
Upvote 0
Top