Android Question how to check if a file is ZIP-archive (without unpacking)?

peacemaker

Expert
Licensed User
Longtime User
Hi, All

How to detect that file is ZIP-archive ?
And if it is - try to unpack to check the integrity.
It's related to FTP server exchange - not all files are ZIPs, but ZIP intergity is very important to check, if a file is ZIP for sure. But other files, sure, cannot and should not be "unpacked".
 
Last edited:

DonManfred

Expert
Licensed User
Longtime User
Upvote 0

Sandman

Expert
Licensed User
Longtime User
In addition to the link DonManfred gave, there is an excellent tool called file originally from Linux and the like, but apparently also ported to Windows.

I can't imagine that you will be able to run it on Android, but it's open source so it's entirely possible to see how it decide what a file is and get inspired for your own implementation. As for zip files, it does look into the file to perform some analysis.

This shows how it works:
B4X:
sandman@cray:/tmp echo "wuba luba dub dub" > peacemaker.txt
sandman@cray:/tmp zip peacemaker.zip peacemaker.txt
  adding: peacemaker.txt (deflated 17%)
sandman@cray:/tmp file peacemaker.zip
peacemaker.zip: Zip archive data, at least v2.0 to extract

And it's smart enough to not be fooled:
B4X:
sandman@cray:/tmp echo "wuba luba dub dub" > peacemaker.zip
sandman@cray:/tmp file peacemaker.zip 
peacemaker.zip: ASCII text

I doubt it will help you with checking the integrity though.
 
Upvote 0

peacemaker

Expert
Licensed User
Longtime User
I doubt it will help you with checking the integrity though
So, the algorithm of this "file" tool is interesting.


Maybe this will work, will check...:
updated finally:
B4X:
'check if file format is of ZIP-archive, to be unpacked
Private Sub isZIPfile (dir As String, fn As String) As Boolean
    If File.Exists(dir, fn) = False Then
        Return False
    End If
    
    Dim ist As InputStream = File.OpenInput(dir, fn)
    
    Dim buffer(4) As Byte
    Dim res As Int = ist.ReadBytes(buffer, 0, 4)
    ist.Close
    If res = 4 Then
        If buffer(0) = 0x50 And buffer(1) = 0x4b And buffer(2) = 0x03 And buffer(3) = 0x04 Then
            Return True
        Else
            Return False
        End If
    Else
        Return False
    End If
End Sub

And i guess that .zip, .apk, .xlsx.... files will be True.
 
Last edited:
Upvote 0

Sandman

Expert
Licensed User
Longtime User
I'm not commenting on the logic in your sub, I don't know anything about it. I just saw a couple of ways to reduce the complexity:
B4X:
Private Sub ifZIPfile (dir As String, fn As String) As Boolean
    Dim raf As RandomAccessFile
    raf.Initialize(dir, fn, True)
    Dim buffer() As Byte
    Dim res As Int = raf.ReadBytes(buffer, 0, 4, 0)

    If res <> 4 then Return False

    Return (buffer(0) = 0x50 And buffer(1) = 0x4b And buffer(2) = 0x03 And buffer(3) = 0x04)
End Sub

(I would also name the sub isZipFile instead, but that's probably more of a personal preference.)
 
Upvote 0

JohnC

Expert
Licensed User
Longtime User
Or maybe you might be able to just use the function:

B4X:
IsValidZipFile (ArchivePath As String) As Boolean

From @Informatix's ArchivePlusZip library:

 
Upvote 0

Star-Dust

Expert
Licensed User
Longtime User
You can also use my SD_Zip library

B4X:
Dim UZ As SD_ZipLibrary

UZ.Initialize
Dim L As List = UZ.unZipList(File.Combine(File.DirRootExternal,fileZipName))
Wait For UZ_finish(Success As Boolean)
If Success then log("File ZIP Intect")
 
Upvote 0

drgottjr

Expert
Licensed User
Longtime User
and don't forget .b4xlib files!
 
Upvote 0

Star-Dust

Expert
Licensed User
Longtime User
No. Wrong way. ZIP-archives may be huge, why to wait for unpacking.
It only reads the list does not disassemble the content. This takes less time.

Then you choose which solution you prefer
 
Upvote 0

Star-Dust

Expert
Licensed User
Longtime User
Please note that verifying that the Header is correct does not guarantee that the content is intact.
In your question you asked about the integrity of the file, and you can get this at least by unzipping the list or by checking the CRC
 
Last edited:
Upvote 0

peacemaker

Expert
Licensed User
Longtime User
Actually, strange but my code
Dim res As Int = raf.ReadBytes(buffer, 0, 4, 0)
gives error on any file:

aftp_event = FTP file downloaded OK: /.codepage; file = .codepage, size = 6, date = 28-11-2018 00:00:00
DownloadCompleted: /.codepage, Success=true
Error occurred on line: 244 (aftpclient)
java.lang.IndexOutOfBoundsException
at java.nio.ByteBuffer.wrap(ByteBuffer.java:322)
at anywheresoftware.b4a.randomaccessfile.RandomAccessFile.ReadBytes(RandomAccessFile.java:231)
at java.lang.reflect.Method.invoke(Native Method)
at anywheresoftware.b4a.shell.Shell.runMethod(Shell.java:732)
at anywheresoftware.b4a.shell.Shell.raiseEventImpl(Shell.java:348)
at anywheresoftware.b4a.shell.Shell.raiseEvent(Shell.java:255)
at java.lang.reflect.Method.invoke(Native Method)
at anywheresoftware.b4a.ShellBA.raiseEvent2(ShellBA.java:144)
at anywheresoftware.b4a.BA$2.run(BA.java:387)
at android.os.Handler.handleCallback(Handler.java:883)
at android.os.Handler.dispatchMessage(Handler.java:100)
at android.os.Looper.loop(Looper.java:214)
at android.app.ActivityThread.main(ActivityThread.java:7356)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:930)

RAF lib is v.2.33 of B4A v.10.0
 
Upvote 0

drgottjr

Expert
Licensed User
Longtime User
you got to declare buffer with some size, eg dim buffer(8) as byte
 
Upvote 0
Top