B4J Question A dumb question about file compare

JTmartins

Active Member
Licensed User
Longtime User
Is it normal that comparing 2 files byte per byte using InputStream can take a huge amount of time (maybe one hour or more). I'm talking about a 4.3megabyte file.

code is something like (I'm not in front of the PC)

B4X:
Private bytesOriginal(LocalFileSize) As Byte
Private bytescopy(copySize) As Byte
bytesOriginal=Bit.InputStreamToBytes(LocalFile)
bytesCopy=Bit.InputStreamToBytes(copyFile)

then I iterate trough the array using a counter inside a timer with 20ms (cant be much lower or I get errors)

(can't use a for...next loop, as I need to update the UI with a byte compare counter and with a for..next I can't update the label text)

If some one has a better idea, or some help, I would be gratefull.

Many thanks.
 

JTmartins

Active Member
Licensed User
Longtime User
I made a smal sample of what I'm trying to do. And probably I'm doing some extremly silly thing.

In this case I made the timer as 20 miliseconds ( and even with this value I ocasionaly get : java.lang.OutOfMemoryError: unable to create new native thread - wich is puzzling me..Increasing the value makes the error apear at a later stage, till it actually disapears)

I attach 1 file with this small project the other is a sample MP3 for compare test purposes. The folder "test" with the Mp3 should go to c:\ .
File is available here : https://drive.google.com/file/d/0B3GJ8Rccu01eYmJVN2JTVno3RXM/view?usp=sharing


Actually if my math is not failing I'm comparing only 1 byte in every 20 miliseconds, then it will be 50 bytes per second. So for a file of 2184381 bytes (the test mp3) it will be 728 minutes !!!

I'm pretty sure there is some sorcery to solve this.

help !!!

btw - Using jdk1.8.0_25
 

Attachments

  • Filecompare.zip
    2.1 KB · Views: 231
Last edited:
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Note that your code doesn't read the files at all.

I've changed your code to:
B4X:
Sub FileCompare
   Dim n As Long = DateTime.Now   
   Log(Compare(Bit.InputStreamToBytes(File.OpenInput(OriginalDir,"Canon.mp3")), _
     Bit.InputStreamToBytes(File.OpenInput(CopyDir,"Canon.mp3"))))
   Log(DateTime.Now - n)
End Sub

Sub Compare(array1() As Byte, array2() As Byte) As Boolean
   If array1.Length <> array2.Length Then Return False
   For i = 0 To array1.Length - 1
     If array1(i) <> array2(i) Then Return False
   Next
   Return True
End Sub

It takes 23ms (milliseconds) to compare the two files. There is no reason to make it complicate and add a progress bar as it is very quick.
 
Upvote 0

JTmartins

Active Member
Licensed User
Longtime User
LOl..you are right Erel...When passing the code from my project to this sample, I actualy forgot to copy the lines to read the files...It's always an extra work, as I try to translate var names into english so that they have some meaning to others.

B4X:
bytesOriginal=Bit.InputStreamToBytes(originalFile)
bytescopy=Bit.InputStreamToBytes(copiedFile)

but the time it takes its the same, obviously.

I like the progress bar...this is a backup program I'm trying to do..it is working OK (apart from this slow thing)...and it can be a file of 2MB or it may happen to be a much bigger file.

Also I think that if I try to compare 2 files with say 400MB each..this method will fail due to memory problems, unless I read them by chunks, wich is the path I will have to pursuit.

Many thanks for your trouble.
 
Last edited:
Upvote 0

JTmartins

Active Member
Licensed User
Longtime User
Managed to find a compromise solution using RandomAccessFile. Compared 2 files of 350MB each and took around 17 seconds and no memory problems. So, I'm happy
 
Upvote 0
Top