B4J Question Perceptual Hashes - create hash from image and compare for similarity

Sandman

Expert
Licensed User
Longtime User
I agree with Erel, that's complicated.

However, if accuracy isn't overly important you could come pretty close with something like this:
  1. Regardless of image size, resize to 100x100 pixels (*)
  2. Posterize the image to 32 colors (*)
  3. Calculate hash on resulting image
This is a quite simple solution to make all images comparable by forcing them into a similar mold. (If you can live with the fact that the accuracy is out the window, but in many cases it would probably be good enough.)

(*) Size and number of colors might need to be adjusted, depending on your situation. The smaller you make the image the more you will also catch identical images that are cropped in different ways, for instance.

PS. You might want to add a step between 1 and 2 where you perhaps convert to grayscale and apply auto levels to the image. Again, depending on your situation. In many cases it would probably not be necessary.
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
I made some tests with posterizing / binning the images.

I got the best results with the simplest approach:
B4X:
Sub CalcHash (bmp As B4XBitmap) As Int
    bc.CopyPixelsFromBitmap(bmp)
    Dim argb As ARGBColor
    Dim hash As Int
    For x = 0 To bc.mWidth - 1
        For y = 0 To bc.mHeight - 1
            bc.GetARGB(x, y, argb)
            hash = hash + argb.r + argb.g + argb.b + argb.a
        Next
    Next
    Return hash
End Sub

Where bc = BitmapCreator and its size is 100 x 100.
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Dim BinSize As Int = 30
B4X:
    argb.a = Round(Bit.And(0xff, argb.a) / BinSize) * BinSize
    argb.r = Round(Bit.And(0xff, argb.r) / BinSize) * BinSize
    argb.g = Round(Bit.And(0xff, argb.g) / BinSize) * BinSize
    argb.b = Round(Bit.And(0xff, argb.b) / BinSize) * BinSize
You can see the result by adding bc.SetARGB(x, y, argb)
 
Upvote 0

Alexander Stolte

Expert
Licensed User
Longtime User
Dim BinSize As Int = 30
B4X:
argb.a = Round(Bit.And(0xff, argb.a) / BinSize) * BinSize
argb.r = Round(Bit.And(0xff, argb.r) / BinSize) * BinSize
argb.g = Round(Bit.And(0xff, argb.g) / BinSize) * BinSize
argb.b = Round(Bit.And(0xff, argb.b) / BinSize) * BinSize
You can see the result by adding bc.SetARGB(x, y, argb)
and how you combine that with the code above?
I have no idea what i'm doing here 😅
 
Upvote 0

Sandman

Expert
Licensed User
Longtime User
I have no idea what i'm doing here
I recommend that before you start coding, do step 1 and 2 manually for a corpus of images. Then use Erels code to calculate hash for the images and see if the result is what you expect. Just so you don't waste days on something that might not work for you. Remember: This is not an accurate method, it might work for you, or not. :)

Step 1 and 2 can easily be performed using ImageMagick (which is an amazing tool for processing images). Here are the two commands that you would use:
 
Upvote 0
Top