I agree with Erel, that's complicated.
However, if accuracy isn't overly important you could come pretty close with something like this:
- Regardless of image size, resize to 100x100 pixels (*)
- Posterize the image to 32 colors (*)
- Calculate hash on resulting image
This is a quite simple solution to make all images comparable by forcing them into a similar mold. (If you can live with the fact that the accuracy is out the window, but in many cases it would probably be
good enough.)
(*) Size and number of colors might need to be adjusted, depending on your situation. The smaller you make the image the more you will also catch identical images that are cropped in different ways, for instance.
PS. You might want to add a step between 1 and 2 where you perhaps convert to grayscale and apply auto levels to the image. Again, depending on your situation. In many cases it would probably not be necessary.