As there is not (as far as I know) a direct way to do it in B4A, RandomCoder's approach sounds good.
You must also take into account that camera images may also vary depending on lighting conditions, so the autowhite balance must be off (mostly if the images are from exterior) and also some other camera settings fixed. Also, you could normalize the grayscale result.
When you say "similar", you must also decide if you accept "displaced" images (the human eye will recognize two images displaced as similar) but your algorithm may not, depending on how you implement it
A more elaborate try would be with image correlation. It needs math background and a really fast processing so perhaps 1 second is not enough. Don't know if there are fast FFT-2D routines available for B4A
Also, I would highly reccommend to implement whatever algorithm first in PC and have a set of test images to try until you have "tuned" the threshold parameters needed, and then code it with B4A