Android Question Counting Overlapping Objects in Images: App Development Challenge

jroriz · Feb 13, 2025

Hello everyone!

I'm developing an application that needs to count the number of glass plates in photos. The plates are rectangular, of equal size, and usually arranged similarly in all photos. However, the challenge is that some plates may overlap, as in the example image below, where there are two overlapping plates on the right side.

The idea is for the application to be able to count the plates with the highest possible accuracy, even when there is overlap. The plates will always be more or less in the same position and have the same width, which may be useful for algorithm development.

I would like to know if anyone has any suggestions or tips for dealing with this overlap problem and ensuring accuracy in counting. What computer vision techniques or libraries would be most suitable for this case?

I appreciate any help or guidance you can give me.

Thank you!

jroriz · Feb 14, 2025

Daestrum said:
The model I used was "google/owlv2-base-patch16-ensemble" (about 600MB). You would probably need to run it on a PC and call it from Android.

Did you download it from huggingface?

Daestrum · Feb 14, 2025

yes - from Huggingface
This is the Python code I used

Python:

from PIL import Image, ImageDraw, ImageFont
import torch
from transformers import Owlv2Processor, Owlv2ForObjectDetection

def get_model():
    processor = Owlv2Processor.from_pretrained("google/owlv2-base-patch16-ensemble")
    model = Owlv2ForObjectDetection.from_pretrained("google/owlv2-base-patch16-ensemble")
    return processor, model

def get_image(image_path):
    return Image.open(image_path)

def get_texts(processor, image, look_for: str):
    texts = [look_for]
    inputs = processor(text=texts, images=image, return_tensors="pt")
    return inputs, texts

def process(model, processor, inputs, image):
    with torch.no_grad():
        outputs = model(**inputs)

        # Target image sizes (height, width) to rescale box predictions [batch_size, 2]
        target_sizes = torch.Tensor([image.size[::-1]])
        # Convert outputs (bounding boxes and class logits) to Pascal VOC Format (xmin, ymin, xmax, ymax)
        results = processor.post_process_object_detection(outputs=outputs, target_sizes=target_sizes, threshold=0.1)
        return results

def draw_boxes(image, boxes, scores, labels, texts):
    draw = ImageDraw.Draw(image)
    font = ImageFont.load_default()

    for box, score, label in zip(boxes, scores, labels):
        box = [round(i, 2) for i in box.tolist()]
        draw.rectangle(box, outline='red', width=2)
        draw.text((box[0], box[1]), f"{texts[label]}: {round(score.item(), 2)}", fill='blue', font=font)

    image.show()

def run_me(target_object: str, picture_url: str):
    processor, model = get_model()
    image = get_image(picture_url)
    inputs, texts = get_texts(processor, image, target_object)
    results = process(model, processor, inputs, image)
   
    i = 0  # Retrieve predictions for the first image for the corresponding text queries
    boxes, scores, labels = results[i]["boxes"], results[i]["scores"], results[i]["labels"]
    text = texts[i]

    draw_boxes(image, boxes, scores, labels, text)
    print(len(boxes))
    for box, score, label in zip(boxes, scores, labels):
        box = [round(i, 2) for i in box.tolist()]
        return f"Detected {text[label]} with confidence {round(score.item(), 3)} at location {box}"

emexes · Feb 16, 2025

Daestrum said:
The prompt was simply 'glass panes' The number was how sure it was, not the count.

Well, if I'd known we don't have to count the glass sheets, that makes it easier. But my code is longer than just two words, and not very happy with the reflections on the upper edges. And I'm intrigued to see how far the AI/ML model can go.

Or some digital calipers, if there is an equivalent to counting scales. Although it might not always be easy to get the calipers in to measure the unit thickness.

Daestrum · Feb 16, 2025

I suppose the next step would be to grab the image in the bounding box, enlarge it, and ask again for the panes.

emexes · Feb 16, 2025

Daestrum said:
and ask again for the panes.

or put AI into a loop and see how that panes out 🫣

https://x.com/i/grok/share/Y6PZlQiFonk1Xm8rUYanfVhO0

Android Question Counting Overlapping Objects in Images: App Development Challenge

jroriz

Active Member

jroriz

Active Member

Daestrum

Expert

emexes

Expert

Daestrum

Expert

emexes

Expert

Similar Threads