Gemma 4 E2B is multimodal and can return bounding boxes directly. Upload an image, describe what to find, and the app will draw the detections back onto the image.