Google DeepMind Introduces Visual Banana: A Tuned Image Generator That Beats SAM 3 in Segmentation and Depth Any V3 in Metric Depth Estimation
For years, the computer vision community has worked on two different tracks: generative models (which generate images) and discriminative models (which understand them). The guesswork was straightforward – good models for it to do the pictures are not really good reading see. A new paper from Google, titled “Photographers are students of Generalist Vision” (arXiv:2604.20329), … Read more