Meta has recently introduced a single foundation model that does both text-to-image and image-to-text generation known as CM3leon

One of the best features of CM3leon is that it achieves state-of-the-art performance for text-to-image generation, despite requiring five times less computerization than previous models

CM3leon has the versatility and effectiveness of autoregressive models, while maintaining low training costs and inference efficiency

It is a causal masked mixed-modal (CM3) model because it can generate sequences of text and images conditioned on arbitrary sequences of other image and text content

CM3leon can significantly improve performance on tasks such as image caption generation, visual question answering, text-based editing, and conditional image generation

This new model of Meta has even outperformed Google’s text-to-image model Parti establishing itself as a new state of the art in text-to-image generation model

Besides an impressive ability to generate complex compositional objects, CM3leon performs well across a variety of vision-language tasks, including visual question answering and long-form captioning

With CM3leon’s capabilities, image generation tools can produce more coherent imagery that better follows the input prompts

CM3Leon's architecture uses a decoder-only transformer similar to high quality text based models but it has the ability to input and generate both text and images

CM3leon which is actually pronounced like “chameleon” is believed to be a gamechanger in the AI industry

Northeast Now

www.nenow.in

Visit Our website