jul 14, 2023 - Meta introduces CM3leon generative model for text and images
Description:
CM3leon is the first multimodal model to be trained with a recipe adapted from text-only language models, and it achieves state-of-the-art performance for text-to-image generation. CM3leon is trained with five times less compute than previous transformer-based methods, while still maintaining low training costs and inference efficiency. It is a causal masked mixed-modal (CM3) model as it can generate sequences of text and images conditioned on arbitrary sequences of other image and text content.