00:00:00

Share Your Feedback 🏝️

Google | Normalizing Flow | Transformer-Based Normalizing Flow

Google | Normalizing Flow | Transformer-Based Normalizing Flow

MinWoo(Daniel) Park | Tech Blog

Read more
Previous: Multimodal | Next Token Diffusion Next: LlamaFusion | Adapting Pretrained Language Models for Multimodal Generation

Google | Normalizing Flow | Transformer-Based Normalizing Flow

  • Related Project: Private
  • Category: Paper Review
  • Date:2024-12-20

Jet: A Modern Transformer-Based Normalizing Flow

  • url: https://arxiv.org/abs/2412.15129
  • pdf: https://arxiv.org/pdf/2412.15129
  • abstract: In the past, normalizing generative flows have emerged as a promising class of generative models for natural images. This type of model has many modeling advantages: the ability to efficiently compute log-likelihood of the input data, fast generation and simple overall structure. Normalizing flows remained a topic of active research but later fell out of favor, as visual quality of the samples was not competitive with other model classes, such as GANs, VQ-VAE-based approaches or diffusion models. In this paper we revisit the design of the coupling-based normalizing flow models by carefully ablating prior design choices and using computational blocks based on the Vision Transformer architecture, not convolutional neural networks. As a result, we achieve state-of-the-art quantitative and qualitative performance with a much simpler architecture. While the overall visual quality is still behind the current state-of-the-art models, we argue that strong normalizing flow models can help advancing research frontier by serving as building components of more powerful generative models.
Previous: Multimodal | Next Token Diffusion Next: LlamaFusion | Adapting Pretrained Language Models for Multimodal Generation

post contain ""

    No matching posts found containing ""