00:00:00

Share Your Feedback 🏝️

EQ-VAE

EQ-VAE

MinWoo(Daniel) Park | Tech Blog

Read more
Previous: Attn | Prune Sub-quadratic Attention Next: Score of Mixture

EQ-VAE

  • Related Project: Private
  • Category: Paper Review
  • Date: 2025-02-15

EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

  • url: https://arxiv.org/abs/2502.09509
  • pdf: https://arxiv.org/pdf/2502.09509
  • html: https://arxiv.org/html/2502.09509v1
  • abstract: Latent generative models have emerged as a leading approach for high-quality image synthesis. These models rely on an autoencoder to compress images into a latent space, followed by a generative model to learn the latent distribution. We identify that existing autoencoders lack equivariance to semantic-preserving transformations like scaling and rotation, resulting in complex latent spaces that hinder generative performance. To address this, we propose EQ-VAE, a simple regularization approach that enforces equivariance in the latent space, reducing its complexity without degrading reconstruction quality. By finetuning pre-trained autoencoders with EQ-VAE, we enhance the performance of several state-of-the-art generative models, including DiT, SiT, REPA and MaskGIT, achieving a 7 speedup on DiT-XL/2 with only five epochs of SD-VAE fine-tuning. EQ-VAE is compatible with both continuous and discrete autoencoders, thus offering a versatile enhancement for a wide range of latent generative models. Project page and code: this https URL.
Previous: Attn | Prune Sub-quadratic Attention Next: Score of Mixture

post contain ""

    No matching posts found containing ""