You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it true that a VAE model can only generate images with the same dimensions as the training data? For example, if the model was trained on 256x256 images, is there any way to use a checkpoint from that model to generate images with arbitrary resolutions, such as 352x275?
The text was updated successfully, but these errors were encountered:
Is it true that a VAE model can only generate images with the same dimensions as the training data? For example, if the model was trained on 256x256 images, is there any way to use a checkpoint from that model to generate images with arbitrary resolutions, such as 352x275?
@Leiii-Cao
In fact, this is not the case, and we will soon release work on T2I based on VAR to support arbitrary resolution generation.
Also, VAE is a CNN structure, so it can be reconstructed at any resolution
@Leiii-Cao
Powered by a CNN structure, VAE could encode and decode images with arbitrary resolution images. However, VAR only generates square images. Our recent work Infinity (text-to-image model for VAR) could generates images with various aspect ratios. Please check https://github.com/FoundationVision/Infinity
Is it true that a VAE model can only generate images with the same dimensions as the training data? For example, if the model was trained on 256x256 images, is there any way to use a checkpoint from that model to generate images with arbitrary resolutions, such as 352x275?
The text was updated successfully, but these errors were encountered: