EPFL’s Multi-modal Multi-task Masked Autoencoder: A Simple, Flexible and Effective ViT Pretraining Strategy Applicable to Any RGB Dataset | Synced
The Swiss Federal Institute of Technology Lausanne (EPFL) presents Multi-modal Multi-task Masked Autoencoders (MultiMAE), a simple and effective pretraining strategy that enables masked autoencodin...
Source: Synced | AI Technology & Industry Review
The Swiss Federal Institute of Technology Lausanne (EPFL) presents Multi-modal Multi-task Masked Autoencoders (MultiMAE), a simple and effective pretraining strategy that enables masked autoencoding to include multiple modalities and tasks and is applicable to any RGB dataset.