NVIDIA’s Global Context ViT Achieves SOTA Performance on CV Tasks Without Expensive Computation | Synced

In the new paper Global Context Vision Transformers, an NVIDIA research team proposes the Global Context Vision Transformer, a novel yet simple hierarchical ViT architecture comprising global self-...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

In the new paper Global Context Vision Transformers, an NVIDIA research team proposes the Global Context Vision Transformer, a novel yet simple hierarchical ViT architecture comprising global self-attention and token generation modules that enables the efficient modelling of both short- and long-range dependencies without costly compute operations while achieving SOTA results across various computer vision tasks.