Google & MIT’s Confident Adaptive Language Modeling Uses Dynamic Compute Allocation to Achieve 3x Speedups | Synced
In the new paper Confident Adaptive Language Modeling, a research team from Google and MIT presents Confident Adaptive Language Modeling (CALM), a framework that dynamically allocates different amo...
Source: Synced | AI Technology & Industry Review
In the new paper Confident Adaptive Language Modeling, a research team from Google and MIT presents Confident Adaptive Language Modeling (CALM), a framework that dynamically allocates different amounts of compute to each input and generation timestep, achieving up to 3x speedups while maintaining high performance.