Hugging Face Uses Block Pruning to Speedup Transformer Training While Maintaining Accuracy | Synced
A research team from Hugging Face introduces a block pruning approach targeting both small and fast models, which learns to eliminate full components of the original model while effectively droppin...
Source: Synced | AI Technology & Industry Review
A research team from Hugging Face introduces a block pruning approach targeting both small and fast models, which learns to eliminate full components of the original model while effectively dropping a large number of attention heads.