Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA Performance | Synced

In the new paper Fast DistilBERT on CPUs, researchers from Intel Corporation and Intel Labs propose a pipeline and hardware-aware extreme compression technique for creating and running fast transfo...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

In the new paper Fast DistilBERT on CPUs, researchers from Intel Corporation and Intel Labs propose a pipeline and hardware-aware extreme compression technique for creating and running fast transformer models on CPUs. The approach achieves impressive speed ups and SOTA performance in production environments.