Applying Linearly Scalable Transformers to Model Longer Protein Sequences | Synced
Researchers proposed a new transformer architecture called “Performer” — based on what they call fast attention via orthogonal random features (FAVOR).
Source: Synced | AI Technology & Industry Review
Researchers proposed a new transformer architecture called “Performer” — based on what they call fast attention via orthogonal random features (FAVOR).