Normalization
IntermediateTechniques that stabilize and speed training by normalizing activations; LayerNorm is common in Transformers.
AdvertisementAd space — term-top
Definition
Full Definition
Techniques that stabilize and speed training by normalizing activations; LayerNorm is common in Transformers.