Review Note

Last Update: 04/08/2024 10:23 AM

Current Deck: Deeplearning(week 3 & 4)::week 4

Published

Currently Published Content


Text
Self-Attention



For each item in the input sequence we compute three things. What are those? {{c1::For each item in the input sequence we compute a Query, Key and Value. These vectors are obtained by multiplying the input by learned weight matrices.}}

What do we get when we perform the row-wise softmax?
{{c2::The softmax gives a probability distribution. This distribution represents the attention weights.}}
Extra

Current Tags:

DeepLearning::week4::Wednesday

Pending Suggestions


No pending suggestions for this note.