NYU Deep Learning Week 12 – Practicum: RNN and LSTM architectures, Attention and the Transformer
AIP - State-of-the-Art AI Research
Join the channel membership: https://www.youtube.com/c/AIPursuit/join
Subscribe to the channel: https://www.youtube.com/c/AIPursuit?sub_confirmation=1
Support and Donation: Paypal ⇢ https://paypal.me/tayhengee Patreon ⇢ https://www.patreon.com/hengee BTC ⇢ bc1q2r7eymlf20576alvcmryn28tgrvxqw5r30cmpu ETH ⇢ 0x58c4bD4244686F3b4e636EfeBD159258A5513744 Doge ⇢ DSGNbzuS1s6x81ZSbSHHV5uGDxJXePeyKy
Wanted to own BTC, ETH, or even Dogecoin? Kickstart your crypto portfolio with the largest crypto market Binance with my affiliate link: https://accounts.binance.com/en/register?ref=27700065
The video was published under the license of the Creative Commons Attribution license (reuse allowed). It is reposted for educational purposes and encourages involvement in the field of research. Source: https://youtu.be/f01J0Dri-6k Subscribe to Alfredo Canziani: https://www.youtube.com/channel/UCupQLyNchb9-2Z5lmUOIijw
We introduce attention, focusing on self-attention and its hidden layer representations of the inputs. Then, we introduce the key-value store paradigm and discuss how to represent queries, keys, and values as rotations of an input. Finally, we use attention to interpret the transformer architecture, taking a forward pass through a basic transformer, and comparing the encoder-decoder paradigm to sequential architectures. 0:01:09 – Attention 0:17:36 – Key-value store 0:35:14 – Transformer and PyTorch implementation 0:54:00 – Q&A ... https://www.youtube.com/watch?v=8BdMObVdr1Y
197389745 Bytes