fastai v2 | Deep Learning for Coders: Lesson 8 | Jeremy Howard
AIP - State-of-the-Art AI Research
Join the channel membership: https://www.youtube.com/c/AIPursuit/join
Subscribe to the channel: https://www.youtube.com/c/AIPursuit?sub_confirmation=1
Support and Donation: Paypal ⇢ https://paypal.me/tayhengee Patreon ⇢ https://www.patreon.com/hengee BTC ⇢ bc1q2r7eymlf20576alvcmryn28tgrvxqw5r30cmpu ETH ⇢ 0x58c4bD4244686F3b4e636EfeBD159258A5513744 Doge ⇢ DSGNbzuS1s6x81ZSbSHHV5uGDxJXePeyKy
Wanted to own BTC, ETH, or even Dogecoin? Kickstart your crypto portfolio with the largest crypto market Binance with my affiliate link: https://accounts.binance.com/en/register?ref=27700065
The video was published under the license of the Creative Commons Attribution license (reuse allowed). It is reposted for educational purposes and encourages involvement in the field of research. Source: https://youtu.be/WjnwWeGjZcM By Jeremy Howard: https://www.youtube.com/channel/UCX7Y2qWriXpqocG97SFW2OQ
PLEASE DO SUBSCRIBE AND FOLLOW JEREMY HOWARD!!!! HE IS TRULY INSPIRING!!!
NB: We recommend watching these videos through https://course.fast.ai rather than directly on YouTube, to get access to the searchable transcript, interactive notebooks, setup guides, questionnaires, and so forth.
We finish this course with a full lesson on natural language processing (NLP). Modern NLP depends heavily on self-supervised learning, and in particular the use of language models.
Pretrained language models are fine-tuned, in order to benefit from transfer learning. Unlike computer vision, fine-tuning in NLP can take advantage of an extra step, which is the use of self-supervised learning on the target dataset.
Before we can do any modeling with text data, we first have to tokenize and numericalize it. There are a number of approaches to tokenization, and which you choose will depend on your language and dataset.
NLP models use the same basic approach of entity embedding that we've seen before, except that for text data it's called word embedding
. The method, however, is nearly identical.
NLP models have to handle documents of varying sizes, so they require a somewhat different architecture, such as a recurrent neural network (RNN). It turns out that an RNN is basically just a regular deep net, which has been refactored using a loop.
However, simple RNNs suffer from exploding gradients, so we have to use methods such as the LSTM cell to avoid this problem.
Finally, we look at some tricks to improve the results of our NLP models, such as additional regularization approaches, including various types of dropout, and activation regularization, as well as looking at weight tying. ... https://www.youtube.com/watch?v=wi6UGLBw-8c
562439748 Bytes