How-to Decode Outputs From NLP Models (Python)

James Briggs

gpt-2 gpt-3 natural language processing nlp open ai python pytorch tensorflow text generation transformers

description

In this video, we will cover three ways to decode the output probabilities from NLP models - greedy search, random sampling, and beam search.

Learning how to decode outputs can make a huge difference in diagnosing model issues and improving text output quality - and as an added bonus it's super easy.

One of the often-overlooked parts of sequence generation in natural language processing (NLP) is how we select our output tokens — otherwise known as decoding.

You may be thinking — we select a token/word/character based on the probability of each token assigned by our model.

This is half-true — in language-based tasks, we typically build a model which outputs a set of probabilities to an array where each value in that array represents the probability of a specific word/token.

At this point, it might seem logical to select the token with the highest probability? Well, not really — this can create some unforeseen consequences — as we will see soon.

When we are selecting a token in machine-generated text, we have a few alternative methods for performing this decode — and options for modifying the exact behavior too.

In this video we will explore three different methods for selecting our output token, these are:

Greedy Decoding
Random Sampling
Beam Search

🤖 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5

Link to the article version on Medium: https://towardsdatascience.com/the-three-decoding-methods-for-nlp-23ca59cb1e9d

Free link (if you don't have membership): https://towardsdatascience.com/the-three-decoding-methods-for-nlp-23ca59cb1e9d?sk=64fbb0204c174dc520af027a69f88030 ... https://www.youtube.com/watch?v=QJq9RTp_OVE

created

2025-02-21

staked

0.0 LBC

license

Copyrighted (contact publisher)

File size

109822016 Bytes