Making The Most of Data: Augmented SBERT
James Briggs
š Free NLP for Semantic Search Course: https://www.pinecone.io/learn/nlp
ML models are data-hungry. They consume massive amounts of data to identify generalized patterns and apply those learned patterns to new data.
As models get bigger, so do datasets. And although we have seen an explosion of data in the past decade, it is often not accessible or in an ML-friendly format, especially in niche domains.
For many niche, low-resource domains, finding or annotating a substantial dataset manually is practically impossible.
Fortunately, we don't need to label (or even find) this new data. Instead, we can automatically generate or label data using one or more data augmentation techniques.
In this video, we will introduce data augmentation and its application to the field of NLP. We will focus on the 'in-domain' flavor of a particular data-augmentation strategy named augmented SBERT (AugSBERT).
š² Pinecone article: https://www.pinecone.io/learn/data-augmentation/
š¤ 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5
š Subscribe for Article and Video Updates! https://jamescalam.medium.com/subscribe https://medium.com/@jamescalam/membership
š¾ Discord: https://discord.gg/c5QtDB9RAP ... https://www.youtube.com/watch?v=3IPCEeh4xTg
448880998 Bytes