Using tidyverse's Julia-native Prodigy on 1000 OpenAI Embeddings
Alex Tantos
YT video JuliaLang code: https://gist.github.com/atantos/40c29e8ce72e4bbfab4ba5a337d01725
Official Tidier.jl webpage: https://tidierorg.github.io/Tidier.jl/dev/
Singh Karandeep's github account: https://github.com/kdpsingh
Julia Sigle personal web page: https://juliasilge.com/
Today, we'll handle a real-world data analysis scenario. We are going to clean a horror movies dataset and prepare it for calculating movie similarities using cosine distance based on OpenAI embeddings of the movies' overviews.
CHAPTERS: 0:00 Tidyerverse and Tidier.jl 1:11 Hoor movies dataset 1:30 Julia Silge's R code 1:53 CCleaning the dataset with and without Tidier.jl 5:45 Fetching the OpenAI embeddings 7:10 Preparing the embeddings matrix 10:20 Compting the cosine distance 11:10 Movies closer/distant to the movie "Belbo the Clown" ... https://www.youtube.com/watch?v=BFKubJwDKtw
57826759 Bytes