Linkedin's New Search Engine | DeText: A Deep Text Ranking Framework with BERT | Deep Ranking Model
Deep Learning Explainer
This video explains Linkedin's latest ranking models (deep learning based) and how they deployed this model to production. Deploying deep learning models is a very challenging thing to do, especially at a scale like Linkedin. This is a really practical and useful paper. If you also want to build a semantic search engine, make sure you check it out!
0:00 - Intro 2:50 - What is a search engine 3:11 - Search v.s Ranking 4:08 - Representation-based ranking 8:12 - Interaction-based ranking 9:28 - 3 search verticals on Linkedin 11:30 - DeText framework 13:48 - Interaction layer 15:22 - Traditional feature processing 16:55 - Learning-to-rank layer 18:38 - DeText-BERT for ranking 20:56 - Linkedin data for BERT pre-training 22:31 - Document embedding pre-computing 23:41 - 2-pass ranking (DeText-CNN) 25:-00 - Experiment settings 26:17 - Training data 27:36 - How good DeText is 30:18 - General BERT v.s In-domain BERT 32:42 - Traditional feature ablation study 33:52 - Metrics for online experiments 36:02 - BERT v.s CNN 37:42 - 99th percentile latency 39:29 - Summary
Related Video: Neural Information Retrial | REALM: Retrieval-Augmented Language Model Pre-training https://youtu.be/JQ-bxQT5Qsw
Paper: DeText: A Deep Text Ranking Framework with BERT Weiwei https://arxiv.org/abs/2008.02460
Code: https://github.com/linkedin/detext
Follow me on Twitter https://twitter.com/DeepExplainer
Abstract Ranking is the most important component in a search system. Most search systems deal with large amounts of natural language data, hence an effective ranking system requires a deep understanding of text semantics. Recently, deep learning based natural language processing (deep NLP) models have generated promising results on ranking systems. BERT is one of the most successful models that learn contextual embedding, which has been applied to capture complex query-document relations for search ranking. However, this is generally done by exhaustively interacting each query word with each document word, which is inefficient for online serving in search product systems. In this paper, we investigate how to build an efficient BERT-based ranking model for industry use cases. The solution is further extended to a general ranking framework, DeText, that is open sourced and can be applied to various ranking productions. Offline and online experiments of DeText on three real-world search systems present significant improvement over state-of-the-art approaches. ... https://www.youtube.com/watch?v=Dd4Rw3t5QQk
55976788 Bytes