Improving Punctuation Restoration for Speech Transcripts via External Data
Deep Learning Explainer
It proposes a data sampling technique and a two-stage fine-tuning approach, allowing people to sample more training data similar to our in-domain ASR transcripts and improve the model performance.
0:00 - How to make a model more accurate 1:02 - I published a paper 3:05 - Punctuation restoration 5:32 - In-domain data 7:29 - Annotated data is expensive 8:47 - Opensubtitles 10:04 - Data sampling via LM 11:34 - Two-stage fine-tuning 14:55 - Layer reduction 16:49 - Takeaway 18:10- EMNLP 2021
Connect Linkedin https://www.linkedin.com/in/xue-yong-fu-955723a6/ Twitter https://twitter.com/home email edwindeeplearning@gmail.com
Paper Improving Punctuation Restoration for Speech Transcripts via External Data https://arxiv.org/abs/2110.00560?context=cs
Abstract Automatic Speech Recognition (ASR) systems generally do not produce punctuated transcripts. To make transcripts more readable and follow the expected input format for down-stream language models, it is necessary to add punctuation marks. In this paper, we tackle the punctuation restoration problem specifically for the noisy text (e.g., phone conversation scenarios). To leverage the available writ-ten text datasets, we introduce a data sampling technique based on an n-gram language model to sample more training data that are similar to our in-domain data. Moreover, we propose a two-stage fine-tuning approach that utilizes the sampled external data as well as our in-domain dataset for models based on BERT. Extensive experiments show that the proposed approach outperforms the baseline with an improvementof1.12%F1 score. ... https://www.youtube.com/watch?v=jxOpu4hXPJY
85523846 Bytes