Posts

Showing posts with the label Supervised Learning

Understanding NLP Model Adaptation: Pre-Training vs. Fine-Tuning

Image
The difference is a mere difference in the terminology used. When the model is trained on a large generic corpus, it is called 'pre-training'. When it is adapted to a particular task or dataset it is called 'fine-tuning'. Technically speaking, in either case ('pre-training or 'fine-tuning'), there are updates to the model weights. For example, usually, you can just take the pre-trained model and then fine-tune it for a specific task (such as classification, question-answering, etc.). However, if you find that the target dataset is from a specific domain, and you have a few unlabeled data that might help the model to adapt to the particular domain, then you can do an MLM or MLM+NSP 'fine-tuning' (unsupervised learning) (some researchers do call this as 'pre-training' especially when a huge corpus is used to train the model), followed by using the target corpus with target task fine-tuning.