Understanding NLP Model Adaptation: Pre-Training vs. Fine-Tuning

The difference is a mere difference in the terminology used. When the model is trained on a large generic corpus, it is called 'pre-training'. When it is adapted to a particular task or dataset it is called 'fine-tuning'.

Technically speaking, in either case ('pre-training or 'fine-tuning'), there are updates to the model weights.

For example, usually, you can just take the pre-trained model and then fine-tune it for a specific task (such as classification, question-answering, etc.). However, if you find that the target dataset is from a specific domain, and you have a few unlabeled data that might help the model to adapt to the particular domain, then you can do an MLM or MLM+NSP 'fine-tuning' (unsupervised learning) (some researchers do call this as 'pre-training' especially when a huge corpus is used to train the model), followed by using the target corpus with target task fine-tuning.

Comments

Popular posts from this blog

Data Visualization Notes

Exploring MemGPT: Enhancing Conversational AI with Extended Memory

Leveraging Existing Security Controls: Strengthening Frameworks and Languages