Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Pre-trained language models (LMs) have been shown to achieve outstanding performance in various natural language processing tasks; however, these models have a significantly large number of parameters to handle large-scale text corpora during the pre-training process, and thus, they entail the risk of overfitting when fine-tuning for small task-oriented datasets is conducted.In this paper, we propose a text embedding augmentation method to prevent such overfitting.The proposed method applies augmentation to a text embedding by generating an adversarial embedding, which is not identical Track Jacket to original input embedding but maintaining the characteristics of the original input embedding, using Fridge Icemaker Hose PGD-based adversarial training for input text data.

A pseudo-label that is identical to the label of the input text is then assigned to adversarial embedding to conduct retraining by using adversarial embedding and pseudo-label as input embedding and label pair for a separate LM.Experimental results on several text classification benchmark datasets demonstrated that the proposed method effectively prevented overfitting, which commonly occurs when adjusting a large-scale pre-trained LM to a specific task.

Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Leave a Reply Cancel reply

Links

Visitors

Archives

Categories

Meta