Fine-Tuning BERT On Coarse-Grained Labels: Exploring Hidden States For Fine-Grained Classification

Anjum, Aftab; Krestel, Ralf

doi:doi:10.1007/978-3-031-70239-6_1

Bitte verwenden Sie diesen Link, um diese Publikation zu zitieren, oder auf sie als Internetquelle zu verweisen: https://hdl.handle.net/11108/653

Titel:

Fine-Tuning BERT On Coarse-Grained Labels: Exploring Hidden States For Fine-Grained Classification

Autoren:

Anjum, Aftab
Krestel, Ralf

Datum:

2024

Quellenangabe:

[Editor:] Rapp, Amon et al. [Title:] Natural Language Processing and Information Systems. NLDB 2024 [Series:] Lecture Notes in Computer Science [No.:] 14762 [Publisher:] Springer [Place:] Cham [Pages:] 1-15

Zusammenfassung:

In recent years, pre-trained language models such as BERT (Bidirectional Encoder Representations from Transformers) have demonstrated exceptional performance across various natural language processing tasks. However, its effectiveness of encoding and capturing fine-grained distinctions within the hidden latent space during fine-tuning on coarse-grained labels remains relatively unexplored. To investigate this, we performed two distinct tasks: clustering and few-shot classification on fine-grained labels. The representations extracted from BERT’s hidden layers are utilized as input for these tasks. In the few-shot classification task, we demonstrate that the BERT model encodes valuable information about fine-grained labels during its fine-tuning on coarse-grained labels, allowing the few-shot classifier to classify fine-grained classes accurately even with a limited number of data samples. Additionally, in the clustering analysis, a thorough examination of the hidden layers is conducted to identify clusters that align with fine-grained label distinctions. The identification of such patterns further proves that the BERT model indeed encodes fine-grained label information within its hidden layers even when fine-tuned on coarse-grained labels. The findings contribute to a deeper understanding of the capabilities of the BERT model and provide valuable insights into harnessing its hidden latent space for fine-grained classification tasks.

Persistent Identifier der Erstveröffentlichung:

doi:10.1007/978-3-031-70239-6_1

URL der Erstveröffentlichung:

https://www.ipr.informatik.uni-kiel.de/publications/pdfs/nldb24a

Dokumentversion:

Published Version

Erscheint in der Sammlung:

Open-Access-Publikationen von ZBW-Angehörigen

Datei(en):

Mit dieser Publikation sind keine Dateien verknüpft.

Download-Statistik

BibTeX-Export

Publikationen in ZBWPub sind urheberrechtlich geschützt.