Please use this identifier to cite or link to this item: https://hdl.handle.net/11108/361
Title: 

Fusion architectures for automatic subject indexing under concept drift

Authors: 
Toepfer, Martin
Seifert, Christin
Year of Publication: 
2018
Citation: 
[Journal:] International Journal on Digital Libraries [ISSN:] 1432-1300 [Issue:] Online First
Abstract: 
Indexing documents with controlled vocabularies enables a wealth of semantic applications for digital libraries. Due to the rapid growth of scientific publications, machine learning-based methods are required that assign subject descriptors automatically. While stability of generative processes behind the underlying data is often assumed tacitly, it is being violated in practice. Addressing this problem, this article studies explicit and implicit concept drift, that is, settings with new descriptor terms and new types of documents, respectively. First, the existence of concept drift in automatic subject indexing is discussed in detail and demonstrated by example. Subsequently, architectures for automatic indexing are analyzed in this regard, highlighting individual strengths and weaknesses. The results of the theoretical analysis justify research on fusion of different indexing approaches with special consideration on information sharing among descriptors. Experimental results on titles and author keywords in the domain of economics underline the relevance of the fusion methodology, especially under concept drift. Fusion approaches outperformed non-fusion strategies on the tested data sets, which comprised shifts in priors of descriptors as well as covariates. These findings can help researchers and practitioners in digital libraries to choose appropriate methods for automatic subject indexing, as is finally shown by a recent case study.
Subjects: 
Automatic Subject Indexing
Concept drift
Meta-learning
Multi-label classification
Short texts
Persistent Identifier of the first edition: 

Files in This Item:
There are no files associated with this item.





Items in ZBWPub are protected by copyright, with all rights reserved, unless otherwise indicated.