Please use this identifier to cite or link to this item: https://hdl.handle.net/11108/317
Title: 

Machine Learning Architectures for Scalable and Reliable Subject Indexing. Fusion, Knowledge Transfer, and Confidence

Authors: 
Toepfer, Martin
Year of Publication: 
2017
Citation: 
[Editor:] Kamps, Jaap et al. [Title:] Research and Advanced Technology for Digital Libraries 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Thessaloniki, Greece, September 18-21, 2017, Proceedings [ISBN:] 978-3-319-67008-9 [Series:] Lecture Notes in Computer Science [No.:] 10450 [Publisher:] Springer [Place:] Cham [Year:] 2017 [Pages:] 644-647
Abstract: 
Digital libraries desire automatic subject indexing as a scalable provider of high-quality semantic document representations. The task is, however, complex and challenging, thus many issues are still unsolved. For instance, certain concepts are not detected accurately, and confidence estimates are often unreliable. Accurate quality estimates are, however, crucial in practice, for example, to filter results and ensure highest standards before subsequent use. The proposed thesis studies applications of machine learning for automatic subject indexing, which faces considerable challenges like class imbalance, concept drift, and zero-shot learning. Special attention will be paid to architecture design and automatic quality estimation, with experiments on scholarly publications in economics and business studies. First results indicate the importance of knowledge transfer between concepts and point out the value of so-called fusion approaches that carefully combine lexical and associative subsystems. This extended abstract summarizes the main topic and status of the thesis and provides an outlook on future directions.
Subjects: 
Automatic subject indexing
Machine learning
Quality control
Persistent Identifier of the first edition: 

Files in This Item:
There are no files associated with this item.





Items in ZBWPub are protected by copyright, with all rights reserved, unless otherwise indicated.