Skip to Main content Skip to Navigation
Conference papers

Automatic Subject Indexing and Classification Using Text Recognition and Computer-Based Analysis of Tables of Contents

Abstract : This paper will describe a method for machine-based creation of high quality subject indexing and classification for both electronic and print documents using tables of contents (ToCs). The technology described here is primarily focused on electronic and print documents for which, because of technical or licensing reasons, it is not possible to index full text. However, the technology would also be useful for full text documents, because it could significantly enhance the accuracy and relevance of subject description by analyzing the structure of ToCs.
Document type :
Conference papers
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01816705
Contributor : Openedition Press <>
Submitted on : Friday, June 15, 2018 - 3:51:38 PM
Last modification on : Wednesday, July 4, 2018 - 9:35:52 AM
Long-term archiving on: : Monday, September 17, 2018 - 10:31:36 AM

File

PokornyJan_ELPUB2018.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Collections

Citation

Jan Pokorny. Automatic Subject Indexing and Classification Using Text Recognition and Computer-Based Analysis of Tables of Contents. ELPUB 2018, Jun 2018, Toronto, Canada. ⟨10.4000/proceedings.elpub.2018.19⟩. ⟨hal-01816705⟩

Share

Metrics

Record views

404

Files downloads

2285