Text Mining Methods Applied to Mathematical Texts - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Text Mining Methods Applied to Mathematical Texts

Résumé

Up to now, flexiform mathematical text has mainly been processed with the intention of formalizing mathematical knowledge so that proof engines can be applied to it. This approach can be compared with the symbolic approach to natural language processing, where methods of logic and knowledge representation are used to analyze linguistic phenomena. In the last two decades, a new approach to natural language processing has emerged, based on statistical methods and, in particular, data mining. This method, called text mining, aims to process large text corpora, in order to detect tendencies, to extract information, to classify documents, etc. In this paper we present math mining, namely the potential applications of text mining to mathematical texts. After reviewing some existing works heading in that direction, we formulate and describe several roadmap suggestions for the use and applications of statistical methods to mathematical text processing: (1) using terms instead of words as the basic unit of text processing, (2) using topics instead of subjects (``topics'' in the sense of ``topic models'' in natural language processing, and ``subjects'' in the sense of various mathematical subject classifications), (3) using and correlating various graphs extracted from mathematical corpora, (4) use paraphrastic redundancy, etc. The purpose of this talk is to give a glimpse on potential applications of the math mining approach on large mathematical corpora, such as arXiv.org.
Fichier principal
Vignette du fichier
paper.pdf (888.31 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01864536 , version 1 (30-08-2018)

Identifiants

  • HAL Id : hal-01864536 , version 1

Citer

Yannis Haralambous. Text Mining Methods Applied to Mathematical Texts. CICM 2012 : Conferences on Intelligent Computer Mathematics, Jul 2012, Brême, Germany. ⟨hal-01864536⟩
124 Consultations
430 Téléchargements

Partager

Gmail Facebook X LinkedIn More