%0 Conference Proceedings %T Integration Process for Multidimensional Textual Data Modeling %+ Equipe de Recherche en Ingénierie des Connaissances (ERIC) %A Aknouche, Rachid %A Asfari, Ounas %A Bentayeb, Fadila %A Boussaid, Omar %< avec comité de lecture %Z ERIC:13-027 %( Proceedings of SEM / ENASE 2013 %B 1st International Workshop in Software Evolution and Modernization SEM / ENASE 2013 %C Angers, France %Y 8th international Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2013) %N 978-989-8565-66-2 %P 119-126 %8 2013-07-04 %D 2013 %K Extract-Transform-Load %K Textual Data %K Text Warehousing %K Text Warehouse Model %K TWM %K data integration %K decisional architecture %K information retrieval %K 20 Newsgroups %Z Computer Science [cs]/Document and Text Processing %Z Computer Science [cs]/Databases [cs.DB]Conference papers %X In this paper, we propose an original approach for text warehousing process. It is based on a decisional architecture which combines classical data warehousing tasks and information retrieval (IR) techniques. We first propose a new ETL process, named ETL-Text, for textual data integration and then, we present a new Text Warehouse Model, denoted TWM, which takes into account both the structure and the semantics of the textual data. TWM is associated with new dimensions types including: a metadata dimension and a semantic dimension. In addition, we propose a new analysis measure based on the language model widely used in IR area. Moreover, our approach is based on Wikipedia as external knowledge source to extract the semantics of the textual documents. To validate our approach, we develop a prototype composed of several processing modules that illustrate the different steps of the ETL-Text. Also, we use the 20 Newsgroups corpus to perform our experimentation. %G English %L hal-00911862 %U https://hal.science/hal-00911862 %~ UNIV-LYON2 %~ ERIC %~ UDL