A clustering approach for detecting defects in technical documents - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

A clustering approach for detecting defects in technical documents

Résumé

Requirements are usually “hand-written” and suffers from several problems like redundancy and inconsistency. The problems of redundancy and inconsistency between requirements or sets of requirements impact negatively the success of final products. Manually processing these issues requires too much time and it is very costly. The main contribution of this paper is the use of k-means algorithm for a redundancy and inconsistency detection in a new context, which is Requirements Engineering context. Also, we introduce a pre-processing step based on the Natural Language Processing (NLP) techniques to see the impact of this latter to the k-means results. We use Part-Of-Speech (POS) tagging and noun chunking to detect technical busi-ness terms associated to the requirements documents that we analyze. We experiment this approach on real industrial datasets. The results show the efficiency of the k-means clustering algorithm especially with the pre-processing.
Fichier principal
Vignette du fichier
mezghani_22498.pdf (188.47 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02191796 , version 1 (23-07-2019)

Identifiants

  • HAL Id : hal-02191796 , version 1
  • OATAO : 22498

Citer

Manel Mezghani, Juyeon Kang Choi, Florence Sèdes. A clustering approach for detecting defects in technical documents. 13th International Workshop on Natural Language Processing and Cognitive Science (NLPCS 2018), Sep 2018, Cracovie, Poland. pp.27-33. ⟨hal-02191796⟩
38 Consultations
59 Téléchargements

Partager

Gmail Facebook X LinkedIn More