Learning ontological rules to extract multiple relations of genic interactions from text - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue International Journal of Medical Informatics Année : 2009

Learning ontological rules to extract multiple relations of genic interactions from text

Résumé

Introduction: Information extraction (IE) systems have been proposed in recent years to extract genic interactions from bibliographical resources. They are limited to single interaction relations, and have to face a trade-off between recall and precision, by focusing either on specific interactions (for precision), or general and unspecified interactions of biological entities (for recall). Yet, biologists need to process more complex data from literature, in order to study biological pathways. An ontology is an adequate formal representation to model this sophisticated knowledge. However, the tight integration of IE systems and ontologies is still a current research issue, a fortiori with complex ones that go beyond hierarchies. Method: We propose a rich modeling of genic interactions with an ontology, and show how it can be used within an IE system. The ontology is seen as a language specifying a normalized representation of text. First, IE is performed by extracting instances from natural language processing (NLP) modules. Then, deductive inferences on the ontology language are completed, and new instances are derived from previously extracted ones. Inference rules are learnt with an inductive logic programming (ILP) algorithm, using the ontology as the hypothesis language, and its instantiation on an annotated corpus as the example language. Learning is set in a multi-class setting to deal with the multiple ontological relations. Results: We validated our approach on an annotated corpus of gene transcription regulations in the Bacillus subtilis bacterium. We reach a global recall of 89.3% and a precision of 89.6%, with high scores for the ten semantic relations defined in the ontology.

Dates et versions

hal-02660738 , version 1 (30-05-2020)

Identifiants

Citer

Alain-Pierre Manine, Erick Alphonse, Philippe Bessières. Learning ontological rules to extract multiple relations of genic interactions from text. International Journal of Medical Informatics, 2009, 78 (12), pp.E31-E38. ⟨10.1016/j.ijmedinf.2009.03.005⟩. ⟨hal-02660738⟩

Collections

INRA INRAE MATHNUM
7 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More