RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models

Résumé

Relation extraction (RE) is a promising way to extend the semantic web from web pages. However, it is unclear how RE can deal with the several challenges of web pages such as noise, data sparsity and conflicting information. In this paper, we benchmark state-of-the-art RE approaches on the particular case of company web pages, since company web pages are important source of information for Fin-tech and BusinnessTech. To this end, we present a method to build a corpus mimicking web pages characteristics. This corpus was used to evaluate several deep learning RE models and compared to another benchmark corpus.
Fichier principal
Vignette du fichier
DeepOntoNLP_Wouter(2).pdf (120.99 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02941935 , version 1 (17-09-2020)

Identifiants

  • HAL Id : hal-02941935 , version 1

Citer

Wouter Baes, François Portet, Hamid Mirisaee, Cyril Labbé. RDF triples extraction from company web pages: comparison of state-of-the-art Deep Models. 1st International Workshop Deep Learning meets Ontologies and Natural Language Processing, Sep 2020, Bozen-Bolzano, Italy. ⟨hal-02941935⟩
184 Consultations
887 Téléchargements

Partager

Gmail Facebook X LinkedIn More