Multiple Retrieval Models and Regression Models for Prior Art Search

Patrice Lopez 1 Laurent Romary 1, 2
2 GEMO - Integration of data and knowledge distributed over the web
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00411835
Contributor : Laurent Romary <>
Submitted on : Sunday, August 30, 2009 - 8:17:25 PM
Last modification on : Friday, March 22, 2019 - 2:22:12 PM
Document(s) archivé(s) le : Tuesday, June 15, 2010 - 9:11:41 PM

Files

technote.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00411835, version 1
  • ARXIV : 0908.4413

Collections

Citation

Patrice Lopez, Laurent Romary. Multiple Retrieval Models and Regression Models for Prior Art Search. CLEF 2009 Workshop, Sep 2009, Corfu, Greece. 18p. ⟨hal-00411835⟩

Share

Metrics

Record views

468

Files downloads

365