Skip to Main content Skip to Navigation
Book sections

Seek&Hide. Anonymising a French SMS corpus using natural language processing techniques.

Pierre Accorsi 1 Namrata Patel 2 Cédric Lopez 3 Rachel Panckhurst 4 Mathieu Roche 5, 6
2 GRAPHIK - Graphs for Inferences on Knowledge
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
5 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : This article presents the system Seek&Hide, a text message processing tool developed for the sud4science LR (http://www.sud4science.org/) project. It performs the anonymisation/de-iden- ti cation of a corpus. At present, it has been used to anonymise the sud4science LR corpus of French text messages collected during the project. is is done in two phases. In the rst phase, it automatically processes over 70% of the corpus. e rest of the corpus is processed in the second phase, aided by an expert annotator via a web interface speci cally designed to simplify the task.
Document type :
Book sections
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-01485615
Contributor : Rachel Panckhurst Connect in order to contact the contributor
Submitted on : Thursday, March 9, 2017 - 10:34:41 AM
Last modification on : Monday, October 11, 2021 - 1:24:12 PM

Identifiers

Citation

Pierre Accorsi, Namrata Patel, Cédric Lopez, Rachel Panckhurst, Mathieu Roche. Seek&Hide. Anonymising a French SMS corpus using natural language processing techniques.. Louise-Amélie Cougnon; Cédrick Fairon. SMS Communication. A linguistic approach, John Benjamins, pp.11-28, 2014, 978 90 272 0280 2/9789027270306. ⟨10.1075/bct.61⟩. ⟨hal-01485615⟩

Share

Metrics

Record views

885