Skip to Main content Skip to Navigation
Conference papers

Towards an automatic detection of sensitive information in a database

Abstract : In order to validate user requirements, tests are often conducted on real data. However, developments and tests are more and more outsourced, leading companies to provide external staff with real confidential data. A solution to this problem is known as Data Scrambling. Many algorithms aim at smartly replacing true data by false but realistic ones. However, nothing has been developed to automate the crucial task of the detection of the data to be scrambled. In this paper we propose an innovative approach - and its implementation as an expert system - to achieve the automatic detection of the candidate attributes for scrambling. Our approach is mainly based on semantic rules that determine which concepts have to be scrambled, and on a linguistic component that retrieves the attributes that semantically correspond to these concepts. Since attributes can not be considered independently from each other we also address the challenging problem of the propagation of the scrambling among the whole database. An important contribution of our approach is to provide a semantic modelling of sensitive data. This knowledge is made available through production rules, operationalizing the sensitive data detection.
Complete list of metadata
Contributor : Laboratoire Cedric <>
Submitted on : Friday, March 6, 2015 - 11:25:06 AM
Last modification on : Monday, November 30, 2020 - 3:00:06 PM

Links full text



Cedric Du Mouza, Elisabeth Metais, Nadira Lammari, Jacky Akoka, Tatiana Aubonnet, et al.. Towards an automatic detection of sensitive information in a database. DBKDA 2010 : 2nd International Conference on Advances in Databases, Knowledge, and Data Applications, Apr 2010, Les Menuires, France. pp.247-252, ⟨10.1109/DBKDA.2010.17⟩. ⟨hal-01125724⟩



Record views