Skip to Main content Skip to Navigation

Towards an Automatic Detection of Sensitive Information in a Database

Abstract : Test phase is a crucial step in Information System design. It is a first real validation of user requirements. In order to maximize their effectiveness, tests are often conducted on real data. However developments and tests are more and more outsourced, leading companies to provide external staff with real confidential data. A solution to this problem is known as Data Scrambling. Many algorithms aim at smartly replacing true data by false but realistic ones. However nothing has been developed to automate the crucial task of the detection of the data to be scrambled. In this paper we propose an innovative approach - and its implementation as an expert system - to achieve the automatic detection of the candidate attributes for scrambling. Our approach is mainly based on semantic rules that determine which concepts have to be scrambled, and on a linguistic component that retrieves the attributes that semantically correspond to these concepts. Since attributes can not be considered independently from each other we also address the challenging problem of the propagation of the scrambling among the whole database. An important contribution of our approach is to provide a semantic modelling of sensitive data. This knowledge is made available through production rules, operationalizing the sensitive data detection.
Complete list of metadatas
Contributor : Laboratoire Cedric <>
Submitted on : Friday, March 6, 2015 - 11:25:06 AM
Last modification on : Monday, February 17, 2020 - 10:40:16 PM


  • HAL Id : hal-01125724, version 1


Cedric Du Mouza, Elisabeth Metais, Nadira Lammari, Jacky Akoka, Tatiana Aubonnet, et al.. Towards an Automatic Detection of Sensitive Information in a Database. DBKDA'10, Int. Conf. on Advances in Databases, Knowledge, and Data Applications, Les Menuires, Jan 2010, X, France. pp.34-39. ⟨hal-01125724⟩



Record views