Skip to Main content Skip to Navigation
Conference papers

A new Method to Match Data Sets Applied to Electric Market

Abstract : The French household electricity market currently dominatedby EDF, is going to be opened up to free market economy. In thiscontext, a better understanding of its customers behaviour would be a keyfeature for the success of the company. To achieve this goal, EDF holdstwo information sources i.e., a comprehensive customers' invoicing filewith few individual data though, as well as survey results performed inregional centres, which contain more data per customer.We herein present a new method of data fusion based on the generationof virtual individuals. For each of them, a full set of variables will beaccessible. This procedure is based on two steps. Firstly, a MultipleCorrespondences Analysis (MCA) from an existing master sample isperformed on fundamental variables. A sample of virtual individuals(SVI) is randomly selected, based on the distribution of each significantMCA factor. Then, for each virtual individual, a specific value is givenfor each fundamental variable which is the most correlated to one of theMCA factors. Secondly, the distinct sets of secondary samples are graftedto the previous SVI. A simultaneous estimation of variables distributionis made as based on PLS regression on variables shared by all samples.The use of this method brings about two advantages, namely thepossibility to choose SVI size and the avoidance of varianceunderestimation generally encountered in using the imputation methods.This process has been so far applied to the treatment of two databases,i.e. two surveys, in order to generate the expected artificial sample.Validation and perspectives will be herein further discussed.
Document type :
Conference papers
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download
Contributor : Laboratoire Cedric <>
Submitted on : Monday, March 23, 2020 - 12:49:49 PM
Last modification on : Monday, March 30, 2020 - 3:08:54 PM


Files produced by the author(s)


  • HAL Id : hal-01124656, version 1



Nicolas Fischer, Christian Derquenne, Gilbert Saporta. A new Method to Match Data Sets Applied to Electric Market. NTTS-ETK : New Techniques and Technologies for Statistics, Exchange of Technology and Know-how, Eurostat, Jun 2001, Hersonissos, Greece. pp.725-733. ⟨hal-01124656⟩



Record views


Files downloads