N. Aswani, K. Bontcheva, and H. Cunningham, Mining Information for Instance Unification, Lecture Notes in Computer Science, vol.4273, pp.329-363, 2006.
DOI : 10.1007/11926078_24

M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg, Adaptive name matching in information integration, IEEE Intelligent Systems, vol.18, issue.5, pp.16-23, 2003.
DOI : 10.1109/MIS.2003.1234765

P. Bourke and L. Butler, Standards issues in a national bibliometric database: The Australian case, Scientometrics, vol.31, issue.2, pp.199-207, 1996.
DOI : 10.1007/BF02018478

N. Carayol and L. Cassi, Whos Who in Patents. A Bayesian approach, 2009.

T. Churches, P. Christen, K. Lim, and J. X. Zhu, Preparation of name and address data for record linkage using hidden Markov models, BMC Medical Informatics and Decision Making, vol.13, issue.2, pp.10-1186, 2002.
DOI : 10.1016/0022-2836(91)90193-A

G. Cleuziou, An extended version of the k-means method for overlapping clustering, 2008 19th International Conference on Pattern Recognition, pp.1-4, 2008.
DOI : 10.1109/ICPR.2008.4761079

URL : https://hal.archives-ouvertes.fr/hal-00466009

D. Bruin, R. E. Moed, and H. F. , The unification of addresses in scientific publications, Informetrics, vol.90, issue.6578, 1989.

D. Bruin, R. E. Moed, and H. F. , Delimitation of scientific subfields using cognitive words from corporate addresses in scientific publications, Scientometrics, vol.42, issue.No. 4, pp.65-80, 1993.
DOI : 10.1007/BF02016793

P. Domingos and M. Pazzani, Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier, International Conference on Machine Learning (ICML), pp.105-112, 1996.

I. Fellegi and A. Sunter, A Theory for Record Linkage, Journal of the American Statistical Association, vol.63, issue.328, pp.1183-1210, 1969.
DOI : 10.1126/science.130.3381.954

J. C. French, A. L. Powell, and E. Schulman, Using clustering strategies for creating authority files, Journal of the American Society for Information Science, vol.32, issue.8, pp.774-786, 2000.
DOI : 10.1002/(SICI)1097-4571(2000)51:8<774::AID-ASI90>3.0.CO;2-P

C. Galvez and F. Moya-anegn, The unification of institutional addresses applying parametrized finite-state graphs (P-FSG), Scientometrics, vol.69, issue.2, pp.323-345, 2006.
DOI : 10.1007/s11192-006-0156-3

D. J. Hand and K. Yu, Idiots BayesNot So Stupid After All?, International Statistical Review, vol.69, issue.3, pp.385-398, 2001.
DOI : 10.1111/j.1751-5823.2001.tb00465.x

W. Hood and C. Wilson, Informetric studies using databases: Opportunities and challenges, Scientometrics, vol.58, issue.3, pp.587-608, 2003.
DOI : 10.1023/B:SCIE.0000006882.47115.c6

J. Huang, S. Ertekin, and C. L. Giles, Efficient Name Disambiguation for Large-Scale Databases, pp.536-544, 2006.
DOI : 10.1007/11871637_53

Y. Jiang, H. Zheng, X. Wang, B. Lu, and K. Wu, Affiliation disambiguation for constructing semantic digital libraries, Journal of the American Society for Information Science and Technology, vol.23, issue.2, pp.621029-1041, 2011.
DOI : 10.1002/asi.21538

J. Lamirel, R. Mall, P. Cuxac, and S. G. , Variations to incremental growing neural gas algorithm based on label maximization, The 2011 International Joint Conference on Neural Networks, pp.956-965, 2011.
DOI : 10.1109/IJCNN.2011.6033326

URL : https://hal.archives-ouvertes.fr/hal-00645390

A. Lelu, Modèles neuronaux pour l'analyse de donnes documentaires et textuelles, 1993.

N. C. Liu, Y. Cheng, and L. Liu, Academic ranking of world universities using scientometrics?? - A comment to the ???Fatal Attraction???, Scientometrics, vol.64, issue.1, pp.101-112, 2005.
DOI : 10.1007/s11192-005-0241-z

J. Macqueen, Some Methods for Classification and Analysis of Multivariate Observations, Proc. 5th Berkeley Symp. Math. Proba, pp.281-297, 1967.

H. F. Moed, Citation Analysis in Research Evaluation, 2005.

L. Niu, J. Wu, and Y. Shi, Entity Disambiguation with Textual and Connection Information, Procedia Computer Science, vol.9, issue.0, pp.1249-1255, 2012.
DOI : 10.1016/j.procs.2012.04.136

F. Osareh and C. S. Wilson, A comparison of Iranian scientific publications in the SCI: 1985-1989 and 1990-1994, Scientometrics, vol.48, issue.3, pp.427-442, 2000.
DOI : 10.1023/A:1005648723433

E. Rahm and H. H. Do, Data Cleaning: Problems and Current Approaches, IEEE Data Engineering Bulletin, vol.23, issue.4, pp.3-13, 2000.

M. Sadinle, R. Hall, and S. Fienberg, Approaches to Multiple Record Linkage, 2010.

M. Sadinle and S. E. Fienberg, A Generalized Fellegi???Sunter Framework for Multiple Record Linkage With Application to Homicide Record Systems, Journal of the American Statistical Association, vol.11, issue.502, 2012.
DOI : 10.1080/01621459.2012.757231

A. F. Van-raan, Fatal attraction: Conceptual and methodological problems in the ranking of universities by bibliometric methods, Scientometrics, vol.62, issue.1, pp.133-143, 2005.
DOI : 10.1007/s11192-005-0008-6

S. L. Ventura, R. Nugent, and E. R. Fuchs, Methods Matter: Revamping Inventor Disambiguation Algorithms with Classification Models and Labeled Inventor Records, SSRN Electronic Journal, 2012.
DOI : 10.2139/ssrn.2079330

J. Wang, K. Berzins, D. Hicks, J. Melkers, F. Xiao et al., A boosted-trees method for name disambiguation, Scientometrics, vol.66, issue.1, pp.1-21, 2012.
DOI : 10.1007/s11192-012-0681-1

Y. Zhou, J. R. Talburt, Y. Su, and L. Yin, OYSTER: A Tool for Entity Resolution in Health Information Exchange, Proceedings of the 5th International Conference on Cooperation and Promotion of Information Resources in Science and Technology, pp.358-364, 2010.

M. Zitt and E. Bassecoulard, Challenges for scientometric indicators: data demining, knowledge-flow measurements and diversity issues, Ethics in Science and Environmental Politics, vol.8, pp.49-60, 2008.
DOI : 10.3354/esep00092