M. R. Anderson and M. J. Cafarella, Input selection for fast feature engineering, 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp.577-588, 2016.
DOI : 10.1109/ICDE.2016.7498272

P. Andritsos, R. J. Miller, and P. Tsaparas, Information-theoretic tools for mining database structure from large data sets, Proceedings of the 2004 ACM SIGMOD international conference on Management of data , SIGMOD '04, pp.731-742, 2004.
DOI : 10.1145/1007568.1007650
URL : http://www.cs.uiuc.edu/class/fa05/cs591han/sigmodpods04/sigmod/pdf/R-674.pdf

T. Antonopoulos, F. Neven, and F. Servais, Definability problems for graph query languages, Proceedings of the 16th International Conference on Database Theory, ICDT '13, pp.141-152, 2013.
DOI : 10.1145/2448496.2448514
URL : http://www.edbt.org/Proceedings/2013-Genova/papers/icdt/a13-antonopoulos.pdf

A. Assadi, T. Milo, and S. Novgorodov, DANCE: Data Cleaning with Constraints and Experts, 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp.1409-1410, 2017.
DOI : 10.1109/ICDE.2017.199

S. H. Bach, B. Dawei-he, A. Ratner, and C. R. , Learning the Structure of Generative Models without Labeled Data, pp.273-282, 2017.

M. Bergman, T. Milo, S. Novgorodov, and W. Tan, Query-Oriented Data Cleaning with Oracles, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pp.1199-1214, 2015.
DOI : 10.1145/2588555.2594515

L. Berti-equille, T. Dasu, and D. Srivastava, Discovery of complex glitch patterns: A novel approach to Quantitative Data Cleaning, 2011 IEEE 27th International Conference on Data Engineering, pp.733-744
DOI : 10.1109/ICDE.2011.5767864

L. Berti-equille, J. M. Loh, and T. Dasu, A Masking Index for Quantifying Hidden Glitches, 2013 IEEE 13th International Conference on Data Mining, pp.253-277, 2015.
DOI : 10.1109/ICDM.2013.16
URL : http://www.research.att.com/export/sites/att_labs/techdocs/TD_101229.pdf

G. J. Bex, W. Gelade, F. Neven, and S. Vansummeren, Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data, pp.825-834, 2008.
DOI : 10.1145/1367497.1367609
URL : http://alpha.uhasselt.be/~lucg6377/publications/www08.pdf

M. Bilenko, B. Kamath, and R. J. Mooney, Adaptive Blocking: Learning to Scale Up Record Linkage, Sixth International Conference on Data Mining (ICDM'06), pp.87-96, 2006.
DOI : 10.1109/ICDM.2006.13
URL : http://www.cs.utexas.edu/users/mbilenko/papers/06-icdm.pdf

M. Bilenko, Learnable Similarity Functions and their Applications to Clustering and Record Linkage, pp.981-982, 2004.

A. Bonifati, R. Ciucanu, and S. Staworko, Learning Join Queries from User Examples, ACM Transactions on Database Systems, vol.40, issue.4, pp.1-2438, 2016.
DOI : 10.1145/2463676.2465320
URL : https://hal.archives-ouvertes.fr/hal-01187986

A. Bonifati, R. Ciucanu, and A. Lemay, Learning Path Queries on Graph Databases, pp.109-120, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01068055

A. Bonifati, U. Comignani, E. Coquery, and R. Thion, Interactive Schema Mapping Specification with Exemplar Tuples, pp.667-682, 2017.
DOI : 10.1145/3035918.3064028

Y. Chung, S. Krishnan, and T. Kraska, A data quality metric (DQM), Proceedings of the VLDB Endowment, vol.10, issue.10, pp.1094-1105, 2017.
DOI : 10.14778/3115404.3115414

S. De, Y. Hu, V. V. Meduri, Y. Chen, and S. Kambhampati, BayesWipe, Journal of Data and Information Quality, vol.8, issue.1, pp.1-530, 2016.
DOI : 10.1145/2723372.2749430

A. Doan, P. Domingos, and A. Y. Halevy, Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach, pp.509-520, 2001.
DOI : 10.1145/375663.375731

A. Doan, A. Ardalan, J. R. Ballard, S. Das, Y. Govind et al., Human-in-the-Loop Challenges for Entity Matching, Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics , HILDA'17, pp.1-12, 2017.
DOI : 10.14778/2536336.2536337

X. L. Dong, L. Berti-equille, and D. Srivastava, Data fusion: resolving conflicts from multiple sources. Handbook of Data Quality, pp.293-318
DOI : 10.1007/978-3-642-38562-9_7
URL : http://arxiv.org/pdf/1503.00310

X. L. Dong, L. Berti-equille, and D. Srivastava, Integrating conflicting data, Proceedings of the VLDB Endowment, vol.2, issue.1, pp.550-561, 2009.
DOI : 10.14778/1687627.1687690

H. Fernau, Algorithms for learning regular expressions from positive data, Information and Computation, vol.207, issue.4, pp.521-541, 2009.
DOI : 10.1016/j.ic.2008.12.008
URL : https://doi.org/10.1016/j.ic.2008.12.008

P. A. Flach and I. Savnik, Database Dependency Discovery: A Machine Learning Approach, AI Commun, vol.12, issue.3, pp.139-160, 1999.

F. Geerts, G. Mecca, P. Papotti, and D. Santoro, The LLUNATIC data-cleaning framework, Proceedings of the VLDB Endowment, vol.6, issue.9, pp.625-636, 2013.
DOI : 10.14778/2536360.2536363
URL : http://www.vldb.org/pvldb/vol6/p625-mecca.pdf

Y. Hu, Q. Wang, D. Vatsalan, and P. Christen, Improving Temporal Record Linkage Using Regression Classification, pp.561-573, 2017.
DOI : 10.1145/2723372.2737789

Z. Jin, M. R. Anderson, M. J. Cafarella, and H. V. Jagadish, Foofah, Proceedings of the 2017 ACM International Conference on Management of Data , SIGMOD '17, pp.683-698, 2017.
DOI : 10.1145/2557500.2557523

A. Kimmig, A. Memory, R. J. Miller, and L. Getoor, A Collective Probabilistic Approach to Schema Mapping Discovery, pp.921-932, 2017.
DOI : 10.1109/icde.2017.140
URL : https://lirias.kuleuven.be/bitstream/123456789/575149/3/kimmig-icde17.pdf

S. Krishnan, J. Wang, M. J. Franklin, K. Goldberg, T. Kraska et al., SampleClean: Fast and Reliable Analytics on Dirty Data, IEEE Data Eng. Bull, vol.38, issue.3, pp.59-75, 2015.

S. Krishnan, J. Wang, E. Wu, M. J. Franklin, and K. Goldberg, ActiveClean, Proceedings of the VLDB Endowment, vol.9, issue.12, pp.948-959, 2016.
DOI : 10.14778/2994509.2994514
URL : http://dl.acm.org/ft_gateway.cfm?id=2899409&type=pdf

Y. Li, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, H. V. Jagadish33-]-r et al., Regular expression learning for information extraction, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pp.21-3040, 2003.
DOI : 10.3115/1613715.1613719
URL : http://dl.acm.org/ft_gateway.cfm?id=1613719&type=pdf

Y. Park, A. Shahab-tajik, M. Cafarella, and B. Mozafari, Database Learning, Proceedings of the 2017 ACM International Conference on Management of Data , SIGMOD '17, pp.745-758, 2017.
DOI : 10.1145/2588555.2588579
URL : http://arxiv.org/pdf/1703.05468

R. Pradhan, S. Bykau, and S. Prabhakar, Staging User Feedback toward Rapid Conflict Resolution in Data Fusion, Proceedings of the 2017 ACM International Conference on Management of Data , SIGMOD '17, 2017.
DOI : 10.14778/2536360.2536374

A. Ratner, S. H. Bach, H. R. Ehrenberg, J. A. Fries, S. Wu et al., Snorkel: A System for Lightweight Extraction, 2017.

C. Ré, D. Agrawal, M. Balazinska, M. J. Cafarella, M. I. Jordan et al., Machine Learning and Databases: The Sound of Things to Come or a Cacophony of Hype? SIGMOD, pp.283-284, 2015.

M. Schleich, D. Olteanu, and R. Ciucanu, Learning Linear Regression Models over Factorized Joins, Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, pp.3-18, 2016.
DOI : 10.14778/2809974.2809991
URL : https://hal.archives-ouvertes.fr/hal-01330113

A. Rostin, O. Albrecht, J. Bauckmann, F. Naumann, and U. Leser, A Machine Learning Approach to Foreign Key Discovery, WebDB@SIGMOD, 2009.

S. Song, C. Li, and X. Zhang, Turn Waste into Wealth, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pp.1115-1124, 2015.
DOI : 10.1145/1093382.1093385

S. Thirumuruganathan, L. Berti-equille, M. Ouzzani, J. Quian-ruiz, and N. Tang, UGuide, Proceedings of the 2017 ACM International Conference on Management of Data , SIGMOD '17, pp.1385-1397
DOI : 10.1145/1645953.1646135

D. Van-aken, A. Pavlo, G. J. Gordon, and B. Zhang, Automatic Database Management System Tuning Through Large-scale Machine Learning, Proceedings of the 2017 ACM International Conference on Management of Data , SIGMOD '17, pp.1009-1024, 2017.
DOI : 10.1145/2588555.2593678

J. Wang, S. Krishnan, M. J. Franklin, K. Goldberg, T. Kraska et al., A sample-and-clean framework for fast and accurate query processing on dirty data, Proceedings of the 2014 ACM SIGMOD international conference on Management of data, SIGMOD '14, pp.469-480, 2014.
DOI : 10.1145/2588555.2610505
URL : http://goldberg.berkeley.edu/pubs/sampleclean-sigmod14.pdf

S. E. Whang, D. Marmaros, and H. Garcia-molina, Pay-As-You-Go Entity Resolution, IEEE Transactions on Knowledge and Data Engineering, vol.25, issue.5, pp.1111-1124, 2013.
DOI : 10.1109/TKDE.2012.43

M. Yakout, L. Berti-equille, and A. K. Elmagarmid, Don't be SCAREd, Proceedings of the 2013 international conference on Management of data, SIGMOD '13, pp.553-564, 2013.
DOI : 10.1145/2463676.2463706

A. Zhang, S. Song, J. Wang, and P. S. Yu, Time series data cleaning, Proceedings of the VLDB Endowment, vol.10, issue.10, pp.1046-1057, 2017.
DOI : 10.14778/3115404.3115410

C. J. Zhang, Z. Zhao, L. Chen, H. V. Jagadish, and C. C. Cao, CrowdMatcher, Proceedings of the 2014 ACM SIGMOD international conference on Management of data, SIGMOD '14, pp.721-724, 2014.
DOI : 10.1145/2588555.2594515