A. Agresti, Categorical Data Analysis, 2002.

C. F. Aliferis, A. R. Statnikov, I. Tsamardinos, S. Mani, and X. D. Koutsoukos, Local causal and markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation, Journal of Machine Learning Research, issue.11, pp.171-234, 2010.

E. Alvares-cherman, J. Metz, and M. Monard, Incorporating label dependency into the binary relevance framework for multilabel classification, Expert Systems With Applications, pp.1647-1655, 2011.

A. P. Armen and I. Tsamardinos, A unified approach to estimation and control of the false discovery rate in bayesian network skeleton identification, European Symposium on Artificial Neural Networks, ESANN, 2011.

A. Aussem, S. Rodrigues-de-morais, and M. Corbex, Analysis of nasopharyngeal carcinoma risk factors with Bayesian networks, Artificial Intelligence in Medicine, vol.54, issue.1, 2012.
DOI : 10.1016/j.artmed.2011.09.002
URL : https://hal.archives-ouvertes.fr/hal-00411518

A. Aussem, A. Tchernof, S. Rodrigues-de-morais, and S. Rome, Analysis of lifestyle and metabolic predictors of visceral obesity with Bayesian Networks, BMC Bioinformatics, vol.11, issue.1, p.487, 2010.
DOI : 10.1186/1471-2105-11-487
URL : https://hal.archives-ouvertes.fr/inserm-00663887

A. Badea, Determining the direction of causal influence in large probabilistic networks: A constraint-based approach, Proceedings of the Sixteenth European Conference on Artificial Intelligence, pp.263-267, 2004.

A. Bernard and A. Hartemink, INFORMATIVE STRUCTURE PRIORS: JOINT LEARNING OF DYNAMIC REGULATORY NETWORKS FROM MULTIPLE TYPES OF DATA, Biocomputing 2005, pp.459-470, 2005.
DOI : 10.1142/9789812702456_0044

H. Blockeel, L. D. Raedt, and J. Ramon, Top-down induction of clustering trees, International Conference on Machine Learning, ICML, pp.55-63, 1998.

H. Borchani, C. Bielza, C. Toro, and P. Larrañaga, Predicting human immunodeficiency virus inhibitors using multi-dimensional Bayesian network classifiers, Artificial Intelligence in Medicine, vol.57, issue.3, pp.219-229, 2013.
DOI : 10.1016/j.artmed.2012.12.005

L. Breiman, Random forests, Machine Learning, pp.5-32, 2001.

L. E. Brown and I. Tsamardinos, A strategy for making predictions under manipulation, Journal of Machine Learning Research JMLR, vol.3, pp.35-52, 2008.

W. Buntine, Theory Refinement on Bayesian Networks, Uncertainty in Artificial Intelligence, UAI, pp.52-60, 1991.
DOI : 10.1016/B978-1-55860-203-8.50010-3

G. Cawley, Causal and non-causal feature selection for ridge regression, Conference Proceedings, pp.107-128, 2008.

J. Cheng, R. Greiner, J. Kelly, D. A. Bell, and W. Liu, Learning Bayesian networks from data: An information-theory based approach, Artificial Intelligence, vol.137, issue.1-2, pp.43-90, 2002.
DOI : 10.1016/S0004-3702(02)00191-1

D. Chickering, D. Heckerman, and C. Meek, Large-sample learning of Bayesian networks is NP-hard, Journal of Machine Learning Research, JMLR, vol.5, pp.1287-1330, 2004.

D. M. Chickering, Optimal structure identification with greedy search, Journal of Machine Learning Research JMLR, vol.3, pp.507-554, 2002.

J. Cussens and M. Bartlett, Advances in Bayesian Network Learning using Integer Programming, Uncertainty in Artificial Intelligence, pp.182-191, 2013.

K. Dembczyåski, W. Waegeman, W. Cheng, and E. Hüllermeier, On label dependence and loss minimization in multi-label classification, Machine Learning, pp.5-45, 2012.
DOI : 10.1007/s10994-012-5285-8

B. Ellis and W. H. Wong, Learning Causal Bayesian Network Structures From Experimental Data, Journal of the American Statistical Association, vol.103, issue.482, pp.778-789, 2008.
DOI : 10.1198/016214508000000193

N. Friedman, I. Nachman, and D. Peer, Learning bayesian network structure from massive datasets: The sparse candidate algorithm, Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp.206-215, 1999.

N. Friedman, I. Nachman, and D. Pe-'er, Learning bayesian network structure from massive datasets: the " sparse candidate " algorithm, Uncertainty in Artificial Intelligence, UAI, pp.21-30, 1999.

M. Gasse, A. Aussem, and H. Elghazel, An Experimental Comparison of Hybrid Algorithms for Bayesian Network Structure Learning, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD, pp.58-73, 2012.
DOI : 10.1007/978-3-642-33460-3_9
URL : https://hal.archives-ouvertes.fr/hal-01122771

Q. Gu, Z. Li, and J. Han, Correlated multi-label feature selection, Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pp.1087-1096, 2011.
DOI : 10.1145/2063576.2063734

Y. Guo and S. Gu, Multi-label classification using conditional dependency networks, International Joint Conference on Artificial Intelligence, IJCAI, pp.1300-1305, 2011.

D. Heckerman, D. Geiger, and D. Chickering, Learning bayesian networks: The combination of knowledge and statistical data, Machine Learning, pp.197-243, 1995.

D. Kocev, C. Vens, J. Struyf, and S. D?eroski, Ensembles of Multi-Objective Decision Trees, European Conference on Machine Learning, ECML, pp.624-631, 2007.
DOI : 10.1007/978-3-540-74958-5_61

M. Koivisto and K. Sood, Exact bayesian structure discovery in bayesian networks, Journal of Machine Learning Research, JMLR, vol.5, pp.549-573, 2004.

K. Kojima, E. Perrier, S. Imoto, and S. Miyano, Optimal search on clustered structural constraint for learning bayesian network structure, Journal of Machine Learning Research, issue.11, pp.285-310, 2010.

D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques, 2009.

D. Koller and M. Sahami, Toward optimal feature selection, International Conference on Machine Learning, ICML, pp.284-292, 1996.

A. Liaw and M. Wiener, Classification and regression by randomforest, R News, vol.2, pp.18-22, 2002.

O. Luaces, J. Díez, J. Barranquero, J. J. Del-coz, and A. Bahamonde, Binary relevance efficacy for multilabel classification, Progress in Artificial Intelligence, vol.40, issue.7, pp.303-313, 2012.
DOI : 10.1007/s13748-012-0030-x

G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. D?eroski, An extensive experimental comparison of methods for multi-label learning, Pattern Recognition, vol.45, issue.9, pp.3084-3104, 2012.
DOI : 10.1016/j.patcog.2012.03.004

O. Maron and A. L. Ratan, Multiple-Instance Learning for Natural Scene Classification, International Conference on Machine Learning, ICML, pp.341-349, 1998.

A. Mccallum, Multi-label text classification with a mixture model trained by em, AAAI Workshop on Text Learning, 1999.

A. Moore and W. Wong, Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning, International Conference on Machine Learning, ICML, 2003.

R. Nagarajan, M. Scutari, and S. Lbre, Bayesian Networks in R: with Applications in Systems Biology, 2013.
DOI : 10.1007/978-1-4614-6446-4

R. E. Neapolitan, Learning Bayesian Networks, 2004.
DOI : 10.1016/B978-012370477-1.50021-9

S. Ott, S. Imoto, and S. Miyano, FINDING OPTIMAL MODELS FOR SMALL GENE NETWORKS, Biocomputing 2004, pp.557-567, 2004.
DOI : 10.1142/9789812704856_0052

J. Peña, R. Nilsson, J. Björkegren, and J. Tegnér, Towards scalable and data efficient learning of Markov boundaries, International Journal of Approximate Reasoning, vol.45, issue.2, pp.211-232, 2007.
DOI : 10.1016/j.ijar.2006.06.008

J. M. Peña, J. Björkegren, and J. Tegnér, Growing Bayesian network models of gene networks from seed genes, Bioinformatics, vol.21, issue.Suppl 2, pp.224-229, 2005.
DOI : 10.1093/bioinformatics/bti1137

J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, 1988.

J. M. Peña, Learning gaussian graphical models of gene networks with false discovery rate control, European Conference on Evolutionary Computation, pp.165-176, 2008.

J. M. Peña, Finding consensus bayesian network structures, Journal of Artificial Intelligence Research, vol.42, pp.661-687, 2012.

E. Perrier, S. Imoto, and S. Miyano, Finding optimal bayesian network given a super-structure, Journal of Machine Learning Research , JMLR, vol.9, pp.2251-2286, 2008.

D. Peer, A. Regev, G. Elidan, and N. Friedman, Inferring subnetworks from perturbed expression profiles, Bioinformatics, vol.17, issue.Suppl 1, pp.215-224, 2001.
DOI : 10.1093/bioinformatics/17.suppl_1.S215

E. Prestat, S. Rodrigues-de-morais, J. Vendrell, A. Thollet, C. Gautier et al., Learning the local Bayesian network structure around the ZNF217 oncogene in breast tumours, Computers in Biology and Medicine, pp.334-341, 2013.
DOI : 10.1016/j.compbiomed.2012.12.002
URL : https://hal.archives-ouvertes.fr/hal-00851231

R. Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing Vienna, 2013.

J. Read, B. Pfahringer, G. Holmes, and E. Frank, Classifier Chains for Multi-label Classification, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD, pp.254-269, 2009.
DOI : 10.1007/978-3-642-04174-7_17

S. Rodrigues-de-morais and A. Aussem, An efficient learning algorithm for local bayesian network structure discovery, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD, pp.164-169, 2010.

S. Rodrigues-de-morais and A. Aussem, A novel Markov boundary based feature subset selection algorithm, Neurocomputing, vol.73, issue.4-6, pp.578-584, 2010.
DOI : 10.1016/j.neucom.2009.05.018
URL : https://hal.archives-ouvertes.fr/hal-00383776

V. Roth and B. Fischer, Improved functional prediction of proteins by learning kernel combinations in multilabel settings, BMC Bioinformatics, vol.8, issue.Suppl 2, p.12, 2007.
DOI : 10.1186/1471-2105-8-S2-S12

G. E. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

M. Scutari, Learning bayesian networks with the bnlearn R package, Journal of Statistical Software, vol.35, pp.1-22, 2010.

M. Scutari, Measures of Variability for Graphical Models, Ph.D. thesis School in Statistical Sciences, 2011.

M. Scutari and A. Brogini, Bayesian Network Structure Learning with Permutation Tests, Communications in Statistics - Theory and Methods, vol.35, issue.3, pp.3233-3243, 2012.
DOI : 10.1007/s10994-006-6889-7

M. Scutari and R. Nagarajan, Identifying significant edges in graphical models of molecular networks, Artificial Intelligence in Medicine, vol.57, issue.3, pp.207-217, 2013.
DOI : 10.1016/j.artmed.2012.12.006

T. Silander and P. Myllymäki, A Simple Approach for Finding the Globally Optimal Bayesian Network Structure, Uncertainty in Artificial Intelligence, UAI, pp.445-452, 2006.

C. Snoek, M. Worring, J. V. Gemert, J. Geusebroek, and A. Smeulders, The challenge problem for automated detection of 101 semantic concepts in multimedia, Proceedings of the 14th annual ACM international conference on Multimedia , MULTIMEDIA '06, pp.421-430, 2006.
DOI : 10.1145/1180639.1180727

P. Spirtes, C. Glymour, and R. Scheines, Causation, Prediction , and Search, 2000.
DOI : 10.1007/978-1-4612-2748-9

N. Spolaôr, E. A. Cherman, M. C. Monard, and H. D. Lee, A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach, Electronic Notes in Theoretical Computer Science, vol.292, pp.135-151, 2013.
DOI : 10.1016/j.entcs.2013.02.010

M. Studen´ystuden´y and D. Haws, Learning Bayesian network structure: Towards the essential graph by integer linear programming tools, International Journal of Approximate Reasoning, vol.55, issue.4, pp.1043-1071, 2014.
DOI : 10.1016/j.ijar.2013.09.016

I. Tsamardinos, C. Aliferis, and A. Statnikov, Algorithms for large scale Markov blanket discovery, Florida Artificial Intelligence Research Society Conference FLAIRS'03, pp.376-381, 2003.

I. Tsamardinos and G. Borboudakis, Permutation Testing Improves Bayesian Network Learning, European Conference on Machine Learning and Knowledge Discovery in Databases, ECML- PKDD, pp.322-337, 2010.
DOI : 10.1007/978-3-642-15939-8_21

I. Tsamardinos, L. Brown, and C. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm, Machine Learning, pp.31-78, 2006.
DOI : 10.1007/s10994-006-6889-7

I. Tsamardinos and L. E. Brown, Bounding the false discovery rate in local Bayesian network learning, AAAI Conference on Artificial Intelligence, pp.1100-1105, 2008.

G. Tsoumakas, I. Katakis, and I. Vlahavas, Mining Multilabel Data. Transformation, pp.1-20, 2010.

G. Tsoumakas, I. Katakis, and I. Vlahavas, Random klabelsets for Multi-Label Classification, IEEE Transactions on Knowledge and Data Engineering TKDE, vol.23, pp.1-12, 2010.

G. Tsoumakas and I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proceedings of the 18th European Conference on Machine Learning, pp.406-417, 2007.
DOI : 10.1007/978-3-540-74958-5_38

E. Villanueva and C. Maciel, Optimized algorithm for learning bayesian network superstructures, International Conference on Pattern Recognition Applications and Methods, ICPRAM, 2012.

E. Villanueva and C. D. Maciel, Efficient methods for learning Bayesian network super-structures, Neurocomputing, vol.123, pp.3-12, 2014.
DOI : 10.1016/j.neucom.2012.10.035

M. L. Zhang and K. Zhang, Multi-label learning by exploiting label dependency, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '10, p.999, 2010.
DOI : 10.1145/1835804.1835930

M. Zhang and Z. Zhou, Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization, IEEE Transactions on Knowledge and Data Engineering, vol.18, issue.10, pp.1338-1351, 2006.
DOI : 10.1109/TKDE.2006.162