Big data in practice: the example of nilearn for mining brain imaging data - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Big data in practice: the example of nilearn for mining brain imaging data

Résumé

This talk will present useful patterns and lessons learned for efficient application of machine learning to data too big to fit in memory, in the context of brain imaging applications. I will discuss the design of a versatile library for mining big datasets of brain images, nilearn, and the data patterns it relies upon, implemented in scikit-learn, for machine learning, and joblib, for data flow. In particular, I will show how to go from general principles of efficient data processing to a simple vocabulary that can be understood by end users and applied to raw data. I will also discuss human factors: a big data library should be usable by end users to solve their application needs. In particular this talk will cover: - out-of-core computing relying on the user's data files or disk-based cache - online and on-the-fly data reduction, to get the gist out of big data - simple chaining of operations, with caching, for a versatile processing pipeline - API-design to steer non-expert users to data-efficient code - reliable documentation through examples of a library for heavy-duty data processing The high-level concepts introduced will be illustrated with detailed technical discussions of Python implementations, based both on examples using scikit-learn, joblib and nilearn and on an analysis of how these libraries work. The focus here is sharing insights gained while using and developing them, as many of the lessons are not specific to brain imaging. Links: - nilearn: https://nilearn.github.io - scikit-learn: http://scikit-learn.org/stable/ - joblib: https://pythonhosted.org/joblib/
Fichier non déposé

Dates et versions

hal-01207106 , version 1 (30-09-2015)

Identifiants

  • HAL Id : hal-01207106 , version 1

Citer

Loïc Estève. Big data in practice: the example of nilearn for mining brain imaging data. Scipy 2015, Jul 2015, Austin, Texas, United States. ⟨hal-01207106⟩
419 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More