The Fourth Paradigm ??? Data-Intensive Scientific Discovery, 2009. ,
DOI : 10.1007/978-3-642-33299-9_1
MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008. ,
DOI : 10.1145/1327452.1327492
Resilient Distributed Datasets, NSDI'12: The 9th USENIX Symposium on Networked Systems Design and Implementation, pp.15-28 ,
DOI : 10.1145/2886107.2886110
Towards Memory-Optimized Data Shuffling Patterns for Big Data Analytics, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp.409-412, 2016. ,
DOI : 10.1109/CCGrid.2016.85
URL : https://hal.archives-ouvertes.fr/hal-01355227
Encapsulation of parallelism in the volcano query processing system, " in SIGMOD '90: The, ACM SIGMOD International Conference on Management of Data, pp.102-111, 1990. ,
An overview of DB2 parallel edition, ACM SIGMOD Record, vol.24, issue.2, pp.460-462, 1995. ,
DOI : 10.1145/568271.223876
Understanding Vertical Scalability of I/O Virtualization for MapReduce Workloads: Challenges and Opportunities, BigDataCloud '13: 2nd Workshop on Big Data Management in Clouds (held in conjunction with EuroPar'13), 2013. ,
DOI : 10.1007/978-3-642-54420-0_1
URL : https://hal.archives-ouvertes.fr/hal-00856877
Workload characterization on a production Hadoop cluster: A case study on Taobao, 2012 IEEE International Symposium on Workload Characterization (IISWC), pp.3-13, 2012. ,
DOI : 10.1109/IISWC.2012.6402895
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.360.5580
Investigation of data locality and fairness in mapreduce, " in MapReduce '12: The Third International Workshop on MapReduce and Its Applications, pp.25-32, 2012. ,
DynMR, Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pp.1-2, 2014. ,
DOI : 10.1145/2592798.2592805
Optimizing Data Shuffling in Dataparallel Computation by Understanding User-defined Functions, NSDI'12: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pp.1-2214 ,
Making sense of performance in data analytics frameworks, NSDI'15: The 12th USENIX Conference on Networked Systems Design and Implementation, pp.293-307, 2015. ,
MRONLINE, Proceedings of the 23rd international symposium on High-performance parallel and distributed computing, HPDC '14, pp.165-176, 2014. ,
DOI : 10.1145/2600212.2600229
Improving MapReduce performance in heterogeneous environments with adaptive task tuning, Proceedings of the 15th International Middleware Conference on, Middleware '14, pp.97-108, 2014. ,
DOI : 10.1145/2663165.2666089
A storagecentric analysis of mapreduce workloads: File popularity, temporal locality and arrival patterns, IISWC '12 Proceedings of the 2012 IEEE International Symposium on Workload Characterization, pp.100-109 ,
Metadata Traces and Workload Models for Evaluating Big Storage Systems, 2012 IEEE Fifth International Conference on Utility and Cloud Computing, pp.125-132, 2012. ,
DOI : 10.1109/UCC.2012.27
BlobSeer: Next-generation data management for large scale infrastructures, Journal of Parallel and Distributed Computing, vol.71, issue.2, pp.169-184, 2011. ,
DOI : 10.1016/j.jpdc.2010.08.004
URL : https://hal.archives-ouvertes.fr/inria-00511414
Tachyon, Proceedings of the ACM Symposium on Cloud Computing, SOCC '14, pp.1-6 ,
DOI : 10.1145/2670979.2670985
Sorhdfs: A seda-based approach to maximize overlapping in rdmaenhanced hdfs, HPDC '14: The 23rd International Symposium on High-performance Parallel and Distributed Computing, pp.261-264, 2014. ,
Enabling Big Data Analytics in the Hybrid Cloud Using Iterative MapReduce, UCC'15: 8th IEEE/ACM International Conference on Utility and Cloud Computing, pp.290-299, 2015. ,
Towards Transparent Throughput Elasticity for IaaS Cloud Storage:, International Journal of Distributed Systems and Technologies, vol.6, issue.4, pp.21-44, 2015. ,
DOI : 10.4018/IJDST.2015100102
URL : https://hal.archives-ouvertes.fr/hal-01199464
The Efficiency of MapReduce in Parallel External Memory, LATIN'12: Proceedings of the 10th Latin American International Conference on Theoretical Informatics, pp.433-445, 2012. ,
DOI : 10.1007/978-3-642-29344-3_37
HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-8, 2009. ,
DOI : 10.1109/CLUSTR.2009.5289171
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.544.6768
Numaaware algorithms: the case of data shuffling, CIDR '13: The 6th Biennial Conference on Innovative Data Systems Research, 2013. ,
HOMR, Proceedings of the 28th ACM international conference on Supercomputing, ICS '14, pp.33-42, 2014. ,
DOI : 10.1145/2597652.2597684
Accelerating Spark with RDMA for Big Data Processing: Early Experiences, 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects, pp.9-16, 2014. ,
DOI : 10.1109/HOTI.2014.15
Managing data transfers in computer clusters with orchestra, ACM SIGCOMM Computer Communication Review, vol.41, issue.4, pp.98-109, 2011. ,
DOI : 10.1145/2043164.2018448
Optimizing shuffle performance in spark, 2013. ,
On the benefits of transparent compression for costeffective cloud data storage Transactions on Large-Scale Data-and Knowledge-Centered Systems, pp.167-184, 2011. ,