Data mining with big data, IEEE transactions on knowledge and data engineering, vol.26, issue.1, pp.97-107, 2014. ,
Tachyon: Reliable, memory speed storage for cluster computing frameworks, Proceedings of the ACM Symposium on Cloud Computing, pp.1-15, 2014. ,
NoSQL database: New era of databases for big data analytics-classification, characteristics and comparison, 2013. ,
Locality and availability in distributed storage, IEEE Transactions on Information Theory, vol.62, issue.8, pp.4481-4493, 2016. ,
Scatter/Gather: A cluster-based approach to browsing large document collections, ACM SIGIR Forum, vol.51, pp.148-159, 2017. ,
Characterizing and profiling scientific workflows, Future Generation Computer Systems, vol.29, issue.3, pp.682-692, 2013. ,
, NERSC storage trends and summaries, 2017.
A benchmark simulation for moist nonhydrostatic numerical models, Monthly Weather Review, vol.130, issue.12, pp.2917-2928, 2002. ,
The universe at extreme scale: multi-petaflop sky simulation on the BG/Q, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p.4, 2012. ,
Montage: An on-demand image mosaic service for the nvo, Astronomical Data Analysis Software and Systems XII, vol.295, p.343, 2003. ,
CyberShake: A physics-based seismic hazard model for southern california, Pure and Applied Geophysics, vol.168, issue.3-4, pp.367-381, 2011. ,
LIGO: The laser interferometer gravitational-wave observatory, pp.325-333, 1992. ,
Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitterfree I/O, Cluster Computing (CLUSTER), 2012 IEEE International Conference on, pp.155-163, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00715252
Towards multi-site metadata management for geographically distributed cloud workflows, Cluster Computing (CLUSTER), 2015 IEEE International Conference on, pp.294-303, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01239150
An optimized approach for storing and accessing small files on cloud storage, Journal of Network and Computer Applications, vol.35, issue.6, pp.1847-1862, 2012. ,
Small-file access in parallel file systems, Parallel & Distributed Processing, pp.1-11, 2009. ,
The small files problem, Cloudera Blog, 2009. ,
Improving the efficiency of storing for small files in HDFS, Computer Science & Service System (CSSS), 2012 International Conference on, pp.2239-2242, 2012. ,
Improving metadata management for small files in HDFS, Cluster Computing and Workshops, pp.1-4, 2009. ,
Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, pp.654-663, 1997. ,
Simulation of dynamic data replication strategies in data grids, Parallel and Distributed Processing Symposium, p.10, 2003. ,
Chord: A scalable peer-to-peer lookup service for internet applications, ACM SIGCOMM Computer Communication Review, vol.31, issue.4, pp.149-160, 2001. ,
Dynamo: Amazon's highly available key-value store, ACM SIGOPS operating systems review, vol.41, issue.6, pp.205-220, 2007. ,
Towards efficient location and placement of dynamic replicas for geo-distributed data stores, Proceedings of the ACM 7th Workshop on Scientific Cloud Computing, pp.3-9, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01304328
Keeping up with storage: decentralized, write-enabled dynamic geo-replication, Future Generation Computer Systems, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01617658
Dynamic data replication across geo-distributed cloud data centres, International Conference on Distributed Computing and Internet Technology, pp.182-187, 2016. ,
T´yrT´yr: blob storage meets built-in transactions, High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, pp.573-584, 2016. ,
Lustre: Building a file system for 1000-node clusters, Proceedings of the 2003 Linux symposium, pp.380-386, 2003. ,
FAST poster session, OrangeFS: Advancing PVFS, 2011. ,
, Grid'5000-Rennes Hardware (Paravance)," Accessed on, 2017.
PVFS: A parallel file system for linux clusters, Proceedings of the 4th annual Linux showcase and conference, pp.391-430, 2000. ,
The Hadoop distributed file system, Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on, pp.1-10, 2010. ,
, Hadoop Documentation-Archives, 2017.
Reduction of data at namenode in HDFS using harballing technique, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), vol.1, issue.4, p.635, 2012. ,
Improving performance of smallfile accessing in Hadoop, Computer Science and Software Engineering (JCSSE), pp.200-205, 2014. ,
HDF5: A file format and I/O library for high performance computing applications, Proceedings of Supercomputing, vol.99, pp.5-33, 1999. ,
Metadata management in distributed file systems, 2017. ,
Riak core: Building distributed applications without shared state, ACM SIGPLAN Commercial Users of Functional Programming, p.14, 2010. ,
Ceph: A scalable, high-performance distributed file system, Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, pp.307-320, 2006. ,
Direct lookup and hashbased metadata placement for local file systems, Proceedings of the 6th International Systems and Storage Conference, p.5, 2013. ,
Efficient computation of frequent and top-k elements in data streams, International Conference on Database Theory, pp.398-412, 2005. ,
Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, pp.2-2, 2012. ,
SparkBench: a comprehensive benchmarking suite for in memory data analytic platform spark, Proceedings of the 12th ACM International Conference on Computing Frontiers, p.53, 2015. ,
Capture, conversion, and analysis of an intense NFS workload, FAST, vol.9, pp.139-152, 2009. ,