Exascale computing study: Technology challenges in achieving exascale systems, DARPA Information Processing Techniques Office, 2008. ,
A survey of architectural techniques for improving cache power efficiency Scaling the bandwidth wall: challenges in and avenues for CMP scaling Phase change memory: From devices to systems A software approach for combating asymmetries of non-volatile memories High-endurance hybrid cache design in CMP architecture with cache partitioning and access-aware policy, Sustainable Computing: Informatics and Systems ISLPED, 2012 GLSVLSI, 2013. [9] S. Mittal, " Energy Saving Techniques for Phase Change Memory (PCM), pp.3-60, 2009. ,
Hybrid cache architecture with disparate memory technologies, ISCA, pp.34-45, 2009. ,
DOI : 10.1145/1555815.1555761
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.463.8280
Technology Comparison for Large Last- Level Caches (L 3 Cs): Low-Leakage SRAM, Low Write-Energy STT-RAM, and Refresh-Optimized eDRAM, pp.143-154, 2013. ,
Dynamically reconfigurable hybrid cache: An energy-efficient last-level cache design, DATE, pp.45-50, 2012. ,
Relaxing non-volatility for fast and energy-efficient STT-RAM caches, 2011 IEEE 17th International Symposium on High Performance Computer Architecture, 2011. ,
DOI : 10.1109/HPCA.2011.5749716
Bandwidth-aware reconfigurable cache design with hybrid memory technologies, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2010. ,
DOI : 10.1109/ICCAD.2011.6105304
Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement, Proceedings of the 45th annual conference on Design automation, DAC '08, pp.554-559, 2008. ,
DOI : 10.1145/1391469.1391610
After Hard Drives—What Comes Next?, IEEE Transactions on Magnetics, vol.45, issue.10, pp.3406-3413, 2009. ,
DOI : 10.1109/TMAG.2009.2024163
TapeCache, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.185-190, 2012. ,
DOI : 10.1145/2333660.2333707
An energy efficient cache design using spin torque transfer (STT) RAM, Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design, ISLPED '10, 2010. ,
DOI : 10.1145/1840845.1840931
Power7: IBM's next-generation server processor, pp.7-15, 2010. ,
DOI : 10.1109/mm.2010.38
Embedded DRAM: Technology platform for the Blue Gene/L chip, IBM Journal of Research and Development, vol.49, issue.2.3, pp.333-350, 2005. ,
DOI : 10.1147/rd.492.0333
5.9 Haswell: A family of IA 22nm processors, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp.112-113, 2014. ,
DOI : 10.1109/ISSCC.2014.6757361
A 500 MHz Random Cycle, 1.5 ns Latency, SOI Embedded DRAM Macro Featuring a Three-Transistor Micro Sense Amplifier, IEEE Journal of Solid-State Circuits, vol.43, issue.1, pp.86-95, 2008. ,
DOI : 10.1109/JSSC.2007.908006
Reducing cache power with low-cost, multi-bit error-correcting codes, pp.83-93, 2010. ,
Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM, ACM SIGMETRICS PER, pp.247-258, 2012. ,
Spin-transfer torque MRAM (STT-MRAM): Challenges and prospects, AAPPS Bulletin, vol.18, issue.6, pp.33-40, 2008. ,
Cache revive, Proceedings of the 49th Annual Design Automation Conference on, DAC '12, pp.243-252, 2012. ,
DOI : 10.1145/2228360.2228406
An overview of non-volatile memory technology and the implication for tools and architectures, DATE, pp.731-736, 2009. ,
Bi-layered RRAM with unlimited endurance and extremely uniform switching, VLSIT, pp.52-53, 2011. ,
NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Non-volatile Memory, IEEE TCAD, vol.31, issue.7, pp.994-1007, 2012. ,
DOI : 10.1007/978-1-4419-9551-3_2
WAP: Improving non-volatile cache lifetime by reducing inter-and intra-set write variations, HPCA, pp.234-245, 2013. ,
Energy-and endurance-aware design of phase change memory caches, DATE, pp.136-141, 2010. ,
Avoiding unnecessary write operations in STT-MRAM for low power implementation, Fifteenth International Symposium on Quality Electronic Design, 2014. ,
DOI : 10.1109/ISQED.2014.6783375
Asynchronous asymmetrical write termination (AAWT) for a low power STT-MRAM, DATE, 2014. ,
Implementing a hybrid SRAM / eDRAM NUCA architecture, 2011 18th International Conference on High Performance Computing, pp.1-10, 2011. ,
DOI : 10.1109/HiPC.2011.6152738
An hybrid eDRAM/SRAM macrocell to implement first-level data caches, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42, 2009. ,
DOI : 10.1145/1669112.1669140
Exploiting reuse information to reduce refresh energy in on-chip eDRAM caches, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, pp.491-492, 2013. ,
DOI : 10.1145/2464996.2467278
Analyzing the optimal ratio of SRAM banks in hybrid caches, 2012 IEEE 30th International Conference on Computer Design (ICCD), pp.297-302, 2012. ,
DOI : 10.1109/ICCD.2012.6378655
Memories: Exploiting Them and Developing Them, 2006 IEEE International SOC Conference, pp.303-310, 2006. ,
DOI : 10.1109/SOCC.2006.283903
Combining RAM technologies for harderror recovery in L1 data caches working at very-low power modes, DATE, pp.83-88, 2013. ,
Constructing large and fast multi-level cell STT-MRAM based cache for embedded processors, Proceedings of the 49th Annual Design Automation Conference on, DAC '12, pp.907-912, 2012. ,
DOI : 10.1145/2228360.2228521
Lower-bits cache for low power STT-RAM caches, 2012 IEEE International Symposium on Circuits and Systems, pp.480-483, 2012. ,
DOI : 10.1109/ISCAS.2012.6272069
OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pp.847-852, 2013. ,
DOI : 10.7873/DATE.2013.179
Probabilistic design methodology to improve run-time stability and performance of STT-RAM caches, Proceedings of the International Conference on Computer-Aided Design, ICCAD '12, pp.88-94, 2012. ,
DOI : 10.1145/2429384.2429401
Energy-efficient Spin-Transfer Torque RAM cache exploiting additional all-zero-data flags, ISQED, 2013, pp.216-222 ,
Compiler-Assisted Refresh Minimization for Volatile STT-RAM Cache, ASP-DAC, pp.273-278, 2013. ,
DOI : 10.1109/TC.2014.2360527
Prediction Table based Management Policy for STT-RAM and SRAM Hybrid Cache, pp.1092-1097, 2012. ,
Coding Last Level STT-RAM Cache for High Endurance and Low Power, IEEE Computer Architecture Letters, vol.13, issue.2, 2013. ,
DOI : 10.1109/L-CA.2013.8
STT-RAM based energy-efficiency hybrid cache for CMPs, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, pp.31-36, 2011. ,
DOI : 10.1109/VLSISoC.2011.6081626
MAC, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.351-356, 2012. ,
DOI : 10.1145/2333660.2333738
URL : https://hal.archives-ouvertes.fr/hal-00529679
Compiler-assisted preferred caching for embedded systems with STT-RAM based hybrid cache, Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems, LCTES '12, pp.109-118, 2012. ,
DOI : 10.1145/2248418.2248434
Multi retention level STT-RAM cache designs with a dynamic refresh scheme, Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44 '11, pp.329-338, 2011. ,
DOI : 10.1145/2155620.2155659
A novel architecture of the 3D stacked MRAM L2 cache for CMPs, 2009 IEEE 15th International Symposium on High Performance Computer Architecture, pp.239-249, 2009. ,
DOI : 10.1109/HPCA.2009.4798259
Design techniques to improve the device write margin for MRAM-based cache memory, Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI, GLSVLSI '11, pp.97-102, 2011. ,
DOI : 10.1145/1973009.1973030
Analysis and Optimization of Thermal Effect on STT-RAM Based 3-D Stacked Cache Design, 2012 IEEE Computer Society Annual Symposium on VLSI, pp.374-379, 2012. ,
DOI : 10.1109/ISVLSI.2012.56
A Hybrid PRAM and STT-RAM Cache Architecture for Extending the Lifetime of PRAM Caches, IEEE Computer Architecture Letters, vol.12, issue.2, 2012. ,
DOI : 10.1109/L-CA.2012.24
Performance and energyefficiency analysis of hybrid cache memory based on SRAM- MRAM, ISOCC, 2012, pp.247-250 ,
Coordinating prefetching and STT-RAM based last-level cache management for multicore systems, Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI, GLSVLSI '13, pp.55-60 ,
DOI : 10.1145/2483028.2483060
Cross-Layer Techniques for Optimizing Systems Utilizing Memories with Asymmetric Access Characteristics, 2012 IEEE Computer Society Annual Symposium on VLSI, pp.404-409, 2012. ,
DOI : 10.1109/ISVLSI.2012.65
Using Magnetic RAM to Build Low-Power and Soft Error-Resilient L1 Cache, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.20, issue.1, pp.19-28 ,
DOI : 10.1109/TVLSI.2010.2090914
Static and dynamic co-optimizations for blocks mapping in hybrid caches, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, 2012. ,
DOI : 10.1145/2333660.2333717
Exploring the vulnerability of CMPs to soft errors with 3D stacked non-volatile memory, ICCD, pp.366-372, 2011. ,
STTRAM scaling and retention failure, Intel Technology Journal, vol.17, issue.1, p.54, 2013. ,
Cache Coherence Enabled Adaptive Refresh for Volatile STT-RAM, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pp.1247-1250, 2013. ,
DOI : 10.7873/DATE.2013.258
Selectively protecting errorcorrecting code for area-efficient and reliable STT-RAM caches, ASP-DAC, pp.285-290, 2013. ,
Process variation aware data management for STT-RAM cache design, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.179-184, 2012. ,
DOI : 10.1145/2333660.2333706
Asymmetric-access aware optimization for STT-RAM caches with process variations, " in GLSVLSI, pp.143-148, 2013. ,
Hybrid cache architecture replacing SRAM cache with future memory technology, 2012 IEEE International Symposium on Circuits and Systems, pp.2481-2484 ,
DOI : 10.1109/ISCAS.2012.6271803
Exploiting Heterogeneity for Energy Efficiency in Chip Multiprocessors, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol.1, issue.2, pp.109-119, 2011. ,
DOI : 10.1109/JETCAS.2011.2158343
Performance, Power, and Reliability Tradeoffs of STT-RAM Cell Subject to Architecture-Level Requirement, IEEE Transactions on Magnetics, vol.47, issue.10, pp.2356-2359, 2011. ,
DOI : 10.1109/TMAG.2011.2159262
On-chip caches built on multilevel spintransfer torque RAM cells and its optimizations, J. Emerg. Technol. Comput. Syst, vol.9, issue.16, pp.1-1622, 2013. ,
Energy reduction for STT-RAM using early write termination, Proceedings of the 2009 International Conference on Computer-Aided Design, ICCAD '09, pp.264-268, 2009. ,
DOI : 10.1145/1687399.1687448
Future cache design using STT MRAMs for improved energy efficiency, Proceedings of the 49th Annual Design Automation Conference on, DAC '12, pp.492-497, 2012. ,
DOI : 10.1145/2228360.2228447
High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement, IEEE/ACM International Symposium on Low Power Electronics and Design, pp.79-84, 2011. ,
DOI : 10.1109/ISLPED.2011.5993611
Embedded memory hierarchy exploration based on magnetic RAM, 2013 IEEE Faible Tension Faible Consommation, pp.1-4 ,
DOI : 10.1109/FTFC.2013.6577780
URL : https://hal.archives-ouvertes.fr/lirmm-01419132
Automatic Feedback Control of Shared Hybrid Caches in 3D Chip Multiprocessors, 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp.393-400, 2011. ,
DOI : 10.1109/PDP.2011.83
Design space exploration of workload-specific last-level caches, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.243-248, 2012. ,
DOI : 10.1145/2333660.2333718
LASIC: Loop-Aware Sleepy Instruction Caches Based on STT-RAM Technology, IEEE TVLSI, 2013. ,
AWARE (Asymmetric Write Architecture With REdundant Blocks): A High Write Speed STT-MRAM Cache Architecture, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.22, issue.4, 2013. ,
DOI : 10.1109/TVLSI.2013.2256945
Write intensity prediction for energy-efficient non-volatile caches, International Symposium on Low Power Electronics and Design (ISLPED), pp.223-228, 2013. ,
DOI : 10.1109/ISLPED.2013.6629298
A dual-mode architecture for fast-switching STT-RAM, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.45-50, 2012. ,
DOI : 10.1145/2333660.2333673
DWM-TAPESTRI-an energy efficient allspin cache using domain wall shift based writes, DATE, pp.1825-1830, 2013. ,
Power-performance co-optimization of throughput core architecture using resistive memory, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), pp.342-353, 2013. ,
DOI : 10.1109/HPCA.2013.6522331
C1C, ACM Transactions on Architecture and Code Optimization, vol.10, issue.4, pp.1-5222, 2013. ,
DOI : 10.1145/2541228.2555308
URL : https://hal.archives-ouvertes.fr/in2p3-00118705
Optimizing bandwidth and power of graphics memory with hybrid memory technologies and adaptive data migration, Proceedings of the International Conference on Computer-Aided Design, ICCAD '12, pp.81-87, 2012. ,
DOI : 10.1145/2429384.2429400
Low-current probabilistic writes for powerefficient STT-RAM caches, ICCD, 2013, pp.511-514 ,
A low-power phase change memory based hybrid cache architecture, Proceedings of the 18th ACM Great Lakes symposium on VLSI , GLSVLSI '08, pp.395-398, 2008. ,
DOI : 10.1145/1366110.1366204
Wear-Resistant Hybrid Cache Architecture with Phase Change Memory, 2012 IEEE Seventh International Conference on Networking, Architecture, and Storage, pp.268-272, 2012. ,
DOI : 10.1109/NAS.2012.37
Multi-level magnetic RAM using domain wall shift for energy-efficient, high-density caches, International Symposium on Low Power Electronics and Design (ISLPED), pp.64-69, 2013. ,
DOI : 10.1109/ISLPED.2013.6629268
Cross-layer racetrack memory design for ultra high density and low power consumption, Proceedings of the 50th Annual Design Automation Conference on, DAC '13, 2013. ,
DOI : 10.1145/2463209.2488799
Exploiting set-level write non-uniformity for energy-efficient NVM-based hybrid cache, 2011 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia, pp.19-28, 2011. ,
DOI : 10.1109/ESTIMedia.2011.6088521
D-MRAM Cache: Enhancing Energy Efficiency with 3T-1MTJ DRAM / MRAM Hybrid Memory, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pp.1813-1818, 2013. ,
DOI : 10.7873/DATE.2013.363
3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis, 2009 IEEE International Conference on Computer Design, pp.254-259, 2009. ,
DOI : 10.1109/ICCD.2009.5413147
STT-RAM for Shared Memory in GPUs, 2011. ,
Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing, ISCA, pp.371-382, 2010. ,
Unleashing the potential of MLC STT-RAM caches, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp.429-436, 2013. ,
DOI : 10.1109/ICCAD.2013.6691153
Future Memory and Interconnect Technologies, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, pp.964-969 ,
DOI : 10.7873/DATE.2013.202
Cache decay: exploiting generational behavior to reduce cache leakage power, ISCA, pp.240-251, 2001. ,
Moguls: a model to explore the memory hierarchy for bandwidth improvements, ISCA, pp.377-388, 2011. ,
degree in electronics and communications engineering from IIT, Roorkee, India and the Ph.D. degree in computer engineering from Iowa State University, USA. He is currently working as a Post-Doctoral Research Associate at ORNL. His research interests include non-volatile memory, memory system power efficiency ,