Title: Way adaptable D-NUCA caches
Authors: Alessandro Bardine, Manuel Comparetti, Pierfrancesco Foglia, Giacomo Gabrielli, Cosimo Antonio Prete
Addresses: Dipartimento di Ingegneria dell'Informazione, Universita di Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy. ' Dipartimento di Ingegneria dell'Informazione, Universita di Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy. ' Dipartimento di Ingegneria dell'Informazione, Universita di Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy. ' Dipartimento di Ingegneria dell'Informazione, Universita di Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy. ' Dipartimento di Ingegneria dell'Informazione, Universita di Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy
Abstract: Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical of large on-chip last level caches: by partitioning a large cache into several banks, with the latency of each one depending on its physical location and by employing a scalable on-chip network to interconnect the banks with the cache controller, the average access latency can be reduced with respect to a traditional cache. The addition of a migration mechanism to move the most frequently accessed data towards the cache controller (D-NUCA) further improves the average access latency. In this work we propose a last-level cache design, based on the D-NUCA scheme, which is able to significantly limit its static power consumption by dynamically adapting to the needs of the running application: the way adaptable D-NUCA cache. This design leads to a fast and power-efficient memory hierarchy with an average reduction by 31.2% in energy-delay product (EDP) with respect to a traditional D-NUCA. We propose and discuss a methodology for tuning the intrinsic parameters of our design and investigate the adoption of the way adaptable D-NUCA scheme as a shared L2 cache in a chip multiprocessor (CMP) system (24% reduction of EDP).
Keywords: cache memories; power-saving techniques; non-uniform cache architecture; way adaptable D-NUCA; NUCA; reconfigurable architectures; high performance systems architecture; access latency; cache design; energy-delay product; chip multiprocessors; CMP.
DOI: 10.1504/IJHPSA.2010.034542
International Journal of High Performance Systems Architecture, 2010 Vol.2 No.3/4, pp.215 - 228
Published online: 07 Aug 2010 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article