Adaptable Register File Organization for Vector Processors

Cristobal Ramirez Lazo, Enrico Reggiani, Carlos Rojas Morales, Roger Figueras Bague, Luis A. Villa Vargas, Marco A. Ramirez Salinas, Mateo Valero Cortes, Osman Sabri Unsal, Adrian Cristal

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

Contemporary Vector Processors (VPs) are de-signed either for short vector lengths, e.g., Fujitsu A64FX with 512-bit ARM SVE vector support, or long vectors, e.g., NEC Aurora Tsubasa with 16Kbits Maximum Vector Length (MVL1). Unfortunately, both approaches have drawbacks. On the one hand, short vector length VP designs struggle to provide high efficiency for applications featuring long vectors with high Data Level Parallelism (DLP). On the other hand, long vector VP designs waste resources and underutilize the Vector Register File (VRF) when executing low DLP applications with short vector lengths. Therefore, those long vector VP implementations are limited to a specialized subset of applications, where relatively high DLP must be present to achieve excellent performance with high efficiency. Modern scientific applications are getting more diverse, and the vector lengths in those applications vary widely. To overcome these limitations, we propose an Adaptable Vector Architecture (AVA) that leads to having the best of both worlds. AVA is designed for short vectors (MVL=16 elements) and is thus area and energy-efficient. However, AVA has the functionality to reconfigure the MVL, thereby allowing to exploit the benefits of having a longer vector of up to 128 elements microarchitecture when abundant DLP is present. We model AVA on the gem5 simulator and evaluate AVA performance with six applications taken from the RiVEC Benchmark Suite. To obtain area and power consumption metrics, we model AVA on McPAT for 22nm technology. Our results show that by reconfiguring our small VRF (8KB) plus our novel issue queue scheme, AVA yields a 2X speedup over the default configuration for short vectors. Additionally, AVA shows competitive performance when compared to a long vector VP, while saving 50% of area.

Idioma originalInglés
Título de la publicación alojadaProceedings - 2022 IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022
EditorialIEEE Computer Society
Páginas786-799
Número de páginas14
ISBN (versión digital)9781665420273
DOI
EstadoPublicada - 2022
Evento28th Annual IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022 - Virtual, Online, República de Corea
Duración: 2 abr. 20226 abr. 2022

Serie de la publicación

NombreProceedings - International Symposium on High-Performance Computer Architecture
Volumen2022-April
ISSN (versión impresa)1530-0897

Conferencia

Conferencia28th Annual IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022
País/TerritorioRepública de Corea
CiudadVirtual, Online
Período2/04/226/04/22

Huella

Profundice en los temas de investigación de 'Adaptable Register File Organization for Vector Processors'. En conjunto forman una huella única.

Citar esto