Effective usage of vector registers in decoupled vector architectures

Luis Villa; Roger Espasa; Mateo Valero

doi:10.1109/EMPDP.1998.647238

Effective usage of vector registers in decoupled vector architectures

Luis Villa, Roger Espasa, Mateo Valero

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

Resumen

This paper presents a study of the impact of reducing the vector register size in a decoupled vector architecture. In traditional in-order vector architectures, long vector registers have typically been the norm. We start presenting data that shows that, even for highly vectorizable codes, only a small fraction of all elements of a long vector register are actually used. We also show that reducing the register size in a traditional vector architecture in an attempt to reduce hardware cost and maximize register utilization results in a severe performance degradation. However, we combine the decoupling technique with the vector register reduction and show that the resulting architecture tolerates very well the register size cuts. We simulate a selection of Perfect Club and Specfp92 programs using a trace driven approach and compare the execution time in a conventional vector architecture with a decoupled vector architecture using different registers sizes. Halving the register size and using decoupling provides speedups between 1.04-1.49 over a traditional in-order vector machines. Even reducing the register length to 1/4 the original size (and, in some cases, to 1/8) the performance of the decoupled machine is better than a conventional vector model. Moreover, we observe that the resulting decoupled machine with short registers tolerates very well long memory latencies.

Idioma original	Inglés
Título de la publicación alojada	Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998
Editorial	Institute of Electrical and Electronics Engineers Inc.
Páginas	495-501
Número de páginas	7
ISBN (versión digital)	0818683325, 9780818683329
DOI	https://doi.org/10.1109/EMPDP.1998.647238
Estado	Publicada - 1998
Publicado de forma externa	Sí
Evento	6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998 - Madrid, Espana Duración: 21 ene. 1998 → 23 ene. 1998

Serie de la publicación

Nombre	Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998

Conferencia

Conferencia	6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998
País/Territorio	Espana
Ciudad	Madrid
Período	21/01/98 → 23/01/98

Acceder al documento

10.1109/EMPDP.1998.647238

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Villa, L., Espasa, R., & Valero, M. (1998). Effective usage of vector registers in decoupled vector architectures. En Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998 (pp. 495-501). (Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/EMPDP.1998.647238

Villa, Luis ; Espasa, Roger ; Valero, Mateo. / Effective usage of vector registers in decoupled vector architectures. Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998. Institute of Electrical and Electronics Engineers Inc., 1998. pp. 495-501 (Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998).

@inproceedings{7d7d4d98522143369e5c0c3ea751428a,

title = "Effective usage of vector registers in decoupled vector architectures",

abstract = "This paper presents a study of the impact of reducing the vector register size in a decoupled vector architecture. In traditional in-order vector architectures, long vector registers have typically been the norm. We start presenting data that shows that, even for highly vectorizable codes, only a small fraction of all elements of a long vector register are actually used. We also show that reducing the register size in a traditional vector architecture in an attempt to reduce hardware cost and maximize register utilization results in a severe performance degradation. However, we combine the decoupling technique with the vector register reduction and show that the resulting architecture tolerates very well the register size cuts. We simulate a selection of Perfect Club and Specfp92 programs using a trace driven approach and compare the execution time in a conventional vector architecture with a decoupled vector architecture using different registers sizes. Halving the register size and using decoupling provides speedups between 1.04-1.49 over a traditional in-order vector machines. Even reducing the register length to 1/4 the original size (and, in some cases, to 1/8) the performance of the decoupled machine is better than a conventional vector model. Moreover, we observe that the resulting decoupled machine with short registers tolerates very well long memory latencies.",

author = "Luis Villa and Roger Espasa and Mateo Valero",

note = "Publisher Copyright: {\textcopyright} 1998 IEEE; 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998 ; Conference date: 21-01-1998 Through 23-01-1998",

year = "1998",

doi = "10.1109/EMPDP.1998.647238",

language = "Ingl{\'e}s",

series = "Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "495--501",

booktitle = "Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998",

address = "Estados Unidos",

}

Villa, L, Espasa, R & Valero, M 1998, Effective usage of vector registers in decoupled vector architectures. En Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998. Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998, Institute of Electrical and Electronics Engineers Inc., pp. 495-501, 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998, Madrid, Espana, 21/01/98. https://doi.org/10.1109/EMPDP.1998.647238

Effective usage of vector registers in decoupled vector architectures. / Villa, Luis; Espasa, Roger; Valero, Mateo.
Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998. Institute of Electrical and Electronics Engineers Inc., 1998. p. 495-501 (Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Effective usage of vector registers in decoupled vector architectures

AU - Villa, Luis

AU - Espasa, Roger

AU - Valero, Mateo

PY - 1998

Y1 - 1998

N2 - This paper presents a study of the impact of reducing the vector register size in a decoupled vector architecture. In traditional in-order vector architectures, long vector registers have typically been the norm. We start presenting data that shows that, even for highly vectorizable codes, only a small fraction of all elements of a long vector register are actually used. We also show that reducing the register size in a traditional vector architecture in an attempt to reduce hardware cost and maximize register utilization results in a severe performance degradation. However, we combine the decoupling technique with the vector register reduction and show that the resulting architecture tolerates very well the register size cuts. We simulate a selection of Perfect Club and Specfp92 programs using a trace driven approach and compare the execution time in a conventional vector architecture with a decoupled vector architecture using different registers sizes. Halving the register size and using decoupling provides speedups between 1.04-1.49 over a traditional in-order vector machines. Even reducing the register length to 1/4 the original size (and, in some cases, to 1/8) the performance of the decoupled machine is better than a conventional vector model. Moreover, we observe that the resulting decoupled machine with short registers tolerates very well long memory latencies.

AB - This paper presents a study of the impact of reducing the vector register size in a decoupled vector architecture. In traditional in-order vector architectures, long vector registers have typically been the norm. We start presenting data that shows that, even for highly vectorizable codes, only a small fraction of all elements of a long vector register are actually used. We also show that reducing the register size in a traditional vector architecture in an attempt to reduce hardware cost and maximize register utilization results in a severe performance degradation. However, we combine the decoupling technique with the vector register reduction and show that the resulting architecture tolerates very well the register size cuts. We simulate a selection of Perfect Club and Specfp92 programs using a trace driven approach and compare the execution time in a conventional vector architecture with a decoupled vector architecture using different registers sizes. Halving the register size and using decoupling provides speedups between 1.04-1.49 over a traditional in-order vector machines. Even reducing the register length to 1/4 the original size (and, in some cases, to 1/8) the performance of the decoupled machine is better than a conventional vector model. Moreover, we observe that the resulting decoupled machine with short registers tolerates very well long memory latencies.

UR - http://www.scopus.com/inward/record.url?scp=85117484992&partnerID=8YFLogxK

U2 - 10.1109/EMPDP.1998.647238

DO - 10.1109/EMPDP.1998.647238

M3 - Contribución a la conferencia

AN - SCOPUS:85117484992

T3 - Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998

SP - 495

EP - 501

BT - Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998

Y2 - 21 January 1998 through 23 January 1998

ER -

Villa L, Espasa R, Valero M. Effective usage of vector registers in decoupled vector architectures. En Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998. Institute of Electrical and Electronics Engineers Inc. 1998. p. 495-501. (Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, PDP 1998). doi: 10.1109/EMPDP.1998.647238

Effective usage of vector registers in decoupled vector architectures

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto