Talks and Poster Presentations (with Proceedings-Entry):
"Software Vector Chaining";
- 2018-09-13; in: "Proceedings of the 15th International Conference on Managed Languages & Runtimes",
Paper ID 18,
Providing vectors of run-time determined length as
opaque value types is a good interface between the
machine-level SIMD instructions and portable
application-oriented programming languages.
Implementing vector operations requires a loop that
breaks the vector into SIMD-register-sized chunks.
A compiler can fuse the loops of several vector
operations together. However, during normal
compilation this is only easy if no other control
structures are involved. This paper explores an
alternative: collect a trace of vector operations at
run-time (following the program control flow during
this collecting step), and then perform the combined
vector loop. This arrangement has a certain
run-time overhead, but its implementation is simpler
and can happen independently, in a library.
Preliminary performance results indicate that the
overhead makes this approach beneficial only for
long vectors ($>1$KB). For shorter vectors, unfused
loops should be used in a library setting.
Fortunately, this choice can be made at run time,
individually for each vector operation.
"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
Electronic version of the publication:
Created from the Publication Database of the Vienna University of Technology.