Memory transfers are the performance bottleneck of many applications due to poor data locality and limited memory bandwidth. Code refactoring for better data locality can improve cache behavior, leading to significant performance boosts. Reuse distance, a measure of data locality, is useful in identification and optimization of hot code regions exhibiting poor data locality.
Successful completion of the bachelor's thesis enables you to enter responsible positions, for example in the areas of artificial intelligence, gaming and high-frequency trading. The ability to systematically identify and resolve performance bottlenecks is a highly sought-after skill.
Performance matters! During this thesis you can:
Reuse distance is defined as the number of unique memory locations referenced between a pair of references to the same memory location. On the granularity of cache lines, reuse distance can model spatial and temporal locality to assess cache behavior of applications. Assuming a fully associative cache with least recently used (LRU) replacement policy, predicting cache behavior with reuse distance is exact.
However, several cache-specific details are ignored by reuse distance:
On top, even another level of complexity is introduced by cache-sharing in today's multi-core systems.
In this thesis, you will explore the accuracy of cache behavior prediction with reuse distance in irregular applications. As an example application, you will use sequential sparse matrix-vector multiplications (SpMV), a ubiquitous kernel, for instance in simulations and graph algorithms.
In the context of this thesis, you will: