Fast Model of Fully Associative Caches / Automatic Generation of Adaptive Codes


Details
Tech-Talk 1: A Fast Analytical Model of Fully Associative Caches
While the cost of computation is an easy to understand local property, the cost of data movement on cached architectures depends on global state, does not compose, and is hard to predict. As a result, programmers often fail to consider the cost of data movement. Existing cache models and simulators provide the missing information but are computationally expensive. We present a lightweight cache model for fully associative caches with least recently used (LRU) replacement policy that gives fast and accurate results. We count the cache misses without explicit enumeration of all memory accesses by using symbolic counting techniques twice: 1) to derive the stack distance for each memory access and 2) to count the memory accesses with stack distance larger than the cache size. While this technique seems infeasible in theory, due to non-linearities after the first round of counting, we show that the counting problems are sufficiently linear in practice. Our cache model often computes the results within seconds and contrary to simulation the execution time is mostly problem size independent. Our evaluation measures modeling errors below 0.6% on real hardware. By providing accurate data placement information we enable memory hierarchy aware software development.
With Haystack we present the first cache model that is fast enough for providing programmers with cache miss information while programming, e.g. by using in-editor developer hints similar to the code-completion support that is available in visual studio code today. Similarly, the technique could be used by LLVM (or other compiler toolchains) to select cache optimal code transformations. At the LLVM social we hope to inspire a discussion about how we can make our cache model available to a broad audience.
Publication: http://spcl.inf.ethz.ch/Publications/index.php?pub=341
Bio:
Tobias Gysi is a final-year doctoral student at ETH Zurich, working on performance models and optimization techniques for data-locality transformations as well as domain-specific abstractions and programming models for high-performance computing. Before joining the Scalable Parallel Computing Lab, Tobias Gysi received his MSc in computer science from ETH Zurich in 2005. He then worked at a leading Swiss R&D service provider on feasibility studies and application development covering topics such as cryptography, machine learning, or computational physics. In this context, Tobias Gysi introduced GPU accelerated computing into COSMO, the atmospheric model used by MeteoSwiss for day-to-day weather forecasts and long-term climate simulations.
Tech-Talk 2: Automatic Generation of Adaptive Codes
Abstract:
Approximate computing is necessary to meet deadlines in some
compute-intensive applications like simulation. Building them requires a
high level of expertise from the application designers as well as a
significant development effort. We present a new application programming
interface which facilitates application conception by relying on
user-specified or extracted domain-specific knowledge to automatically
generate an adaptive version of the code. We have early results for an
automatic data compression method that provides good mathematical
properties for adaptive applications and better approximation guarantees.
Bio:
Maxime Schmitt, third year PhD student at the university of Strasbourg.
I did my master degree in networking and embedded systems in Strasbourg
where I met the incredible parallel and scientific computing team
(ICPS). I decided to continue in research after a fun internship in this
team. Research interests: automatic compiler optimizations, domain
specific languages, approximate computing and numerical simulations.

Fast Model of Fully Associative Caches / Automatic Generation of Adaptive Codes