Date of Award
Master of Science (MS)
Jon C. Calhoun
Modern HPC applications compute and analyze massive amounts of data. The data volume is growing faster than memory capabilities and storage improvements leading to performance bottlenecks. An example of this is pySDC, a framework for solving collocation problems iteratively using parallel-in-time methods. These methods require storing and exchanging 3D volume data for each parallel point in time. If a simulation consists of M parallel-in-time stages, where the full spatial problem has to be stored for the next iteration, the memory demand for a single state variable is M ×Nx ×Ny ×Nz per time-step. For an application simulation with many state variables or stages, the memory requirement is considerable. Data compression helps alleviate the overhead in memory by reducing the size of data and keeping it in compressed format. Inline compression compresses and decompresses the application’s working set as it moves in and out of main memory. Thus, it provides the system with the appearance of more main memory. Naive compressed arrays require a compression or decompression operation for each store or load and therefore hurt the performance of the application. By incorporating a software cache and storing decompressed values of the array, we limit the number of compression and decompression operations for the stores and loads, thereby improving performance overall. In this thesis, we build a compression manager and software cache manager for the pySDC framework to reduce the memory requirements and computational overhead. The compression manager wraps around LibPressio, a C++ compression library that abstracts all compressors. We utilize blosc, a lossless compressor for our compression manager, and build a software cache manager with various cache configurations and cache policies to work in cohesion with the compression manager. We build a performance model which evaluates the compression manager and cache manager’s performance on different metrics such as compression ratio and compression/decompression time. We test our framework on two different pySDC applications — e.g., Allen-Cahn and Heat-diffusion. ii Results show that incorporating compression and increasing the cache size for our applications inflates the total compressed size in bytes for the arrays and therefore reduces the compression ratio, in contrast to our expectations. However, incorporating the cache and a greater cache size reduces the number of compression/decompression calls to LibPressio as well as cache evictions, significantly reducing the computational overhead for pySDC. Thus, overall, our compression and cache manager help reduce the memory footprint in pySDC. Future work involves looking at improving the compression ratio and using lossy compression to achieve significant reduction in memory footprint.
Ranjan, Sansriti, "Performance Modeling of Inline Compression With Software Caching for Reducing the Memory Footprint in PYSDC" (2023). All Theses. 4158.