Cache miss rates are an important subset of system model inputs. Cache miss rate models are used for broad design space exploration in which many cache configurations cannot be simulated directly due to limitations of trace collection setups or available resources. Often it is not practical to simulate large caches. Large processor counts and consequent potentially high degree of cache sharing are frequently not reproducible on small existing systems. In this article, we present an approach to building multivariate regression models for predicting cache miss rates beyond the range of collectible data. The extrapolation model attempts to accurately estimate the high-level trend of the existing data, which can be extended in a natural way. We extend previous work by its applicability to multiple miss rate components and its ability to model a wide range of cache
parameters, including size, line size, associativity and sharing. The stability of extrapolation is recognized to be a crucial requirement. The proposed extrapolation model is shown to be stable to small data perturbations that may be introduced during data collection. We show the effectiveness of the technique by applying it to two commercial workloads. The wide design space contains configurations that are much larger than those for whichmiss rate data were available. The fitted data match the simulation data very well. The various curves show how
a miss rate model is useful for not only estimating the performance of specific configurations, but also for providing insight into miss rate trends.