Asynchronous Memory Access Chaining
Asynchronous Memory Access Chaining
29 August 2016
In-memory databases rely on pointer-intensive data struc- tures to quickly locate data in memory. A single lookup op- eration in such data structures often exhibits long-latency memory stalls due to dependent pointer dereferences. Hid- ing the memory latency by launching additional memory ac- cesses for other lookups is an e ective way of improving per- formance of pointer-chasing codes (e.g., hash table probes, tree traversals). The ability to exploit such inter-lookup par- allelism is beyond the reach of modern out-of-order cores due to the limited size of their instruction window. Instead, re- cent work has proposed software prefetching techniques that exploit inter-lookup parallelism by arranging a set of inde- pendent lookups into a group or a pipeline, and navigate their respective pointer chains in a synchronized fashion. While these techniques work well for highly regular access patterns, they break down in the face of irregularity across lookups. Such irregularity includes variable-length pointer chains, early exit, and read/write dependencies. This work introduces Asynchronous Memory Access Chaining (AMAC), a new approach for exploiting inter- lookup parallelism to hide the memory access latency. AMAC achieves high dynamism in dealing with irregular- ity across lookups by maintaining the state of each lookup separately from that of other lookups. This feature en- ables AMAC to initiate a new lookup as soon as any of the in- ight lookups complete. In contrast, the static ar- rangement of lookups into a group or pipeline in existing techniques precludes such adaptivity. Our results show that AMAC matches or outperforms state-of-the-art prefetch- ing techniques on regular access patterns, while delivering up to 2.3x higher performance under irregular data struc- ture lookups. AMAC fully utilizes the available micro- architectural resources, generating the maximum number of memory accesses allowed by hardware in both single- and multi-threaded execution modes
Venue : VLDB'16
Click on the button below to download this publication.