|
We introduce Transient Blocking Synchronization (TBS), a new approach to hardware synchronization
for mega-scale distributed-memory multiprocessor machines. Such machines,
with thousands of processors and controller based memory modules, are essentially distributed
networks, and one must search for new paradigms that provide hardware synchronization
support with high levels of robustness and minimal protocol and communication overhead. It is
our claim that the semantics of non-blocking synchronization primitives such as Compare&
Swap and LoadLinked/StoreConditional on the one hand, and blocking ones such as
Full/Empty-bits on the other, will introduce high communication and space costs when implemented
on large scale machines.
TBS is a novel hardware synchronization paradigm resting between the classic blocking and
non-blocking approaches to synchronization. It is an example of how, based on very weak
"transient" hardware blocking, that is, blocking that may be revoked at any time, one can provide
non-blocking universal synchronization with low communication and space complexity.
This paper presents a set of simple TBS single-location synchronization operations and shows
that they provide low cost non-blocking emulations of all known read-modify-write operations.
Moreover, it shows that the combination of TBS with hardware supported transactional bits, a
variation on traditional hardware full/empty bits, can provide low message cost implementations
of multi-word transactional operations.
|