## Using Butterfly-Patterned Partial Sums to Draw from Discrete Distributions

# Using Butterfly-Patterned Partial Sums to Draw from Discrete Distributions

*08 February 2017*

We describe a SIMD technique for drawing values from multiple discrete distributions, such as sampling from the random variables of a mixture model, that avoids computing a complete table of partial sums of the relative probabilities. A table of alternate (``butterfly-patterned'') form is faster to compute, making better use of coalesced memory accesses; from this table, complete partial sums are computed on the fly during a binary search. Measurements using CUDA 7.5 on an NVIDIA Titan Black GPU show that for double-precision data, this technique makes an entire LDA machine-learning application about 25% faster than doing a straightforward matrix transposition after using coalesced accesses.

** Venue : 2017 ACM Symposium on Principles and Practice of Parallel Programming **

**External Link: **http://dl.acm.org/authorize?N21941