gpustats: GPU Library for Statistical Computing

Andrew Cron & Wes McKinney

In this paper we will discuss gpustats, a new Python library for assisting in "big data" statistical computing applications, particularly Monte Carlo-based inference algorithms. The library provides a general code generation / metaprogramming framework for easily implementing discrete and continuous probability density functions and random variable samplers. These functions can be utilized to achieve more than 100x speedup over their CPU equivalents. We will demonstrate their use in an Bayesian MCMC application and discuss avenues for future work.

Keywords: GPU, CUDA, OpenCL, Python, statistical inference, statistics, metaprogramming, sampling, Markov Chain Monte Carlo (MCMC), PyMC, big data

Research reported here was partly supported by the National Institutes of Heath under grant RC1-A1086032. Any opinions, findings and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the NIH.