Using graphic cards to perform statistical computation
For my first post I’m going to blog about a trendy topic in computational statistics: the increasingly common use of graphic cards to perform any kind of scientific computation (see here) and in particular statistical computation.
The fun thing about graphic cards (fun in a weird and definitely nerd way) is that they were built to display pictures (obviously), mainly for movies and video games. These cards became more and more powerful in the 90s and until now, helping the games to achieve an always higher level of photorealism. But I’m not sure their early developers realized how much they could be useful in scientific computation.
In the last years it has become clear that, for a limited amount of money, say 300$, you can get much more computational power if you buy a GPU (a graphic card processor) than a CPU (a standard processor). This gain can be of order 100 or even more!… meaning a program that takes several hours on a standard CPU can be run in less than a minute on a GPU. Of course it doesn’t work for any computation (otherwise GPU would simply replace CPU and there would be nothing to blog about).
Compared to a CPU, a GPU is extremely good at doing parallel computation. It means that it can do hundreds of little tasks at the same time, where a standard CPU with two cores can do… two tasks, basically. Algorithms that need to do a lot of independent little tasks can therefore be parallelized, other algorithms cannot.
- For example, if you have a vector and want to take the log of each component of this vector, you can do it in parallel, because you don’t need to know the result of in order to compute , for any . Similarly, evaluating a likelihood of a vector of independent variables can also be parallelized.
- On the contrary, suppose you want to compute Fibonacci numbers, which are defined by and the recurrence relation: . Then you obviously need to compute the n-th term before computing the n+1-th term. The program doing this computation would therefore be sequential. Any algorithm requiring a recurrence is thus hard to parallelize, and that includes for instance the Metropolis-Hastings algorithm. Though I’m working on using parallel computation to improve Metropolis-Hastings based estimation… maybe more about this in a future post.
My PhD focuses on Sequential Monte Carlo methods (SMC), also known as Particle Filters, which happen to be very easy to parallelize. Hence my interest in the topic! Instead of waiting hours for my computations to finish, I could have the results in (near) real-time. But then I would have no excuse for playing table football…
If you are interested, there are plenty of links to check. Owning a NVIDIA device, I use CUDA and its python counterpart, PyCUDA. Apparently you can also use CUDA in R with this. There is also a MATLAB plugin. An alternative is the open standard called OpenCL, that allows you to program NVIDIA and ATI devices (those two being the main graphic card designers), so that’s probably the best solution, or at least will be in the future. In the statistical literature, there has been recent articles, like this one in the SMC context. This website from the University of Oxford gives lots of links as well, and this one too from Duke University. Programming on a GPU already interests a lot of people, namely those who have intensive computations to perform (financial engineers, physicists, biologists…), I hope you’ll find it useful too.
Stay tuned for more exciting posts!…