Controlling Loops in Parallel Mercury Code

Authors: Paul Bone, Zoltan Somogyi and Peter Schachte
Conference: Declarative Aspects and Applications of Multicore Programming
Where: Philadelphia, PA, USA
Date: January, 2012
Download: PDF
Slides: PDF
Publisher: ACM

Abstract

Recently we built a system that uses profiling data to automatically parallelize Mercury programs by finding conjunctions with expensive conjuncts that can run in parallel with minimal synchronization delays. This worked very well in many cases, but in cases of tail recursion, we got much lower speedups than we expected, due to excessive memory usage. In this paper, we present a novel program transformation that eliminates this problem, and also allows recursive calls inside parallel conjunctions to take advantage of tail recursion optimization. Our benchmark results show that our new transformation greatly increases the speedups we can get from parallel Mercury programs; in one case, it changes no speedup into almost perfect speedup on four cores.