Improving Global Performance on GPU for Algorithms with Main Loop Containing a Reduction Operation: Case of Dijkstra’s Algorithm ()
ABSTRACT
In this paper, we study the impact of
copying data in GPU computing. GPU computing allows implementing parallel
computations at low cost: a GPU can be purchased at under USD 500. Many studies
have shown that GPU can be used to speed up the calculations. But for
algorithms requiring doing a part of the calculations on GPU and another part
on CPU, alternately, latency due to the copy of the data is a performance
degradation factor. To illustrate this, we consider the Dijkstra’s algorithm on
the shortest path used in solving optimization problems. This algorithm is very
heavy to run on sequential machine. So, we are considering a parallel approach
on GPU. Note that Dijkstra’s algorithm has been subject of many implementations
on GPU. In the present work, we use two platforms with external GPU. Graphs are
represented in adjacency matrix. During the computation of this algorithm,
intermediates results are copied from GPU to CPU or from CPU to GPU. The
purpose of this work is to measure the impact of these copies in the overall
performance of the algorithm. For that we calculate time due to the copying
data’s implementation; then we compare results with implementation computing
only on CPU memory (zero-copy). The real impact shown by experiments
demonstrates the interest of this study. GP-GPU programmers have to think that
they will use either memory zero-copy or GPU memory. The challenge for GPU’s
manufacturers is how to reduce this impact.
Share and Cite:
Chaibou, A. and Sie, O. (2015) Improving Global Performance on GPU for Algorithms with Main Loop Containing a Reduction Operation: Case of Dijkstra’s Algorithm.
Journal of Computer and Communications,
3, 41-54. doi:
10.4236/jcc.2015.38005.