Compile-Time Automatic Synchronization Insertion And Redundant Synchronization Elimination For Gpu Kernels

Keywords

Compiler; Data Dependence; GPU; SSA; Synchronization

Abstract

In most of the GPU kernel programs, the synchronization statements are inserted manually by the programmers, which is very labor intensive, and error-prone. In this paper, we propose a synchronization optimization framework to automatically insert synchronization statements into the GPU kernels at compile time, while eliminating the redundant synchronization statements. We have shown that our framework can not only insert the synchronizations correctly, but also eliminate the redundant synchronizations, which outperforms the existing compiler frameworks that introduce redundant synchronizations using the most conservative strategy. Taking the GPU kernels as the input, our framework leverages data dependence analysis to insert synchronizations. We extend CETUS, a source-to-source compiler framework, to implement our synchronization optimization framework. Experimental results show that our proposed framework achieved 100% correctness by combining extensive evaluation and manual comparison. In addition, the number of synchronization statements in GPU kernels is reduced by 32.5%, and the number of synchronization statements executed is reduced by 28.2% on average by our synchronization optimization framework compared to the original GPU kernels.

Publication Date

7-2-2016

Publication Title

Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS

Number of Pages

826-834

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/ICPADS.2016.0112

Socpus ID

85018463223 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85018463223

This document is currently not available here.

Share

COinS