CUDA-Fortran

所属分类:GPU/显卡
开发工具:Fortran
文件大小:0KB
下载次数:1
上传日期:2018-05-17 09:09:06
上 传 者sh-1993
说明:  该存储库旨在提供Fortran 90中的一些基本CUDA编程实践。,
(This repository is intended to provide some basic CUDA programming practices in Fortran 90.,)

文件列表:
01-get_threadid/ (0, 2018-05-17)
01-get_threadid/Makefile (411, 2018-05-17)
01-get_threadid/dev_lib.cuf (258, 2018-05-17)
01-get_threadid/exe (26378, 2018-05-17)
01-get_threadid/run.cuf (420, 2018-05-17)
accelerated-stellar-pulsation-code/ (0, 2018-05-17)
async-data-transfer/ (0, 2018-05-17)
copy_ILP/ (0, 2018-05-17)
copy_ILP/compile.sh (272, 2018-05-17)
copy_ILP/kerns.cuf (1253, 2018-05-17)
copy_ILP/main.f90 (3946, 2018-05-17)
data-movement/ (0, 2018-05-17)
data-movement/Makefile (314, 2018-05-17)
data-movement/data_move.cuf (8042, 2018-05-17)
data-movement/exper_1.cuf (8038, 2018-05-17)
data-movement/exper_1.cuf~ (8037, 2018-05-17)
example-5.1/ (0, 2018-05-17)
example-5.1/Makefile (507, 2018-05-17)
example-5.1/compute.cuf (1033, 2018-05-17)
example-5.1/dev_lib.cuf (1264, 2018-05-17)
example-5.1/host_lib.cuf (4176, 2018-05-17)
example-5.1/modules (95, 2018-05-17)
example-5.2/ (0, 2018-05-17)
example-5.2/Makefile (400, 2018-05-17)
example-5.2/dev.cuf (405, 2018-05-17)
example-5.2/dev_lib.cuf (479, 2018-05-17)
example-5.2/modules (95, 2018-05-17)
example-5.2/page77.cuf (817, 2018-05-17)
example-5.2/prog.cuf (2157, 2018-05-17)
example-5.2/prog.cuf~ (2246, 2018-05-17)
memory-access/ (0, 2018-05-17)
memory-access/Makefile (687, 2018-05-17)
... ...

# CUDA Fortran 90 Feature Tests ## Purpose The purpose of this repository is to experiment with the basics of CUDA programming in Fortran 90. This repository consists of several small projects to test speed ups, memory allocation, streaming, etc. It can be used as a basic refernce for using CUDA programming in modern Fortran. It is noteworthy to mention that some of the examples are taken directly from the [PGI CUDA Fortran Programming Guide][https://www.pgroup.com/doc/pgi17cudaforug.pdf]; these folders carry the `example` in their directory name. ## Contents + `01-get-threadid` + `example-5.1`: An extensive example for large matrix by matrix multiplication using CPU (double loop or OpenBLAS) and GPU (cuBLAS and slicing). + `example-5.2`: replica of the mapped-memory allocation + `memory-bandwidth`: measures the effective Host2Device and Device2Host transfers for the pinned versus pageable memory. + `async-data-transfer`: measures the latency of four different data transfer strategies ## Requirements ## References + [PGI CUDA Fortran Programming Guide][https://www.pgroup.com/doc/pgi17cudaforug.pdf] + PGI User Forum: Accelerator Programming

近期下载者

相关文件


收藏者