CUDA-Fortran
所属分类:GPU/显卡
开发工具:Fortran
文件大小:0KB
下载次数:1
上传日期:2018-05-17 09:09:06
上 传 者:
sh-1993
说明: 该存储库旨在提供Fortran 90中的一些基本CUDA编程实践。,
(This repository is intended to provide some basic CUDA programming practices in Fortran 90.,)
文件列表:
01-get_threadid/ (0, 2018-05-17)
01-get_threadid/Makefile (411, 2018-05-17)
01-get_threadid/dev_lib.cuf (258, 2018-05-17)
01-get_threadid/exe (26378, 2018-05-17)
01-get_threadid/run.cuf (420, 2018-05-17)
accelerated-stellar-pulsation-code/ (0, 2018-05-17)
async-data-transfer/ (0, 2018-05-17)
copy_ILP/ (0, 2018-05-17)
copy_ILP/compile.sh (272, 2018-05-17)
copy_ILP/kerns.cuf (1253, 2018-05-17)
copy_ILP/main.f90 (3946, 2018-05-17)
data-movement/ (0, 2018-05-17)
data-movement/Makefile (314, 2018-05-17)
data-movement/data_move.cuf (8042, 2018-05-17)
data-movement/exper_1.cuf (8038, 2018-05-17)
data-movement/exper_1.cuf~ (8037, 2018-05-17)
example-5.1/ (0, 2018-05-17)
example-5.1/Makefile (507, 2018-05-17)
example-5.1/compute.cuf (1033, 2018-05-17)
example-5.1/dev_lib.cuf (1264, 2018-05-17)
example-5.1/host_lib.cuf (4176, 2018-05-17)
example-5.1/modules (95, 2018-05-17)
example-5.2/ (0, 2018-05-17)
example-5.2/Makefile (400, 2018-05-17)
example-5.2/dev.cuf (405, 2018-05-17)
example-5.2/dev_lib.cuf (479, 2018-05-17)
example-5.2/modules (95, 2018-05-17)
example-5.2/page77.cuf (817, 2018-05-17)
example-5.2/prog.cuf (2157, 2018-05-17)
example-5.2/prog.cuf~ (2246, 2018-05-17)
memory-access/ (0, 2018-05-17)
memory-access/Makefile (687, 2018-05-17)
... ...
# CUDA Fortran 90 Feature Tests
## Purpose
The purpose of this repository is to experiment with the basics of CUDA programming in Fortran 90. This repository consists of several small projects to test speed ups, memory allocation, streaming, etc. It can be used as a basic refernce for using CUDA programming in modern Fortran. It is noteworthy to mention that some of the examples are taken directly from the [PGI CUDA Fortran Programming Guide][https://www.pgroup.com/doc/pgi17cudaforug.pdf]; these folders carry the `example` in their directory name.
## Contents
+ `01-get-threadid`
+ `example-5.1`: An extensive example for large matrix by matrix multiplication using CPU (double loop or OpenBLAS) and GPU (cuBLAS and slicing).
+ `example-5.2`: replica of the mapped-memory allocation
+ `memory-bandwidth`: measures the effective Host2Device and Device2Host transfers for the pinned versus pageable memory.
+ `async-data-transfer`: measures the latency of four different data transfer strategies
## Requirements
## References
+ [PGI CUDA Fortran Programming Guide][https://www.pgroup.com/doc/pgi17cudaforug.pdf]
+ PGI User Forum: Accelerator Programming
近期下载者:
相关文件:
收藏者: