CudaSteps

所属分类:GPU/显卡
开发工具:Cuda
文件大小:59KB
下载次数:0
上传日期:2022-03-11 06:56:06
上 传 者sh-1993
说明:  基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。
(The path of cuda learning based on "Fundamentals and Practice of Cuda Programming" (written by Fan Zheyong).)

文件列表:
capter1 (0, 2022-03-11)
capter10 (0, 2022-03-11)
capter10\reduce.cu (9297, 2022-03-11)
capter10\warp.cu (4034, 2022-03-11)
capter11 (0, 2022-03-11)
capter11\stream.cu (8398, 2022-03-11)
capter13 (0, 2022-03-11)
capter14 (0, 2022-03-11)
capter2 (0, 2022-03-11)
capter2\hello.cpp (113, 2022-03-11)
capter2\hello.cu (721, 2022-03-11)
capter3 (0, 2022-03-11)
capter3\add.cpp (1093, 2022-03-11)
capter3\add.cu (3117, 2022-03-11)
capter4 (0, 2022-03-11)
capter4\check.cu (2839, 2022-03-11)
capter4\error.cuh (1096, 2022-03-11)
capter5 (0, 2022-03-11)
capter5\add.cu (1037, 2022-03-11)
capter5\add.cuh (564, 2022-03-11)
capter5\clock.cu (3148, 2022-03-11)
capter5\clock.cuh (20, 2022-03-11)
capter5\main.cpp (249, 2022-03-11)
capter6 (0, 2022-03-11)
capter6\query.cu (1439, 2022-03-11)
capter6\static.cu (2310, 2022-03-11)
capter7 (0, 2022-03-11)
... ...

# CUDA Study Steps CUDA gpu 编程学习,基于 《CUDA 编程——基础与实践》(樊哲勇)。 包含章节: 1. [GPU 硬件与 CUDA 程序开发工具](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter1/ReadMe.md) 2. [CUDA 中的线程组织](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter2/ReadMe.md) 3. [简单 CUDA 程序的基本框架](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter3/ReadMe.md) 4. [CUDA 程序的错误检测](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter4/ReadMe.md) 5. [GPU 加速的关键](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter5/ReadMe.md) 6. [CUDA 内存组织](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter6/ReadMe.md) 7. [全局内存的合理使用](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter7/ReadMe.md) 8. [共享内存的合理使用](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter8/ReadMe.md) 9. [原子函数的合理使用](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter9/ReadMe.md) 10. [线程束基本函数与协作组](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter10/ReadMe.md) 11. [CUDA 流](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter11/ReadMe.md) 12. [使用同一内存编程](https://github.com/QINZHAOYU/CudaSteps/blob/master/) 13. [分子动力学模型](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter13/ReadMe.md) 14. [CUDA 标准库](https://github.com/QINZHAOYU/CudaSteps/blob/master/./capter14/ReadMe.md) ## CUDA 官方文档 [CUDA c++编程指南](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) [CUDA c++最佳实践指南](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) [CUDA 运行时API手册](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://docs.nvidia.com/cuda/cuda-runtime-api/index.html) [CUDA 数学函数库API手册](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://docs.nvidia.com/cuda/cuda-math-api/index.html) ## CUDA 编程案例 [CUDA Samples](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://github.com/NVIDIA/cuda-samples) + Simple Reference 基础CUDA示例,适用于初学者, 反映了运用CUDA和CUDA runtime APIs的一些基本概念. + Utilities Reference 演示如何查询设备能力和衡量GPU/CPU 带宽的实例程序。 + Graphics Reference 图形化示例展现的是 CUDA, OpenGL, DirectX 之间的互通性。 + Imaging Reference 图像处理,压缩,和数据分析。 + Finance Reference 金融计算的并行处理。 + Simulations Reference 展现一些运用CUDA的模拟算法。 + Advanced Reference 用CUDA实现的一些先进的算法。 + Cudalibraries Reference 这类示例主要告诉我们该如何使用CUDA各种函数库(NPP, CUBLAS, CUFFT,CUSPARSE, and CURAND)。 ## CUDA 性能测试 [CUDA Bechmarks](https://github.com/QINZHAOYU/CudaSteps/blob/master/https://github.com/ekondis/mixbench) + Four types of experiments are executed combined with global memory accesses: Single precision Flops (multiply-additions) Double precision Flops (multiply-additions) Half precision Flops (multiply-additions) Integer multiply-addition operations + Building is based now on CMake files. Each implementation resides in a separate folder: CUDA implementation: mixbench-cuda OpenCL implementation: mixbench-opencl HIP implementation: mixbench-hip SYCL implementation: mixbench-sycl 生成的测试结果类似: ``` mixbench/read-only (v0.03-2-gbccfd71) ------------------------ Device specifications ------------------------ Device: GeForce RTX 2070 CUDA driver version: 10.20 GPU clock rate: 1620 MHz Memory clock rate: 3500 MHz Memory bus width: 256 bits WarpSize: 32 L2 cache size: 4096 KB Total global mem: 7979 MB ECC enabled: No Compute Capability: 7.5 Total SPs: 2304 (36 MPs x *** SPs/MP) Compute throughput: 74***.96 GFlops (theoretical single precision FMAs) Memory bandwidth: 448.06 GB/sec ----------------------------------------------------------------------- Total GPU memory 8366784512, free 7941521408 Buffer size: 256MB Trade-off type: compute with global memory (block strided) Elements per thread: 8 Thread fusion degree: 4 ----------------------------------------------------------------------------- CSV data ----------------------------------------------------------------------------- Experiment ID, Single Precision ops,,,, Double precision ops,,,, Half precision ops,,,, Integer operations,,, Compute iters, Flops/byte, ex.time, GFLOPS, GB/sec, Flops/byte, ex.time, GFLOPS, GB/sec, Flops/byte, ex.time, GFLOPS, GB/sec, Iops/byte, ex.time, GIOPS, GB/sec 0, 0.250, 0.32, 104.42, 417.68, 0.125, 0.63, 53.04, 424.35, 0.500, 0.32, 211.41, 422.81, 0.250, 0.32, 105.58, 422.30 1, 0.750, 0.32, 316.34, 421.79, 0.375, 0.63, 158.69, 423.18, 1.500, 0.32, 634.22, 422.81, 0.750, 0.32, 317.30, 423.07 2, 1.250, 0.32, 528.46, 422.77, 0.625, 0.78, 215.91, 345.45, 2.500, 0.32, 1055.97, 422.39, 1.250, 0.32, 528.57, 422.86 3, 1.750, 0.32, 738.81, 422.17, 0.875, 1.08, 218.17, 249.34, 3.500, 0.32, 1478.95, 422.56, 1.750, 0.32, 740.59, 423.20 4, 2.250, 0.32, 951.33, 422.81, 1.125, 1.38, 219.57, 195.17, 4.500, 0.32, 1902.66, 422.81, 2.250, 0.32, 950.66, 422.51 5, 2.750, 0.32, 1162.74, 422.81, 1.375, 1.67, 220.38, 160.28, 5.500, 0.32, 2328.52, 423.37, 2.750, 0.32, 1162.74, 422.81 6, 3.250, 0.32, 1374.56, 422.94, 1.625, 1.97, 220.99, 135.99, 6.500, 0.32, 2756.62, 424.10, 3.250, 0.32, 1375.81, 423.32 7, 3.750, 0.32, 1592.45, 424.65, 1.875, 2.27, 221.38, 118.07, 7.500, 0.32, 3169.50, 422.60, 3.750, 0.32, 1585.55, 422.81 8, 4.250, 0.32, 1796.95, 422.81, 2.125, 2.57, 221.71, 104.33, 8.500, 0.32, 3587.76, 422.09, 4.250, 0.37, 1545.63, 363.68 9, 4.750, 0.32, 2006.34, 422.39, 2.375, 2.87, 221.85, 93.41, 9.500, 0.32, 3995.38, 420.57, 4.750, 0.32, 19***.29, 420.69 10, 5.250, 0.32, 2209.52, 420.86, 2.625, 3.17, 222.02, 84.58, 10.500, 0.32, 4439.54, 422.81, 5.250, 0.32, 2220.44, 422.94 11, 5.750, 0.32, 2434.12, 423.32, 2.875, 3.47, 222.17, 77.28, 11.500, 0.32, 4855.01, 422.17, 5.750, 0.32, 2426.77, 422.05 12, 6.250, 0.32, 2638.06, 422.09, 3.125, 3.78, 222.18, 71.10, 12.500, 0.32, 5227.20, 418.18, 6.250, 0.38, 2202.15, 352.34 13, 6.750, 0.32, 2841.95, 421.03, 3.375, 4.08, 222.30, 65.87, 13.500, 0.32, 5712.58, 423.15, 6.750, 0.32, 2850.54, 422.30 14, 7.250, 0.32, 3065.39, 422.81, 3.625, 4.37, 222.45, 61.36, 14.500, 0.32, 6135.74, 423.15, 7.250, 0.32, 3065.08, 422.77 15, 7.750, 0.33, 3143.40, 405.60, 3.875, 4.67, 222.57, 57.44, 15.500, 0.32, 6546.34, 422.34, 7.750, 0.32, 3268.89, 421.79 16, 8.250, 0.32, 3482.59, 422.13, 4.125, 4.***, 222.57, 53.96, 16.500, 0.32, 6957.48, 421.67, 8.250, 0.39, 2803.68, 339.84 17, 8.750, 0.32, 3693.66, 422.13, 4.375, 5.28, 222.53, 50.86, 17.500, 0.32, 7396.24, 422.***, 8.750, 0.32, 3694.77, 422.26 18, 9.250, 0.32, 3901.58, 421.79, 4.625, 5.58, 222.58, 48.12, 18.500, 0.32, 7786.72, 420.90, 9.250, 0.32, 3897.66, 421.37 20, 10.250, 0.32, 4312.53, 420.73, 5.125, 6.18, 222.66, 43.45, 20.500, 0.32, 8***0.66, 421.50, 10.250, 0.41, 3374.54, 329.22 22, 11.250, 0.32, 4729.94, 420.44, 5.625, 6.78, 222.74, 39.60, 22.500, 0.32, 9452.31, 420.10, 11.250, 0.32, 4734.21, 420.82 24, 12.250, 0.32, 5148.83, 420.31, 6.125, 7.36, 223.51, 3***9, 24.500, 0.32,1034***0, 422.30, 12.250, 0.42, 3900.12, 318.38 28, 14.250, 0.32, 6009.94, 421.75, 7.125, 8.53, 224.23, 31.47, 28.500, 0.32,11975.32, 420.19, 14.250, 0.44, 4368.11, 306.53 32, 16.250, 0.32, 6795.36, 418.18, 8.125, 9.72, 224.31, 27.61, 32.500, 0.32,13605.***, 418.***, 16.250, 0.45, 4797.12, 295.21 40, 20.250, 0.34, 7899.43, 390.10, 10.125, 12.11, 224.50, 22.17, 40.500, 0.33,16371.37, 404.23, 20.250, 0.50, 54***.85, 269.87 48, 24.250, 0.41, 8029.04, 331.09, 12.125, 14.49, 224.58, 18.52, 48.500, 0.40,1***68.89, 339.56, 24.250, 0.54, 5***6.22, 246.85 56, 28.250, 0.47, 8114.58, 287.24, 14.125, 16.88, 224.65, 15.90, 56.500, 0.46,1***43.12, 291.03, 28.250, 0.60, 6342.42, 224.51 ***, 32.250, 0.53, 8154.47, 252.85, 16.125, 19.26, 224.72, 13.94, ***.500, 0.52,16536.22, 256.38, 32.250, 0.66, 6591.93, 204.40 80, 40.250, 0.66, 8242.80, 204.79, 20.125, 24.03, 224.79, 11.17, 80.500, 0.65,16***4.88, 206.77, 40.250, 0.78, 6909.54, 171.67 96, 48.250, 0.78, 8321.35, 172.46, 24.125, 28.80, 224.85, 9.32, 96.500, 0.78,16685.23, 172.90, 48.250, 0.91, 7108.62, 147.33 128, ***.250, 1.03, 8337.22, 129.76, 32.125, 38.34, 224.91, 7.00, 128.500, 1.03,16775.65, 130.55, ***.250, 1.18, 7295.18, 113.54 192, 96.250, 1.54, 8414.49, 87.42, 48.125, 57.42, 224.97, 4.67, 192.500, 1.53,16847.93, 87.52, 96.250, 1.74, 7431.***, 77.21 256, 128.250, 2.06, 8362.01, 65.20, ***.125, 76.50, 225.02, 3.51, 256.500, 2.06,16693.65, 65.08, 128.250, 2.30, 7477.75, 58.31 -------------------------------------------------------------------------------------------------------------------------------------------------------------------- ```

近期下载者

相关文件


收藏者