turingas

所属分类:GPU/显卡
开发工具:Python
文件大小:26KB
下载次数:0
上传日期:2022-01-13 04:49:58
上 传 者sh-1993
说明:  NVIDIA Volta和图灵GPU的汇编程序
(Assembler for NVIDIA Volta and Turing GPUs)

文件列表:
LICENSE.txt (1071, 2022-01-13)
examples (0, 2022-01-13)
examples\bench (0, 2022-01-13)
examples\bench\smem (0, 2022-01-13)
examples\bench\smem\Makefile (129, 2022-01-13)
examples\bench\smem\lds32.sass (778, 2022-01-13)
examples\bench\smem\main.cu (1021, 2022-01-13)
examples\copy_element (0, 2022-01-13)
examples\copy_element\Makefile (165, 2022-01-13)
examples\copy_element\copy_element.sass (618, 2022-01-13)
examples\copy_element\main.cu (1278, 2022-01-13)
setup.py (191, 2022-01-13)
tools (0, 2022-01-13)
tools\disasm.py (3857, 2022-01-13)
turingas (0, 2022-01-13)
turingas\ELF.py (4053, 2022-01-13)
turingas\__init__.py (0, 2022-01-13)
turingas\cubin.py (15406, 2022-01-13)
turingas\grammar.py (24706, 2022-01-13)
turingas\include (0, 2022-01-13)
turingas\include\helper.py (2546, 2022-01-13)
turingas\main.py (1351, 2022-01-13)
turingas\turas.py (13082, 2022-01-13)

# TuringAs An open source SASS assembler for NVIDIA Volta, Turing, and Ampere GPUs. ## Requirements: * Python >= 3.6 ## Supported hardware: All NVIDIA Volta (SM70), Turing (SM75), and Ampere (SM80) GPUs. ## Other features: * Include files. * Inline python code. ## Install the library ``` python setup.py install ``` ## Use the library ```bash python -m turingas.main -i -o -arch # E.g. python -m turingas.main -i input.sass -o output.cubin -arch 75 # 75 for Turing # or 70 (Volta), 80 (Ampere) ``` ## Related projects [AsFermi](https://github.com/hyqneuron/asfermi), an SASS assembler for NVIDIA Fermi GPUs. By Hou Yunqing. [MaxAs](https://github.com/NervanaSystems/maxas), an SASS assembler for NVIDIA Maxwell and Pascal. By Scott Gray. [KeplerAs](https://github.com/PAA-NCIC/PPoPP2017_artifact), an SASS assembler for NVIDIA Kepler. By Xiuxia Zhang. ## TODO list To support following instructions: - [ ] Type conversion instructions (I2I, I2F, F2I, F2F) - [X] MUFU # Multifunction. E.g., sin, cos. - [ ] LDSM # Load matrix from shared memory. - [ ] Texture instructions - [ ] Surface instructions - [ ] Unified data path instructions - [ ] Other (ISCADD, CALL, JMP ...) - [ ] New Ampere instructions. ## Citation If you find this tool helpful, please cite: ``` @inproceedings{yan2020winograd-conv, author = {Da Yan and Wei Wang and Xiaowen Chu}, title = {Optimizing Batched Winograd Convolution on GPUs}, booktitle = {25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '20)}, year = {2020}, address = {San Diego, CA, USA}, publisher = {ACM}, } ``` This project is released under the MIT License. -- Da Yan

近期下载者

相关文件


收藏者