turingas
所属分类:GPU/显卡
开发工具:Python
文件大小:26KB
下载次数:0
上传日期:2022-01-13 04:49:58
上 传 者:
sh-1993
说明: NVIDIA Volta和图灵GPU的汇编程序
(Assembler for NVIDIA Volta and Turing GPUs)
文件列表:
LICENSE.txt (1071, 2022-01-13)
examples (0, 2022-01-13)
examples\bench (0, 2022-01-13)
examples\bench\smem (0, 2022-01-13)
examples\bench\smem\Makefile (129, 2022-01-13)
examples\bench\smem\lds32.sass (778, 2022-01-13)
examples\bench\smem\main.cu (1021, 2022-01-13)
examples\copy_element (0, 2022-01-13)
examples\copy_element\Makefile (165, 2022-01-13)
examples\copy_element\copy_element.sass (618, 2022-01-13)
examples\copy_element\main.cu (1278, 2022-01-13)
setup.py (191, 2022-01-13)
tools (0, 2022-01-13)
tools\disasm.py (3857, 2022-01-13)
turingas (0, 2022-01-13)
turingas\ELF.py (4053, 2022-01-13)
turingas\__init__.py (0, 2022-01-13)
turingas\cubin.py (15406, 2022-01-13)
turingas\grammar.py (24706, 2022-01-13)
turingas\include (0, 2022-01-13)
turingas\include\helper.py (2546, 2022-01-13)
turingas\main.py (1351, 2022-01-13)
turingas\turas.py (13082, 2022-01-13)
# TuringAs
An open source SASS assembler for NVIDIA Volta, Turing, and Ampere GPUs.
## Requirements:
* Python >= 3.6
## Supported hardware:
All NVIDIA Volta (SM70), Turing (SM75), and Ampere (SM80) GPUs.
## Other features:
* Include files.
* Inline python code.
## Install the library
```
python setup.py install
```
## Use the library
```bash
python -m turingas.main -i
-o -arch
# E.g.
python -m turingas.main -i input.sass -o output.cubin -arch 75 # 75 for Turing
# or 70 (Volta), 80 (Ampere)
```
## Related projects
[AsFermi](https://github.com/hyqneuron/asfermi), an SASS assembler for NVIDIA Fermi GPUs. By Hou Yunqing.
[MaxAs](https://github.com/NervanaSystems/maxas), an SASS assembler for NVIDIA Maxwell and Pascal. By Scott Gray.
[KeplerAs](https://github.com/PAA-NCIC/PPoPP2017_artifact), an SASS assembler for NVIDIA Kepler. By Xiuxia Zhang.
## TODO list
To support following instructions:
- [ ] Type conversion instructions (I2I, I2F, F2I, F2F)
- [X] MUFU # Multifunction. E.g., sin, cos.
- [ ] LDSM # Load matrix from shared memory.
- [ ] Texture instructions
- [ ] Surface instructions
- [ ] Unified data path instructions
- [ ] Other (ISCADD, CALL, JMP ...)
- [ ] New Ampere instructions.
## Citation
If you find this tool helpful, please cite:
```
@inproceedings{yan2020winograd-conv,
author = {Da Yan and
Wei Wang and
Xiaowen Chu},
title = {Optimizing Batched Winograd Convolution on GPUs},
booktitle = {25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '20)},
year = {2020},
address = {San Diego, CA, USA},
publisher = {ACM},
}
```
This project is released under the MIT License.
-- Da Yan
近期下载者:
相关文件:
收藏者: