SILENTARMY是一个GPU Zcash Equihash 解算器,它能够良好的运行在AMD GPUs上,实现了CLI API。
# SILENTARMY Official site: https://github.com/mbevand/silentarmy SILENTARMY is a free open source [Zcash](https://z.cash) miner for Linux with multi-GPU and [Stratum](https://github.com/str4d/zips/blob/77-zip-stratum/drafts/str4d-stratum/draft1.rst) support. It is written in OpenCL and has been tested on AMD/Nvidia/Intel GPUs, Xeon Phi, and more. After compiling SILENTARMY, list the available OpenCL devices: ``` $ silentarmy --list ``` Start mining with two GPUs (ID 2 and ID 5) on a pool: ``` $ silentarmy --use 2,5 -c stratum+tcp://us1-zcash.flypool.org:3333 -u t1cVviFvgJinQ4w3C2m2CfRxgP5DnHYaoFC ``` When run without options, SILENTARMY mines with the first OpenCL device, using my donation address, on flypool: ``` $ silentarmy Connecting to us1-zcash.flypool.org:3333 Stratum server sent us the first job Mining on 1 device Total 0.0 sol/s [dev0 0.0] 0 shares Total 43.9 sol/s [dev0 43.9] 0 shares Total 46.9 sol/s [dev0 46.9] 0 shares Total 44.9 sol/s [dev0 44.9] 1 share [...] ``` Usage: ``` $ silentarmy --help Usage: silentarmy [options] Options: -h, --help show this help message and exit -v, --verbose verbose mode (may be repeated for more verbosity) --debug enable debug mode (for developers only) --list list available OpenCL devices by ID (GPUs...) --use=LIST use specified GPU device IDs to mine, for example to use the first three: 0,1,2 (default: 0) --instances=N run N instances of Equihash per GPU (default: 2) -c POOL, --connect=POOL connect to POOL, for example: stratum+tcp://example.com:1234 -u USER, --user=USER username for connecting to the pool -p PWD, --pwd=PWD password for connecting to the pool ``` # Equihash solver SILENTARMY also provides a command line Equihash solver (`sa-solver`) implementing the CLI API described in the [Zcash open source miner challenge](https://zcashminers.org/rules). To solve a specific block header and print the encoded solution on stdout, run the following command (this header is from [mainnet block #3400](https://explorer.zcha.in/blocks/00000001687e89e7e1ce48b349e601c89c70dd4c268fdf24b269a3ca4140426f) and should result in 1 Equihash solution): ``` $ sa-solver -i 04000000e54c27544050668f272ec3b460e1cde745c6b21239a81dae637fde4704000000844bc0c55696ef9920eeda11c1eb41b0c2e7324b46cc2e7aa0c2aa7736448d7a000000000000000000000000000000000000000000000000000000000000000068241a587e7e061d250e000000000000010000000000000000000000000000000000000000000000 ``` If the option `-i` is not specified, `sa-solver` solves a 140-byte header of all zero bytes. The option `--nonces <nr>` instructs the program to try multiple nonces, each time incrementing the nonce by 1. So a convenient way to run a quick test/benchmark is simply: `$ sa-solver --nonces 100` Note: due to BLAKE2b optimizations in my implementation, if the header is specified it must be 140 bytes and its last 12 bytes **must** be zero. Use the verbose (`-v`) and very verbose (`-v -v`) options to show the solutions and statistics in progressively more and more details. # Performance * 102.0 sol/s with one R9 Nano * 72.0 sol/s with one RX 480 8GB * (TODO: benchmark Nvidia GPUs) Note: the `silentarmy` **miner** automatically achieves this performance level, however the `sa-solver` **command-line solver** by design runs only 1 instance of the Equihash proof-of-work algorithm causing it to slightly underperform by 5-10%. One must manually run 2 instances of `sa-solver` (eg. in 2 terminal consoles) to achieve the same performance level as the `silentarmy` **miner**. # Dependencies SILENTARMY has only one build dependency: an OpenCL implementation. And it has only one runtime dependency: Python 3.3 or later (needed to support the use of the `yield from` syntax.) When running on AMD GPUs, install the **AMD APP SDK** (OpenCL implementation) and either: * the **AMDGPU-PRO** driver (amdgpu.ko, for newer GPUs), or * the **Radeon Software Crimson Edition** driver (fglrx.ko, for older GPUs) When running on Nvidia GPUs, install the Nvidia OpenCL development files, and their binary driver. Instructions are provided below for a few Linux versions. ## Ubuntu 16.04 / amdgpu 1. Download the [AMDGPU-PRO Driver](http://support.amd.com/en-us/kb-articles/Pages/AMDGPU-PRO-Install.aspx) (as of 30 Oct 2016, the latest version is 16.40) 2. Extract it: `$ tar xf amdgpu-pro-16.40-348864.tar.xz` 3. Install (non-root, will use sudo access automatically): `$ ./amdgpu-pro-install` 4. Add yourself to the video group if not already a member: `$ sudo gpasswd -a $(whoami) video` 5. Reboot 6. Download the [AMD APP SDK](http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/) (as of 27 Oct 2016, the latest version is 3.0) 7. Extract it: `$ tar xf AMD-APP-SDKInstaller-v3.0.130.136-GA-linux64.tar.bz2` 8. Install system-wide by running as root (accept all the default options): `$ sudo ./AMD-APP-SDK-v3.0.130.136-GA-linux64.sh` 9. Install compiler dependencies in order to compile SILENTARMY: `$ sudo apt-get install build-essential` ## Ubuntu 14.04 / fglrx 1. Install the official Ubuntu package: `$ sudo apt-get install fglrx` (as of 30 Oct 2016, the latest version is 2:15.201-0ubuntu0.14.04.1) 2. Follow steps 5-9 above. ## Ubuntu 16.04 / Nvidia 1. Install the OpenCL development files and the latest driver: `$ sudo apt-get install nvidia-opencl-dev nvidia-361` 2. Either reboot, or load the kernel driver: `$ modprobe nvidia_361` 3. Install compiler dependencies in order to compile SILENTARMY: `$ sudo apt-get install build-essential` ## Arch Linux 1. Install the [silentarmy AUR package](https://aur.archlinux.org/packages/silentarmy/). # Compilation and installation Compiling SILENTARMY is easy: `$ make` You may need to specify the paths to the locations of your OpenCL C headers and libOpenCL.so if the compiler does not find them, eg.: `$ make OPENCL_HEADERS=/usr/local/cuda-8.0/targets/x86_64-linux/include LIBOPENCL=/usr/local/cuda-8.0/targets/x86_64-linux/lib` Self-testing the command-line solver (solves 100 all-zero 140-byte blocks with their nonces varying from 0 to 99): `$ make test` For more testing run `sa-solver --nonces 10000`. It should finds 18681 solutions which is less than 1% off the theoretical expected average number of solutions of 1.88 per Equihash run at (n,k)=(200,9). For installing, just copy `silentarmy` and `sa-solver` to the same directory. # Implementation details The `silentarmy` Python script is actually mostly a lighteight Stratum implementation and job dispatcher that sends Equihash work items to 1 or more instances of `sa-solver --mining` which initializes the solver in a special "mining mode" so it can be controled via stdin/stdout. By default 2 instances of `sa-solver` are launched for each GPU (this can be changed with the `silentarmy --instances N` option.) 2 instances per GPU usually results in the best performance. The `sa-solver` binary invokes the OpenCL kernel which contains the core of the Equihash algorithm. My implementation uses two hash tables to avoid having to sort the (Xi,i) pairs: * Round 0 (BLAKE2b) fills up table #0 * Round 1 reads table #0, identifies collisions, XORs the Xi's, stores the results in table #1 * Round 2 reads table #1 and fills up table #0 (reusing it) * Round 3 reads table #0 and fills up table #1 (also reusing it) * ... * Round 8 (last round) reads table #1 and fills up table #0. Only the non-zero parts of Xi are stored in the hash table, so fewer and fewer bytes are needed to store Xi as we progress toward round 8. For a description of the layout of the hash table, see the comment at the top of `input.cl`. Also the code implements the notion of "encoded reference to inputs" which I--apparently like most authors of Equihash solvers--independently discovered as a neat t
