CUCCL
cuda GPU ccl 

所属分类:GPU/显卡
开发工具:Cuda
文件大小:903655KB
下载次数:0
上传日期:2018-06-12 03:47:34
上 传 者sh-1993
说明:  ece285 gpu编程,小组项目
(ece285 gpu programming, group project)

文件列表:
CUCCL-xbxy (0, 2018-06-12)
CUCCL-xbxy\CMakeLists.txt (1020, 2018-06-12)
CUCCL-xbxy\algo (0, 2018-06-12)
CUCCL-xbxy\algo\CMakeLists.txt (880, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_CPU (0, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_CPU\CUCCL_LE.hpp (5258, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_DPL (0, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_DPL\CUCCL_DPL.cu (9033, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_DPL\CUCCL_DPL.cuh (997, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_LE (0, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_LE\CUCCL_LE.cu (8411, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_LE\CUCCL_LE.cuh (1311, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_LE\CUCCL_LE.hpp (5261, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_NP (0, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_NP\CUCCL_NP.cu (6740, 2018-06-12)
CUCCL-xbxy\algo\CUCCL_NP\CUCCL_NP.cuh (936, 2018-06-12)
CUCCL-xbxy\algo\test (0, 2018-06-12)
CUCCL-xbxy\algo\test\test.cu (3715, 2018-06-12)
CUCCL-xbxy\algo\test\test.png (1076099, 2018-06-12)
CUCCL-xbxy\algo\test\test_cv.cu (1352, 2018-06-12)
CUCCL-xbxy\build (0, 2018-06-12)
CUCCL-xbxy\build\Display CCL Result_screenshot_06.06.2018.png (8928, 2018-06-12)
CUCCL-xbxy\build\Display CCL Result_screenshot_08.06.2018.png (37817, 2018-06-12)
CUCCL-xbxy\build\Display CCL Result_screenshot_08.07.2018.png (64896, 2018-06-12)
CUCCL-xbxy\build\algo (0, 2018-06-12)
CUCCL-xbxy\build\algo\ccl_opencv (983169, 2018-06-12)
CUCCL-xbxy\build\algo\ccl_test (810021, 2018-06-12)
CUCCL-xbxy\build\ccl (0, 2018-06-12)
CUCCL-xbxy\build\ccl\ccl (260835, 2018-06-12)
CUCCL-xbxy\build\ccl\ccl_opencv (985093, 2018-06-12)
CUCCL-xbxy\build\ccl\ccl_test (811989, 2018-06-12)
CUCCL-xbxy\build\ccl\test (811973, 2018-06-12)
CUCCL-xbxy\build\ccl_example (827031, 2018-06-12)
CUCCL-xbxy\build\output.txt (327156, 2018-06-12)
CUCCL-xbxy\build\result2.png (37817, 2018-06-12)
CUCCL-xbxy\build\result3.png (64896, 2018-06-12)
CUCCL-xbxy\build\test.png (4660, 2018-06-12)
CUCCL-xbxy\dataset (0, 2018-06-12)
CUCCL-xbxy\dataset\im22.png (5028, 2018-06-12)
... ...

# CuCCL ## A Benchmark for CUDA Implementation of Connected Component Labeling ### ECE 285 GPU Programming. 2018 Spring. Implementing 3 GPU-accelerated CCL algorithms based on CUDA. Comparing performance with demonstrated CPU implementation. Validating and testing the result by intuitive visualization. ## Requirements * CMake 3.0.0 or higher (https://cmake.org/download/) * OpenCV 3.0 or higher (http://opencv.org/downloads.html), * CUDA 8.0 or higher(https://developer.nvidia.com/cuda-downloads) ## Test Dataset * We use part of binary test images in _YACCLAB_ Dataset to validate our implementation. For more details it can be inferred from http://imagelab.ing.unimore.it/yacclab . ![alt text](https://github.com/dataset/im2200.png)![alt text](https://github.com/dataset/im2201.png) ## Examples CUCCL evaluates the performance of CUDA algorithm by comparing the elapsing time and the correctness by number of labels. In addition, we use OpenCV to colorize between adjacent labels. If can check any of these functionality by uncommenting the macros defined in _/example/kernel_evaluation.cpp_ . #define SAVEFILE 1 // Save the output files #define RUNTEST 1 // Perform the whole test #define RUNTIMETEST 1 // Perform timing check #define CORRECTNESSTEST 1 // Perform correctness check #define VISUALIZATION 1 // Visualization Example for run Visualization : mkdir build cd build cmake .. make ./ccl_example LE ${PATH_OF_CUCCL}/dataset/ ${PATH_OF_CUCCL}/images/ ${IMAGE_NAME_TO_VISUALIZE} Alternatively, to run more tests: ./ccl_example LE ${PATH_OF_CUCCL}/dataset/ ${PATH_OF_CUCCL}/images/ $(ls ../dataset) ## Algorithms * Kernel A – Neighbour Propagation A very simple multi-pass labelling method. It parallelises the task of labelling by creating one thread for each cell which loads the field and label data from its cell and the neighbouring cells. * Kernel B – Directional Propagation Labelling Kernel B is designed to overcome the problem that a label can only propagate itself by one cell per iteration (Kernel A) or one block per iteration * Kernel C - Label Equivalence A multi-pass algorithm that records and resolves equivalences. For more details : [Parallel graph component labelling with GPUs and CUDA](https://github.comhttps://www.sciencedirect.com/science/article/pii/S0167819110001055) ## Results One can check the results by running the examples above. ### Visualization Some examples of visualization are shown below. Left side are origin images and right side are colorized ones. ![alt text](https://github.com/images/result1.png) ![alt text](https://github.com/images/result2.png) ![alt text](https://github.com/images/result3.png) ### Performance The performance of the implementation can be inferred from the output logs, i.e. : ... Testing CCL on image :/home/xib008/xib008/CUCCL/dataset/im229.png @ Time elapsed for connectivity 4, GPU : 3.18518 ms @ Time elapsed for connectivity 8, GPU : 3.039 ms @ Time elapsed for connectivity 4, CPU : 27.4683 ms @ Time elapsed for connectivity 8, CPU : 46.0675 ms ======================= TEST PASS ===================== Testing CCL on image :/home/xib008/xib008/CUCCL/dataset/im22.png @ Time elapsed for connectivity 4, GPU : 3.28617 ms @ Time elapsed for connectivity 8, GPU : 3.52***3 ms @ Time elapsed for connectivity 4, CPU : 26.01*** ms @ Time elapsed for connectivity 8, CPU : 51.9001 ms ======================= TEST PASS ===================== Validation Summary : @ Algorithm : LE @ Total Test Images : 1111 @ Total Pass Images : 1111 @ Average Time per Image Connection-4, GPU (ms) : 3.89012 @ Average Time per Image Connection-8, GPU (ms) : 3.71094 @ Average Time per Image Connection-4, CPU (ms) : 37.135 @ Average Time per Image Connection-8, CPU (ms) : 60.3834 ### Outputs The output of the program should be a text file. Each row is the list of pixel indexes of a particular blob. The index is the flattened index of the pixels, following row-major format. The 2-D pixel indexes increase from top-left corner towards the bottom-right corners Examples are shown in _CUCCL/images/*.txt_ ## References 1. K. Hawick, A. Leist and D. Playne, Parallel graph component labelling with GPUs and CUDA, Parallel Computing 36 (12) 655-678 (2010) 2. O. Kalentev, A. Rai, S. Kemnitz and R. Schneider, Connected component labeling on a 2D grid using CUDA, J. Parallel Distrib. Comput. 71 (4) 615-620 (2011) 3. V. M. A. Oliveira and R. A. Lotufo, A study on connected components labeling algorithms using GPUs, SIBGRAPI (2010) 4. https://github.com/foota/ccl

近期下载者

相关文件


收藏者