在利用OpenCL编程中,需要对GPU设备的底层领略,这样才气更好的举办代码优化。
好比计较单位CU数量,每个CU的执行单位PE数量,每个CU中的共享内存巨细等等。只有相识了这些才气更好的利用共享内存,设计核函数的运行参数等。
1.clGetDeviceInfo
OpenCL利用clGetDeviceInfo函数获取设备详细,函数原型如下:
cl_int clGetDeviceInfo (
cl_device_id device, //设备id号
cl_device_info param_name, //列举变量,要获取的设备信息名称
size_t param_value_size, //参数范例巨细
void *param_value, //参数值
size_t *param_value_size_ret //参数范例巨细
);
这个函数需要挪用两次,第一次获取参数范例巨细,第二次获取参数。
2.代码实例
2.1 tool.h 与tool.cpp
见:http://www.cnblogs.com/xudong-bupt/p/3582780.html
2.2 QueryDeviceInfo.cpp
#include <stdio.h> #include <stdlib.h> #include "tool.h" #include <CL/cl.h> int main() { ///Get first available Platform cl_platform_id platform; getPlatform(platform); ///get first available GPU cl_device_id *devices=getCl_device_id(platform); char *value; size_t valueSize; size_t maxWorkItemPerGroup; cl_uint maxComputeUnits=0; cl_ulong maxGlobalMemSize=0; cl_ulong maxConstantBufferSize=0; cl_ulong maxLocalMemSize=0; ///print the device name clGetDeviceInfo(devices[0], CL_DEVICE_NAME, 0, NULL, &valueSize); value = (char*) malloc(valueSize); clGetDeviceInfo(devices[0], CL_DEVICE_NAME, valueSize, value, NULL); printf("Device Name: %s\n", value); free(value); /// print parallel compute units(CU) clGetDeviceInfo(devices[0], CL_DEVICE_MAX_COMPUTE_UNITS,sizeof(maxComputeUnits), &maxComputeUnits, NULL); printf("Parallel compute units: %u\n", maxComputeUnits); ///maxWorkItemPerGroup clGetDeviceInfo(devices[0], CL_DEVICE_MAX_WORK_GROUP_SIZE,sizeof(maxWorkItemPerGroup), &maxWorkItemPerGroup, NULL); printf("maxWorkItemPerGroup: %zd\n", maxWorkItemPerGroup); /// print maxGlobalMemSize clGetDeviceInfo(devices[0], CL_DEVICE_GLOBAL_MEM_SIZE,sizeof(maxGlobalMemSize), &maxGlobalMemSize, NULL); printf("maxGlobalMemSize: %lu(MB)\n", maxGlobalMemSize/1024/1024); /// print maxConstantBufferSize clGetDeviceInfo(devices[0], CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE,sizeof(maxConstantBufferSize), &maxConstantBufferSize, NULL); printf("maxConstantBufferSize: %lu(KB)\n", maxConstantBufferSize/1024); /// print maxLocalMemSize clGetDeviceInfo(devices[0], CL_DEVICE_LOCAL_MEM_SIZE,sizeof(maxLocalMemSize), &maxLocalMemSize, NULL); printf("maxLocalMemSize: %lu(KB)\n", maxLocalMemSize/1024); free(devices); return 0; }
执行功效:
3.其他
在安装了OpenCL的平台,可以利用呼吁:clinfo
The OpenCL Specification : https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf
作者:cnblogs 旭东的博客