• VS2015+CUDA9.2编写第一个CUDA程序


    参考博客:https://www.pianshen.com/article/943122806/

    一、新建项目

      打开VS2015 → 新建项目 → 其他 → “空项目”

    二、调整配置管理器平台类型

      右键项目属性 → 配置管理器 → 全改为“x64”

     

     三、配置生成属性

      右键项目 →生成依赖项 →生成自定义 →勾选“CUDA 9.2xxx”

    四、配置基本库文件(CUDA9.2使用默认安装路径)

      右键项目 → 属性 → 配置属性 → VC++目录 → 包含目录,添加以下目录:

    • C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.2include

      …… → 库目录,添加以下目录:

    • C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.2libx64

    五、配置CUDA静态链接库路径

      右键项目 → 属性 → 配置属性 → 链接器 → 常规 → 附加库目录,添加以下目录:

    • $(CUDA_PATH_V9_2)lib$(Platform)

    六、选用CUDA静态链接库(直接复制即可)

      右键项目 → 属性 → 配置属性 → 链接器 → 输入 → 附加依赖项,添加以下库:

    • cublas.lib;cublas_device.lib;cuda.lib;cudadevrt.lib;cudart.lib;cudart_static.lib;cufft.lib;cufftw.lib;curand.lib;cusolver.lib;cusparse.lib;nppc.lib;nppial.lib;nppicc.lib;nppicom.lib;nppidei.lib;nppif.lib;nppig.lib;nppim.lib;nppist.lib;nppisu.lib;nppitc.lib;npps.lib;nvblas.lib;nvcuvid.lib;nvgraph.lib;nvml.lib;nvrtc.lib;OpenCL.lib;

      以上为 “第三步” 中添加的库目录 “C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.0libx64” 中的

      库!

    七、配置源码文件风格

      右键源文件 → 添加 → 新建项 → 选择 “CUDA C/C++ File”

      右键 “xxx.cu" 源文件 → 属性 → 配置属性 → 常规 → 项类型 → 设置为“CUDA C/C++”

    八  将测试程序拷贝到  .cu文件中

     1 #include "cuda_runtime.h"
     2 
     3 #include "device_launch_parameters.h"
     4 
     5 #include <stdio.h>
     6 #include <stdlib.h>
     7 
     8 
     9 void main() {
    10 
    11     int deviceCount;
    12 
    13     cudaGetDeviceCount(&deviceCount);
    14 
    15 
    16 
    17     int dev;
    18 
    19     for (dev = 0; dev < deviceCount; dev++)
    20 
    21     {
    22 
    23         int driver_version(0), runtime_version(0);
    24 
    25         cudaDeviceProp deviceProp;
    26 
    27         cudaGetDeviceProperties(&deviceProp, dev);
    28 
    29         if (dev == 0)
    30 
    31             if (deviceProp.minor = 9999 && deviceProp.major == 9999)
    32 
    33                 printf("
    ");
    34 
    35         printf("
    Device%d:"%s"
    ", dev, deviceProp.name);
    36 
    37         cudaDriverGetVersion(&driver_version);
    38 
    39         printf("CUDA驱动版本:                                   %d.%d
    ", driver_version / 1000, (driver_version % 1000) / 10);
    40 
    41         cudaRuntimeGetVersion(&runtime_version);
    42 
    43         printf("CUDA运行时版本:                                 %d.%d
    ", runtime_version / 1000, (runtime_version % 1000) / 10);
    44 
    45         printf("设备计算能力:                                   %d.%d
    ", deviceProp.major, deviceProp.minor);
    46 
    47         printf("Total amount of Global Memory:                  %u bytes
    ", deviceProp.totalGlobalMem);
    48 
    49         printf("Number of SMs:                                  %d
    ", deviceProp.multiProcessorCount);
    50 
    51         printf("Total amount of Constant Memory:                %u bytes
    ", deviceProp.totalConstMem);
    52 
    53         printf("Total amount of Shared Memory per block:        %u bytes
    ", deviceProp.sharedMemPerBlock);
    54 
    55         printf("Total number of registers available per block:  %d
    ", deviceProp.regsPerBlock);
    56 
    57         printf("Warp size:                                      %d
    ", deviceProp.warpSize);
    58 
    59         printf("Maximum number of threads per SM:               %d
    ", deviceProp.maxThreadsPerMultiProcessor);
    60 
    61         printf("Maximum number of threads per block:            %d
    ", deviceProp.maxThreadsPerBlock);
    62 
    63         printf("Maximum size of each dimension of a block:      %d x %d x %d
    ", deviceProp.maxThreadsDim[0],
    64 
    65             deviceProp.maxThreadsDim[1],
    66 
    67             deviceProp.maxThreadsDim[2]);
    68 
    69         printf("Maximum size of each dimension of a grid:       %d x %d x %d
    ", deviceProp.maxGridSize[0], deviceProp.maxGridSize[1], deviceProp.maxGridSize[2]);
    70 
    71         printf("Maximum memory pitch:                           %u bytes
    ", deviceProp.memPitch);
    72 
    73         printf("Texture alignmemt:                              %u bytes
    ", deviceProp.texturePitchAlignment);
    74 
    75         printf("Clock rate:                                     %.2f GHz
    ", deviceProp.clockRate * 1e-6f);
    76 
    77         printf("Memory Clock rate:                              %.0f MHz
    ", deviceProp.memoryClockRate * 1e-3f);
    78 
    79         printf("Memory Bus Width:                               %d-bit
    ", deviceProp.memoryBusWidth);
    80 
    81     }
    82     system("pause");
    83     //return 0;
    84     
    85 }

     成功的结果:

  • 相关阅读:
    将node.js代码放到阿里云上,并启动提供外部接口供其访问
    Linux内核深度解析之内核互斥技术——读写信号量
    man 1 2 3 4...
    Android Sepolicy 相关工具
    selinux misc
    ext4 mount options
    tune2fs cmd(ext fs)
    /dev/tty node
    kernel misc
    fork & vfork
  • 原文地址:https://www.cnblogs.com/ponxiaoming/p/12518851.html
Copyright © 2020-2023  润新知