• 【CUDA】Win10 + VS2017新 CUDA 项目配置


    一、新建项目

      打开VS2017 → 新建项目 → Win32控制台应用程序 → “空项目”打钩

    二、调整配置管理器平台类型

      右键项目 → 属性 → 配置管理器 → 全改为“x64”

      

    三、配置生成属性

      右键项目 → 生成依赖项 → 生成自定义 → 勾选“CUDA 9.0XXX”

      

    三、配置基本库目录

      注意:后续步骤中出现的目录地址需取决于你当前的CUDA版本及安装路径

      右键项目 → 属性 → 配置属性 → VC++目录 → 包含目录,添加以下目录:

    • C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.0include
    • C:ProgramDataNVIDIA CorporationCUDA Samplesv9.0commoninc

      …… → 库目录,添加以下目录:

    • C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.0libx64
    • C:ProgramDataNVIDIA CorporationCUDA Samplesv9.0commonlibx64

      

    四、配置CUDA静态链接库路径

      右键项目 → 属性 → 配置属性 → 链接器 → 常规 → 附加库目录,添加以下目录:

    • $(CUDA_PATH_V9_0)lib$(Platform)

      

    五、选用CUDA静态链接库

      右键项目 → 属性 → 配置属性 → 链接器 → 输入 → 附加依赖项,添加以下库:

    • cublas.lib;cublas_device.lib;cuda.lib;cudadevrt.lib;cudart.lib;cudart_static.lib;cufft.lib;cufftw.lib;curand.lib;cusolver.lib;cusparse.lib;nppc.lib;nppial.lib;nppicc.lib;nppicom.lib;nppidei.lib;nppif.lib;nppig.lib;nppim.lib;nppist.lib;nppisu.lib;nppitc.lib;npps.lib;nvblas.lib;nvcuvid.lib;nvgraph.lib;nvml.lib;nvrtc.lib;OpenCL.lib;
      以上为 “第三步” 中添加的库目录 “C:Program FilesNVIDIA GPU Computing ToolkitCUDAv9.0libx64” 中的库!
    • 注意:kernel32.lib;user32.lib;gdi32.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies)
      这些库为原有!

    六、配置源码文件风格

      右键源文件 → 添加 → 新建项 → 选择 “CUDA C/C++ File”

      右键 “xxx.cu" 源文件 → 属性 → 配置属性 → 常规 → 项类型 → 设置为“CUDA C/C++”

      

    七、测试程序

     1 #include "cuda_runtime.h"
     2 #include "device_launch_parameters.h"
     3 #include <stdio.h>
     4 
     5 int main() {
     6     int deviceCount;
     7     cudaGetDeviceCount(&deviceCount);
     8 
     9     int dev;
    10     for (dev = 0; dev < deviceCount; dev++)
    11     {
    12         int driver_version(0), runtime_version(0);
    13         cudaDeviceProp deviceProp;
    14         cudaGetDeviceProperties(&deviceProp, dev);
    15         if (dev == 0)
    16             if (deviceProp.minor = 9999 && deviceProp.major == 9999)
    17                 printf("
    ");
    18         printf("
    Device%d:"%s"
    ", dev, deviceProp.name);
    19         cudaDriverGetVersion(&driver_version);
    20         printf("CUDA驱动版本:                                   %d.%d
    ", driver_version / 1000, (driver_version % 1000) / 10);
    21         cudaRuntimeGetVersion(&runtime_version);
    22         printf("CUDA运行时版本:                                 %d.%d
    ", runtime_version / 1000, (runtime_version % 1000) / 10);
    23         printf("设备计算能力:                                   %d.%d
    ", deviceProp.major, deviceProp.minor);
    24         printf("Total amount of Global Memory:                  %u bytes
    ", deviceProp.totalGlobalMem);
    25         printf("Number of SMs:                                  %d
    ", deviceProp.multiProcessorCount);
    26         printf("Total amount of Constant Memory:                %u bytes
    ", deviceProp.totalConstMem);
    27         printf("Total amount of Shared Memory per block:        %u bytes
    ", deviceProp.sharedMemPerBlock);
    28         printf("Total number of registers available per block:  %d
    ", deviceProp.regsPerBlock);
    29         printf("Warp size:                                      %d
    ", deviceProp.warpSize);
    30         printf("Maximum number of threads per SM:               %d
    ", deviceProp.maxThreadsPerMultiProcessor);
    31         printf("Maximum number of threads per block:            %d
    ", deviceProp.maxThreadsPerBlock);
    32         printf("Maximum size of each dimension of a block:      %d x %d x %d
    ", deviceProp.maxThreadsDim[0],
    33             deviceProp.maxThreadsDim[1],
    34             deviceProp.maxThreadsDim[2]);
    35         printf("Maximum size of each dimension of a grid:       %d x %d x %d
    ", deviceProp.maxGridSize[0], deviceProp.maxGridSize[1], deviceProp.maxGridSize[2]);
    36         printf("Maximum memory pitch:                           %u bytes
    ", deviceProp.memPitch);
    37         printf("Texture alignmemt:                              %u bytes
    ", deviceProp.texturePitchAlignment);
    38         printf("Clock rate:                                     %.2f GHz
    ", deviceProp.clockRate * 1e-6f);
    39         printf("Memory Clock rate:                              %.0f MHz
    ", deviceProp.memoryClockRate * 1e-3f);
    40         printf("Memory Bus Width:                               %d-bit
    ", deviceProp.memoryBusWidth);
    41     }
    42 
    43     return 0;
    44 }

      输出结果:

      

  • 相关阅读:
    JUnit常用断言及注解
    centos7 yum快速安装LNMP
    ceph问题汇总
    selinux介绍/状态查看/开启/关闭
    linux 修改主机名
    CentOS 7部署 Ceph分布式存储架构
    如何判断当前系统运行在物理机上还是虚拟机上,返回虚拟机的类型
    Golang操作结构体、Map转化为JSON
    PHP强制修改返回的状态码
    composer问题集锦
  • 原文地址:https://www.cnblogs.com/wayne793377164/p/8185404.html
Copyright © 2020-2023  润新知