• CUDA 编译


    0x00

    Prior to the 5.0 release, CUDA did not support separate compilation, so CUDA code could not call device functions or access variables across files. Such compilation is referred to as whole program compilation. We have always supported the separate compilation of host code, it was just the device CUDA code that needed to all be within one file. Starting with CUDA 5.0, separate compilation of device code is supported, but the old whole program mode is still the default, so there are new options to invoke separate compilation.
    CUDA5.0版本前没有独立编译的支持,所以CUDA的代码不能跨文件访问函数和变量。当时称为[整体编译],值得注意的是,无论CUDA什么版本,都支持host端的独立编译(例如g++编译器),在这种情况下,cuda的代码要放在一个文件里。从5.0以后,开始支持device端的独立编译(nvcc编译器)。默认还是整体编译,使用独立编译需要设置nvcc参数。

    关于CUDA的编译过程可以参考这个文档,https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda

    0x01

    举个例子,在cmu15418的作业中,有写好的makefile,如下

      1
       2 EXECUTABLE := cudaSaxpy
       3
       4 CU_FILES   := saxpy.cu
       5
       6 CU_DEPS    :=
       7
       8 CC_FILES   := main.cpp
       9
      10 ###########################################################
      11
      12 ARCH=$(shell uname | sed -e 's/-.*//g')
      13
      14 OBJDIR=objs
      15 CXX=g++ -m64
      16 CXXFLAGS=-O3 -Wall
      17 ifeq ($(ARCH), Darwin)
      18 # Building on mac
      19 LDFLAGS=-L/usr/local/depot/cuda-8.0/lib/ -lcudart
      20 else
      21 # Building on Linux
      22 # LDFLAGS=-L/usr/local/cuda-11.2/lib64/ -lcudart
      23 LDFLAGS=-L/data/cuda/cuda-11.1/cuda/lib64 -lcudart
      24 endif
      25 NVCC=nvcc
      26 NVCCFLAGS=-O3 -m64 #--gpu-architecture compute_35
      27
      28
      29 OBJS=$(OBJDIR)/main.o  $(OBJDIR)/saxpy.o
      30
      31
      32 .PHONY: dirs clean
      33
      34 default: $(EXECUTABLE)
      35
      36 dirs:
      37         mkdir -p $(OBJDIR)/
      38
      39 clean:
      40         rm -rf $(OBJDIR) *.ppm *~ $(EXECUTABLE)
      41
      42 $(EXECUTABLE): dirs $(OBJS)
      43         $(CXX) $(CXXFLAGS) -o $@ $(OBJS) $(LDFLAGS)
      44
      45 $(OBJDIR)/%.o: %.cpp
      46         $(CXX) $< $(CXXFLAGS) -c -o $@
      47
      48 $(OBJDIR)/%.o: %.cu
      49         $(NVCC) $< $(NVCCFLAGS) -c -o $@
    

    default -> executable -> objs -> objdir/main.o & objdir/saxpy.o
    objdir/main.o -> main.cpp & main.cu
    main.cpp -> g++ -m64 main.cpp -O3 -Wall -c -o main.o
    objdir/saxpy.o -> saxpy.cpp & saxpy.cu
    saxpy.cu -> nvcc saxpy.cu -O3 -m64 -c -o saxpy.o
    dirs objs/main.o objs/saxpy.o -> g++ -m64 -o cudaSaxpy objs/main.o objs/saxpy.o -L/data/cuda/cuda-11.1/cuda/lib64 -lcudart
    关于makefile中特殊符号的定义:
    https://stackoverflow.com/questions/3220277/what-do-the-makefile-symbols-and-mean

  • 相关阅读:
    如何发现需求
    测试linux和window下 jdk最大能使用多大内存
    java获取汉字的拼音 简单版
    oracle一条sql执行导入sql文件
    oracle使用闪回功能恢复删除的表数据
    linux环境变量配置
    有两张表;使用SQL查询,查询所有的客户订单日期最新的前五条订单记录。 糖不苦
    jQuery作业 点击出弹框 糖不苦
    #{}和${}的区别是什么? 糖不苦
    在html页面中如何使用jQuery? 糖不苦
  • 原文地址:https://www.cnblogs.com/ijpq/p/15805560.html
Copyright © 2020-2023  润新知