Pitfalls of using opencv GpuMat data in CUDA kernel code

Please note that cv::cuda::GpuMat and cv::Mat using different memory allocation method. cv::cuda::GpuMat the data in is Nvidia Gpu Ram, but cv::Mat store in normal Ram.

The cv::Mat allocated memory normally is continuous, but cv::cuda::GpuMat may have gap between row and row data. Because cv::cuda::GpuMat is using cuda function cudaMallocPitch, which make the step size different from COLS.

So when passing the row data of cv::cuda::GpuMat into a CUDA kernel function, should also pass in the step size into it, so the function can access the row data correctly. If using COLS instead of step, it will easily get wrong, and it is a headache to debug the problem.

For example:

__global__
void kernel_select_cmp_point(
    float* dMap,
    float* dPhase,
    uint8_t* matResult,
    uint32_t step,
    const int ROWS,
    const int COLS,
    const int span) {
    int start = blockIdx.x * blockDim.x + threadIdx.x;
    int stride = blockDim.x * gridDim.x;

    for (int row = start; row < ROWS; row += stride) {
        int offsetOfInput = row * step;
        int offsetOfResult = row * step;
    }
}

相关阅读:
1924班网址
20192414《网络空间安全导论》第一周学习总结
H-Angry
A-Plague Inc
B-Rolling The Polygon
F-Moving On
A-Maximum Element In A Stack
J-Word Search
E-Pawns
D-Lift Problems

原文地址：https://www.cnblogs.com/shengguang/p/10794827.html