bwlabel函数的c++实现

实验中需要用到区域联通的算法，就是类似于matlab中bwlabel的函数。网上找了找c++源码未果，bwlabel-python版用python描述了matlab中的实现方法，但是最后对标签的处理部分并未看明白，故自己用c++实现了一个。先直接看bwlabel函数代码：

cv::Mat bwlabel(const cv::Mat in, int * num, const int mode)
{
    const int num_runs = number_of_runs(in);
    int * sc = new int[num_runs];
    int * ec = new int[num_runs];
    int * r = new int[num_runs];
    int * labels = new int[num_runs];
    memset(labels, 0, sizeof(int)*num_runs);
    fill_run_vectors(in, sc, ec, r);
    first_pass(sc, ec, r, labels, num_runs, mode);
    cv::Mat result = cv::Mat::zeros(in.size(), CV_8UC1);

    int number = 0;
    for(int i = 0; i < num_runs; i++)
    {
        uchar * p_row = result.ptr<uchar>(r[i]);
        for(int j = sc[i]; j <= ec[i]; j++)
            p_row[j] = labels[i];
        if(number < labels[i])
            number = labels[i];
    }
    if(num != NULL)
        *num = number;
    delete [] sc;
    delete [] ec;
    delete [] r;
    delete [] labels;
    return result;
}

bwlabel中要用到三个辅助函数：number_of_runs，fill_run_vectors，first_pass。函数number_of_runs计算每一行中非零像素团的个数并累加起来。

1 1 0 0 0 1 1 1 0 0

比如，上面这一行就有2个非零像素团，我们称这样的像素团为Run，函数number_of_runs实现如下：

int number_of_runs(const cv::Mat in)
{
    const int rows = in.rows;
    const int cols = in.cols;
    int result = 0;
    for(int row = 0; row < rows; row++)
    {
        const uchar * p_row = in.ptr<uchar>(row);
        if(p_row[0] != 0)
            result++;
        for(int col = 1; col < cols; col++)
        {
            if(p_row[col] != 0 && p_row[col-1] == 0)
                result++;
        }
    }
    return result;
}

这个函数算法思想是，扫描每一行，对每一行，如果当前元素非零并且前一元素为零则Run的个数加一。

函数fill_run_vectors的作用是填充三个数据结构：sc[],ec[],r[]，它们分别表示开始列标、结束列标和行标，数组长度为由number_of_runs函数得到的Run的个数。函数fill_run_vectors实现如下：

void fill_run_vectors(const cv::Mat in, int sc[], int ec[], int r[])
{
    const int rows = in.rows;
    const int cols = in.cols;
    int idx = 0;
    for(int row = 0; row < rows; row++)
    {
        const uchar * p_row = in.ptr<uchar>(row);
        int prev = 0;
        for(int col = 0; col < cols; col++)
        {
            if(p_row[col] != prev)
            {
                if(prev == 0)
                {
                    sc[idx] = col;
                    r[idx] = row;
                    prev = 1;
                }
                else
                {
                    ec[idx++] = col - 1;
                    prev = 0;
                }
            }
            if(col == cols-1 && prev == 1)
            {
                ec[idx++] = col;
            }
        }
    }
}

算法思想还是遍历每一行，用变量prev保存一行中上一个团是0还是1，如果出现01跳变那么就要记录下新的Run的开始列标和行标，如果出现10跳变（或者这行结束并且prev=1）那么就记录下这个Run的结束列标。

函数first_pass顾名思义，字面上说第一次扫描。因为函数扫描每一个Run块，给它打标签。当出现如下情况时：

1 1 0 0 1 1 1 0 
0 1 1 1 1 0 0 0

函数给第一行第一个Run打上标签1，第二个Run打上标签2，当遍历到第二行时，发现这一行的一个Run与第一行第一个Run相邻，故打上标签1，但当继续遍历时发现这个Run也与第一行第二个Run相邻，但函数并没有改变第一行第二个Run的标签，而是记录下这两个标签其实该一样。遍历完第二行结果为：

1 1 0 0 2 2 2 0 
0 1 1 1 1 0 0 0

遍历完每一个Run过后就是处理刚才未处理的标签了。函数first_pass实现如下：

void first_pass(const int sc[], const int ec[], const int r[],int labels[], const int num_runs, const int mode)
{
    int cur_row = 0;
    int next_label = 1;
    int first_run_on_prev_row = -1;
    int last_run_on_prev_row = -1;
    int first_run_on_this_row = 0;
    int offset = 0;
    int * equal_i = new int[num_runs];
    int * equal_j = new int[num_runs];
    int equal_idx = 0;
    if(mode == 8)
        offset = 1;
    for(int k = 0; k < num_runs; k++)
    {
        if(r[k] == cur_row + 1)
        {
            cur_row += 1;
            first_run_on_prev_row = first_run_on_this_row;
            first_run_on_this_row = k;
            last_run_on_prev_row = k - 1;
        }
        else if(r[k] > cur_row + 1)
        {
            first_run_on_prev_row = -1;
            last_run_on_prev_row = -1;
            first_run_on_this_row = k;
            cur_row = r[k];
        }
        if(first_run_on_prev_row >= 0)
        {
            int p = first_run_on_prev_row;
            while(p <= last_run_on_prev_row && sc[p] <= (ec[k] + offset))
            {
                if(sc[k] <= ec[p] + offset)
                {
                    if(labels[k] == 0)
                        labels[k] = labels[p];
                    else if(labels[k] != labels[p])
                    {
                        //labels[p] = labels[k];
                        equal_i[equal_idx] = labels[k];
                        equal_j[equal_idx] = labels[p];
                        equal_idx += 1;
                    }
                }
                p += 1;
            }
        }
        if(labels[k] == 0)
        {
            labels[k] = next_label++;
        }
    }
    /////////////////////// process labels
    for(int i = 0; i < equal_idx; i++)
    {
        int max_label = equal_i[i] > equal_j[i] ? equal_i[i] : equal_j[i];
        int min_label = equal_i[i] < equal_j[i] ? equal_i[i] : equal_j[i];
        for(int j = 0; j < num_runs; j++)
        {
            if(labels[j] == max_label)
                labels[j] = min_label;
        }
    }
    delete [] equal_i;
    delete [] equal_j;
    /////////////////////process ignore labels
    int * hist = new int[next_label];
    int * non_labels = new int[next_label];
    memset(hist, 0, sizeof(int)*next_label);
    int non_num = 0;
    for(int i = 0; i < num_runs; i++)
    {
        hist[labels[i]]++;
    }
    for(int i = 1; i < next_label; i++)
    {
        if(hist[i] == 0)
            non_labels[non_num++] = i;
    }
    for(int j = 0; j < num_runs; j++)
    {
        int k = labels[j];
        for(int i = non_num-1; i >= 0; i--)
        {
            if(k > non_labels[i])
            {
                labels[j] -= (i+1);
                break;
            }
        }
    }
    delete [] hist;
    delete [] non_labels;
}

前面遍历每一个Run分两种情况，上一行有Run和上一行无Run：当上一行无Run时就分配一个新的标签，当上一行有Run时还要考虑是否与上一行Run相邻，若相邻则打上上一行的标签，当出现上面讲到的情况时就保存这两个标签到数组equal_i,equal_j中。

接下来就是处理equal_i和equal_j这两个数组了，要将它们当中相同族的不同标签合并到一起（注释process labels下面代码）。

这样过后还不能完事，有可能出现标签间断的现象（如1，2，4，6），就是还必须把标签（如1，2，4，6）映射到一个连续的空间（1，2，3，4）。参见注释process ignore labels以下代码。

这样过后就差不多了，最后一步是在bwlabel中给返回的Mat中元素打上对应的标签。

@waring

相关阅读:
搜索进阶1、八数码（HDU1043）
D.迷宫2 (BFS+优先队列)
小H的询问（线段树）
B.迷宫（BFS）
【UVA】10935 Throwing cards away I（STL队列）
【UVA】10391 Compound Words（STL map）
【UVA】12100 Printer Queue（STL队列&优先队列）
【UVA】1596 Bug Hunt（模拟）
【UVA】201 Squares（模拟）
【UVA】1595 Symmetry（模拟）
原文地址：https://www.cnblogs.com/waring/p/4233705.html