基础算法-查找：线性索引查找（II）

索引查找是在索引表和主表(即线性表的索引存储结构)上进行的查找。

索引查找的过程是：

1)首先根据给定的索引值K1，在索引表上查找出索引值等于K1的索引项，以确定对应子表在主表中的开始位置和长度。

2)然后再根据给定的关键字K2，在对应的子表中查找出关键字等于K2的元素(结点)。

对索引表或子表进行查找时，若表是顺序存储的有序表，则既可进行顺序查找，也可进行二分查找，否则只能进行顺序查找。

一提到“索引”，估计大家第一反应就是“数据库索引”，对的，其实主键建立“索引”，就是方便我们在海量数据中查找。

实现索引查找时常使用的三个术语：

1)主表：这个很简单，要查找的对象，主表在逻辑上被划分为一个一个的子表。

2)索引项：一般我们会用函数将一个主表划分成几个子表，每个子表建立一个索引，这个索引叫做索引项。

3)索引表：索引项的集合也就是索引表。

一般“索引项”包含三种内容：index，start，length

第一：index，也就是索引指向主表的关键字。

第二：start，也就是index在主表中的位置。

第三：length, 也就是子表的区间长度。

下面做一个有关分块索引的实现：

在主表中有10000个数据，现在分成100个子表，每个子表有100条记录，整个索引表有100个索引项对应100个子表。

对于某个子表，把它的所有记录中的最大项作为它对应的索引项的index。

在存储数据的时候，每个子表都有各自的数值的范围，插入一个新的数据时，先找到其要存储的位置（在那张子表中存储），然后查看这张子表是否有剩余的空间，如果有剩余的空间，插入该值然后更新对应的索引项。

程序代码实现

const int MaxSize = 10000;
const int IndexItemNum = 100;

//定义主表
struct student_info
{
	int score;
	//some other attributes
};
typedef struct student_info mainTable[MaxSize];

//定义索引表
struct indexItem
{
	int index;
	int start;
	int length;
};
typedef struct indexItem indexTable[IndexItemNum];


class IndexSearch
{
public:
	IndexSearch();

	//在主表中添加一个元素，添加后返回它的下标，失败后返回-1
	int addElements(int key);

	//返回搜索值在主表中的下标，如果检索失败返回-1
	int searchElements(int key);

private:
	indexTable index_table;
	mainTable main_table;

	//索引表中索引项的个数
	const int indexItem_num;

	//每个子表存储的数据的个数
	const int numPerBlock;
};

　　上述代码定义了主表和索引表，同时确定了主表分成多少个子表和每个子表中存储数据的个数。

#include "IndexSearch.h"

int IndexSearch::searchElements(int element)
{
	int i,j;

	//检索索引表
	int low = 0;
	int high = indexItem_num - 1;
	while(low <= high)
	{
		int mid = (low +high) / 2;

		if(element > index_table[mid].index)
		{
			low = mid + 1;
		}
		else if(element < index_table[mid-1].index)
		{
			high = mid - 1;
		}
		else
		{
			i = mid;//用i暂存结果
			break;
		}

	}//while

	if(low > high)
	{
		return -1;	//检索失败
	}

	//检索主表
	low  = index_table[i].start;
	high = index_table[i].start + index_table[i].length;

	for(j = low; j < high; j++)
	{
		if(main_table[j].score == element)
		{
			break;
		}
	}//for
	if(j < high)
	{
		return j;
	}
	else
	{
		return -1;
	}

}
int IndexSearch::addElements(int element)
{
	int tag = element / numPerBlock;
	if(tag > numPerBlock - 1)
	{
		tag = numPerBlock - 1;
	}

	if(index_table[tag].length < numPerBlock)
	{
		int start  = index_table[tag].start;
		int length = index_table[tag].length;
		main_table[start + length].score = element;

		index_table[tag].length++;
		if(element > index_table[tag].index)
		{
			index_table[tag].index = element;
		}

		return start + length;
	}
	else
	{
		return -1;
	}
}

IndexSearch::IndexSearch(): indexItem_num(IndexItemNum), numPerBlock(100)
{
	int index = 0;
	int start = 0;

	for(int i = 0; i < 100; i++)
	{
		index_table[i].length = 0;
		index_table[i].index  = index;
		index_table[i].start  = start; 

		index += 100;
		start += 100;
	}//for

}

　　初始化子表的个数和每个子表中存储数据的个数，同时实现插入和查找这两个函数。

测试代码：

#include <iostream>
#include <ctime>
#include "IndexSearch.h"
using namespace std;

void main()
{
	srand(time(0));
	IndexSearch searchTool;
	int index;
	int key;
	for(int i = 0; i < 1000; i++)
	{
		key = rand()%10000;

		index = searchTool.addElements(key);
		cout << "the number: " << key << " is insert in " << index << endl;
	}

	index = searchTool.addElements(2015);
	cout << "the number 2015: "<< " is insert in " << index << endl;
	//查找刚才插入的这个值
	cout << "the number 2015 is stored is the position of "<< searchTool.searchElements(2015) << endl;

	index = searchTool.addElements(467);
	cout << "the number 467: "<< " is insert in " << index << endl;
	//查找刚才插入的这个值
	cout << "the number 467 is stored is the position of "<< searchTool.searchElements(467) << endl;
}

相关阅读:
confluence的安装、破解和汉化
 Linux学习经验集锦
 MFS 分布式文件系统
 MFS
Docker 搭建 WordPress
ansible入门
 docker搭建pxc集群与haproxy负载均衡
 mysql-proxy 实现读写分离
 Linux内核学习总结
 lab8：理解进程调度时机跟踪分析进程调度与进程切换的过程
原文地址：https://www.cnblogs.com/stemon/p/4478557.html