查找分类:
- 静态查找:查找不成功,只返回一个不成功标志(不涉及插入删除)
- 动态查找:查找不成功,将被查找的记录插入到查找集合中
查找结构:
- 线性表:适用于静态查找,主要采用技术:顺序查找、折半查找
- 树表:适用于动态查找,主要采用技术:二叉排序树
- 散列表:适用于静态查找和动态查找,主要采用技术:散列
查找算法性能:
平均查找长度:(Pi:查找第i个记录的概率;Ci:关键码比较次数)
查获成功、查找失败两种情况
线性表
顺序查找
顺序表:
单链表:
(有序)折半查找
递归算法:
非递归算法:
publicint binarySearch(int[] r,int a){
low =0, high = r.length -1;
while(low <= high)
{
mid =(low + high)/2;
if(a < r[mid]) high = mid -1;
elseif(a > r[mid]) low = mid +1;
else return mid;
}
return-1;
}
折半查找判定树:
树表
线性表的插入删除需要O(n),只适合静态查找 >> 使用树表快速完成插入和查找(动态查找)
二叉排序树
- 插入与构造:
privateBiNode root = null;
BST(int[] r)
{
for(int i =0; i < r.length; i++)
{
BiNode s =newBiNode(r[i],null,null);
root = insertBST(root, s);
// insertBST1(s);
}
}
//插入(递归)
publicBiNode insertBST(BiNode head,BiNode s)
{
if(head == null) {
head = s; return head; }if(s.data < head.data)
head.left = insertBST(head.left, s);
else
head.right = insertBST(head.right,s);
return head;
}
//插入(非递归:循环、用临时变量保存过程)
publicvoid insertBST1(BiNode s)
{
if(root == null) {
root = s; return; }BiNode temp = root;//需要临时结点记录
while(true)
{
if(s.data < temp.data)
{
if(temp.left == null)
{
temp.left = s;
return;
}
temp = temp.left;
}
else
{
if(temp.right == null)
{
temp.right = s;
return;
}
temp = temp.right;
}
}
}
- 查找:
//查找
privateBiNode searchBST(BiNode head,int a)
{
if(head == null)
return null;if(a < head.data)
return searchBST(head.left, a);
elseif(a > head.data)
return searchBST(head.right, a);
else
return head;
}
- 删除:
//
publicvoid deleteLeftBST(BiNode f,BiNode p)
{
if(p.left == null && p.right == null) //p为叶子
{
f.left = null;
p = null;
}
elseif(p.right == null) //p只有左子树
{
f.left = p.left;
p = null;
}
elseif(p.left == null) //p只有右子树
{
f.left = p.right;
p = null;
}
else //p的左右子树均不空
{
BiNode par = p, s = par.right; //用par s 去查找p的右子树的最左下结点
while(s.left != null)
{
par = s;
s = par.left;
}
p.data = s.data; //交换最左下结点s与p结点数据
//剪枝(删除s结点)
if(par == p) //处理特殊情况
{
par.right = s.right;
s = null;
}
else //处理一般情况
{
par.left = s.right;
s = null;
}
}
}
性能分析:
- 当二叉树排序树是平衡的(形态均衡),则有n个结点的二叉树的高度是 [ log2(n) ] + 1,其查找效率为O(log2(n)),近似于折半查找;
- 当二叉树排序树不平衡(最坏为一课斜树),退化为顺序查找,查找效率为O(n)
- 二叉排序树查找性能一般在:O(log2(n))~ O(n)
平衡二叉树
当二叉排序树初始化数组按序排列时,其性能退化为O(n) >> 应构造平衡二叉树
平衡二叉树:
1、根节点的左子树和右子树深度最多相差1
2、根节点的左子树和右子树也是平衡二叉树
性能分析:
- All operations on AVL trees are O(log2 n).
- The AVL tree is also weight balanced. Sizes of left and right can’t vary by more than a factor of roughly two.
- Longest path to a leaf can’t differ from other paths to leaves by more than O(log n).
红黑树
- A node is either red or black.
- The root is black.
- All leaves (NIL) are black.
- Every red node must have two black child nodes.
- Every path from a given node to any of its descendant leaves contains the same number of black nodes.
1. 节点是红色或黑色。
2. 根是黑色。
3. 所有叶子都是黑色(叶子是NIL节点)。
4. 每个红色节点必须有两个黑色的子节点。(从每个叶子到根的所有路径上不能有两个连续的红色节点。)
5. 从任一节点到其每个叶子的所有简单路径都包含相同数目的黑色节点。
性能分析:
Properties (4) and (5) guarantee that the difference in depth of any two nodes can’t vary by more than a factor of two.
Many insert and remove operations can preserve this without having to do a rotation.
AVL树与与红黑树比较:
- AvlTrees are slightly better balanced than RedBlackTrees.
- Both trees take O(log n) time overall for lookups, insertions, and deletions, but for insertion and deletion AvlTrees requires O(log n) rotations, while RedBlackTrees takes only O(1) rotations. Since rotations mean writing to memory, and writing to memory is expensive, RedBlackTrees are in practice faster to update than AvlTrees.
- Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time. 适合于:time-sensitive applications such as real-time applications; data structures which provide worst-case guarantees
- AVL tree is more rigidly balanced than red–black trees, leading to slower insertion and removal but faster retrieval. 适合于:datastructures that may be built once and loaded without reconstruction, such as language dictionaries (or programdictionaries, such as the opcodes of an assembler or interpreter).
参考:
wikipedia
B树(B-树)、B+树、B*树
对外存中的文件进行索引查找
- B树(B-树):
- B+树:
- B*树:
散列表
散列函数
1、直接定址法
2、除留取余法
3、数字分析法
4、平方取中法
5、折叠法
处理冲突方法
1、开放定址法(闭散列表)
(1)线性探测法
(2)二次探测法
(3)随机探测法
线性探测代码:
2、拉链法 / 链地址法(开散列表)
开散列表与闭散列表的比较:
综合比较
参考资料:《数据结构(C++版)》王红梅