trie树常用于搜索提示。如当输入一个网址,可以自动搜索出可能的选择。当没有完全匹配的搜索结果,可以返回前缀最相似的可能。
一、Tire树的基本性质
- 根节点不包含字符,除根节点外每一个节点都只包含一个字符。
- 从根节点到某一节点,路径上经过的字符连接起来,为该节点对应的字符串。
- 每个节点的所有子节点包含的字符都不相同。
Trie 树的本质,就是利用字符串之间的公共前缀,将重复的前缀合并在一起,比如我们有[b,abc,abd,bcd,abcd,efg,hii ]这个字符串集合,可以将其构建成下面这棵 Trie 树:
每个节点表示一个字符串中的字符,从根节点到红色节点的一条路径表示一个字符串(红色节点表示是某个单词的结束字符,但不一定都是叶子节点)。这样,我们就可以通过遍历这棵树来检索是否存在待匹配的字符串了
二、如何实现Tire树
Tire主要包含两个操作,一个是将字符串集合构造成 Trie 树。这个过程分解开来的话,就是一个将字符串插入到 Trie 树的过程。另一个是在 Trie 树中查询一个字符串。
Trie 树是个多叉树,在这里用数组来存储一个节点的所有子结点。
Trie树节点类,PHP代码实现:
1 <?php 2 /** 3 * TrieNode.php 4 * Created on 2019/4/29 14:53 5 * Created by Wilin 6 */ 7 8 class TrieNode 9 { 10 public $data; 11 public $children = []; 12 public $isEndingChar = false; 13 14 public function __construct($data) 15 { 16 $this->data = $data; 17 } 18 }
Trie树,PHP代码实现:
1 <?php 2 /** 3 * Tire.php 4 * Created on 2019/4/29 14:57 5 * Created by Wilin 6 */ 7 8 include "TrieNode.php"; 9 10 class Tire { 11 private $root; 12 13 public function __construct() { 14 $this->root = new TrieNode('/'); //根节点 15 } 16 17 public function getRoot() { 18 return $this->root; 19 } 20 21 public function insert($text) { 22 $p = $this->root; 23 for ($i = 0; $i < mb_strlen($text); $i++) { 24 $index = $data = $text[$i]; 25 26 if (empty($p->children[$index])) { 27 $newNode = new TrieNode($data); 28 $p->children[$index] = $newNode; 29 } 30 $p = $p->children[$index]; 31 } 32 $p->isEndingChar = true; 33 } 34 35 public function find($pattern) { 36 $p = $this->root; 37 for ($i = 0; $i < mb_strlen($pattern); $i++) { 38 $index = $data = $pattern[$i]; 39 40 if (empty($p->children[$index])) { 41 return false; 42 } 43 $p = $p->children[$index]; 44 } 45 if ($p->isEndingChar == false) { 46 return false; 47 } 48 return true; 49 } 50 } 51 52 $trie = new Tire(); 53 $strings = ['b','abc','abd','bcd','abcd','efg','hii']; 54 foreach ($strings as $str) { 55 $trie->insert($str); 56 } 57 if ($trie->find('bcd')) { 58 print "包含这个字符串 "; 59 } else { 60 print "不包含这个字符串 "; 61 } 62 print_r($trie->getRoot());
打印结果如下:
E:www ree3>php Tire.php 包含这个字符串 TrieNode Object ( [data] => / [children] => Array ( [b] => TrieNode Object ( [data] => b [children] => Array ( [c] => TrieNode Object ( [data] => c [children] => Array ( [d] => TrieNode Object ( [data] => d [children] => Array ( ) [isEndingChar] => 1 ) ) [isEndingChar] => ) ) [isEndingChar] => 1 ) [a] => TrieNode Object ( [data] => a [children] => Array ( [b] => TrieNode Object ( [data] => b [children] => Array ( [c] => TrieNode Object ( [data] => c [children] => Array ( [d] => TrieNode Object ( [data] => d [children] => Array ( ) [isEndingChar] => 1 ) ) [isEndingChar] => 1 ) [d] => TrieNode Object ( [data] => d [children] => Array ( ) [isEndingChar] => 1 ) ) [isEndingChar] => ) ) [isEndingChar] => ) [e] => TrieNode Object ( [data] => e [children] => Array ( [f] => TrieNode Object ( [data] => f [children] => Array ( [g] => TrieNode Object ( [data] => g [children] => Array ( ) [isEndingChar] => 1 ) ) [isEndingChar] => ) ) [isEndingChar] => ) [h] => TrieNode Object ( [data] => h [children] => Array ( [i] => TrieNode Object ( [data] => i [children] => Array ( [i] => TrieNode Object ( [data] => i [children] => Array ( ) [isEndingChar] => 1 ) ) [isEndingChar] => ) ) [isEndingChar] => ) ) [isEndingChar] => )
参考资料:https://www.cnblogs.com/luosongchao/p/3239521.html,https://articles.zsxq.com/id_qa0npqvszcmx.html