• 使用正则表达式突出搜索结果


    例:

    <style>
    	.word-0{ background-color: yellow; }
    	.word-1{ border:1px solid red; }
    </style>
    
    <?php
    
    header('Content-type:text/html;charset=utf-8');
    
    /* 标记Web页面 */
    $body = '
    <p>I like pickles and hrrring.</p>
    <a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
    I have herringbone-patterned toaster cozy.
    <herring>Herring is not a real HTML element!</herring>
    ';
    
    $words = array('pickle', 'herring');
    $replacements = array();
    foreach($words as $i => $word) {
    	$replacements[] = "<span class='word-$i'>$word</span>";
    }
    
    // 将页面分解为多个块
    // 由看上去类似HTML元素的部分分隔
    $parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
    //var_dump($parts);
    /*
    array (size=15)
      0 => string '
    ' (length=2)
      1 => string '<p>' (length=3)
      2 => string 'I like pickles and hrrring.' (length=27)
      3 => string '</p>' (length=4)
      4 => string '
    ' (length=2)
      5 => string '<a href="pickle.php">' (length=21)
      6 => string '' (length=0)
      7 => string '<img width="200" src="pickle.png">' (length=34)
      8 => string 'A pickle pic' (length=12)
      9 => string '</a>' (length=4)
      10 => string '
    I have herringbone-patterned toaster cozy.
    ' (length=46)
      11 => string '<herring>' (length=9)
      12 => string 'Herring is not a real HTML element!' (length=35)
      13 => string '</herring>' (length=10)
      14 => string '
    ' (length=2)
    */
    
    foreach($parts as $i => $part) {
    	//如果这个部分是HTML元素则跳过
    	if(isset($part[0]) && ($part[0] == '<')) { continue; }
    	//将这些单词用<span/>包围起来
    	$parts[$i] = str_replace($words, $replacements, $part);
    }
    
    $body = implode('', $parts);
    
    echo $body;
    

    说明:

    preg_split() 函数中使用的正则表达式匹配 HTML 标签

    <(?:"[^"]*"|'[^']*'|[^'">])*>

    可以这样理解:

    <                                //开始尖括号
        (?:                         //任意数量的 
            "[^"]*"                //双引号字符串
            |                        //
            '[^']*'                  //单引号字符串
            |                        //
            [^'">]                  //除去单引号、双引号和>的其他文本
        )*                          
    >                                //结束尖括号

    但是这种方法无法高亮最后一个 Herring,因为它的首字母是大写的。要完全不区分大小写的更改,需要把 str_replace() 方法 改为 preg_replace() 方法:

    <style>
    	.word-0{ background-color: yellow; }
    	.word-1{ border:1px solid red; }
    </style>
    
    <?php
    
    header('Content-type:text/html;charset=utf-8');
    
    /* 标记Web页面 */
    $body = '
    <p>I like pickles and hrrring.</p>
    <a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
    I have herringbone-patterned toaster cozy.
    <herring>Herring is not a real HTML element!</herring>
    ';
    
    $words = array('pickle', 'herring');
    $replacements = array();
    foreach($words as $i => $word) {
    	$patterns[] = '/'.preg_quote($word).'/i'; 
    	//preg_quote()需要参数 str 并向其中 每个正则表达式语法中的字符前增加一个反斜线。正则表达式特殊字符有: .  + * ? [ ^ ] $ ( ) { } = ! < > | : -
    	$replacements[] = "<span class='word-$i'>\0</span>";
    }
    
    // 将页面分解为多个块
    // 由看上去类似HTML元素的部分分隔
    $parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
    var_dump($parts);
    /*
    array (size=15)
      0 => string '
    ' (length=2)
      1 => string '<p>' (length=3)
      2 => string 'I like pickles and hrrring.' (length=27)
      3 => string '</p>' (length=4)
      4 => string '
    ' (length=2)
      5 => string '<a href="pickle.php">' (length=21)
      6 => string '' (length=0)
      7 => string '<img width="200" src="pickle.png">' (length=34)
      8 => string 'A pickle pic' (length=12)
      9 => string '</a>' (length=4)
      10 => string '
    I have herringbone-patterned toaster cozy.
    ' (length=46)
      11 => string '<herring>' (length=9)
      12 => string 'Herring is not a real HTML element!' (length=35)
      13 => string '</herring>' (length=10)
      14 => string '
    ' (length=2)
    */
    
    foreach($parts as $i => $part) {
    	//如果这个部分是HTML元素则跳过
    	if(isset($part[0]) && ($part[0] == '<')) { continue; }
    	//将这些单词用<span/>包围起来
    	$parts[$i] = preg_replace($patterns, $replacements, $part);
    }
    
    $body = implode('', $parts);
    
    echo $body;
    

      

    参考:

    <PHP Cookbook>3'rd

    《精通正则表达式》第3版

  • 相关阅读:
    Hibernate---对象的三种状态
    grunt+bower依赖管理
    grunt 的安装和简单使用
    sqlserver dmv 动态管理视图
    ado.net 数据库连接池
    桥接模式
    .net MVP
    主定理(分治算法)
    图中环的判断
    选举协议paxos 协议 理解
  • 原文地址:https://www.cnblogs.com/dee0912/p/5410703.html
Copyright © 2020-2023  润新知