• 使用正则表达式突出搜索结果


    例:

    <style>
    	.word-0{ background-color: yellow; }
    	.word-1{ border:1px solid red; }
    </style>
    
    <?php
    
    header('Content-type:text/html;charset=utf-8');
    
    /* 标记Web页面 */
    $body = '
    <p>I like pickles and hrrring.</p>
    <a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
    I have herringbone-patterned toaster cozy.
    <herring>Herring is not a real HTML element!</herring>
    ';
    
    $words = array('pickle', 'herring');
    $replacements = array();
    foreach($words as $i => $word) {
    	$replacements[] = "<span class='word-$i'>$word</span>";
    }
    
    // 将页面分解为多个块
    // 由看上去类似HTML元素的部分分隔
    $parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
    //var_dump($parts);
    /*
    array (size=15)
      0 => string '
    ' (length=2)
      1 => string '<p>' (length=3)
      2 => string 'I like pickles and hrrring.' (length=27)
      3 => string '</p>' (length=4)
      4 => string '
    ' (length=2)
      5 => string '<a href="pickle.php">' (length=21)
      6 => string '' (length=0)
      7 => string '<img width="200" src="pickle.png">' (length=34)
      8 => string 'A pickle pic' (length=12)
      9 => string '</a>' (length=4)
      10 => string '
    I have herringbone-patterned toaster cozy.
    ' (length=46)
      11 => string '<herring>' (length=9)
      12 => string 'Herring is not a real HTML element!' (length=35)
      13 => string '</herring>' (length=10)
      14 => string '
    ' (length=2)
    */
    
    foreach($parts as $i => $part) {
    	//如果这个部分是HTML元素则跳过
    	if(isset($part[0]) && ($part[0] == '<')) { continue; }
    	//将这些单词用<span/>包围起来
    	$parts[$i] = str_replace($words, $replacements, $part);
    }
    
    $body = implode('', $parts);
    
    echo $body;
    

    说明:

    preg_split() 函数中使用的正则表达式匹配 HTML 标签

    <(?:"[^"]*"|'[^']*'|[^'">])*>

    可以这样理解:

    <                                //开始尖括号
        (?:                         //任意数量的 
            "[^"]*"                //双引号字符串
            |                        //
            '[^']*'                  //单引号字符串
            |                        //
            [^'">]                  //除去单引号、双引号和>的其他文本
        )*                          
    >                                //结束尖括号

    但是这种方法无法高亮最后一个 Herring,因为它的首字母是大写的。要完全不区分大小写的更改,需要把 str_replace() 方法 改为 preg_replace() 方法:

    <style>
    	.word-0{ background-color: yellow; }
    	.word-1{ border:1px solid red; }
    </style>
    
    <?php
    
    header('Content-type:text/html;charset=utf-8');
    
    /* 标记Web页面 */
    $body = '
    <p>I like pickles and hrrring.</p>
    <a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
    I have herringbone-patterned toaster cozy.
    <herring>Herring is not a real HTML element!</herring>
    ';
    
    $words = array('pickle', 'herring');
    $replacements = array();
    foreach($words as $i => $word) {
    	$patterns[] = '/'.preg_quote($word).'/i'; 
    	//preg_quote()需要参数 str 并向其中 每个正则表达式语法中的字符前增加一个反斜线。正则表达式特殊字符有: .  + * ? [ ^ ] $ ( ) { } = ! < > | : -
    	$replacements[] = "<span class='word-$i'>\0</span>";
    }
    
    // 将页面分解为多个块
    // 由看上去类似HTML元素的部分分隔
    $parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
    var_dump($parts);
    /*
    array (size=15)
      0 => string '
    ' (length=2)
      1 => string '<p>' (length=3)
      2 => string 'I like pickles and hrrring.' (length=27)
      3 => string '</p>' (length=4)
      4 => string '
    ' (length=2)
      5 => string '<a href="pickle.php">' (length=21)
      6 => string '' (length=0)
      7 => string '<img width="200" src="pickle.png">' (length=34)
      8 => string 'A pickle pic' (length=12)
      9 => string '</a>' (length=4)
      10 => string '
    I have herringbone-patterned toaster cozy.
    ' (length=46)
      11 => string '<herring>' (length=9)
      12 => string 'Herring is not a real HTML element!' (length=35)
      13 => string '</herring>' (length=10)
      14 => string '
    ' (length=2)
    */
    
    foreach($parts as $i => $part) {
    	//如果这个部分是HTML元素则跳过
    	if(isset($part[0]) && ($part[0] == '<')) { continue; }
    	//将这些单词用<span/>包围起来
    	$parts[$i] = preg_replace($patterns, $replacements, $part);
    }
    
    $body = implode('', $parts);
    
    echo $body;
    

      

    参考:

    <PHP Cookbook>3'rd

    《精通正则表达式》第3版

  • 相关阅读:
    light oj 1007
    51nod 1298 圆与三角形
    codeforces 899C Dividing the numbers
    zznu 1996 : 正三角形和圆的爱情
    zznu 2081 : 舰队管理
    zzun 2076 : 三花聚顶神功
    zznu 2054 : 油田
    机械设备--第九届省赛--深搜
    设计模式-单例模式、工厂模式
    Spring Boot 遇到空指针
  • 原文地址:https://www.cnblogs.com/dee0912/p/5410703.html
Copyright © 2020-2023  润新知