• php过滤HTML标签、属性等正则表达式汇总


    $str=preg_replace("/s+/", " ", $str); //过滤多余回车
    $str=preg_replace("/<[ ]+/si","<",$str); //过滤<__("<"号后面带空格)
      
    $str=preg_replace("/<!--.*?-->/si","",$str); //注释
    $str=preg_replace("/<(!.*?)>/si","",$str); //过滤DOCTYPE
    $str=preg_replace("/<(/?html.*?)>/si","",$str); //过滤html标签
    $str=preg_replace("/<(/?head.*?)>/si","",$str); //过滤head标签
    $str=preg_replace("/<(/?meta.*?)>/si","",$str); //过滤meta标签
    $str=preg_replace("/<(/?body.*?)>/si","",$str); //过滤body标签
    $str=preg_replace("/<(/?link.*?)>/si","",$str); //过滤link标签
    $str=preg_replace("/<(/?form.*?)>/si","",$str); //过滤form标签
    $str=preg_replace("/cookie/si","COOKIE",$str); //过滤COOKIE标签
      
    $str=preg_replace("/<(applet.*?)>(.*?)<(/applet.*?)>/si","",$str); //过滤applet标签
    $str=preg_replace("/<(/?applet.*?)>/si","",$str); //过滤applet标签
      
    $str=preg_replace("/<(style.*?)>(.*?)<(/style.*?)>/si","",$str); //过滤style标签
    $str=preg_replace("/<(/?style.*?)>/si","",$str); //过滤style标签
      
    $str=preg_replace("/<(title.*?)>(.*?)<(/title.*?)>/si","",$str); //过滤title标签
    $str=preg_replace("/<(/?title.*?)>/si","",$str); //过滤title标签
      
    $str=preg_replace("/<(object.*?)>(.*?)<(/object.*?)>/si","",$str); //过滤object标签
    $str=preg_replace("/<(/?objec.*?)>/si","",$str); //过滤object标签
      
    $str=preg_replace("/<(noframes.*?)>(.*?)<(/noframes.*?)>/si","",$str); //过滤noframes标签
    $str=preg_replace("/<(/?noframes.*?)>/si","",$str); //过滤noframes标签
      
    $str=preg_replace("/<(i?frame.*?)>(.*?)<(/i?frame.*?)>/si","",$str); //过滤frame标签
    $str=preg_replace("/<(/?i?frame.*?)>/si","",$str); //过滤frame标签
      
    $str=preg_replace("/<(script.*?)>(.*?)<(/script.*?)>/si","",$str); //过滤script标签
    $str=preg_replace("/<(/?script.*?)>/si","",$str); //过滤script标签
    $str=preg_replace("/javascript/si","Javascript",$str); //过滤script标签
    $str=preg_replace("/vbscript/si","Vbscript",$str); //过滤script标签
    $str=preg_replace("/on([a-z]+)s*=/si","On\1=",$str); //过滤script标签
    $str=preg_replace("/&#/si","&#",$str); //过滤script标签,如javAsCript:alert(
    

    清除空格,换行

    function DeleteHtml($str)
    {
    $str = trim($str);
    $str = strip_tags($str,"");
    $str = ereg_replace("	","",$str);
    $str = ereg_replace("
    ","",$str);
    $str = ereg_replace("
    ","",$str);
    $str = ereg_replace("
    ","",$str);
    $str = ereg_replace(" "," ",$str);
    return trim($str);
    }
    

    过滤HTML属性

    1,过滤所有html标签的正则表达式:

    </?[^>]+>
     
    //过滤所有html标签的属性的正则表达式:
     
    $html = preg_replace("/<([a-zA-Z]+)[^>]*>/","<\1>",$html);
    

      

      

      

  • 相关阅读:
    ETL之Kettle
    java 之webmagic 网络爬虫
    【AC自动机】【树状数组】【dfs序】洛谷 P2414 [NOI2011]阿狸的打字机 题解
    【AC自动机】【字符串】【字典树】AC自动机 学习笔记
    【前缀和】【two-pointer】【贪心】洛谷 P3143 [USACO16OPEN]钻石收藏家Diamond Collector 题解
    【KMP】【矩阵加速】【递推】洛谷 P3193 [HNOI2008]GT考试 题解
    【KMP】洛谷P2375 [NOI2014]动物园 题解
    【KMP】【字符串】KMP字符串匹配算法 学习笔记
    【DP】+【贪心】【前缀和】洛谷P2893 [USACO08FEB]修路Making the Grade 题解
    【字典树】【树】【二进制】bzoj1954/POJ3764The xor-longest Path 题解
  • 原文地址:https://www.cnblogs.com/vania/p/4431558.html
Copyright © 2020-2023  润新知