• simple_html_dom使用小结


    2.简单范例
    <?php

    include "simple_html_dom.php"; // Create DOM from URL or file
    $html file_get_html('http://www.google.com/');

    // Find all images 
    foreach($html->find('img')as $element
           echo $element->src '<br>';

    // Find all links 
    foreach($html->find('a')as $element
           echo $element->href '<br>';

    // Create DOM from URL
    $html file_get_html('http://slashdot.org/');

    // Find all article blocks
    foreach($html->find('div.article')as $article){
        $item['title']     = $article->find('div.title'0)->plaintext;
        $item['intro']    = $article->find('div.intro'0)->plaintext;
        $item['details'] = $article->find('div.details'0)->plaintext;
        $articles[]= $item;
    }

    print_r($articles);

    // Create DOM from string
    $html str_get_html('<div id="hello">Hello</div><div id="world">World</div>');$html->find('div',1)->class = 'bar';

    $html->find('div[id=hello]',0)->innertext = 'foo';

    echo $html// Output: <div id="hello">foo</div><div id="world" class="bar">World</div>



    3.DOM的方法

    $html = file_get_html('http://www.google.com/');        //$html 所拥有的方法如下表所示
    $html->clear() ;                                                              //调用方法

    DOM methods & properties 
    Name Description
    void
    __construct ( [string $filename] )
    Constructor, set the filename parameter will automatically load the contents, either text or file/url.
     string
    plaintext
    Returns the contents extracted from HTML.
    void
    clear ()
    Clean up memory.
    void
    load ( string $content )
    Load contents from a string.
    string
    save ( [string $filename] )
    Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
    void
    load_file ( string $filename )
    Load contents from a from a file or a URL.
    void
    set_callback ( string $function_name )
    Set a callback function.
    mixed
    find ( string $selector [, int $index] )
    Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.


    4.find 方法详细介绍

    find ( string $selector [, int $index] ) 

    // Find all anchors, returns a array of element objects
    $ret = $html->find('a');

    // Find (N)th anchor, returns element object or null if not found (zero based)
    $ret = $html->find('a', 0);

    // Find lastest anchor, returns element object or null if not found (zero based)
    $ret = $html->find('a', -1); 

    // Find all <div> with the id attribute
    $ret = $html->find('div[id]');

    // Find all <div> which attribute id=foo
    $ret = $html->find('div[id=foo]'); 

    // Find all element which id=foo
    $ret = $html->find('#foo');

    // Find all element which class=foo
    $ret = $html->find('.foo');

    // Find all element has attribute id
    $ret = $html->find('*[id]'); 

    // Find all anchors and images 
    $ret = $html->find('a, img'); 

    // Find all anchors and images with the "title" attribute
    $ret = $html->find('a[title], img[title]');

    // Find all <li> in <ul> 
    $es = $html->find('ul li');

    // Find Nested <div> tags
    $es = $html->find('div div div'); 

    // Find all <td> in <table> which class="hello" 
    $es = $html->find('table.hello td');

    // Find all td tags with attribite align=center in table tags 
    $es = $html->find(''table td[align=center]');


    5. Element  的方法

    $e = $html->find("div"0);                              //$e 所拥有的方法如下表所示

    Attribute Name Usage
    $e->tag Read or write the tag name of element.
    $e->outertext Read or write the outer HTML text of element.
    $e->innertext Read or write the inner HTML text of element.
    $e->plaintext Read or write the plain text of element.

    // Example
    $html = str_get_html("<div>foo <b>bar</b></div>"); 
    $e = $html->find("div"0);

    echo $e->tag// Returns: " div"
    echo $e->outertext// Returns: " <div>foo <b>bar</b></div>"
    echo $e->innertext// Returns: " foo <b>bar</b>"
    echo $e->plaintext// Returns: " foo bar"


    6.DOM traversing 方法

    Method Description
    mixed
    $e->children ( [int $index] )
    Returns the Nth child object if index is set, otherwise return an array of children.
    element
    $e->parent ()
    Returns the parent of element.
    element
    $e->first_child ()
    Returns the first child of element, or null if not found.
    element
    $e->last_child ()
    Returns the last child of element, or null if not found.
    element
    $e->next_sibling ()
    Returns the next sibling of element, or null if not found.
    element
    $e->prev_sibling ()
    Returns the previous sibling of element, or null if not found.

    // Example
    echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
    // or 
    echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');


        附带: DOM方法  set_callback('my_callback')  使用方法

    // Write a function with parameter "$element"
    function my_callback($element) {
            // Hide all <b> tags 
            if ($element->tag=='b')
                    $element->outertext = '';


    // Register the callback function with it's function name
    $html->set_callback('my_callback');

    // Callback function will be invoked while dumping
    echo $html;
  • 相关阅读:
    已整理Linux进程概念与常用操作
    Linux vsftpd服务配置
    已整理制作ceph离线安装包
    时间同步服务与客户端配置
    已整理rpm 包管理与yum服务器配置操作
    通用池化框架GenericObjectPool性能测试
    红利、辛苦钱、利润和工资【读书笔记】
    通用池化框架GenericKeyedObjectPool性能测试
    国际化和本地化测试
    Redis stream Java API实践
  • 原文地址:https://www.cnblogs.com/rockchip/p/3202552.html
Copyright © 2020-2023  润新知