DOM和simplexml处理xml非常的灵活方便,它们的内存组织结构与xml文件格式很相近。但是同时它们也有一个缺点,对于大文件处理起来力不从心,太耗内存了。
还好有xmlreader,基于流的解析器,(什么是基于流)。它可以对于xml大文件进行解析,采用一边读取一边解析的方法,而不是一股脑儿都加载到内存去处理。但是它也有缺点,不够灵活方便(这是DOM和simplexml擅长的)。
那些把他们结合起来,不就可以很好的解析大文件了吗? 我写了一个简单的类实现了一点点鸡肋般的功能。
xml文件
<?xml version='1.0' standalone='yes'?> <movies> <movie> <title>PHP: Behind the Parser</title> <characters> <character> <name>Ms. Coder</name> <actor>Onlivia Actora</actor> </character> <character> <name>Mr. Coder</name> <actor>El ActÓr</actor> </character> </characters> <plot> So, this language. It's like, a programming language. Or is it a scripting language? All is revealed in this thrilling horror spoof of a documentary. </plot> <great-lines> <line>PHP solves all my web problems</line> </great-lines> <rating type="thumbs">7</rating> <rating type="stars">5</rating> </movie>
<movie> <title>PHP: Behind the Parser</title> <characters> <character> <name>Ms. Coder</name> <actor>Onlivia Actora</actor> </character> <character> <name>Mr. Coder</name> <actor>El ActÓr</actor> </character> </characters> <plot> So, this language. It's like, a programming language. Or is it a scripting language? All is revealed in this thrilling horror spoof of a documentary. </plot> <great-lines> <line>PHP solves all my web problems</line> </great-lines> <rating type="thumbs">7</rating> <rating type="stars">5</rating> </movie> </movies>
实现类
class SimpleXmlReader extends XMLReader{ public function __construct($source, $isfile = false){ if($isfile){ $this->open($source); }else{ $this->XML($source); } } public function getElement($nodename, $depth = 0){ if($this->localName == $nodename && $this->nodeType == self::ELEMENT){ if(!$depth || ($depth && $depth == $this->depth)){ $this->next(); } } while($this->read()){ if($this->localName == $nodename && $this->nodeType == self::ELEMENT){ if(!$depth || ($depth && $depth == $this->depth)){ return true; } } } return false; } public function expandNodeToSimpleXml(){ if($this->nodeType == self::ELEMENT){ $node = $this->expand(); $dom = new DomDocument(); $n = $dom->importNode($node, true); $sxe = simplexml_import_dom($n); return $sxe; } return false; } }
实例代码:
$xmlhl = new SimpleXmlReader('test.xml', true); while($xmlhl->getElement('movie')){ $sxe = $xmlhl->expandNodeToSimpleXml(); foreach($sxe->characters[0] as $character){ echo " name -> " . $character->name; echo " actor -> " . $character->actor; } }
结构:
name -> Ms. Coder
actor -> Onlivia Actora
name -> Mr. Coder
actor -> El ActÓr
name -> Ms. Coder
actor -> Onlivia Actora
name -> Mr. Coder
actor -> El ActÓr