在PHP4中用于XML解析/读取的内容

问题描述:

不幸的是,我必须在PHP4服务器上的较早的Web应用程序中工作; 现在需要解析很多XM L来拨打webservices (custom protocol, no SOAP/REST);在PHP4中用于XML解析/读取的内容

根据PHP5我会使用SimpleXML但这是不可用; PHP4中有Dom XML,但它在PHP5中不再是默认值。

还有什么其他的选择? 我正在寻找一种迁移后仍能在PHP5上运行的解决方案。

如果XML可以通过模式进行验证,那么额外的好处是。

有一个反向移植的SimpleXML可供选择:http://www.ister.org/code/simplexml44/index.html

如果你可以安装,那么这将是最好的解决方案。

它可能有点草根,但如果它适用于您正在使用的数据,则可以使用XSLT将XML转换为可用的数据。显然,一旦升级到PHP5,XSLT仍然可以工作,并且可以在DOM解析时进行迁移。

安德鲁

我第二Rich Bradshaw's suggestion关于SimpleXML的反向移植,但如果这不是一个选项,然后xml_parse会做在PHP4的工作,并在迁移后仍然工作于5

$xml = ...; // Get your XML data 
$xml_parser = xml_parser_create(); 

// _start_element and _end_element are two functions that determine what 
// to do when opening and closing tags are found 
xml_set_element_handler($xml_parser, "_start_element", "_end_element"); 

// How to handle each char (stripping whitespace if needs be, etc 
xml_set_character_data_handler($xml_parser, "_character_data"); 

xml_parse($xml_parser, $xml); 

有一个很好的tutorial here关于在PHP4中解析XML可能对你有一些用处。

如果你可以使用xml_parse,那就去做那个吧。它强大,快速且与PHP5兼容。然而,它不是一个DOM解析器,而是一个更简单的基于事件的解析器(Also called a SAX parser),所以如果你需要访问一棵树,你将不得不将这个流封送到你自己的树中。这很简单,使用s堆栈,然后在start-element上推送项目,并在end-element上弹出。

我肯定会推荐SimpleXML backport,只要它的性能足够满足您的需求。 xml_parse的演示看起来很简单,但根据我的经验,它可以非常快速地变得非常有毛。内容处理函数不会获得关于解析器在树中的位置的任何上下文信息,除非您在开始和结束标记处理程序中跟踪它并将其提供。因此,您要么为每个开始/结束标记调用函数,要么绕过全局变量来跟踪您在树中的位置。

显然,SimpleXML backport会稍微慢一点,因为它是用PHP编写的,必须在整个文档可用之前解析整个文档,但编码的难易程度超过了它。

也许还考虑寻找在PEAR可用,特别是XML_UtilXML_ParserXML_Serializer的XML包...与parse_into_struct

XML解析器变成树阵列结构:

<?php 
/** 
* What to use for XML parsing/reading in PHP4 
* @link http://*.com/q/132233/367456 
*/ 

$encoding = 'US-ASCII'; 
     // https://gist.github.com/hakre/46386de578619fbd898c 
$path  = dirname(__FILE__) . '/time-series-example.xml'; 

$parser_creator = 'xml_parser_create'; // alternative creator is 'xml_parser_create_ns' 

if (!function_exists($parser_creator)) { 
    trigger_error(
     "XML Parsers' $parser_creator() not found. XML Parser " 
     . '<http://php.net/xml> is required, activate it in your PHP configuration.' 
     , E_USER_ERROR 
    ); 
    return; 
} 

$parser = $parser_creator($encoding); 
if (!$parser) { 
    trigger_error(sprintf('Unable to create a parser (Encoding: "%s")', $encoding), E_USER_ERROR); 
    return; 
} 

xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0); 
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1); 

$data = file_get_contents($path); 
if ($data === FALSE) { 
    trigger_error(sprintf('Unable to open file "%s" for reading', $path)); 
    return; 
} 
$result = xml_parse_into_struct($parser, $data, $xml_struct_values); 
unset($data); 
xml_parser_free($parser); 
unset($parser); 

if ($result === 0) { 
    trigger_error(sprintf('Unable to parse data of file "%s" as XML', $path)); 
    return; 
} 

define('TREE_NODE_TAG', 'tagName'); 
define('TREE_NODE_ATTRIBUTES', 'attributes'); 
define('TREE_NODE_CHILDREN', 'children'); 

define('TREE_NODE_TYPE_TAG', 'array'); 
define('TREE_NODE_TYPE_TEXT', 'string'); 
define('TREE_NODE_TYPE_NONE', 'NULL'); 

/** 
* XML Parser indezies for parse into struct values 
*/ 
define('XML_STRUCT_VALUE_TYPE', 'type'); 
define('XML_STRUCT_VALUE_LEVEL', 'level'); 
define('XML_STRUCT_VALUE_TAG', 'tag'); 
define('XML_STRUCT_VALUE_ATTRIBUTES', 'attributes'); 
define('XML_STRUCT_VALUE_VALUE', 'value'); 

/** 
* XML Parser supported node types 
*/ 
define('XML_STRUCT_TYPE_OPEN', 'open'); 
define('XML_STRUCT_TYPE_COMPLETE', 'complete'); 
define('XML_STRUCT_TYPE_CDATA', 'cdata'); 
define('XML_STRUCT_TYPE_CLOSE', 'close'); 

/** 
* Tree Creator 
* @return array 
*/ 
function tree_create() 
{ 
    return array(
     array(
      TREE_NODE_TAG  => NULL, 
      TREE_NODE_ATTRIBUTES => NULL, 
      TREE_NODE_CHILDREN => array(), 
     ) 
    ); 
} 

/** 
* Add Tree Node into Tree a Level 
* 
* @param $tree 
* @param $level 
* @param $node 
* @return array|bool Tree with the Node added or FALSE on error 
*/ 
function tree_add_node($tree, $level, $node) 
{ 
    $type = gettype($node); 
    switch ($type) { 
     case TREE_NODE_TYPE_TEXT: 
      $level++; 
      break; 
     case TREE_NODE_TYPE_TAG: 
      break; 
     case TREE_NODE_TYPE_NONE: 
      trigger_error(sprintf('Can not add Tree Node of type None, keeping tree unchanged', $type, E_USER_NOTICE)); 
      return $tree; 
     default: 
      trigger_error(sprintf('Can not add Tree Node of type "%s"', $type), E_USER_ERROR); 
      return FALSE; 
    } 

    if (!isset($tree[$level - 1])) { 
     trigger_error("There is no parent for level $level"); 
     return FALSE; 
    } 

    $parent = & $tree[$level - 1]; 

    if (isset($parent[TREE_NODE_CHILDREN]) && !is_array($parent[TREE_NODE_CHILDREN])) { 
     trigger_error("There are no children in parent for level $level"); 
     return FALSE; 
    } 

    $parent[TREE_NODE_CHILDREN][] = & $node; 
    $tree[$level]     = & $node; 

    return $tree; 
} 

/** 
* Creator of a Tree Node 
* 
* @param $value XML Node 
* @return array Tree Node 
*/ 
function tree_node_create_from_xml_struct_value($value) 
{ 
    static $xml_node_default_types = array(
     XML_STRUCT_VALUE_ATTRIBUTES => NULL, 
     XML_STRUCT_VALUE_VALUE  => NULL, 
    ); 

    $orig = $value; 

    $value += $xml_node_default_types; 

    switch ($value[XML_STRUCT_VALUE_TYPE]) { 
     case XML_STRUCT_TYPE_OPEN: 
     case XML_STRUCT_TYPE_COMPLETE: 
      $node = array(
       TREE_NODE_TAG => $value[XML_STRUCT_VALUE_TAG], 
       // '__debug1' => $orig, 
      ); 
      if (isset($value[XML_STRUCT_VALUE_ATTRIBUTES])) { 
       $node[TREE_NODE_ATTRIBUTES] = $value[XML_STRUCT_VALUE_ATTRIBUTES]; 
      } 
      if (isset($value[XML_STRUCT_VALUE_VALUE])) { 
       $node[TREE_NODE_CHILDREN] = (array)$value[XML_STRUCT_VALUE_VALUE]; 
      } 
      return $node; 

     case XML_STRUCT_TYPE_CDATA: 
      // TREE_NODE_TYPE_TEXT 
      return $value[XML_STRUCT_VALUE_VALUE]; 

     case XML_STRUCT_TYPE_CLOSE: 
      return NULL; 

     default: 
      trigger_error(
       sprintf(
        'Unkonwn Xml Node Type "%s": %s', $value[XML_STRUCT_VALUE_TYPE], var_export($value, TRUE) 
       ) 
      ); 
      return FALSE; 
    } 
} 

$tree = tree_create(); 

while ($tree && $value = array_shift($xml_struct_values)) { 
    $node = tree_node_create_from_xml_struct_value($value); 
    if (NULL === $node) { 
     continue; 
    } 
    $tree = tree_add_node($tree, $value[XML_STRUCT_VALUE_LEVEL], $node); 
    unset($node); 
} 

if (!$tree) { 
    trigger_error('Parse error'); 
    return; 
} 

if ($xml_struct_values) { 
    trigger_error(sprintf('Unable to process whole parsed XML array (%d elements left)', count($xml_struct_values))); 
    return; 
} 

// tree root is the first child of level 0 
print_r($tree[0][TREE_NODE_CHILDREN][0]); 

输出:

Array 
(
    [tagName] => dwml 
    [attributes] => Array 
     (
      [version] => 1.0 
      [xmlns:xsd] => http://www.w3.org/2001/XMLSchema 
      [xmlns:xsi] => http://www.w3.org/2001/XMLSchema-instance 
      [xsi:noNamespaceSchemaLocation] => http://www.nws.noaa.gov/forecasts/xml/DWMLgen/schema/DWML.xsd 
     ) 

    [children] => Array 
     (
      [0] => Array 
       (
        [tagName] => head 
        [children] => Array 
         (
          [0] => Array 
           (
            [tagName] => product 
            [attributes] => Array 
             (
              [srsName] => WGS 1984 
              [concise-name] => time-series 
              [operational-mode] => official 
             ) 

            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => title 
                [children] => Array 
                 (
                  [0] => NOAA's National Weather Service Forecast Data 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => field 
                [children] => Array 
                 (
                  [0] => meteorological 
                 ) 

               ) 

              [2] => Array 
               (
                [tagName] => category 
                [children] => Array 
                 (
                  [0] => forecast 
                 ) 

               ) 

              [3] => Array 
               (
                [tagName] => creation-date 
                [attributes] => Array 
                 (
                  [refresh-frequency] => PT1H 
                 ) 

                [children] => Array 
                 (
                  [0] => 2013-11-02T06:51:17Z 
                 ) 

               ) 

             ) 

           ) 

          [1] => Array 
           (
            [tagName] => source 
            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => more-information 
                [children] => Array 
                 (
                  [0] => http://www.nws.noaa.gov/forecasts/xml/ 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => production-center 
                [children] => Array 
                 (
                  [0] => Meteorological Development Laboratory 
                  [1] => Array 
                   (
                    [tagName] => sub-center 
                    [children] => Array 
                     (
                      [0] => Product Generation Branch 
                     ) 

                   ) 

                 ) 

               ) 

              [2] => Array 
               (
                [tagName] => disclaimer 
                [children] => Array 
                 (
                  [0] => http://www.nws.noaa.gov/disclaimer.html 
                 ) 

               ) 

              [3] => Array 
               (
                [tagName] => credit 
                [children] => Array 
                 (
                  [0] => http://www.weather.gov/ 
                 ) 

               ) 

              [4] => Array 
               (
                [tagName] => credit-logo 
                [children] => Array 
                 (
                  [0] => http://www.weather.gov/images/xml_logo.gif 
                 ) 

               ) 

              [5] => Array 
               (
                [tagName] => feedback 
                [children] => Array 
                 (
                  [0] => http://www.weather.gov/feedback.php 
                 ) 

               ) 

             ) 

           ) 

         ) 

       ) 

      [1] => Array 
       (
        [tagName] => data 
        [children] => Array 
         (
          [0] => Array 
           (
            [tagName] => location 
            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => location-key 
                [children] => Array 
                 (
                  [0] => point1 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => point 
                [attributes] => Array 
                 (
                  [latitude] => 40.00 
                  [longitude] => -120.00 
                 ) 

               ) 

             ) 

           ) 

          [1] => Array 
           (
            [tagName] => moreWeatherInformation 
            [attributes] => Array 
             (
              [applicable-location] => point1 
             ) 

            [children] => Array 
             (
              [0] => http://forecast.weather.gov/MapClick.php?textField1=40.00&textField2=-120.00 
             ) 

           ) 

          [2] => Array 
           (
            [tagName] => time-layout 
            [attributes] => Array 
             (
              [time-coordinate] => local 
              [summarization] => none 
             ) 

            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => layout-key 
                [children] => Array 
                 (
                  [0] => k-p24h-n4-1 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-02T08:00:00-07:00 
                 ) 

               ) 

              [2] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-02T20:00:00-07:00 
                 ) 

               ) 

              [3] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-03T07:00:00-08:00 
                 ) 

               ) 

              [4] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-03T19:00:00-08:00 
                 ) 

               ) 

              [5] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-04T07:00:00-08:00 
                 ) 

               ) 

              [6] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-04T19:00:00-08:00 
                 ) 

               ) 

              [7] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-05T07:00:00-08:00 
                 ) 

               ) 

              [8] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-05T19:00:00-08:00 
                 ) 

               ) 

             ) 

           ) 

          [3] => Array 
           (
            [tagName] => time-layout 
            [attributes] => Array 
             (
              [time-coordinate] => local 
              [summarization] => none 
             ) 

            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => layout-key 
                [children] => Array 
                 (
                  [0] => k-p24h-n5-2 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-01T20:00:00-07:00 
                 ) 

               ) 

              [2] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-02T09:00:00-07:00 
                 ) 

               ) 

              [3] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-02T19:00:00-07:00 
                 ) 

               ) 

              ... 

              [10] => Array 
               (
                [tagName] => end-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-06T08:00:00-08:00 
                 ) 

               ) 

             ) 

           ) 

          [4] => Array 
           (
            [tagName] => time-layout 
            [attributes] => Array 
             (
              [time-coordinate] => local 
              [summarization] => none 
             ) 

            [children] => Array 
             (
              [0] => Array 
               (
                [tagName] => layout-key 
                [children] => Array 
                 (
                  [0] => k-p12h-n9-3 
                 ) 

               ) 

              [1] => Array 
               (
                [tagName] => start-valid-time 
                [children] => Array 
                 (
                  [0] => 2013-11-01T17:00:00-07:00 
                 ) 

               ) 
            ...