Here is the example xml I am using :
这是我正在使用的示例xml:
<contact id="43956">
<personal>
<name>
<first>J</first>
<middle>J</middle>
<last>J</last>
Some text...
</name>
<title>Manager</title>
<employer>National</employer>
<dob>1971-12-22</dob>
</personal>
</contact>
I got the Some text...
but now i need my code to read the entire xml document. It also isn't reading the values inside the xml... as you can tell I have never used XMLReader
before.
我得到了一些文本......但现在我需要我的代码来读取整个xml文档。它也没有读取xml中的值...因为你可以告诉我以前从未使用过XMLReader。
This is what I get:
这就是我得到的:
Array ( [contact] => Array ( [id] => 43956 [value] => some sample value ) [first] => [middle] => [last] => [#text] => Some text... [name] => [title] => [employer] => [dob] => [personal] => )
数组([contact] =>数组([id] => 43956 [数值] =>一些样本值)[first] => [middle] => [last] => [#text] =>有些文字...... [name] => [title] => [雇主] => [dob] => [个人] =>)
Here is the code I have now:
这是我现在的代码:
function xml2array($file, array $result = array()) {
$lastElementNodeType = '';
$xml = new XMLReader();
if(!$xml->open($file)) {
die("Failed to open input file");
}
while($xml->read()) {
switch ($xml->nodeType) {
case $xml::END_ELEMENT:
$lastElementNodeType = $xml->nodeType;
case $xml::TEXT:
$tag = $xml->name;
if($lastElementNodeType == 15) {
$result[$tag] = $xml->readString();
}
case $xml::ELEMENT:
$lastElementNodeType = $xml->nodeType;
$tag = $xml->name;
if($xml->hasAttributes) {
while($xml->moveToNextAttribute()) {
$result[$tag][$xml->name] = $xml->value;
}
}
}
}
print_r($result);
}
I thought about making this function recursive, but when I tried that it made the array really messy.
我想过让这个函数递归,但是当我尝试它时,它使得数组非常混乱。
I had a version of this, but it still didn't output the J
in the first
, etc. :
我有一个版本,但它仍然没有在第一个输出J,等等:
function xml2assoc($xml) {
$tree = null;
while($xml->read())
switch ($xml->nodeType) {
case XMLReader::END_ELEMENT: return $tree;
case XMLReader::ELEMENT:
$node = array('tag' => $xml->name, 'value' => $xml->isEmptyElement ? '' : xml2assoc($xml));
if($xml->hasAttributes)
while($xml->moveToNextAttribute())
$node['attributes'][$xml->name] = $xml->value;
$tree[] = $node;
break;
case XMLReader::TEXT:
case XMLReader::CDATA:
$tree .= $xml->value;
}
return $tree;
}
1 个解决方案
#1
0
Take 1
I think what you're going to need to do is save the type of recent nodes or at least the last one so it's available to test against. In short, at least as you present it in your sample XML, you're going to encounter an ELEMENT_END
node type, a TEXT
node type with the text you're looking for, and then another ELEMENT_END
node type.
我认为您需要做的是保存最近节点的类型,或者至少保存最后一个节点,以便进行测试。简而言之,至少在您的示例XML中提供它时,您将遇到ELEMENT_END节点类型,带有您正在查找的文本的TEXT节点类型,然后是另一个ELEMENT_END节点类型。
So you're going to need a case $xml::TEXT
, and you're also going to need to save the previous node type so that your parser knows that, under normal circumstances it should either be expecting a new ELEMENT
event, or an END_ELEMENT
event, but has instead received TEXT. That'll be the signal you need to capture the text to a temporary variable with readString()
and either save it for your purposes, or wait to see if the next node is also a ELEMENT_END
at which point you could save it and clear the temporary variable.
所以你需要一个案例$ xml :: TEXT,你还需要保存以前的节点类型,以便你的解析器知道,在正常情况下它应该是期待一个新的ELEMENT事件,或者END_ELEMENT事件,但已收到TEXT。这将是您使用readString()将文本捕获到临时变量所需的信号,并将其保存以用于您的目的,或等待查看下一个节点是否也是ELEMENT_END,此时您可以保存它并清除临时变量。
Take 2
Now that we know a bit more of what you're hoping to end up with (i.e, Since you are looking to capture the whole tree and not just specific information from it), I would suggest you stick with the recursive version of the function. I modified the one you have slightly (see the TEXT and CDATA cases for the main substantive change).
现在我们已经了解了你希望最终得到的更多内容(即,因为你想要捕获整个树而不仅仅是捕获整个树的特定信息),我建议你坚持使用函数的递归版本。我稍微修改了你的那个(请参阅TEXT和CDATA案例进行主要的实质性修改)。
function xml2assoc($xml)
{
$tree = null;
while($xml->read())
{
switch ($xml->nodeType)
{
case XMLReader::END_ELEMENT:
return $tree;
case XMLReader::ELEMENT:
$node = array('tag' => $xml->name, 'value' => $xml->isEmptyElement ? '' : xml2assoc($xml));
if($xml->hasAttributes)
while($xml->moveToNextAttribute())
$node['attributes'][$xml->name] = $xml->value;
$tree[] = $node;
break;
case XMLReader::TEXT:
$tree["text"] = $xml->value;
break;
case XMLReader::CDATA:
$tree["cdata"] = $xml->value;
break;
}
}
return $tree;
}
The output in this case looked like:
这种情况下的输出看起来像:
Array
(
[0] => Array
(
[tag] => contact
[value] => Array
(
[0] => Array
(
[tag] => personal
[value] => Array
(
[0] => Array
(
[tag] => name
[value] => Array
(
[0] => Array
(
[tag] => first
[value] => Array
(
[text] => J
)
)
[1] => Array
(
[tag] => middle
[value] => Array
(
[text] => J
)
)
[2] => Array
(
[tag] => last
[value] => Array
(
[text] => J
)
)
[text] => Some text...
)
)
[1] => Array
(
[tag] => title
[value] => Array
(
[text] => Manager
)
)
[2] => Array
(
[tag] => employer
[value] => Array
(
[text] => National
)
)
[3] => Array
(
[tag] => dob
[value] => Array
(
[text] => 1971-12-22
)
)
)
)
)
[attributes] => Array
(
[id] => 43956
)
)
)
I assume this is what you're going for with minor edits, though we're really just re-inventing the wheel here. I hope the XML you need to parse isn't particularly large.
我认为这是你想要的小编辑,但我们真的只是在这里重新发明*。我希望你需要解析的XML不是特别大。
#1
0
Take 1
I think what you're going to need to do is save the type of recent nodes or at least the last one so it's available to test against. In short, at least as you present it in your sample XML, you're going to encounter an ELEMENT_END
node type, a TEXT
node type with the text you're looking for, and then another ELEMENT_END
node type.
我认为您需要做的是保存最近节点的类型,或者至少保存最后一个节点,以便进行测试。简而言之,至少在您的示例XML中提供它时,您将遇到ELEMENT_END节点类型,带有您正在查找的文本的TEXT节点类型,然后是另一个ELEMENT_END节点类型。
So you're going to need a case $xml::TEXT
, and you're also going to need to save the previous node type so that your parser knows that, under normal circumstances it should either be expecting a new ELEMENT
event, or an END_ELEMENT
event, but has instead received TEXT. That'll be the signal you need to capture the text to a temporary variable with readString()
and either save it for your purposes, or wait to see if the next node is also a ELEMENT_END
at which point you could save it and clear the temporary variable.
所以你需要一个案例$ xml :: TEXT,你还需要保存以前的节点类型,以便你的解析器知道,在正常情况下它应该是期待一个新的ELEMENT事件,或者END_ELEMENT事件,但已收到TEXT。这将是您使用readString()将文本捕获到临时变量所需的信号,并将其保存以用于您的目的,或等待查看下一个节点是否也是ELEMENT_END,此时您可以保存它并清除临时变量。
Take 2
Now that we know a bit more of what you're hoping to end up with (i.e, Since you are looking to capture the whole tree and not just specific information from it), I would suggest you stick with the recursive version of the function. I modified the one you have slightly (see the TEXT and CDATA cases for the main substantive change).
现在我们已经了解了你希望最终得到的更多内容(即,因为你想要捕获整个树而不仅仅是捕获整个树的特定信息),我建议你坚持使用函数的递归版本。我稍微修改了你的那个(请参阅TEXT和CDATA案例进行主要的实质性修改)。
function xml2assoc($xml)
{
$tree = null;
while($xml->read())
{
switch ($xml->nodeType)
{
case XMLReader::END_ELEMENT:
return $tree;
case XMLReader::ELEMENT:
$node = array('tag' => $xml->name, 'value' => $xml->isEmptyElement ? '' : xml2assoc($xml));
if($xml->hasAttributes)
while($xml->moveToNextAttribute())
$node['attributes'][$xml->name] = $xml->value;
$tree[] = $node;
break;
case XMLReader::TEXT:
$tree["text"] = $xml->value;
break;
case XMLReader::CDATA:
$tree["cdata"] = $xml->value;
break;
}
}
return $tree;
}
The output in this case looked like:
这种情况下的输出看起来像:
Array
(
[0] => Array
(
[tag] => contact
[value] => Array
(
[0] => Array
(
[tag] => personal
[value] => Array
(
[0] => Array
(
[tag] => name
[value] => Array
(
[0] => Array
(
[tag] => first
[value] => Array
(
[text] => J
)
)
[1] => Array
(
[tag] => middle
[value] => Array
(
[text] => J
)
)
[2] => Array
(
[tag] => last
[value] => Array
(
[text] => J
)
)
[text] => Some text...
)
)
[1] => Array
(
[tag] => title
[value] => Array
(
[text] => Manager
)
)
[2] => Array
(
[tag] => employer
[value] => Array
(
[text] => National
)
)
[3] => Array
(
[tag] => dob
[value] => Array
(
[text] => 1971-12-22
)
)
)
)
)
[attributes] => Array
(
[id] => 43956
)
)
)
I assume this is what you're going for with minor edits, though we're really just re-inventing the wheel here. I hope the XML you need to parse isn't particularly large.
我认为这是你想要的小编辑,但我们真的只是在这里重新发明*。我希望你需要解析的XML不是特别大。