I am half way through writing a script using XML::Simple
. I have read that is not so "simple", and even its own documentation discourages its use in new code, but I have no other choice as this script will be an extension to existing code.
我已经完成了使用XML::Simple编写脚本的一半工作。我读到过它并不是那么“简单”,甚至它自己的文档也不鼓励在新代码中使用它,但是我别无选择,因为这个脚本将是现有代码的扩展。
What I am doing is this
我在做的就是这个
- Get XML by reading from a URL
- 从URL读取XML
- Parse it using
XML::Simple
- 解析它使用XML::Simple
- Read the required elements from the data
- 从数据中读取所需的元素
- Run different checks on these required elements
- 对这些必需的元素执行不同的检查
I could parse and do some checks on a few of the elements, but while reading elements that are in array, I am getting undef
.
我可以对其中的一些元素进行解析和检查,但是在读取数组中的元素时,我得到了undef。
This is my code:
这是我的代码:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use LWP::Simple;
use XML::Simple;
use DBI;
use Data::Dumper;
my $str = "<Actual_URL>";
my $ua = LWP::UserAgent->new;
$ua->timeout( 180 );
$ua->agent( "$0/0.1 " . $ua->agent );
my $req = HTTP::Request->new( GET => $str );
my $buffer;
$req->content_type( 'text/xml' );
$req->content( $buffer );
my $response = $ua->request( $req );
my $xml = $response->content();
print "Value of \$xml is:\n";
print $xml;
my $filename = 'record.txt';
open( my $fh, '>', $filename ) or die "Could not open file '$filename' $!";
print $fh $xml;
close $fh;
my $number_of_lines = `wc -l record.txt | cut -d' ' -f1`;
print "Number of lines in $filename are: $number_of_lines\n";
if ( $number_of_lines >= 50 ) {
print "TEST_1 SUCCESS\n";
}
my $mysql_dbh;
my $test_id;
my $xst;
my %cmts_Pre_EQ_tags;
if ( ( not defined $xml ) or ( $xml =~ m/read\stimeout/i ) ) {
&printXMLErr( 'DRUM request timed out' );
}
else {
my $xs = XML::Simple->new();
$xst = eval { $xs->XMLin( $xml, KeyAttr => 1 ) };
&printXMLErr( $@ ) if ( $@ );
print "Value of \$xst inside is:\n";
print Dumper( $xst );
}
$cmts_Pre_EQ_tags{'$cmts_Pre_EQ_groupDelayMag'} =
$xst->{cmts}->{Pre_EQ}->{groupDelayMag}->{content};
#More elements like this are checked here
$cmts_Pre_EQ_tags{'$cmts_Pre_EQ_ICFR'} =
$xst->{cmts}->{Pre_EQ}->{ICFR}->{content};
my $decision1 = 1;
print "\%cmts_Pre_EQ_tags:\n";
foreach ( sort keys %cmts_Pre_EQ_tags ) {
print "$_ : $cmts_Pre_EQ_tags{$_}\n";
if ( $cmts_Pre_EQ_tags{$_} eq '' ) {
print "$_ is empty!\n";
$decision1 = 0;
}
}
print "\n";
if ( $decision1 == 0 ) {
print "TEST_2_1 FAIL\n";
}
else {
print "TEST_2_1 SUCCESS\n";
}
my $cpeIP4 = $xst->{cmts}->{cpeIP4}->{content};
print "The cpe IP is: $cpeIP4\n";
if ( $cpeIP4 ne '' ) {
print "TEST_2_2 SUCCESS\n";
}
else {
print "TEST_2_2 FAIL\n";
}
# Working fine until here, but following 2 print are showing undef
print Dumper ( $xst->{cmts}{STBDSG}{dsg}[0]{dsgIfStdTunnelFilterTunnelId} );
print Dumper ( $xst->{cmts}{STBDSG}{dsg}[0]{dsgIfStdTunnelFilterClientIdType} );
print "After\n";
Output of last three print statements is:
最后三个打印声明的输出为:
$VAR1 = undef;
$VAR1 = undef;
After
I can't provide the entire XML or the output of print Dumper($xst)
as it's too big and gets generated dynamically, but I'll provide a sample of it.
我不能提供整个XML或打印Dumper($xst)的输出,因为它太大,并且会动态生成,但是我将提供它的一个示例。
The part of the XML that is causing trouble is
引起麻烦的XML部分是
<cmts>
<STBDSG>
<dsg>
<dsgIfStdTunnelFilterTunnelId>1</dsgIfStdTunnelFilterTunnelId>
<dsgIfStdTunnelFilterClientIdType>caSystemId</dsgIfStdTunnelFilterClientIdType>
</dsg>
<dsg>
<dsgIfStdTunnelFilterTunnelId>2</dsgIfStdTunnelFilterTunnelId>
<dsgIfStdTunnelFilterClientIdType>gaSystemId</dsgIfStdTunnelFilterClientIdType>
</dsg>
</STBDSG>
</cmts>
And when this part is parsed, then its corresponding output in $xst
is
当这个部分被解析时,它对应的输出是$xst
$VAR1 = {
'cmts' => {
'STBDSG' => {
'dsg' => [
{
'dsgIfStdTunnelFilterTunnelId' => '1',
'dsgIfStdTunnelFilterClientIdType' => 'caSystemId',
},
{
'dsgIfStdTunnelFilterTunnelId' => '2',
'dsgIfStdTunnelFilterClientIdType' => 'gaSystemId',
}
]
},
},
};
The XML part where after parsing the values are fetched fine is like this
解析完值后获取良好的XML部分如下所示
<cmts>
<name field_name="Name">cts01nsocmo</name>
<object field_name="Nemos Object">888</object>
<vendor field_name="Vendor">xyz</vendor>
</cmts>
Which was converted as:
这是转换为:
$VAR1 = {
'cmts' => {
'name' => {
'content' => 'cts01nsocmo',
'field_name' => 'Name'
},
'object' => {
'content' => '888',
'field_name' => 'Nemos Object'
},
'vendor' => {
'content' => 'xyz',
'field_name' => 'Vendor'
}
},
};
So basically when there is no array in parsed content, the values are being fetched correctly in variables.
所以基本上,当解析内容中没有数组时,值在变量中被正确地获取。
It seems that the reason why this
看来这就是原因
print Dumper ( $xst->{cmts}{STBDSG}{dsg}[0]{dsgIfStdTunnelFilterTunnelId} );
print Dumper ( $xst->{cmts}{STBDSG}{dsg}[0]{dsgIfStdTunnelFilterClientIdType} );
is getting undef
is related to setting correct values to either KeyAttr
or ForceArray
. I am trying to find it by reading XML::Simple
, but I wanted to see if there's something distinct that I am missing here.
获取undef与将正确的值设置为KeyAttr或ForceArray相关。我试图通过阅读XML::Simple来找到它,但是我想看看这里是否有我所缺少的不同之处。
2 个解决方案
#1
4
It's worth considering the use of XML::Twig
, regardless of what the rest of your project does
考虑使用XML::Twig是值得的,不管项目的其他部分做什么
In particular, XML::Twig::Elt
objects -- the module's implementation of XML elements -- have a simplify
method, whose documentation says this
特别是XML::Twig::Elt对象——模块对XML元素的实现——有一个简化的方法,其文档说明了这一点。
Return a data structure suspiciously similar to XML::Simple's. Options are identical to XMLin options
返回与XML::Simple相似的数据结构。选项与XMLin选项相同
So you can use XML::Twig
for its precision and convenience, and apply the simplify
method if you need to pass on any data that looks like an XML::Simple
data structure
因此,您可以使用XML::Twig来实现它的精确性和方便性,如果需要传递任何看起来像XML::Simple数据结构的数据,可以使用simplify方法
#2
1
As you have found - XML::Simple
, isn't. Even it's documentation suggests:
正如您所发现的——XML::简单,不是。甚至它的文档表明:
The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces.
不建议在新代码中使用此模块。其他模块提供了更简单和一致的接口。
Part of the problem is - XML doesn't have any such thing as arrays. It might have duplicated tags. But as such - there is no linear mapping between 'array' and 'XML' so it always makes the programming uncomfortable.
问题的一部分是——XML没有数组之类的东西。它可能有重复的标签。但就其本身而言,在'array'和'XML'之间没有线性映射,因此它总是使编程变得不舒服。
What it's doing to you is assuming that the dsg
elements are an array, and casting them automatically.
它对您的作用是假设dsg元素是一个数组,并自动地对它们进行强制转换。
Anyway, I would suggest using XML::Twig
instead - and then your 'print' statements just look like this:
无论如何,我建议使用XML::Twig,然后你的“print”语句就像这样:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new->parse( \*DATA );
foreach my $element ( $twig->get_xpath( "cmts/STBDSG/dsg", 0 ) ) {
print $element ->first_child_text("dsgIfStdTunnelFilterTunnelId"), "\n";
print $element ->first_child_text("dsgIfStdTunnelFilterClientIdType"),
"\n";
}
Anyway, if you're forced into using XML::Simple
- and throwing it away and starting over isn't an option. (Because seriously, I'd consider it!).
无论如何,如果您*使用XML::Simple——那么丢弃它并重新开始是不可能的。(说真的,我会考虑的!)
What XML::Simple does with 'matching' elements is try and pretend they're arrays.
XML: Simple对“匹配”元素做的是尝试并假装它们是数组。
If there aren't matching elements, it treats them as a hash. That's probably what's catching you out. The problem is - in perl, hashes can't have duplicate keys - so your example, dsg
- rather than duplicating it, it array-ifys it.
如果没有匹配的元素,它将它们视为散列。这可能就是你想说的。问题是—在perl中,散列不能有重复的键—因此您的示例dsg—而不是复制它,而是对它进行排列。
Switching on ForceArray
puts everything into arrays, but some of the arrays might be single elements. That's useful if you want consistency though.
打开ForceArray将所有东西都放入数组中,但是有些数组可能是单个元素。如果你想要一致性,这是很有用的。
KeyAttr
probably doesn't help you - that's primarily geared to having different subelements and you wanting to 'map' them. It allows you to turn one of the XML attributes into the 'key' field in a hash.
KeyAttr可能对您没有帮助——这主要是针对拥有不同的子元素,您想要“映射”它们。它允许您将其中一个XML属性转换为散列中的“key”字段。
E.g.
如。
<element name="firstelement">content</element>
<element name="secondelement">morecontent</element>
If you specify KeyAttr
as name
it will make a hash with keys of firstelement
and secondelement
.
如果您指定KeyAttr作为名称,它将使用firstelement和secondelement键生成一个散列。
As your dsg
doesn't have this, then that's not what you want.
因为dsg没有这个,所以这不是你想要的。
To iterate upon dsg
:
迭代dsg。
foreach my $element ( @{ $xst->{cmts}{STBDSG}{dsg} } ) {
print $element ->{dsgIfStdTunnelFilterTunnelId}, "\n";
print $element ->{dsgIfStdTunnelFilterClientIdType}, "\n";
}
#1
4
It's worth considering the use of XML::Twig
, regardless of what the rest of your project does
考虑使用XML::Twig是值得的,不管项目的其他部分做什么
In particular, XML::Twig::Elt
objects -- the module's implementation of XML elements -- have a simplify
method, whose documentation says this
特别是XML::Twig::Elt对象——模块对XML元素的实现——有一个简化的方法,其文档说明了这一点。
Return a data structure suspiciously similar to XML::Simple's. Options are identical to XMLin options
返回与XML::Simple相似的数据结构。选项与XMLin选项相同
So you can use XML::Twig
for its precision and convenience, and apply the simplify
method if you need to pass on any data that looks like an XML::Simple
data structure
因此,您可以使用XML::Twig来实现它的精确性和方便性,如果需要传递任何看起来像XML::Simple数据结构的数据,可以使用simplify方法
#2
1
As you have found - XML::Simple
, isn't. Even it's documentation suggests:
正如您所发现的——XML::简单,不是。甚至它的文档表明:
The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces.
不建议在新代码中使用此模块。其他模块提供了更简单和一致的接口。
Part of the problem is - XML doesn't have any such thing as arrays. It might have duplicated tags. But as such - there is no linear mapping between 'array' and 'XML' so it always makes the programming uncomfortable.
问题的一部分是——XML没有数组之类的东西。它可能有重复的标签。但就其本身而言,在'array'和'XML'之间没有线性映射,因此它总是使编程变得不舒服。
What it's doing to you is assuming that the dsg
elements are an array, and casting them automatically.
它对您的作用是假设dsg元素是一个数组,并自动地对它们进行强制转换。
Anyway, I would suggest using XML::Twig
instead - and then your 'print' statements just look like this:
无论如何,我建议使用XML::Twig,然后你的“print”语句就像这样:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new->parse( \*DATA );
foreach my $element ( $twig->get_xpath( "cmts/STBDSG/dsg", 0 ) ) {
print $element ->first_child_text("dsgIfStdTunnelFilterTunnelId"), "\n";
print $element ->first_child_text("dsgIfStdTunnelFilterClientIdType"),
"\n";
}
Anyway, if you're forced into using XML::Simple
- and throwing it away and starting over isn't an option. (Because seriously, I'd consider it!).
无论如何,如果您*使用XML::Simple——那么丢弃它并重新开始是不可能的。(说真的,我会考虑的!)
What XML::Simple does with 'matching' elements is try and pretend they're arrays.
XML: Simple对“匹配”元素做的是尝试并假装它们是数组。
If there aren't matching elements, it treats them as a hash. That's probably what's catching you out. The problem is - in perl, hashes can't have duplicate keys - so your example, dsg
- rather than duplicating it, it array-ifys it.
如果没有匹配的元素,它将它们视为散列。这可能就是你想说的。问题是—在perl中,散列不能有重复的键—因此您的示例dsg—而不是复制它,而是对它进行排列。
Switching on ForceArray
puts everything into arrays, but some of the arrays might be single elements. That's useful if you want consistency though.
打开ForceArray将所有东西都放入数组中,但是有些数组可能是单个元素。如果你想要一致性,这是很有用的。
KeyAttr
probably doesn't help you - that's primarily geared to having different subelements and you wanting to 'map' them. It allows you to turn one of the XML attributes into the 'key' field in a hash.
KeyAttr可能对您没有帮助——这主要是针对拥有不同的子元素,您想要“映射”它们。它允许您将其中一个XML属性转换为散列中的“key”字段。
E.g.
如。
<element name="firstelement">content</element>
<element name="secondelement">morecontent</element>
If you specify KeyAttr
as name
it will make a hash with keys of firstelement
and secondelement
.
如果您指定KeyAttr作为名称,它将使用firstelement和secondelement键生成一个散列。
As your dsg
doesn't have this, then that's not what you want.
因为dsg没有这个,所以这不是你想要的。
To iterate upon dsg
:
迭代dsg。
foreach my $element ( @{ $xst->{cmts}{STBDSG}{dsg} } ) {
print $element ->{dsgIfStdTunnelFilterTunnelId}, "\n";
print $element ->{dsgIfStdTunnelFilterClientIdType}, "\n";
}