$characterString = $verb[2];
$inputFile = $targetdirectory."/ppt/slides/slide".$slidenumber.".xml";
open FILE, "<$inputFile>";
for (@lines) {
if ($_ =~ /$characterString/) {
print "Matched $characterString \n ";
} else {
print "Did not match $characterString\n";
}
}
close FILE;
Here is a sample from the XML file:
下面是XML文件中的一个示例:
<a:t>Bailey</a:t></a:r></a:p><a:p><a:pPr lvl="1"><a:lnSpc><a:spcPct val="90000"/>
Here is the output:
这是输出:
PUB ENGINE: Version 5-26-2015
Did not match billybob
Did not match Bailey
Bailey is in the xml file, but billybob is not
Bailey在xml文件中,但是billybob不是
2 个解决方案
#1
3
The first two major issues:
前两个主要问题:
-
You are trying to open a file whose name ends with
.xml>
.您正在尝试打开一个文件名以.xml>结尾的文件。
open FILE, "<$inputFile>";
should be
应该是
open FILE, "<$inputFile";
Well, not really. It should be
好吧,其实不是。它应该是
open(my $FILE, '<', $inputFile) or die("Can't open \"$inputFile\": $!\n");
This avoids the use of global vars, this avoids the file name from being treated as anything but a file name, and this checks if the
open
succeeded (being a common point of failure).这避免了全局vars的使用,避免了文件名被当作文件名以外的任何东西,并检查open是否成功(这是一个常见的失败点)。
-
You never read from the file handle.
您从不从文件句柄中读取。
for (@lines) {
should be
应该是
while (<FILE>) {
Or if you adopted my suggested change,
或者如果你采纳了我的建议,
while (<$FILE>) {
#2
2
I would suggest that you're taking the wrong approach. XML doesn't parse well with line and regex based parsing - there's a variety of ways to create semantically identical XML that doesn't match the same regular expressions.
我建议你采取错误的方法。XML不能很好地解析基于线和正则表达式的解析——有很多方法可以创建语义相同的XML,而这些XML与相同的正则表达式不匹配。
I've had to adjust your XML a little too, because it's not valid. I am assuming that because you mention 'sample' that your XML is valid. For reference - it's useful to provide sample XML that's valid - which means all the tags open/close.
我也不得不调整你的XML,因为它是无效的。我假设,因为您提到了“sample”,所以您的XML是有效的。作为参考,提供有效的示例XML非常有用,这意味着所有的标记都是打开/关闭的。
So I'm using this:
所以我用这个:
<root>
<a:r>
<a:p>
<a:t>Bailey</a:t>
</a:p>
</a:r>
<a:p>
<a:pPr lvl="1">
<a:lnSpc>
<a:spcPct val="90000" />
</a:lnSpc>
</a:pPr>
</a:p>
</root>
Note this can be written in a variety of ways:
注意,可以用多种方式写:
<root
><a:r
><a:p
><a:t
>Bailey</a:t></a:p></a:r><a:p
><a:pPr
lvl="1"
><a:lnSpc
><a:spcPct
val="90000"
/></a:lnSpc></a:pPr></a:p></root>
Or:
或者:
<root><a:r><a:p><a:t>Bailey</a:t></a:p></a:r><a:p><a:pPr lvl="1"><a:lnSpc><a:spcPct val="90000"/></a:lnSpc></a:pPr></a:p></root>
All of which mean the same - and hopefully illustrates why using line based parsing is a bad idea. This may not entirely apply to your use case, but I'm a firm believer that using an XML parser whenever XML is involved is no bad thing.
所有这些都意味着相同的意思——希望能说明为什么使用基于线的解析是一个坏主意。这可能并不完全适用于您的用例,但我坚信,在涉及XML的时候使用XML解析器并不是件坏事。
Anyway - finding elements.
无论如何,找到元素。
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $search = 'Bailey';
my $found;
XML::Twig->new(
twig_handlers => {
'_all_' => sub { $found++ if $_->text =~ m/$search/ }
}
)->parsefile($inputFile);
if ($found) {
print "Found $search\n";
}
else {
print "Didn't find $search\n";
}
Note - only 'finds' the keywords in the text of the XML, rather than in any of the attributes. This is usually more desirable than just blind matching XML structure/attributes/content.
注意——只在XML文本中“查找”关键字,而不是在任何属性中。这通常比盲目匹配XML结构/属性/内容更可取。
#1
3
The first two major issues:
前两个主要问题:
-
You are trying to open a file whose name ends with
.xml>
.您正在尝试打开一个文件名以.xml>结尾的文件。
open FILE, "<$inputFile>";
should be
应该是
open FILE, "<$inputFile";
Well, not really. It should be
好吧,其实不是。它应该是
open(my $FILE, '<', $inputFile) or die("Can't open \"$inputFile\": $!\n");
This avoids the use of global vars, this avoids the file name from being treated as anything but a file name, and this checks if the
open
succeeded (being a common point of failure).这避免了全局vars的使用,避免了文件名被当作文件名以外的任何东西,并检查open是否成功(这是一个常见的失败点)。
-
You never read from the file handle.
您从不从文件句柄中读取。
for (@lines) {
should be
应该是
while (<FILE>) {
Or if you adopted my suggested change,
或者如果你采纳了我的建议,
while (<$FILE>) {
#2
2
I would suggest that you're taking the wrong approach. XML doesn't parse well with line and regex based parsing - there's a variety of ways to create semantically identical XML that doesn't match the same regular expressions.
我建议你采取错误的方法。XML不能很好地解析基于线和正则表达式的解析——有很多方法可以创建语义相同的XML,而这些XML与相同的正则表达式不匹配。
I've had to adjust your XML a little too, because it's not valid. I am assuming that because you mention 'sample' that your XML is valid. For reference - it's useful to provide sample XML that's valid - which means all the tags open/close.
我也不得不调整你的XML,因为它是无效的。我假设,因为您提到了“sample”,所以您的XML是有效的。作为参考,提供有效的示例XML非常有用,这意味着所有的标记都是打开/关闭的。
So I'm using this:
所以我用这个:
<root>
<a:r>
<a:p>
<a:t>Bailey</a:t>
</a:p>
</a:r>
<a:p>
<a:pPr lvl="1">
<a:lnSpc>
<a:spcPct val="90000" />
</a:lnSpc>
</a:pPr>
</a:p>
</root>
Note this can be written in a variety of ways:
注意,可以用多种方式写:
<root
><a:r
><a:p
><a:t
>Bailey</a:t></a:p></a:r><a:p
><a:pPr
lvl="1"
><a:lnSpc
><a:spcPct
val="90000"
/></a:lnSpc></a:pPr></a:p></root>
Or:
或者:
<root><a:r><a:p><a:t>Bailey</a:t></a:p></a:r><a:p><a:pPr lvl="1"><a:lnSpc><a:spcPct val="90000"/></a:lnSpc></a:pPr></a:p></root>
All of which mean the same - and hopefully illustrates why using line based parsing is a bad idea. This may not entirely apply to your use case, but I'm a firm believer that using an XML parser whenever XML is involved is no bad thing.
所有这些都意味着相同的意思——希望能说明为什么使用基于线的解析是一个坏主意。这可能并不完全适用于您的用例,但我坚信,在涉及XML的时候使用XML解析器并不是件坏事。
Anyway - finding elements.
无论如何,找到元素。
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $search = 'Bailey';
my $found;
XML::Twig->new(
twig_handlers => {
'_all_' => sub { $found++ if $_->text =~ m/$search/ }
}
)->parsefile($inputFile);
if ($found) {
print "Found $search\n";
}
else {
print "Didn't find $search\n";
}
Note - only 'finds' the keywords in the text of the XML, rather than in any of the attributes. This is usually more desirable than just blind matching XML structure/attributes/content.
注意——只在XML文本中“查找”关键字,而不是在任何属性中。这通常比盲目匹配XML结构/属性/内容更可取。