hello Currently i am able to parse the xml file if it is saved in my folder from the webpage.
你好,目前我能够解析xml文件,如果它保存在我的文件夹中的网页上。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
my $parser = new XML::Simple;
my $data = $parser->XMLin("config.xml");
print Dumper($data);
But it does not work if i am trying to parse it from the website.
但如果我试图从网站上解析它,它不起作用。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
my $parser = new XML::Simple;
my $data = $parser->XMLin("http://website/computers/computers_main/config.xml");
print Dumper($data);
it gives me following error "File does not exist: http://website/computers/computers_main/config.xml at test.pl line 12"
它给了我以下错误“文件不存在:http://website/computers/computers_main/config.xml at test.pl第12行”
How do I parse multiple xml files from the webpage? i have to grab multiple xml form the websites and parse it. can someone please help me with this?
如何从网页解析多个xml文件?我必须从网站上抓取多个xml并解析它。有人可以帮我这个吗?
3 个解决方案
#1
2
Super Edit: This method will require WWW::Mechanize but it will allow you to login to your website then get the xml page. You will have to change a few things which are found in the comments. Hope this helps.
超级编辑:此方法将需要WWW :: Mechanize,但它允许您登录到您的网站然后获取xml页面。您将不得不更改注释中的一些内容。希望这可以帮助。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use WWW::Mechanize;
# Create a new instance of Mechanize
$bot = WWW::Mechanize->new();
# Create a cookie jar for the login credentials
$bot->cookie_jar(
HTTP::Cookies->new(
file => "cookies.txt",
autosave => 1,
ignore_discard => 1,
)
);
# Connect to the login page
$response = $bot->get( 'http://www.thePageYouLoginTo.com' );
# Get the login form
$bot->form_number(1);
# Enter the login credentials.
# You're going to have to change the login and
# pass(on the left) to match with the name of the form you're logging
# into(Found in the source of the website). Then you can put your
# respective credentials on the right.
$bot->field( login => 'thisIsWhereYourLoginInfoGoes' );
$bot->field( pass => 'thisIsWhereYourPasswordInfoGoes' );
$response =$bot->click();
# Get the xml page
$response = $bot->get( 'http://website/computers/computers_main/config.xml' );
my $content = $response->decoded_content();
my $parser = new XML::Simple;
my $data = $parser->XMLin($content);
print Dumper($data);
Give this a go. Uses LWP::Simple as answered above. It just connects to the page and grabs the content of that page (xml file) and runs in through XMLin. Edit: added simple error checking at the get $url line. Edit2: Keeping the code here because it should work if a login is not required.
放手一搏。如上所述,使用LWP :: Simple。它只是连接到页面并抓取该页面的内容(xml文件)并通过XMLin运行。编辑:在get $ url行添加了简单的错误检查。 Edit2:将代码保存在此处,因为如果不需要登录,它应该可以正常工作。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use LWP::Simple;
my $parser = new XML::Simple;
my $url = 'http://website/computers/computers_main/config.xml';
my $content = get $url or die "Unable to get $url\n";
my $data = $parser->XMLin($content);
print Dumper($data);
#2
3
Read the documentation for XML::Simple
. Notice that the XMLin
method can take a file handle, a string, and even an IO::Handle
object. What it can't take is a URL via HTTP.
阅读XML :: Simple的文档。请注意,XMLin方法可以使用文件句柄,字符串甚至是IO :: Handle对象。它不能采用的是通过HTTP的URL。
Use the Perl module LWP::Simple
to fetch the XML file you need and pass that through to XMLin
.
使用Perl模块LWP :: Simple来获取所需的XML文件并将其传递给XMLin。
You'll have to download and install LWP::Simple
via using cpan
, as you did before for XML::Simple
.
您必须使用cpan下载并安装LWP :: Simple,就像之前为XML :: Simple一样。
#3
1
If you don't have any specific reason to stick with XML::Simple, then use some other parser like XML::Twig, XML::LibXML which provides an inbuilt feature to parse the XML available through web.
如果您没有任何特定的理由坚持使用XML :: Simple,那么请使用其他解析器,如XML :: Twig,XML :: LibXML,它提供内置功能来解析通过Web提供的XML。
Here is the simple code for the same using XML::Twig
下面是使用XML :: Twig的简单代码
use strict;
use warnings;
use XML::Twig;
use LWP::Simple;
my $url = 'http://website/computers/computers_main/config.xml';
my $twig= XML::Twig->new();
$twig->parse( LWP::Simple::get( $url ));
As said, XML::Simple does not have such in-built feature.
如上所述,XML :: Simple没有这样的内置功能。
#1
2
Super Edit: This method will require WWW::Mechanize but it will allow you to login to your website then get the xml page. You will have to change a few things which are found in the comments. Hope this helps.
超级编辑:此方法将需要WWW :: Mechanize,但它允许您登录到您的网站然后获取xml页面。您将不得不更改注释中的一些内容。希望这可以帮助。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use WWW::Mechanize;
# Create a new instance of Mechanize
$bot = WWW::Mechanize->new();
# Create a cookie jar for the login credentials
$bot->cookie_jar(
HTTP::Cookies->new(
file => "cookies.txt",
autosave => 1,
ignore_discard => 1,
)
);
# Connect to the login page
$response = $bot->get( 'http://www.thePageYouLoginTo.com' );
# Get the login form
$bot->form_number(1);
# Enter the login credentials.
# You're going to have to change the login and
# pass(on the left) to match with the name of the form you're logging
# into(Found in the source of the website). Then you can put your
# respective credentials on the right.
$bot->field( login => 'thisIsWhereYourLoginInfoGoes' );
$bot->field( pass => 'thisIsWhereYourPasswordInfoGoes' );
$response =$bot->click();
# Get the xml page
$response = $bot->get( 'http://website/computers/computers_main/config.xml' );
my $content = $response->decoded_content();
my $parser = new XML::Simple;
my $data = $parser->XMLin($content);
print Dumper($data);
Give this a go. Uses LWP::Simple as answered above. It just connects to the page and grabs the content of that page (xml file) and runs in through XMLin. Edit: added simple error checking at the get $url line. Edit2: Keeping the code here because it should work if a login is not required.
放手一搏。如上所述,使用LWP :: Simple。它只是连接到页面并抓取该页面的内容(xml文件)并通过XMLin运行。编辑:在get $ url行添加了简单的错误检查。 Edit2:将代码保存在此处,因为如果不需要登录,它应该可以正常工作。
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use LWP::Simple;
my $parser = new XML::Simple;
my $url = 'http://website/computers/computers_main/config.xml';
my $content = get $url or die "Unable to get $url\n";
my $data = $parser->XMLin($content);
print Dumper($data);
#2
3
Read the documentation for XML::Simple
. Notice that the XMLin
method can take a file handle, a string, and even an IO::Handle
object. What it can't take is a URL via HTTP.
阅读XML :: Simple的文档。请注意,XMLin方法可以使用文件句柄,字符串甚至是IO :: Handle对象。它不能采用的是通过HTTP的URL。
Use the Perl module LWP::Simple
to fetch the XML file you need and pass that through to XMLin
.
使用Perl模块LWP :: Simple来获取所需的XML文件并将其传递给XMLin。
You'll have to download and install LWP::Simple
via using cpan
, as you did before for XML::Simple
.
您必须使用cpan下载并安装LWP :: Simple,就像之前为XML :: Simple一样。
#3
1
If you don't have any specific reason to stick with XML::Simple, then use some other parser like XML::Twig, XML::LibXML which provides an inbuilt feature to parse the XML available through web.
如果您没有任何特定的理由坚持使用XML :: Simple,那么请使用其他解析器,如XML :: Twig,XML :: LibXML,它提供内置功能来解析通过Web提供的XML。
Here is the simple code for the same using XML::Twig
下面是使用XML :: Twig的简单代码
use strict;
use warnings;
use XML::Twig;
use LWP::Simple;
my $url = 'http://website/computers/computers_main/config.xml';
my $twig= XML::Twig->new();
$twig->parse( LWP::Simple::get( $url ));
As said, XML::Simple does not have such in-built feature.
如上所述,XML :: Simple没有这样的内置功能。