I have some input with a link and I want to open that link. For instance, I have an HTML file and want to find all links in the file and open their contents in an Excel spreadsheet.
我有一些链接的输入,我想打开该链接。例如,我有一个HTML文件,想要查找文件中的所有链接,并在Excel电子表格中打开它们的内容。
4 个解决方案
#1
It sounds like you want the linktractor script from my HTML::SimpleLinkExtor module.
听起来你想要我的HTML :: SimpleLinkExtor模块中的linktractor脚本。
You might also be interested in my webreaper script. I wrote that a long, long time ago to do something close to this same task. I don't really recommend it because other tools are much better now, but you can at least look at the code.
您可能也对我的webreaper脚本感兴趣。很久很久以前,我写了一篇接近同样任务的文章。我不推荐它,因为现在其他工具要好得多,但你至少可以看一下代码。
CPAN and Google are your friends. :)
CPAN和Google是你的朋友。 :)
Mojo::UserAgent is quite nice for this, too:
Mojo :: UserAgent对此也非常好:
use Mojo::UserAgent
print Mojo::UserAgent
->new
->get( $ARGV[0] )
->res
->dom->find( "a" )
->map( attr => "href" )
->join( "\n" );
#2
That sounds like a job for WWW::Mechanize. It provides a fairly high level interface to fetching and studying web pages.
这听起来像WWW :: Mechanize的工作。它为获取和学习网页提供了相当高级的界面。
Once you've read the docs, I think you'll have a good idea how to go about it.
一旦你阅读了文档,我想你会知道如何去做。
#3
There is also Web::Query:
还有Web :: Query:
#!/usr/bin/env perl
use 5.10.0;
use strict;
use warnings;
use Web::Query;
say for wq( shift )->find('a')->attr('href');
Or, from the cli:
或者,从cli:
$ perl -MWeb::Query -E'say for wq(shift)->find("a")->attr("href")' \
http://techblog.babyl.ca
#4
I've used URI::Find for this in the past (for when the file is not HTML).
我过去曾经使用过URI :: Find(当文件不是HTML时)。
#1
It sounds like you want the linktractor script from my HTML::SimpleLinkExtor module.
听起来你想要我的HTML :: SimpleLinkExtor模块中的linktractor脚本。
You might also be interested in my webreaper script. I wrote that a long, long time ago to do something close to this same task. I don't really recommend it because other tools are much better now, but you can at least look at the code.
您可能也对我的webreaper脚本感兴趣。很久很久以前,我写了一篇接近同样任务的文章。我不推荐它,因为现在其他工具要好得多,但你至少可以看一下代码。
CPAN and Google are your friends. :)
CPAN和Google是你的朋友。 :)
Mojo::UserAgent is quite nice for this, too:
Mojo :: UserAgent对此也非常好:
use Mojo::UserAgent
print Mojo::UserAgent
->new
->get( $ARGV[0] )
->res
->dom->find( "a" )
->map( attr => "href" )
->join( "\n" );
#2
That sounds like a job for WWW::Mechanize. It provides a fairly high level interface to fetching and studying web pages.
这听起来像WWW :: Mechanize的工作。它为获取和学习网页提供了相当高级的界面。
Once you've read the docs, I think you'll have a good idea how to go about it.
一旦你阅读了文档,我想你会知道如何去做。
#3
There is also Web::Query:
还有Web :: Query:
#!/usr/bin/env perl
use 5.10.0;
use strict;
use warnings;
use Web::Query;
say for wq( shift )->find('a')->attr('href');
Or, from the cli:
或者,从cli:
$ perl -MWeb::Query -E'say for wq(shift)->find("a")->attr("href")' \
http://techblog.babyl.ca
#4
I've used URI::Find for this in the past (for when the file is not HTML).
我过去曾经使用过URI :: Find(当文件不是HTML时)。