I want to retrieve the HTML code of a link (web page) in PHP. For example, if the link is
我想检索PHP中的链接(web页面)的HTML代码。例如,如果链接是
https://*.com/questions/ask
then I want the HTML code of the page which is served. I want to retrieve this HTML code and store it in a PHP variable.
然后我想要页面的HTML代码。我想要检索这个HTML代码并将其存储在一个PHP变量中。
How can I do this?
我该怎么做呢?
8 个解决方案
#1
97
If your PHP server allows url fopen wrappers then the simplest way is:
如果您的PHP服务器允许url fopen包装器,那么最简单的方法是:
$html = file_get_contents('http://*.com/questions/ask');
If you need more control then you should look at the cURL functions:
如果你需要更多的控制,那么你应该看看旋度函数:
$c = curl_init('http://*.com/questions/ask');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)
$html = curl_exec($c);
if (curl_error($c))
die(curl_error($c));
// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);
curl_close($c);
#2
17
Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser. I find PHP Simple HTML DOM Parser very easy to use.
另外,如果您想以某种方式操作检索到的页面,您可能需要尝试一些php DOM解析器。我发现PHP简单的HTML DOM解析器非常容易使用。
#3
10
You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql
您可能想要查看Yahoo: http://developer.yahoo.com/yql中的YQL库
The task at hand is as simple as
手头的任务很简单
select * from html where url = 'http://*.com/questions/ask'
You can try this out in the console at: http://developer.yahoo.com/yql/console (requires login)
您可以在控制台:http://developer.yahoo.com/yql/console中试用(需要登录)
Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html
还可以看看Chris Heilmanns屏幕上的一些精彩内容:http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html
#4
8
Simple way: Use file_get_contents()
:
简单的方法:使用file_get_contents():
$page = file_get_contents('http://*.com/questions/ask');
Please note that allow_url_fopen
must be true
in you php.ini
to be able to use URL-aware fopen wrappers.
请注意,在php中,allow_url_fopen必须为true。可以使用支持url的fopen包装器。
More advanced way: If you cannot change your PHP configuration, allow_url_fopen
is false
by default and if ext/curl is installed, use the cURL
library to connect to the desired page.
更高级的方法:如果不能更改PHP配置,默认情况下allow_url_fopen是错误的,如果安装了ext/curl,则使用curl库连接到所需的页面。
#5
2
look at this function:
看看这个函数:
http://ru.php.net/manual/en/function.file-get-contents.php
http://ru.php.net/manual/en/function.file-get-contents.php
#6
2
you could use file_get_contents if you are wanting to store the source as a variable however curl is a better practive.
如果您想将源存储为变量,那么可以使用file_get_contents,但是curl是更好的实践。
$url = file_get_contents('http://example.com');
echo $url;
this solution will display the webpage on your site. However curl is a better option.
此解决方案将在您的站点上显示网页。不过卷发是更好的选择。
#7
1
include_once('simple_html_dom.php');
$url="http://*.com/questions/ask";
$html = file_get_html($url);
You can get the whole HTML code as an array (parsed form) using this code Download the 'simple_html_dom.php' file here http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download
可以使用此代码下载“simple_html_dom”,将整个HTML代码作为数组(解析后的表单)。php的文件http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download
#8
0
Here is two different, simple ways to get content from URL:
这里有两种从URL获取内容的简单方法:
1) the first method
1)第一种方法
Enable Allow_url_include from your hosting (php.ini or somewhere)
从您的主机(php)启用Allow_url_include。ini或地方)
<?php
$variableee = readfile("http://example.com/");
echo $variableee;
?>
or
或
2)the second method
2)第二种方法
Enable php_curl, php_imap and php_openssl
启用php_curl、php_imap和php_openssl
<?php
// you can add anoother curl options too
// see here - http://php.net/manual/en/function.curl-setopt.php
function get_dataa($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$variableee = get_dataa('http://example.com');
echo $variableee;
?>
#1
97
If your PHP server allows url fopen wrappers then the simplest way is:
如果您的PHP服务器允许url fopen包装器,那么最简单的方法是:
$html = file_get_contents('http://*.com/questions/ask');
If you need more control then you should look at the cURL functions:
如果你需要更多的控制,那么你应该看看旋度函数:
$c = curl_init('http://*.com/questions/ask');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)
$html = curl_exec($c);
if (curl_error($c))
die(curl_error($c));
// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);
curl_close($c);
#2
17
Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser. I find PHP Simple HTML DOM Parser very easy to use.
另外,如果您想以某种方式操作检索到的页面,您可能需要尝试一些php DOM解析器。我发现PHP简单的HTML DOM解析器非常容易使用。
#3
10
You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql
您可能想要查看Yahoo: http://developer.yahoo.com/yql中的YQL库
The task at hand is as simple as
手头的任务很简单
select * from html where url = 'http://*.com/questions/ask'
You can try this out in the console at: http://developer.yahoo.com/yql/console (requires login)
您可以在控制台:http://developer.yahoo.com/yql/console中试用(需要登录)
Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html
还可以看看Chris Heilmanns屏幕上的一些精彩内容:http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html
#4
8
Simple way: Use file_get_contents()
:
简单的方法:使用file_get_contents():
$page = file_get_contents('http://*.com/questions/ask');
Please note that allow_url_fopen
must be true
in you php.ini
to be able to use URL-aware fopen wrappers.
请注意,在php中,allow_url_fopen必须为true。可以使用支持url的fopen包装器。
More advanced way: If you cannot change your PHP configuration, allow_url_fopen
is false
by default and if ext/curl is installed, use the cURL
library to connect to the desired page.
更高级的方法:如果不能更改PHP配置,默认情况下allow_url_fopen是错误的,如果安装了ext/curl,则使用curl库连接到所需的页面。
#5
2
look at this function:
看看这个函数:
http://ru.php.net/manual/en/function.file-get-contents.php
http://ru.php.net/manual/en/function.file-get-contents.php
#6
2
you could use file_get_contents if you are wanting to store the source as a variable however curl is a better practive.
如果您想将源存储为变量,那么可以使用file_get_contents,但是curl是更好的实践。
$url = file_get_contents('http://example.com');
echo $url;
this solution will display the webpage on your site. However curl is a better option.
此解决方案将在您的站点上显示网页。不过卷发是更好的选择。
#7
1
include_once('simple_html_dom.php');
$url="http://*.com/questions/ask";
$html = file_get_html($url);
You can get the whole HTML code as an array (parsed form) using this code Download the 'simple_html_dom.php' file here http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download
可以使用此代码下载“simple_html_dom”,将整个HTML代码作为数组(解析后的表单)。php的文件http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download
#8
0
Here is two different, simple ways to get content from URL:
这里有两种从URL获取内容的简单方法:
1) the first method
1)第一种方法
Enable Allow_url_include from your hosting (php.ini or somewhere)
从您的主机(php)启用Allow_url_include。ini或地方)
<?php
$variableee = readfile("http://example.com/");
echo $variableee;
?>
or
或
2)the second method
2)第二种方法
Enable php_curl, php_imap and php_openssl
启用php_curl、php_imap和php_openssl
<?php
// you can add anoother curl options too
// see here - http://php.net/manual/en/function.curl-setopt.php
function get_dataa($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$variableee = get_dataa('http://example.com');
echo $variableee;
?>