How I can get the content of the web page using ASP.NET? I need to write a program to get the the HTML of a webpage and store it into a string variable.
如何使用ASP.NET获取网页内容?我需要编写一个程序来获取网页的HTML并将其存储到一个字符串变量中。
4 个解决方案
#1
100
You can use the WebClient
您可以使用WebClient
WebClient client = new WebClient();
string downloadString = client.DownloadString("http://www.gooogle.com");
#2
66
I've run into issues with Webclient.Downloadstring before. If you do, you can try this:
我遇到了Webclient的问题。Downloadstring之前。如果你这样做了,你可以试试这个:
WebRequest request = WebRequest.Create("http://www.google.com");
WebResponse response = request.GetResponse();
Stream data = response.GetResponseStream();
string html = String.Empty;
using (StreamReader sr = new StreamReader(data))
{
html = sr.ReadToEnd();
}
#3
20
I recommend not using WebClient.DownloadString
. This is because (at least in .NET 3.5) DownloadString is not smart enough to use/remove the BOM, should it be present. This can result in the BOM (
) incorrectly appearing as part of the string when UTF-8 data is returned (at least without a charset) - ick!
我建议不要使用WebClient.DownloadString。这是因为(至少在。net 3.5中)下载的字符串在使用/删除BOM时不够聪明,如果BOM存在的话。这会导致当返回UTF-8数据(至少没有字符集)时,BOM (i»¿)不正确地作为字符串的一部分出现。
Instead, this slight variation will work correctly with BOMs:
相反,这种微小的变化将在BOMs上正常工作:
string ReadTextFromUrl(string url) {
// WebClient is still convenient
// Assume UTF8, but detect BOM - could also honor response charset I suppose
using (var client = new WebClient())
using (var stream = client.OpenRead(url))
using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
return textReader.ReadToEnd();
}
}
#4
7
Webclient client = new Webclient();
string content = client.DownloadString(url);
Pass the URL of page who you want to get. You can parse the result using htmlagilitypack.
传递您想要获取的页面的URL。您可以使用htmlagilitypack解析结果。
#1
100
You can use the WebClient
您可以使用WebClient
WebClient client = new WebClient();
string downloadString = client.DownloadString("http://www.gooogle.com");
#2
66
I've run into issues with Webclient.Downloadstring before. If you do, you can try this:
我遇到了Webclient的问题。Downloadstring之前。如果你这样做了,你可以试试这个:
WebRequest request = WebRequest.Create("http://www.google.com");
WebResponse response = request.GetResponse();
Stream data = response.GetResponseStream();
string html = String.Empty;
using (StreamReader sr = new StreamReader(data))
{
html = sr.ReadToEnd();
}
#3
20
I recommend not using WebClient.DownloadString
. This is because (at least in .NET 3.5) DownloadString is not smart enough to use/remove the BOM, should it be present. This can result in the BOM (
) incorrectly appearing as part of the string when UTF-8 data is returned (at least without a charset) - ick!
我建议不要使用WebClient.DownloadString。这是因为(至少在。net 3.5中)下载的字符串在使用/删除BOM时不够聪明,如果BOM存在的话。这会导致当返回UTF-8数据(至少没有字符集)时,BOM (i»¿)不正确地作为字符串的一部分出现。
Instead, this slight variation will work correctly with BOMs:
相反,这种微小的变化将在BOMs上正常工作:
string ReadTextFromUrl(string url) {
// WebClient is still convenient
// Assume UTF8, but detect BOM - could also honor response charset I suppose
using (var client = new WebClient())
using (var stream = client.OpenRead(url))
using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
return textReader.ReadToEnd();
}
}
#4
7
Webclient client = new Webclient();
string content = client.DownloadString(url);
Pass the URL of page who you want to get. You can parse the result using htmlagilitypack.
传递您想要获取的页面的URL。您可以使用htmlagilitypack解析结果。