I want to use HTTP GET and POST commands to retrieve URLs from a website and parse the HTML. How do I do this?
我想使用HTTP GET和POST命令从网站检索URL并解析HTML。我该怎么做呢?
5 个解决方案
#1
19
You can use HttpURLConnection in combination with URL.
您可以将HttpURLConnection与URL结合使用。
URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader
#2
3
The easiest way to do a GET is to use the built in java.net.URL. However, as mentioned, httpclient is the proper way to go, as it will allow you among others to handle redirects.
执行GET最简单的方法是使用内置的java.net.URL。但是,如上所述,httpclient是正确的方法,因为它将允许您和其他人处理重定向。
For parsing the html, you can use html parser.
对于解析html,您可以使用html解析器。
#3
3
The ticked/approved answer for this is from robhruska - thank you. This shows the most basic way to do it, it's simple with an understanding of what's necessary to do a simple URL connection. However, the longer term strategy would be to use HTTP Client for more advanced and feature rich ways to complete this task.
勾选/批准的答案来自robhruska - 谢谢。这显示了最基本的方法,它很简单,了解了进行简单URL连接所需的内容。但是,长期策略是使用HTTP客户端来获得更高级和功能丰富的方法来完成此任务。
Thank you everyone, here's the quick answer again:
谢谢大家,这里是快速回答:
URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader
#5
#1
19
You can use HttpURLConnection in combination with URL.
您可以将HttpURLConnection与URL结合使用。
URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader
#2
3
The easiest way to do a GET is to use the built in java.net.URL. However, as mentioned, httpclient is the proper way to go, as it will allow you among others to handle redirects.
执行GET最简单的方法是使用内置的java.net.URL。但是,如上所述,httpclient是正确的方法,因为它将允许您和其他人处理重定向。
For parsing the html, you can use html parser.
对于解析html,您可以使用html解析器。
#3
3
The ticked/approved answer for this is from robhruska - thank you. This shows the most basic way to do it, it's simple with an understanding of what's necessary to do a simple URL connection. However, the longer term strategy would be to use HTTP Client for more advanced and feature rich ways to complete this task.
勾选/批准的答案来自robhruska - 谢谢。这显示了最基本的方法,它很简单,了解了进行简单URL连接所需的内容。但是,长期策略是使用HTTP客户端来获得更高级和功能丰富的方法来完成此任务。
Thank you everyone, here's the quick answer again:
谢谢大家,这里是快速回答:
URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader