在java中,读取url并将其拆分为部分的最佳方法是什么?

时间:2022-07-04 03:05:17

Firstly, I am aware that there are other posts similar, but since mine is using a URL and I am not always sure what my delimiter will be, I feel that I am alright posting my question. My assignment is to make a crude web browser. I have a textField that a user enters the desired URL into. I then have obviously have to navigate to that webpage. Here is an example from my teacher of what my code would look kinda like. This is the code i'm suposed to be sending to my socket. Sample url: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol

首先,我知道还有其他类似的帖子,但由于我使用的是URL,而且我并不总是确定我的分隔符是什么,我觉得我很好地发布了我的问题。我的任务是制作一个粗糙的网络浏览器。我有一个textField,用户输入所需的URL。然后,我显然必须导航到该网页。以下是我的老师对我的代码看起来有点像的一个例子。这是我要发送到我的套接字的代码。示例网址:http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol

 GET /wiki/Hypertext_Transfer_Protocol HTTP/1.1\n
Host: en.wikipedia.org\n
\n

So my question is this: I am going to read in the url as just one complete string, so how do I extract just the "en.wikipedia.org" part and just the extension? I tried this as a test:

所以我的问题是这样的:我将在url中读取一个完整的字符串,那么如何只提取“en.wikipedia.org”部分和扩展名呢?我试过这个测试:

 String url = "http://en.wikipedia.org/wiki/Hypertext Transfer Protocol";
    String done = " ";
    String[] hope = url.split(".org");

    for ( int i = 0; i < hope.length; i++)
    {
        done = done + hope[i];
    }
    System.out.println(done);

This just prints out the URL without the ".org" in it. I think i'm on the right track. I am just not sure. Also, I know that websites can have different endings (.org, .com, .edu, etc) so I am assuming i'll have to have a few if statements that compenstate for the possible different endings. Basically, how do I get the url into the two parts that I need?

这只是打印出没有“.org”的URL。我想我正走在正确的轨道上。我只是不确定。另外,我知道网站可以有不同的结尾(.org,.com,.edu等),所以我假设我必须有一些if语句来补充可能的不同结局。基本上,我如何将网址分成我需要的两个部分?

3 个解决方案

#1


27  

The URL class pretty much does this, look at the tutorial. For example, given this URL:

URL类几乎可以做到这一点,看看教程。例如,给定此URL:

http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING

This is the kind of information you can expect to obtain:

这是您可以获得的信息:

protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING

#2


1  

This is how you should split your URL parts: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html

这是您应该如何拆分URL部分:http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html

#3


0  

Instead of url.split(".org"); try url.split("/"); and iterate through your array of strings.

而不是url.split(“。org”);试试url.split(“/”);并遍历您的字符串数组。

Or you can look into regular expressions. This is a good example to start with.

或者你可以查看正则表达式。这是一个很好的例子。

Good luck on your homework.

祝你的作业好运。

#1


27  

The URL class pretty much does this, look at the tutorial. For example, given this URL:

URL类几乎可以做到这一点,看看教程。例如,给定此URL:

http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING

This is the kind of information you can expect to obtain:

这是您可以获得的信息:

protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING

#2


1  

This is how you should split your URL parts: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html

这是您应该如何拆分URL部分:http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html

#3


0  

Instead of url.split(".org"); try url.split("/"); and iterate through your array of strings.

而不是url.split(“。org”);试试url.split(“/”);并遍历您的字符串数组。

Or you can look into regular expressions. This is a good example to start with.

或者你可以查看正则表达式。这是一个很好的例子。

Good luck on your homework.

祝你的作业好运。