开发Web服务器的提示

时间:2023-01-16 16:42:56

After doing some search here, I found next to no questions on developing a web server.

在这里做了一些搜索后,我发现没有关于开发Web服务器的问题。

I'm mainly going to be doing this for two reasons. As a side project and to learn more about developing a server program. This is not going to turn into a usable application, more of a learning tool

我主要是因为两个原因而这样做。作为一个侧面项目,并了解有关开发服务器程序的更多信息。这不会变成一个可用的应用程序,更像是一个学习工具

So the questions are simple.

所以问题很简单。

  • Have you developed a web server? (no matter what language)
  • 你开发了一个Web服务器吗? (不管用什么语言)

  • What are the gotchas and other good tips can you supply
  • 你可以提供什么问题和其他好的技巧

Links to helpful sites are welcome, but don't link to a working project that is open source, since this is about the process of learning.

欢迎链接到有用的网站,但不要链接到开源的工作项目,因为这是关于学习的过程。

8 个解决方案

#1


A web server starts out as being an extremely simple piece of code:

Web服务器最初只是一段非常简单的代码:

  • open a TCP/IP socket on port 80
  • 在端口80上打开TCP / IP套接字

  • while not terminated
    • wait for connections on that socket
    • 等待该套接字上的连接

    • when someone sends you HTTP headers
      • find the path to the file
      • 找到文件的路径

      • copy the file to the socket
      • 将文件复制到套接字

    • 当有人向您发送HTTP标头时,找到文件的路径将文件复制到套接字

  • 虽然没有终止等待有人发送给你的套接字HTTP标头找到文件的路径将文件复制到套接字

So the outline of the code is easy.

所以代码的轮廓很简单。

Now, you have some complexities to handle:

现在,您需要处理一些复杂问题:

  • in the simplest version of the code, while you're talking to one browser, all the others can't connect. You need to come up with some way of handling multiple connections.
  • 在最简单的代码版本中,当您与一个浏览器通话时,所有其他浏览器都无法连接。您需要提出一些处理多个连接的方法。

  • it's often convenient to be able to send out something more than just a static file (although the first HTTP servers did exactly that) so you need to be able to run other programs.
  • 能够发送不仅仅是静态文件的东西通常很方便(尽管第一个HTTP服务器就是这样做的)所以你需要能够运行其他程序。

Handling the possibility of multiple connections is also relatively easy, with a number of possible choices.

处理多个连接的可能性也相对容易,有许多可能的选择。

  • the simplest version (again, this is the way it was done originally) is to have the code that listens to port 80 set up a specific socket for that connection, then fork a copy of itself to handle that one connection. That process runs until the socket is closed, and then terminates. However, that's relatively expensive: a fork takes tens of milliseconds in general, so that limits how fast you can run.
  • 最简单的版本(再次,这是它最初完成的方式)是让侦听端口80的代码为该连接设置一个特定的套接字,然后分叉自己的副本来处理那个连接。该过程一直运行直到套接字关闭,然后终止。然而,这是相对昂贵的:一般来说,fork需要几十毫秒,因此限制了你运行的速度。

  • The second choice is to create a lightweight process — a/k/a a thread — to process the request.
  • 第二种选择是创建一个轻量级进程 - 一个/ k /一个线程 - 来处理请求。

Running a program is actually fairly easy too. In general, you define a special path to a CGI directory; a URL that has a path through that directory then interprets the path name as the path to a program. The server would then create a subprocess using fork/exec, with STDOUT connected to the socket. The program then runs, sending output to STDOUT, and it is sent on to the client browser.

运行程序实际上也很容易。通常,您定义CGI目录的特殊路径;具有通过该目录的路径的URL然后将路径名称解释为程序的路径。然后,服务器将使用fork / exec创建子进程,并将STDOUT连接到套接字。然后程序运行,将输出发送到STDOUT,然后将其发送到客户端浏览器。

This is the basic pattern; everything else a web server does is just adding frills and additional functionality to this basic pattern.

这是基本模式; Web服务器所做的一切只是为这个基本模式添加了多余的功能和附加功能。

Here are some other sources for example code:

以下是一些代码示例的其他来源:


It pretty much does nothing of what you really wanted, but for simple it's hard to beat this one from http://www.commandlinefu.com:

它几乎没有你真正想要的东西,但简单来说,很难从http://www.commandlinefu.com击败这个:

$ python -m SimpleHTTPServer

$ python -m SimpleHTTPServer

#2


Firstly, please don't let this become a usable project - getting security right for web servers is really hard.

首先,请不要让它成为一个可用的项目 - 获得Web服务器的安全性非常困难。

Ok, here are things to keep in mind:

好的,这里要记住的事情:

  1. The thread that accepts connections needs to hand off to background threads as soon as possible.
  2. 接受连接的线程需要尽快切换到后台线程。

  3. You can't have a thread for every single connection - with large volumes you'll run out of your thread limit.
  4. 你不能为每一个连接都有一个线程 - 大容量你将超出你的线程限制。

  5. Use some kind of a worker thread pool to handle your requests.
  6. 使用某种工作线程池来处理您的请求。

  7. Ensure that you scrub the URL when you get an HTTP GET request. So I couldn't do something like http://localhost/../../Users/blah/ to get higher level access.
  8. 确保在收到HTTP GET请求时清除URL。所以我不能做像http://localhost/../../Users/blah/这样的事情来获得更高级别的访问权限。

  9. Ensure you always set the relevant content and mime types.
  10. 确保始终设置相关内容和mime类型。

Good luck - this is a hell of a job.

祝你好运 - 这是一份很糟糕的工作。

#3


The networking et al are pretty standard fair, so don't worry so much about that. (there are several "instant", sample network servers in most any language.)

网络等是相当标准的公平,所以不要太担心这一点。 (在大多数语言中都有几个“即时”的示例网络服务器。)

Instead, focus on actually implementing the HTTP specification. You'll be amazed at a) what you don't know and b) how much things that are supposed to be HTTP compliant, really aren't, but fake it well.

相反,专注于实际实现HTTP规范。你会惊讶于a)你不知道的东西和b)多少应该符合HTTP标准的东西,实际上并非如此,但是它很好。

Then you'll marvel that the web works at all.

那么你会惊讶于网络的运作。

When you're done with HTTP, enjoy trying to implement IMAP.

完成HTTP后,尽量尝试实施IMAP。

#4


I wrote a light webserver in Python a few years back, also as a learning project.

几年前我在Python中编写了一个轻量级的网络服务器,也作为一个学习项目。

The simplest piece of advice I can give, especially as a learning project, is build a core that works, then iterative design on top of that. Don't aim for the moon right off the hop, start very small, then add featuers, refine and continue. I would recommend using a tool that encourages expermentation, like Python, where you can literally type and test code at the same time.

我能给出的最简单的建议,特别是作为一个学习项目,是构建一个有效的核心,然后是迭代设计。不要瞄准跳跃的月亮,开始非常小,然后添加特色,精炼和继续。我建议使用一种鼓励实验的工具,比如Python,你可以在这里同时输入和测试代码。

#5


The course I TAed had a proxy assignment so I can kind of shed some light here, I think.

我认为,TA TAed的课程有一个代理任务,所以我可以在这里阐明一些。

So, you're going to end up doing a lot of header changing just to make your life easier. Namely, HTTP/1.0 is wayyy easier to deal with than HTTP/1.1. You don't want to have to deal with managing timeouts and keep-alives and stuff like that. One connection per transaction is easiest.

所以,你最终会做很多标题改变只是为了让你的生活更轻松。也就是说,HTTP / 1.0比HTTP / 1.1更容易处理。你不想处理管理超时和保持活动等事情。每个事务一个连接最简单。

You're going to be doing lots and lots of parsing. Parsing is hard in C. I'd advise you to write a function that is something like

你将要进行大量的解析。解析很难用C.我建议你写一个类似的函数

int readline(char *buff, int maxLen) {
    while((c = readNextCharFromSocket(&s)) && c != '\n' && i < maxLen)
      buff[i++] = c;
    return i;
}

and handle it one line at a time, solely because it's easiest to use the existing C string functions on one line at a time. Also, remember lines are \r\n separated and headers are terminated with a \r\n\r\n.

并且一次处理一行,完全是因为最简单的方法是一次在一行上使用现有的C字符串函数。另外,记住行是\ r \ n分隔,标题以\ r \ n \ n \ n \ n结尾。

The main hard thing will be parsing, so long as you can read files everything else will work as expected.

主要的难点是解析,只要你能读取文件,其他一切都会按预期工作。

For debugging, you'll probably want to print out headers that are passed around to sanity test them when stuff breaks.

对于调试,您可能希望打印出传递的标题,以便在内容中断时对其进行完整性测试。

#6


local-web-server is an example of a simple development web server written in node.js.. It's more reliable and has more features than python -m SimpleHTTPServer

local-web-server是用node.js编写的简单开发Web服务器的一个例子..它比python -m SimpleHTTPServer更可靠,功能更多

#7


I was thinking of starting the same project as a way to learn Python better. There's a BaseHTTPServer class that's a pretty good starting point.

我正在考虑启动同一个项目,以便更好地学习Python。有一个BaseHTTPServer类,这是一个非常好的起点。

Here's some tutorial-style articles: 1 & 2

这里有一些教程风格的文章:1和2

#8


I've already developed a web server that runs (Html and PHP) using C language it's not that complicated you should know how to use TCP/IP Sockets, Thread in order to handle multiple requests, processes fork (you need to create a child for php command line executing (i used execvp))

我已经开发了一个使用C语言运行(Html和PHP)的Web服务器它并不复杂你应该知道如何使用TCP / IP套接字,Thread以处理多个请求,进程fork(你需要创建一个子) for php命令行执行(我使用的是execvp))

i think the most strugling part is handling strings in c langage and send POST/GET requests in command line.

我认为最令人讨厌的部分是处理字符串并在命令行中发送POST / GET请求。

Good luck

#1


A web server starts out as being an extremely simple piece of code:

Web服务器最初只是一段非常简单的代码:

  • open a TCP/IP socket on port 80
  • 在端口80上打开TCP / IP套接字

  • while not terminated
    • wait for connections on that socket
    • 等待该套接字上的连接

    • when someone sends you HTTP headers
      • find the path to the file
      • 找到文件的路径

      • copy the file to the socket
      • 将文件复制到套接字

    • 当有人向您发送HTTP标头时,找到文件的路径将文件复制到套接字

  • 虽然没有终止等待有人发送给你的套接字HTTP标头找到文件的路径将文件复制到套接字

So the outline of the code is easy.

所以代码的轮廓很简单。

Now, you have some complexities to handle:

现在,您需要处理一些复杂问题:

  • in the simplest version of the code, while you're talking to one browser, all the others can't connect. You need to come up with some way of handling multiple connections.
  • 在最简单的代码版本中,当您与一个浏览器通话时,所有其他浏览器都无法连接。您需要提出一些处理多个连接的方法。

  • it's often convenient to be able to send out something more than just a static file (although the first HTTP servers did exactly that) so you need to be able to run other programs.
  • 能够发送不仅仅是静态文件的东西通常很方便(尽管第一个HTTP服务器就是这样做的)所以你需要能够运行其他程序。

Handling the possibility of multiple connections is also relatively easy, with a number of possible choices.

处理多个连接的可能性也相对容易,有许多可能的选择。

  • the simplest version (again, this is the way it was done originally) is to have the code that listens to port 80 set up a specific socket for that connection, then fork a copy of itself to handle that one connection. That process runs until the socket is closed, and then terminates. However, that's relatively expensive: a fork takes tens of milliseconds in general, so that limits how fast you can run.
  • 最简单的版本(再次,这是它最初完成的方式)是让侦听端口80的代码为该连接设置一个特定的套接字,然后分叉自己的副本来处理那个连接。该过程一直运行直到套接字关闭,然后终止。然而,这是相对昂贵的:一般来说,fork需要几十毫秒,因此限制了你运行的速度。

  • The second choice is to create a lightweight process — a/k/a a thread — to process the request.
  • 第二种选择是创建一个轻量级进程 - 一个/ k /一个线程 - 来处理请求。

Running a program is actually fairly easy too. In general, you define a special path to a CGI directory; a URL that has a path through that directory then interprets the path name as the path to a program. The server would then create a subprocess using fork/exec, with STDOUT connected to the socket. The program then runs, sending output to STDOUT, and it is sent on to the client browser.

运行程序实际上也很容易。通常,您定义CGI目录的特殊路径;具有通过该目录的路径的URL然后将路径名称解释为程序的路径。然后,服务器将使用fork / exec创建子进程,并将STDOUT连接到套接字。然后程序运行,将输出发送到STDOUT,然后将其发送到客户端浏览器。

This is the basic pattern; everything else a web server does is just adding frills and additional functionality to this basic pattern.

这是基本模式; Web服务器所做的一切只是为这个基本模式添加了多余的功能和附加功能。

Here are some other sources for example code:

以下是一些代码示例的其他来源:


It pretty much does nothing of what you really wanted, but for simple it's hard to beat this one from http://www.commandlinefu.com:

它几乎没有你真正想要的东西,但简单来说,很难从http://www.commandlinefu.com击败这个:

$ python -m SimpleHTTPServer

$ python -m SimpleHTTPServer

#2


Firstly, please don't let this become a usable project - getting security right for web servers is really hard.

首先,请不要让它成为一个可用的项目 - 获得Web服务器的安全性非常困难。

Ok, here are things to keep in mind:

好的,这里要记住的事情:

  1. The thread that accepts connections needs to hand off to background threads as soon as possible.
  2. 接受连接的线程需要尽快切换到后台线程。

  3. You can't have a thread for every single connection - with large volumes you'll run out of your thread limit.
  4. 你不能为每一个连接都有一个线程 - 大容量你将超出你的线程限制。

  5. Use some kind of a worker thread pool to handle your requests.
  6. 使用某种工作线程池来处理您的请求。

  7. Ensure that you scrub the URL when you get an HTTP GET request. So I couldn't do something like http://localhost/../../Users/blah/ to get higher level access.
  8. 确保在收到HTTP GET请求时清除URL。所以我不能做像http://localhost/../../Users/blah/这样的事情来获得更高级别的访问权限。

  9. Ensure you always set the relevant content and mime types.
  10. 确保始终设置相关内容和mime类型。

Good luck - this is a hell of a job.

祝你好运 - 这是一份很糟糕的工作。

#3


The networking et al are pretty standard fair, so don't worry so much about that. (there are several "instant", sample network servers in most any language.)

网络等是相当标准的公平,所以不要太担心这一点。 (在大多数语言中都有几个“即时”的示例网络服务器。)

Instead, focus on actually implementing the HTTP specification. You'll be amazed at a) what you don't know and b) how much things that are supposed to be HTTP compliant, really aren't, but fake it well.

相反,专注于实际实现HTTP规范。你会惊讶于a)你不知道的东西和b)多少应该符合HTTP标准的东西,实际上并非如此,但是它很好。

Then you'll marvel that the web works at all.

那么你会惊讶于网络的运作。

When you're done with HTTP, enjoy trying to implement IMAP.

完成HTTP后,尽量尝试实施IMAP。

#4


I wrote a light webserver in Python a few years back, also as a learning project.

几年前我在Python中编写了一个轻量级的网络服务器,也作为一个学习项目。

The simplest piece of advice I can give, especially as a learning project, is build a core that works, then iterative design on top of that. Don't aim for the moon right off the hop, start very small, then add featuers, refine and continue. I would recommend using a tool that encourages expermentation, like Python, where you can literally type and test code at the same time.

我能给出的最简单的建议,特别是作为一个学习项目,是构建一个有效的核心,然后是迭代设计。不要瞄准跳跃的月亮,开始非常小,然后添加特色,精炼和继续。我建议使用一种鼓励实验的工具,比如Python,你可以在这里同时输入和测试代码。

#5


The course I TAed had a proxy assignment so I can kind of shed some light here, I think.

我认为,TA TAed的课程有一个代理任务,所以我可以在这里阐明一些。

So, you're going to end up doing a lot of header changing just to make your life easier. Namely, HTTP/1.0 is wayyy easier to deal with than HTTP/1.1. You don't want to have to deal with managing timeouts and keep-alives and stuff like that. One connection per transaction is easiest.

所以,你最终会做很多标题改变只是为了让你的生活更轻松。也就是说,HTTP / 1.0比HTTP / 1.1更容易处理。你不想处理管理超时和保持活动等事情。每个事务一个连接最简单。

You're going to be doing lots and lots of parsing. Parsing is hard in C. I'd advise you to write a function that is something like

你将要进行大量的解析。解析很难用C.我建议你写一个类似的函数

int readline(char *buff, int maxLen) {
    while((c = readNextCharFromSocket(&s)) && c != '\n' && i < maxLen)
      buff[i++] = c;
    return i;
}

and handle it one line at a time, solely because it's easiest to use the existing C string functions on one line at a time. Also, remember lines are \r\n separated and headers are terminated with a \r\n\r\n.

并且一次处理一行,完全是因为最简单的方法是一次在一行上使用现有的C字符串函数。另外,记住行是\ r \ n分隔,标题以\ r \ n \ n \ n \ n结尾。

The main hard thing will be parsing, so long as you can read files everything else will work as expected.

主要的难点是解析,只要你能读取文件,其他一切都会按预期工作。

For debugging, you'll probably want to print out headers that are passed around to sanity test them when stuff breaks.

对于调试,您可能希望打印出传递的标题,以便在内容中断时对其进行完整性测试。

#6


local-web-server is an example of a simple development web server written in node.js.. It's more reliable and has more features than python -m SimpleHTTPServer

local-web-server是用node.js编写的简单开发Web服务器的一个例子..它比python -m SimpleHTTPServer更可靠,功能更多

#7


I was thinking of starting the same project as a way to learn Python better. There's a BaseHTTPServer class that's a pretty good starting point.

我正在考虑启动同一个项目,以便更好地学习Python。有一个BaseHTTPServer类,这是一个非常好的起点。

Here's some tutorial-style articles: 1 & 2

这里有一些教程风格的文章:1和2

#8


I've already developed a web server that runs (Html and PHP) using C language it's not that complicated you should know how to use TCP/IP Sockets, Thread in order to handle multiple requests, processes fork (you need to create a child for php command line executing (i used execvp))

我已经开发了一个使用C语言运行(Html和PHP)的Web服务器它并不复杂你应该知道如何使用TCP / IP套接字,Thread以处理多个请求,进程fork(你需要创建一个子) for php命令行执行(我使用的是execvp))

i think the most strugling part is handling strings in c langage and send POST/GET requests in command line.

我认为最令人讨厌的部分是处理字符串并在命令行中发送POST / GET请求。

Good luck