I've created a question about this a few days. My solution is something in the lines of what was suggested in the accepted answer. However, a friend of mine came up with the following solution:
几天我就提出了一个问题。我的解决方案符合接受的答案中的建议。但是,我的一个朋友提出了以下解决方案:
Please note that the code has been updated a few times (check the edit revisions) to reflect the suggestions in the answers below. If you intend to give a new answer, please do so with this new code in mind and not the old one which had lots of problems.
请注意,代码已更新几次(查看编辑修订版)以反映下面答案中的建议。如果您打算给出新的答案,请考虑这个新代码,而不是那些有很多问题的旧代码。
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char *argv[]){
int fd[2], i, aux, std0, std1;
do {
std0 = dup(0); // backup stdin
std1 = dup(1); // backup stdout
// let's pretend I'm reading commands here in a shell prompt
READ_COMMAND_FROM_PROMPT();
for(i=1; i<argc; i++) {
// do we have a previous command?
if(i > 1) {
dup2(aux, 0);
close(aux);
}
// do we have a next command?
if(i < argc-1) {
pipe(fd);
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]);
}
// last command? restore stdout...
if(i == argc-1) {
dup2(std1, 1);
close(std1);
}
if(!fork()) {
// if not last command, close all pipe ends
// (the child doesn't use them)
if(i < argc-1) {
close(std0);
close(std1);
close(fd[0]);
}
execlp(argv[i], argv[i], NULL);
exit(0);
}
}
// restore stdin to be able to keep using the shell
dup2(std0, 0);
close(std0);
}
return 0;
}
This simulates a series of commands through pipes like in bash, for instance: cmd1 | cmd2 | ... | cmd_n. I say "simulate", because, as you can see, the commands are actually read from the arguments. Just to spare time coding a simple shell prompt...
这通过像bash这样的管道模拟一系列命令,例如:cmd1 | cmd2 | ...... | cmd_n。我说“模拟”,因为正如你所看到的,命令实际上是从参数中读取的。只是为了空闲时间编写一个简单的shell提示符......
Of course there are some issues to fix and to add like error handling but that's not the point here. I think I kinda get the code but it still makes me a lot of confusing how this whole thing works.
当然,有一些问题需要修复,并添加像错误处理,但这不是重点。我觉得我有点得到代码,但它仍然让我很困惑这整个事情的工作原理。
Am I missing something or this really works and it's a nice and clean solution to solve the problem? If not, can anyone point me the crucial problems this code has?
我错过了什么或者这确实有效,这是解决问题的一个很好的清洁解决方案吗?如果没有,有人能指出这个代码的关键问题吗?
5 个解决方案
#1
Looks reasonable, though it really needs to fix leaking std
and aux
to the children and after the loop, and the parent's original stdin
is lost forever.
看起来很合理,虽然它确实需要修复泄漏的std和aux给孩子们和循环后,父母的原始stdin永远丢失。
This would probably be better with color...
颜色可能会更好......
./a.out foo bar baz <stdin >stdout std = dup(stdout) || |+==========================std || || || pipe(fd) || || pipe1[0] -- pipe0[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || || || || fork+exec(foo) || || || || XX || || || /-----++-------+| || dup2(aux, 0) // || || || || || || || close(aux) || || XX || || || || pipe(fd) || || pipe2[0] -- pipe2[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || || || || fork+exec(bar) || || || || XX || || || /-----++-------+| || dup2(aux, 0) // || || || || || || || close(aux) || || XX || || || || pipe(fd) || || pipe3[0] -- pipe3[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || XX || || || /-------++-----------------+| dup2(std, 1) || // || || || || || || fork+exec(baz) || || || ||
-
foo
getsstdin=stdin
,stdout=pipe1[1]
-
bar
getsstdin=pipe1[0]
,stdout=pipe2[1]
-
baz
getsstdin=pipe2[0]
,stdout=stdout
foo获取stdin = stdin,stdout = pipe1 [1]
bar获取stdin = pipe1 [0],stdout = pipe2 [1]
baz获取stdin = pipe2 [0],stdout = stdout
My suggestion is different in that it avoids mangling the parent's stdin
and stdout
, only manipulating them within the child, and never leaks any FDs. It's a bit harder to diagram, though.
我的建议不同之处在于它避免了修改父级的stdin和stdout,只是在子级内部操纵它们,并且永远不会泄漏任何FD。不过,图表有点难度。
for cmd in cmds
if there is a next cmd
pipe(new_fds)
fork
if child
if there is a previous cmd
dup2(old_fds[0], 0)
close(old_fds[0])
close(old_fds[1])
if there is a next cmd
close(new_fds[0])
dup2(new_fds[1], 1)
close(new_fds[1])
exec cmd || die
else
if there is a previous cmd
close(old_fds[0])
close(old_fds[1])
if there is a next cmd
old_fds = new_fds
parent cmds = [foo, bar, baz] fds = {0: stdin, 1: stdout} cmd = cmds[0] { there is a next cmd { pipe(new_fds) new_fds = {3, 4} fds = {0: stdin, 1: stdout, 3: pipe1[0], 4: pipe1[1]} } fork => child there is a next cmd { close(new_fds[0]) fds = {0: stdin, 1: stdout, 4: pipe1[1]} dup2(new_fds[1], 1) fds = {0: stdin, 1: pipe1[1], 4: pipe1[1]} close(new_fds[1]) fds = {0: stdin, 1: pipe1[1]} } exec(cmd) there is a next cmd { old_fds = new_fds old_fds = {3, 4} } } cmd = cmds[1] { there is a next cmd { pipe(new_fds) new_fds = {5, 6} fds = {0: stdin, 1: stdout, 3: pipe1[0], 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} } fork => child there is a previous cmd { dup2(old_fds[0], 0) fds = {0: pipe1[0], 1: stdout, 3: pipe1[0], 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} close(old_fds[0]) fds = {0: pipe1[0], 1: stdout, 4: pipe1[1], 5: pipe2[0] 6: pipe2[1]} close(old_fds[1]) fds = {0: pipe1[0], 1: stdout, 5: pipe2[0], 6: pipe2[1]} } there is a next cmd { close(new_fds[0]) fds = {0: pipe1[0], 1: stdout, 6: pipe2[1]} dup2(new_fds[1], 1) fds = {0: pipe1[0], 1: pipe2[1], 6: pipe2[1]} close(new_fds[1]) fds = {0: pipe1[0], 1: pipe1[1]} } exec(cmd) there is a previous cmd { close(old_fds[0]) fds = {0: stdin, 1: stdout, 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} close(old_fds[1]) fds = {0: stdin, 1: stdout, 5: pipe2[0], 6: pipe2[1]} } there is a next cmd { old_fds = new_fds old_fds = {3, 4} } } cmd = cmds[2] { fork => child there is a previous cmd { dup2(old_fds[0], 0) fds = {0: pipe2[0], 1: stdout, 5: pipe2[0], 6: pipe2[1]} close(old_fds[0]) fds = {0: pipe2[0], 1: stdout, 6: pipe2[1]} close(old_fds[1]) fds = {0: pipe2[0], 1: stdout} } exec(cmd) there is a previous cmd { close(old_fds[0]) fds = {0: stdin, 1: stdout, 6: pipe2[1]} close(old_fds[1]) fds = {0: stdin, 1: stdout} } }
Edit
Your updated code does fix the previous FD leaks… but adds one: you're now leaking std0
to the children. As Jon says, this is probably not dangerous to most programs... but you still should write a better behaved shell than this.
您更新的代码确实修复了以前的FD泄漏...但添加了一个:您现在正在向孩子们泄漏std0。正如Jon所说,这对大多数程序来说可能并不危险......但是你仍然应该编写一个比这更好的shell。
Even if it's temporary, I would strongly recommend against mangling your own shell's standard in/out/err (0/1/2), only doing so within the child right before exec. Why? Suppose you add some printf
debugging in the middle, or you need to bail out due to an error condition. You'll be in trouble if you don't clean up your messed-up standard file descriptors first. Please, for the sake of having things operate as expected even in unexpected scenarios, don't muck with them until you need to.
即使它是临时的,我强烈建议不要修改你自己的shell标准输入/输出/错误(0/1/2),只在exec之前的子项内执行。为什么?假设您在中间添加了一些printf调试,或者由于错误情况需要挽救。如果你不先清理乱糟糟的标准文件描述符,你就会遇到麻烦。请为了让事情在意想不到的情况下按预期运行,在你需要之前不要捣乱。
Edit
As I mentioned in other comments, splitting it up into smaller parts makes it much easier to understand. This small helper should be easily understandable and bug-free:
正如我在其他评论中提到的那样,将其拆分为更小的部分会使其更容易理解。这个小助手应该易于理解和无错误:
/* cmd, argv: passed to exec
* fd_in, fd_out: when not -1, replaces stdin and stdout
* return: pid of fork+exec child
*/
int fork_and_exec_with_fds(char *cmd, char **argv, int fd_in, int fd_out) {
pid_t child = fork();
if (fork)
return child;
if (fd_in != -1 && fd_in != 0) {
dup2(fd_in, 0);
close(fd_in);
}
if (fd_out != -1 && fd_in != 1) {
dup2(fd_out, 1);
close(fd_out);
}
execvp(cmd, argv);
exit(-1);
}
As should this:
应该这样:
void run_pipeline(int num, char *cmds[], char **argvs[], int pids[]) {
/* initially, don't change stdin */
int fd_in = -1, fd_out;
int i;
for (i = 0; i < num; i++) {
int fd_pipe[2];
/* if there is a next command, set up a pipe for stdout */
if (i + 1 < num) {
pipe(fd_pipe);
fd_out = fd_pipe[1];
}
/* otherwise, don't change stdout */
else
fd_out = -1;
/* run child with given stdin/stdout */
pids[i] = fork_and_exec_with_fds(cmds[i], argvs[i], fd_in, fd_out);
/* nobody else needs to use these fds anymore
* safe because close(-1) does nothing */
close(fd_in);
close(fd_out);
/* set up stdin for next command */
fd_in = fd_pipe[0];
}
}
You can see Bash's execute_cmd.c#execute_disk_command
being called from execute_cmd.c#execute_pipeline
, xsh's process.c#process_run
being called from jobs.c#job_run
, and even every single one of BusyBox's various small and minimal shells splits them up.
您可以看到Bash的execute_cmd.c#execute_disk_command是从execute_cmd.c调用的.execute_pipeline,xsh的process.c#process_run是从jobs.c#job_run调用的,甚至BusyBox的各个小的和最小的shell中的每一个都将它们分开。
#2
The key problem is that you create a bunch of pipes and don't make sure that all the ends are closed properly. If you create a pipe, you get two file descriptors; if you fork, then you have four file descriptors. If you dup()
or dup2()
one end of the pipe to a standard descriptor, you need to close both ends of the pipe - at least one of the closes must be after the dup() or dup2() operation.
关键问题是您创建了一堆管道,并且不确保所有端部都正确关闭。如果您创建管道,您将获得两个文件描述符;如果你分叉,那么你有四个文件描述符。如果dup()或dup2()管道的一端到标准描述符,则需要关闭管道的两端 - 至少有一个关闭必须在dup()或dup2()操作之后。
Consider the file descriptors available to the first command (assuming there are at least two - something that should be handled in general (no pipe()
or I/O redirection needed with just one command), but I recognize that the error handling is eliminated to keep the code suitable for SO):
考虑第一个命令可用的文件描述符(假设至少有两个 - 一般应该处理的东西(只需一个命令就不需要管道()或I / O重定向),但我认识到错误处理被消除了保持代码适合SO):
std=dup(1); // Likely: std = 3
pipe(fd); // Likely: fd[0] = 4, fd[1] = 5
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]); // Closes 5
if (fork() == 0) {
// Need to close: fd[0] aka aux = 4
// Need to close: std = 3
close(fd[0]);
close(std);
execlp(argv[i], argv[i], NULL);
exit(1);
}
Note that because fd[0]
is not closed in the child, the child will never get EOF on its standard input; this is usually problematic. The non-closure of std
is less critical.
请注意,因为孩子不关闭fd [0],所以孩子的标准输入永远不会得到EOF;这通常是有问题的。 std的非关闭不太重要。
Revisiting amended code (as of 2009-06-03T20:52-07:00)...
重新审视修订后的代码(截至2009-06-03T20:52-07:00)......
Assume that process starts with file descriptors 0, 1, 2 (standard input, output, error) open only. Also assume we have exactly 3 commands to process. As before, this code writes out the loop with annotations.
假设进程以文件描述符0,1,2(标准输入,输出,错误)打开开始。还假设我们有3个命令要处理。和以前一样,这段代码用注释写出循环。
std0 = dup(0); // backup stdin - 3
std1 = dup(1); // backup stdout - 4
// Iteration 1 (i == 1)
// We have another command
pipe(fd); // fd[0] = 5; fd[1] = 6
aux = fd[0]; // aux = 5
dup2(fd[1], 1);
close(fd[1]); // 6 closed
// Not last command
if (fork() == 0) {
// Not last command
close(std1); // 4 closed
close(fd[0]); // 5 closed
// Minor problemette: 3 still open
execlp(argv[i], argv[i], NULL);
}
// Parent has open 3, 4, 5 - no problem
// Iteration 2 (i == 2)
// There was a previous command
dup2(aux, 0); // stdin now on read end of pipe
close(aux); // 5 closed
// We have another command
pipe(fd); // fd[0] = 5; fd[1] = 6
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]); // 6 closed
// Not last command
if (fork() == 0) {
// Not last command
close(std1); // 4 closed
close(fd[0]); // 5 closed
// As before, 3 is still open - not a major problem
execlp(argv[i], argv[i], NULL);
}
// Parent has open 3, 4, 5 - no problem
// Iteration 3 (i == 3)
// We have a previous command
dup2(aux, 0); // stdin is now read end of pipe
close(aux); // 5 closed
// No more commands
// Last command - restore stdout...
dup2(std1, 1); // stdin is back where it started
close(std1); // 4 closed
if (fork() == 0) {
// Last command
// 3 still open
execlp(argv[i], argv[i], NULL);
}
// Parent has closed 4 when it should not have done so!!!
// End of loop
// restore stdin to be able to keep using the shell
dup2(std0, 0);
// 3 still open - as desired
So, all the children have the original standard input connected as file descriptor 3. This is not ideal, though it is not dreadfully traumatic; I'm hard pressed to find a circumstance where this would matter.
因此,所有孩子都将原始标准输入连接为文件描述符3.这并不理想,尽管它并不是非常糟糕的创伤;我很难找到一个重要的情况。
Closing file descriptor 4 in the parent is a mistake - the next iteration of 'read a command and process it won't work because std1
is not initialized inside the loop.
在父项中关闭文件描述符4是一个错误 - 下一次'读取命令并处理它将无法工作,因为std1未在循环内初始化。
Generally, this is close to correct - but not quite correct.
一般来说,这接近正确 - 但不太正确。
#3
It will give results, some that are not expected. It is far from a nice solution: It messes with the parent process' standard descriptors, does not recover the standard input, descriptors leak to children, etc.
它会给出结果,有些是不期望的。它远非一个很好的解决方案:它与父进程的标准描述符混淆,没有恢复标准输入,描述符泄露给儿童等。
If you think recursively, it may be easier to understand. Below is a correct solution, without error checking. Consider a linked-list type command
, with it's next
pointer and a argv
array.
如果你以递归方式思考,可能会更容易理解。下面是一个正确的解决方案,没有错误检查。考虑一个链表类型命令,它的下一个指针和一个argv数组。
void run_pipeline(command *cmd, int input) {
int pfds[2] = { -1, -1 };
if (cmd->next != NULL) {
pipe(pfds);
}
if (fork() == 0) { /* child */
if (input != -1) {
dup2(input, STDIN_FILENO);
close(input);
}
if (pfds[1] != -1) {
dup2(pfds[1], STDOUT_FILENO);
close(pfds[1]);
}
if (pfds[0] != -1) {
close(pfds[0]);
}
execvp(cmd->argv[0], cmd->argv);
exit(1);
}
else { /* parent */
if (input != -1) {
close(input);
}
if (pfds[1] != -1) {
close(pfds[1]);
}
if (cmd->next != NULL) {
run_pipeline(cmd->next, pfds[0]);
}
}
}
Call it with the first command in the linked-list, and input
= -1. It does the rest.
使用链表中的第一个命令调用它,输入= -1。它完成剩下的工作。
#4
Both in this question and in another (as linked in the first post), ephemient suggested me a solution to the problem without messing with the parents file descriptors as demonstrated by a possible solution in this question.
在这个问题和另一个问题中(如第一篇文章中所述),ephemient建议我解决问题,而不会弄乱父文件描述符,正如本问题中可能的解决方案所证明的那样。
I didn't get his solution, I tried and tried to understand but I can't seem to get it. I also tried to code it without understanding but it didn't work. Probably because I've failed to understand it correctly and wasn't able to code it the it should have been coded.
我没有得到他的解决方案,我尝试并试图理解,但我似乎无法得到它。我也试图在不理解的情况下对其进行编码,但它没有用。可能是因为我没有正确理解它并且无法编码它应该编码。
Anyway, I tried to come up with my own solution using some of the things I understood from the pseudo code and came up with this:
无论如何,我尝试使用我从伪代码中理解的一些东西来提出我自己的解决方案并想出了这个:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
#include <string.h>
#include <readline/readline.h>
#include <readline/history.h>
#define NUMPIPES 5
#define NUMARGS 10
int main(int argc, char *argv[]) {
char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[NUMARGS];
int aPipe[2], bPipe[2], pCount, aCount, i, status;
pid_t pid;
using_history();
while(1) {
bBuffer = readline("\e[1;31mShell \e[1;32m# \e[0m");
if(!strcasecmp(bBuffer, "exit")) {
return 0;
}
if(strlen(bBuffer) > 0) {
add_history(bBuffer);
}
sPtr = bBuffer;
pCount =0;
do {
aPtr = strsep(&sPtr, "|");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
pipeComms[pCount++] = aPtr;
}
}
} while(aPtr);
cmdArgs[pCount] = NULL;
for(i = 0; i < pCount; i++) {
aCount = 0;
do {
aPtr = strsep(&pipeComms[i], " ");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
cmdArgs[aCount++] = aPtr;
}
}
} while(aPtr);
cmdArgs[aCount] = NULL;
// Do we have a next command?
if(i < pCount-1) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
pipe(aPipe);
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
pipe(bPipe);
}
}
pid = fork();
if(pid == 0) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
// Do we have a previous command?
if(i > 0) {
close(bPipe[1]);
dup2(bPipe[0], STDIN_FILENO);
close(bPipe[0]);
}
// Do we have a next command?
if(i < pCount-1) {
close(aPipe[0]);
dup2(aPipe[1], STDOUT_FILENO);
close(aPipe[1]);
}
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
// Do we have a previous command?
if(i > 0) {
close(aPipe[1]);
dup2(aPipe[0], STDIN_FILENO);
close(aPipe[0]);
}
// Do we have a next command?
if(i < pCount-1) {
close(bPipe[0]);
dup2(bPipe[1], STDOUT_FILENO);
close(bPipe[1]);
}
}
execvp(cmdArgs[0], cmdArgs);
exit(1);
} else {
// Do we have a previous command?
if(i > 0) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
close(bPipe[0]);
close(bPipe[1]);
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
close(aPipe[0]);
close(aPipe[1]);
}
}
// wait for the last command? all others will run in the background
if(i == pCount-1) {
waitpid(pid, &status, 0);
}
// I know they will be left as zombies in the table
// Not relevant for this...
}
}
}
return 0;
}
This may not be the best and cleanest solution but it was something I could come up with and, most importantly, something I can understand. What good is to have something working that I don't understand and then I'm evaluated by my teacher and I can't explain to him what the code is doing?
这可能不是最好和最干净的解决方案,但这是我能想到的,最重要的是,我能理解的东西。有一些我不理解的工作有什么用,然后我被老师评估,我无法向他解释代码在做什么?
Anyway, what do you think about this one?
无论如何,你怎么看待这个?
#5
This is my "final" code with ephemient suggestions:
这是我的“最终”代码与流行的建议:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
#include <string.h>
#include <readline/readline.h>
#include <readline/history.h>
#define NUMPIPES 5
#define NUMARGS 10
int main(int argc, char *argv[]) {
char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[NUMARGS];
int newPipe[2], oldPipe[2], pCount, aCount, i, status;
pid_t pid;
using_history();
while(1) {
bBuffer = readline("\e[1;31mShell \e[1;32m# \e[0m");
if(!strcasecmp(bBuffer, "exit")) {
return 0;
}
if(strlen(bBuffer) > 0) {
add_history(bBuffer);
}
sPtr = bBuffer;
pCount = -1;
do {
aPtr = strsep(&sPtr, "|");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
pipeComms[++pCount] = aPtr;
}
}
} while(aPtr);
cmdArgs[++pCount] = NULL;
for(i = 0; i < pCount; i++) {
aCount = -1;
do {
aPtr = strsep(&pipeComms[i], " ");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
cmdArgs[++aCount] = aPtr;
}
}
} while(aPtr);
cmdArgs[++aCount] = NULL;
// do we have a next command?
if(i < pCount-1) {
pipe(newPipe);
}
pid = fork();
if(pid == 0) {
// do we have a previous command?
if(i > 0) {
close(oldPipe[1]);
dup2(oldPipe[0], 0);
close(oldPipe[0]);
}
// do we have a next command?
if(i < pCount-1) {
close(newPipe[0]);
dup2(newPipe[1], 1);
close(newPipe[1]);
}
// execute command...
execvp(cmdArgs[0], cmdArgs);
exit(1);
} else {
// do we have a previous command?
if(i > 0) {
close(oldPipe[0]);
close(oldPipe[1]);
}
// do we have a next command?
if(i < pCount-1) {
oldPipe[0] = newPipe[0];
oldPipe[1] = newPipe[1];
}
// wait for last command process?
if(i == pCount-1) {
waitpid(pid, &status, 0);
}
}
}
}
return 0;
}
Is it ok now?
现在好吗?
#1
Looks reasonable, though it really needs to fix leaking std
and aux
to the children and after the loop, and the parent's original stdin
is lost forever.
看起来很合理,虽然它确实需要修复泄漏的std和aux给孩子们和循环后,父母的原始stdin永远丢失。
This would probably be better with color...
颜色可能会更好......
./a.out foo bar baz <stdin >stdout std = dup(stdout) || |+==========================std || || || pipe(fd) || || pipe1[0] -- pipe0[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || || || || fork+exec(foo) || || || || XX || || || /-----++-------+| || dup2(aux, 0) // || || || || || || || close(aux) || || XX || || || || pipe(fd) || || pipe2[0] -- pipe2[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || || || || fork+exec(bar) || || || || XX || || || /-----++-------+| || dup2(aux, 0) // || || || || || || || close(aux) || || XX || || || || pipe(fd) || || pipe3[0] -- pipe3[1] || || || || || || aux = fd[0] || || aux || || || XX || || || || /-------++----------+| || dup2(fd[1], 1) || // || || || || || || || || close(fd[1]) || || || XX || || XX || || || /-------++-----------------+| dup2(std, 1) || // || || || || || || fork+exec(baz) || || || ||
-
foo
getsstdin=stdin
,stdout=pipe1[1]
-
bar
getsstdin=pipe1[0]
,stdout=pipe2[1]
-
baz
getsstdin=pipe2[0]
,stdout=stdout
foo获取stdin = stdin,stdout = pipe1 [1]
bar获取stdin = pipe1 [0],stdout = pipe2 [1]
baz获取stdin = pipe2 [0],stdout = stdout
My suggestion is different in that it avoids mangling the parent's stdin
and stdout
, only manipulating them within the child, and never leaks any FDs. It's a bit harder to diagram, though.
我的建议不同之处在于它避免了修改父级的stdin和stdout,只是在子级内部操纵它们,并且永远不会泄漏任何FD。不过,图表有点难度。
for cmd in cmds
if there is a next cmd
pipe(new_fds)
fork
if child
if there is a previous cmd
dup2(old_fds[0], 0)
close(old_fds[0])
close(old_fds[1])
if there is a next cmd
close(new_fds[0])
dup2(new_fds[1], 1)
close(new_fds[1])
exec cmd || die
else
if there is a previous cmd
close(old_fds[0])
close(old_fds[1])
if there is a next cmd
old_fds = new_fds
parent cmds = [foo, bar, baz] fds = {0: stdin, 1: stdout} cmd = cmds[0] { there is a next cmd { pipe(new_fds) new_fds = {3, 4} fds = {0: stdin, 1: stdout, 3: pipe1[0], 4: pipe1[1]} } fork => child there is a next cmd { close(new_fds[0]) fds = {0: stdin, 1: stdout, 4: pipe1[1]} dup2(new_fds[1], 1) fds = {0: stdin, 1: pipe1[1], 4: pipe1[1]} close(new_fds[1]) fds = {0: stdin, 1: pipe1[1]} } exec(cmd) there is a next cmd { old_fds = new_fds old_fds = {3, 4} } } cmd = cmds[1] { there is a next cmd { pipe(new_fds) new_fds = {5, 6} fds = {0: stdin, 1: stdout, 3: pipe1[0], 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} } fork => child there is a previous cmd { dup2(old_fds[0], 0) fds = {0: pipe1[0], 1: stdout, 3: pipe1[0], 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} close(old_fds[0]) fds = {0: pipe1[0], 1: stdout, 4: pipe1[1], 5: pipe2[0] 6: pipe2[1]} close(old_fds[1]) fds = {0: pipe1[0], 1: stdout, 5: pipe2[0], 6: pipe2[1]} } there is a next cmd { close(new_fds[0]) fds = {0: pipe1[0], 1: stdout, 6: pipe2[1]} dup2(new_fds[1], 1) fds = {0: pipe1[0], 1: pipe2[1], 6: pipe2[1]} close(new_fds[1]) fds = {0: pipe1[0], 1: pipe1[1]} } exec(cmd) there is a previous cmd { close(old_fds[0]) fds = {0: stdin, 1: stdout, 4: pipe1[1], 5: pipe2[0], 6: pipe2[1]} close(old_fds[1]) fds = {0: stdin, 1: stdout, 5: pipe2[0], 6: pipe2[1]} } there is a next cmd { old_fds = new_fds old_fds = {3, 4} } } cmd = cmds[2] { fork => child there is a previous cmd { dup2(old_fds[0], 0) fds = {0: pipe2[0], 1: stdout, 5: pipe2[0], 6: pipe2[1]} close(old_fds[0]) fds = {0: pipe2[0], 1: stdout, 6: pipe2[1]} close(old_fds[1]) fds = {0: pipe2[0], 1: stdout} } exec(cmd) there is a previous cmd { close(old_fds[0]) fds = {0: stdin, 1: stdout, 6: pipe2[1]} close(old_fds[1]) fds = {0: stdin, 1: stdout} } }
Edit
Your updated code does fix the previous FD leaks… but adds one: you're now leaking std0
to the children. As Jon says, this is probably not dangerous to most programs... but you still should write a better behaved shell than this.
您更新的代码确实修复了以前的FD泄漏...但添加了一个:您现在正在向孩子们泄漏std0。正如Jon所说,这对大多数程序来说可能并不危险......但是你仍然应该编写一个比这更好的shell。
Even if it's temporary, I would strongly recommend against mangling your own shell's standard in/out/err (0/1/2), only doing so within the child right before exec. Why? Suppose you add some printf
debugging in the middle, or you need to bail out due to an error condition. You'll be in trouble if you don't clean up your messed-up standard file descriptors first. Please, for the sake of having things operate as expected even in unexpected scenarios, don't muck with them until you need to.
即使它是临时的,我强烈建议不要修改你自己的shell标准输入/输出/错误(0/1/2),只在exec之前的子项内执行。为什么?假设您在中间添加了一些printf调试,或者由于错误情况需要挽救。如果你不先清理乱糟糟的标准文件描述符,你就会遇到麻烦。请为了让事情在意想不到的情况下按预期运行,在你需要之前不要捣乱。
Edit
As I mentioned in other comments, splitting it up into smaller parts makes it much easier to understand. This small helper should be easily understandable and bug-free:
正如我在其他评论中提到的那样,将其拆分为更小的部分会使其更容易理解。这个小助手应该易于理解和无错误:
/* cmd, argv: passed to exec
* fd_in, fd_out: when not -1, replaces stdin and stdout
* return: pid of fork+exec child
*/
int fork_and_exec_with_fds(char *cmd, char **argv, int fd_in, int fd_out) {
pid_t child = fork();
if (fork)
return child;
if (fd_in != -1 && fd_in != 0) {
dup2(fd_in, 0);
close(fd_in);
}
if (fd_out != -1 && fd_in != 1) {
dup2(fd_out, 1);
close(fd_out);
}
execvp(cmd, argv);
exit(-1);
}
As should this:
应该这样:
void run_pipeline(int num, char *cmds[], char **argvs[], int pids[]) {
/* initially, don't change stdin */
int fd_in = -1, fd_out;
int i;
for (i = 0; i < num; i++) {
int fd_pipe[2];
/* if there is a next command, set up a pipe for stdout */
if (i + 1 < num) {
pipe(fd_pipe);
fd_out = fd_pipe[1];
}
/* otherwise, don't change stdout */
else
fd_out = -1;
/* run child with given stdin/stdout */
pids[i] = fork_and_exec_with_fds(cmds[i], argvs[i], fd_in, fd_out);
/* nobody else needs to use these fds anymore
* safe because close(-1) does nothing */
close(fd_in);
close(fd_out);
/* set up stdin for next command */
fd_in = fd_pipe[0];
}
}
You can see Bash's execute_cmd.c#execute_disk_command
being called from execute_cmd.c#execute_pipeline
, xsh's process.c#process_run
being called from jobs.c#job_run
, and even every single one of BusyBox's various small and minimal shells splits them up.
您可以看到Bash的execute_cmd.c#execute_disk_command是从execute_cmd.c调用的.execute_pipeline,xsh的process.c#process_run是从jobs.c#job_run调用的,甚至BusyBox的各个小的和最小的shell中的每一个都将它们分开。
#2
The key problem is that you create a bunch of pipes and don't make sure that all the ends are closed properly. If you create a pipe, you get two file descriptors; if you fork, then you have four file descriptors. If you dup()
or dup2()
one end of the pipe to a standard descriptor, you need to close both ends of the pipe - at least one of the closes must be after the dup() or dup2() operation.
关键问题是您创建了一堆管道,并且不确保所有端部都正确关闭。如果您创建管道,您将获得两个文件描述符;如果你分叉,那么你有四个文件描述符。如果dup()或dup2()管道的一端到标准描述符,则需要关闭管道的两端 - 至少有一个关闭必须在dup()或dup2()操作之后。
Consider the file descriptors available to the first command (assuming there are at least two - something that should be handled in general (no pipe()
or I/O redirection needed with just one command), but I recognize that the error handling is eliminated to keep the code suitable for SO):
考虑第一个命令可用的文件描述符(假设至少有两个 - 一般应该处理的东西(只需一个命令就不需要管道()或I / O重定向),但我认识到错误处理被消除了保持代码适合SO):
std=dup(1); // Likely: std = 3
pipe(fd); // Likely: fd[0] = 4, fd[1] = 5
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]); // Closes 5
if (fork() == 0) {
// Need to close: fd[0] aka aux = 4
// Need to close: std = 3
close(fd[0]);
close(std);
execlp(argv[i], argv[i], NULL);
exit(1);
}
Note that because fd[0]
is not closed in the child, the child will never get EOF on its standard input; this is usually problematic. The non-closure of std
is less critical.
请注意,因为孩子不关闭fd [0],所以孩子的标准输入永远不会得到EOF;这通常是有问题的。 std的非关闭不太重要。
Revisiting amended code (as of 2009-06-03T20:52-07:00)...
重新审视修订后的代码(截至2009-06-03T20:52-07:00)......
Assume that process starts with file descriptors 0, 1, 2 (standard input, output, error) open only. Also assume we have exactly 3 commands to process. As before, this code writes out the loop with annotations.
假设进程以文件描述符0,1,2(标准输入,输出,错误)打开开始。还假设我们有3个命令要处理。和以前一样,这段代码用注释写出循环。
std0 = dup(0); // backup stdin - 3
std1 = dup(1); // backup stdout - 4
// Iteration 1 (i == 1)
// We have another command
pipe(fd); // fd[0] = 5; fd[1] = 6
aux = fd[0]; // aux = 5
dup2(fd[1], 1);
close(fd[1]); // 6 closed
// Not last command
if (fork() == 0) {
// Not last command
close(std1); // 4 closed
close(fd[0]); // 5 closed
// Minor problemette: 3 still open
execlp(argv[i], argv[i], NULL);
}
// Parent has open 3, 4, 5 - no problem
// Iteration 2 (i == 2)
// There was a previous command
dup2(aux, 0); // stdin now on read end of pipe
close(aux); // 5 closed
// We have another command
pipe(fd); // fd[0] = 5; fd[1] = 6
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]); // 6 closed
// Not last command
if (fork() == 0) {
// Not last command
close(std1); // 4 closed
close(fd[0]); // 5 closed
// As before, 3 is still open - not a major problem
execlp(argv[i], argv[i], NULL);
}
// Parent has open 3, 4, 5 - no problem
// Iteration 3 (i == 3)
// We have a previous command
dup2(aux, 0); // stdin is now read end of pipe
close(aux); // 5 closed
// No more commands
// Last command - restore stdout...
dup2(std1, 1); // stdin is back where it started
close(std1); // 4 closed
if (fork() == 0) {
// Last command
// 3 still open
execlp(argv[i], argv[i], NULL);
}
// Parent has closed 4 when it should not have done so!!!
// End of loop
// restore stdin to be able to keep using the shell
dup2(std0, 0);
// 3 still open - as desired
So, all the children have the original standard input connected as file descriptor 3. This is not ideal, though it is not dreadfully traumatic; I'm hard pressed to find a circumstance where this would matter.
因此,所有孩子都将原始标准输入连接为文件描述符3.这并不理想,尽管它并不是非常糟糕的创伤;我很难找到一个重要的情况。
Closing file descriptor 4 in the parent is a mistake - the next iteration of 'read a command and process it won't work because std1
is not initialized inside the loop.
在父项中关闭文件描述符4是一个错误 - 下一次'读取命令并处理它将无法工作,因为std1未在循环内初始化。
Generally, this is close to correct - but not quite correct.
一般来说,这接近正确 - 但不太正确。
#3
It will give results, some that are not expected. It is far from a nice solution: It messes with the parent process' standard descriptors, does not recover the standard input, descriptors leak to children, etc.
它会给出结果,有些是不期望的。它远非一个很好的解决方案:它与父进程的标准描述符混淆,没有恢复标准输入,描述符泄露给儿童等。
If you think recursively, it may be easier to understand. Below is a correct solution, without error checking. Consider a linked-list type command
, with it's next
pointer and a argv
array.
如果你以递归方式思考,可能会更容易理解。下面是一个正确的解决方案,没有错误检查。考虑一个链表类型命令,它的下一个指针和一个argv数组。
void run_pipeline(command *cmd, int input) {
int pfds[2] = { -1, -1 };
if (cmd->next != NULL) {
pipe(pfds);
}
if (fork() == 0) { /* child */
if (input != -1) {
dup2(input, STDIN_FILENO);
close(input);
}
if (pfds[1] != -1) {
dup2(pfds[1], STDOUT_FILENO);
close(pfds[1]);
}
if (pfds[0] != -1) {
close(pfds[0]);
}
execvp(cmd->argv[0], cmd->argv);
exit(1);
}
else { /* parent */
if (input != -1) {
close(input);
}
if (pfds[1] != -1) {
close(pfds[1]);
}
if (cmd->next != NULL) {
run_pipeline(cmd->next, pfds[0]);
}
}
}
Call it with the first command in the linked-list, and input
= -1. It does the rest.
使用链表中的第一个命令调用它,输入= -1。它完成剩下的工作。
#4
Both in this question and in another (as linked in the first post), ephemient suggested me a solution to the problem without messing with the parents file descriptors as demonstrated by a possible solution in this question.
在这个问题和另一个问题中(如第一篇文章中所述),ephemient建议我解决问题,而不会弄乱父文件描述符,正如本问题中可能的解决方案所证明的那样。
I didn't get his solution, I tried and tried to understand but I can't seem to get it. I also tried to code it without understanding but it didn't work. Probably because I've failed to understand it correctly and wasn't able to code it the it should have been coded.
我没有得到他的解决方案,我尝试并试图理解,但我似乎无法得到它。我也试图在不理解的情况下对其进行编码,但它没有用。可能是因为我没有正确理解它并且无法编码它应该编码。
Anyway, I tried to come up with my own solution using some of the things I understood from the pseudo code and came up with this:
无论如何,我尝试使用我从伪代码中理解的一些东西来提出我自己的解决方案并想出了这个:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
#include <string.h>
#include <readline/readline.h>
#include <readline/history.h>
#define NUMPIPES 5
#define NUMARGS 10
int main(int argc, char *argv[]) {
char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[NUMARGS];
int aPipe[2], bPipe[2], pCount, aCount, i, status;
pid_t pid;
using_history();
while(1) {
bBuffer = readline("\e[1;31mShell \e[1;32m# \e[0m");
if(!strcasecmp(bBuffer, "exit")) {
return 0;
}
if(strlen(bBuffer) > 0) {
add_history(bBuffer);
}
sPtr = bBuffer;
pCount =0;
do {
aPtr = strsep(&sPtr, "|");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
pipeComms[pCount++] = aPtr;
}
}
} while(aPtr);
cmdArgs[pCount] = NULL;
for(i = 0; i < pCount; i++) {
aCount = 0;
do {
aPtr = strsep(&pipeComms[i], " ");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
cmdArgs[aCount++] = aPtr;
}
}
} while(aPtr);
cmdArgs[aCount] = NULL;
// Do we have a next command?
if(i < pCount-1) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
pipe(aPipe);
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
pipe(bPipe);
}
}
pid = fork();
if(pid == 0) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
// Do we have a previous command?
if(i > 0) {
close(bPipe[1]);
dup2(bPipe[0], STDIN_FILENO);
close(bPipe[0]);
}
// Do we have a next command?
if(i < pCount-1) {
close(aPipe[0]);
dup2(aPipe[1], STDOUT_FILENO);
close(aPipe[1]);
}
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
// Do we have a previous command?
if(i > 0) {
close(aPipe[1]);
dup2(aPipe[0], STDIN_FILENO);
close(aPipe[0]);
}
// Do we have a next command?
if(i < pCount-1) {
close(bPipe[0]);
dup2(bPipe[1], STDOUT_FILENO);
close(bPipe[1]);
}
}
execvp(cmdArgs[0], cmdArgs);
exit(1);
} else {
// Do we have a previous command?
if(i > 0) {
// Is this the first, third, fifth, etc... command?
if(i%2 == 0) {
close(bPipe[0]);
close(bPipe[1]);
}
// Is this the second, fourth, sixth, etc... command?
if(i%2 == 1) {
close(aPipe[0]);
close(aPipe[1]);
}
}
// wait for the last command? all others will run in the background
if(i == pCount-1) {
waitpid(pid, &status, 0);
}
// I know they will be left as zombies in the table
// Not relevant for this...
}
}
}
return 0;
}
This may not be the best and cleanest solution but it was something I could come up with and, most importantly, something I can understand. What good is to have something working that I don't understand and then I'm evaluated by my teacher and I can't explain to him what the code is doing?
这可能不是最好和最干净的解决方案,但这是我能想到的,最重要的是,我能理解的东西。有一些我不理解的工作有什么用,然后我被老师评估,我无法向他解释代码在做什么?
Anyway, what do you think about this one?
无论如何,你怎么看待这个?
#5
This is my "final" code with ephemient suggestions:
这是我的“最终”代码与流行的建议:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
#include <string.h>
#include <readline/readline.h>
#include <readline/history.h>
#define NUMPIPES 5
#define NUMARGS 10
int main(int argc, char *argv[]) {
char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[NUMARGS];
int newPipe[2], oldPipe[2], pCount, aCount, i, status;
pid_t pid;
using_history();
while(1) {
bBuffer = readline("\e[1;31mShell \e[1;32m# \e[0m");
if(!strcasecmp(bBuffer, "exit")) {
return 0;
}
if(strlen(bBuffer) > 0) {
add_history(bBuffer);
}
sPtr = bBuffer;
pCount = -1;
do {
aPtr = strsep(&sPtr, "|");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
pipeComms[++pCount] = aPtr;
}
}
} while(aPtr);
cmdArgs[++pCount] = NULL;
for(i = 0; i < pCount; i++) {
aCount = -1;
do {
aPtr = strsep(&pipeComms[i], " ");
if(aPtr != NULL) {
if(strlen(aPtr) > 0) {
cmdArgs[++aCount] = aPtr;
}
}
} while(aPtr);
cmdArgs[++aCount] = NULL;
// do we have a next command?
if(i < pCount-1) {
pipe(newPipe);
}
pid = fork();
if(pid == 0) {
// do we have a previous command?
if(i > 0) {
close(oldPipe[1]);
dup2(oldPipe[0], 0);
close(oldPipe[0]);
}
// do we have a next command?
if(i < pCount-1) {
close(newPipe[0]);
dup2(newPipe[1], 1);
close(newPipe[1]);
}
// execute command...
execvp(cmdArgs[0], cmdArgs);
exit(1);
} else {
// do we have a previous command?
if(i > 0) {
close(oldPipe[0]);
close(oldPipe[1]);
}
// do we have a next command?
if(i < pCount-1) {
oldPipe[0] = newPipe[0];
oldPipe[1] = newPipe[1];
}
// wait for last command process?
if(i == pCount-1) {
waitpid(pid, &status, 0);
}
}
}
}
return 0;
}
Is it ok now?
现在好吗?