Consider the function call (calling int sum(int, int)
)
考虑函数调用(调用int sum(int, int))
printf("%d", sum(a,b));
How does the compiler decide that the ,
used in the function call sum(int, int)
is not a comma operator?
编译器如何确定函数调用sum(int, int)中使用的是逗号运算符?
NOTE: I didn't want to actually use the comma operator in the function call. I just wanted to know how the compiler knows that it is not a comma operator.
注意:我并不想在函数调用中使用逗号运算符。我只是想知道编译器如何知道它不是逗号运算符。
6 个解决方案
#1
48
Look at the grammar for the C language. It's listed, in full, in Appendix A of the standard. The way it works is that you can step through each token in a C program and match them up with the next item in the grammar. At each step you have only a limited number of options, so the interpretation of any given character will depend on the context in which it appears. Inside each rule in the grammar, each line gives a valid alternative for the program to match.
看看C语言的语法。它是在附录A中列出的。它的工作方式是,您可以在C程序中遍历每个令牌,并将它们与语法中的下一个项进行匹配。在每个步骤中,您只有有限的选项,因此对任何给定字符的解释将取决于其出现的上下文。在语法中的每条规则中,每一行都为程序提供了一个有效的选择。
Specifically, if you look for parameter-list
, you will see that it contains an explicit comma. Therefore, whenever the compiler's C parser is in "parameter-list" mode, commas that it finds will be understood as parameter separators, not as comma operators. The same is true for brackets (that can also occur in expressions).
具体地说,如果您寻找parameter-list,您将看到它包含一个显式的逗号。因此,每当编译器的C解析器处于“参数列表”模式时,它发现的逗号将被理解为参数分隔符,而不是逗号操作符。括号也是如此(表达式中也可以出现)。
This works because the parameter-list
rule is careful to use assignment-expression
rules, rather than just the plain expression
rule. An expression
can contain commas, whereas an assignment-expression
cannot. If this were not the case the grammar would be ambiguous, and the compiler would not know what to do when it encountered a comma inside a parameter list.
这之所以有效,是因为参数列表规则小心地使用了赋值表达式规则,而不仅仅是普通的表达式规则。表达式可以包含逗号,而赋值表达式不能。如果不是这种情况,语法将是不明确的,当在参数列表中遇到逗号时,编译器将不知道该做什么。
However, an opening bracket, for example, that is not part of a function definition/call, or an if
, while
, or for
statement, will be interpreted as part of an expression (because there's no other option, but only if the start of an expression is a valid choice at that point), and then, inside the brackets, the expression
syntax rules will apply, and that allows comma operators.
然而,开括号,例如,不属于一个函数定义/电话,或者一个如果,,或声明,将被视为一个表达式的一部分(因为没有其他选择,但只有一个表达式的开始是一个有效的选择),然后,在括号内,表达式语法规则将适用,允许逗号操作符。
#2
26
From C99 6.5.17:
C99 6.5.17:
As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call
如语法所示,逗号操作符(如子句中所述)不能出现在使用逗号分隔列表中的项的上下文中(例如函数的参数或初始化器的列表)。另一方面,它可以在圆括号表达式中使用,也可以在条件运算符的第二个表达式中使用。在函数调用
f(a, (t=3, t+2), c)
the function has three arguments, the second of which has the value 5.
这个函数有三个参数,第二个参数的值是5。
Another similar example is the initializer list of arrays or structs:
另一个类似的例子是数组或结构的初始化器列表:
int array[5] = {1, 2};
struct Foo bar = {1, 2};
If a comma operator were to be used as the function parameter, use it like this:
如果要使用逗号操作符作为函数参数,可以这样使用:
sum((a,b))
This won't compile, of course.
当然,这不会编译。
#3
19
The reason is the C Grammar. While everyone else seems to like to cite the example, the real deal is the phrase structure grammar for function calls in the Standard (C99). Yes, a function call consists of the ()
operator applied to a postfix expression (like for example an identifier):
原因是C语法。尽管其他人似乎都喜欢引用这个例子,但真正重要的是标准(C99)中函数调用的短语结构语法。是的,一个函数调用由()操作符应用到后缀表达式(例如一个标识符):
6.5.2 postfix-expression:
...
postfix-expression ( argument-expression-list_opt )
together with
在一起
argument-expression-list:
assignment-expression
argument-expression-list , assignment-expression <-- arglist comma
expression:
assignment-expression
expression , assignment-expression <-- comma operator
The comma operator can only occur in an expression, i.e. further down the in the grammar. So the compiler treats a comma in a function argument list as the one separating assignment-expressions, not as one separating expressions.
逗号操作符只能在表达式中出现,也就是说,在语法上更进一步。因此,编译器将函数参数列表中的逗号视为分隔赋值表达式的逗号,而不是分隔表达式的逗号。
#4
11
Existing answers say "because the C language spec says it's a list separator, and not an operator".
现有的答案说“因为C语言规范说它是一个列表分隔符,而不是一个操作符”。
However, your question is asking "how does the compiler know...", and that's altogether different: It's really no different from how the compiler knows that the comma in printf("Hello, world\n");
isn't a comma operator: The compiler 'knows' because of the context where the comma appears - basically, what's gone before.
但是,您的问题是:“编译器如何知道……”这与编译器知道printf中的逗号(“Hello, world\n”)没有什么不同;不是逗号运算符:编译器“知道”是因为出现逗号的上下文——基本上,是以前的情况。
The C 'language' can be described in Backus-Naur Form (BNF) - essentially, a set of rules that the compiler's parser uses to scan your input file. The BNF for C will distinguish between these different possible occurences of commas in the language.
C“语言”可以用Backus-Naur表单(BNF)来描述——本质上,这是编译器的解析器用来扫描输入文件的一组规则。C语言的BNF将区分这些不同的可能出现的逗号在语言中的出现。
There are lots of good resources on how compilers work, and how to write one.
关于编译器如何工作以及如何编写一个编译器,有很多很好的参考资料。
#5
6
C99标准草案说:
As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call
f(a, (t=3, t+2), c)
the function has three arguments, the second of which has the value 5.如语法所示,逗号操作符(如子句中所述)不能出现在使用逗号分隔列表中的项的上下文中(例如函数的参数或初始化器的列表)。另一方面,它可以在圆括号表达式中使用,也可以在条件运算符的第二个表达式中使用。在函数调用f(a, (t=3, t+2)和c)中,函数有三个参数,第二个参数的值为5。
In other words, "because".
换句话说,“因为”。
#6
1
There are multiple facets to this question. One par is that the definition says so. Well, how does the compiler know what context this comma is in? That's the parser's job. For C in particular, the language can be parsed by an LR(1) parser (http://en.wikipedia.org/wiki/Canonical_LR_parser).
这个问题有多个方面。一个参数是定义是这样说的。编译器如何知道这个逗号在什么上下文?这是解析器的工作。特别是对于C,可以通过LR(1)解析器解析语言(http://en.wikipedia.org/wiki/Canonical_LR_parser)。
The way this works is that the parser generates a bunch of tables that make up the possible states of the parser. Only a certain set of symbols are valid in certain states, and the symbols may have different meaning in different states. The parser knows that it is parsing a function because of the preceding symbols. Thus, it knows the possible states do not include the comma operator.
它的工作方式是解析器生成一组表,这些表组成解析器的可能状态。只有一组符号在某些状态下是有效的,符号在不同的状态下可能有不同的意义。由于前面的符号,解析器知道它正在解析一个函数。因此,它知道可能的状态不包括逗号运算符。
I am being very general here, but you can read all about the details in the Wiki.
我在这里说的很一般,但是你可以在维基上读到所有的细节。
#1
48
Look at the grammar for the C language. It's listed, in full, in Appendix A of the standard. The way it works is that you can step through each token in a C program and match them up with the next item in the grammar. At each step you have only a limited number of options, so the interpretation of any given character will depend on the context in which it appears. Inside each rule in the grammar, each line gives a valid alternative for the program to match.
看看C语言的语法。它是在附录A中列出的。它的工作方式是,您可以在C程序中遍历每个令牌,并将它们与语法中的下一个项进行匹配。在每个步骤中,您只有有限的选项,因此对任何给定字符的解释将取决于其出现的上下文。在语法中的每条规则中,每一行都为程序提供了一个有效的选择。
Specifically, if you look for parameter-list
, you will see that it contains an explicit comma. Therefore, whenever the compiler's C parser is in "parameter-list" mode, commas that it finds will be understood as parameter separators, not as comma operators. The same is true for brackets (that can also occur in expressions).
具体地说,如果您寻找parameter-list,您将看到它包含一个显式的逗号。因此,每当编译器的C解析器处于“参数列表”模式时,它发现的逗号将被理解为参数分隔符,而不是逗号操作符。括号也是如此(表达式中也可以出现)。
This works because the parameter-list
rule is careful to use assignment-expression
rules, rather than just the plain expression
rule. An expression
can contain commas, whereas an assignment-expression
cannot. If this were not the case the grammar would be ambiguous, and the compiler would not know what to do when it encountered a comma inside a parameter list.
这之所以有效,是因为参数列表规则小心地使用了赋值表达式规则,而不仅仅是普通的表达式规则。表达式可以包含逗号,而赋值表达式不能。如果不是这种情况,语法将是不明确的,当在参数列表中遇到逗号时,编译器将不知道该做什么。
However, an opening bracket, for example, that is not part of a function definition/call, or an if
, while
, or for
statement, will be interpreted as part of an expression (because there's no other option, but only if the start of an expression is a valid choice at that point), and then, inside the brackets, the expression
syntax rules will apply, and that allows comma operators.
然而,开括号,例如,不属于一个函数定义/电话,或者一个如果,,或声明,将被视为一个表达式的一部分(因为没有其他选择,但只有一个表达式的开始是一个有效的选择),然后,在括号内,表达式语法规则将适用,允许逗号操作符。
#2
26
From C99 6.5.17:
C99 6.5.17:
As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call
如语法所示,逗号操作符(如子句中所述)不能出现在使用逗号分隔列表中的项的上下文中(例如函数的参数或初始化器的列表)。另一方面,它可以在圆括号表达式中使用,也可以在条件运算符的第二个表达式中使用。在函数调用
f(a, (t=3, t+2), c)
the function has three arguments, the second of which has the value 5.
这个函数有三个参数,第二个参数的值是5。
Another similar example is the initializer list of arrays or structs:
另一个类似的例子是数组或结构的初始化器列表:
int array[5] = {1, 2};
struct Foo bar = {1, 2};
If a comma operator were to be used as the function parameter, use it like this:
如果要使用逗号操作符作为函数参数,可以这样使用:
sum((a,b))
This won't compile, of course.
当然,这不会编译。
#3
19
The reason is the C Grammar. While everyone else seems to like to cite the example, the real deal is the phrase structure grammar for function calls in the Standard (C99). Yes, a function call consists of the ()
operator applied to a postfix expression (like for example an identifier):
原因是C语法。尽管其他人似乎都喜欢引用这个例子,但真正重要的是标准(C99)中函数调用的短语结构语法。是的,一个函数调用由()操作符应用到后缀表达式(例如一个标识符):
6.5.2 postfix-expression:
...
postfix-expression ( argument-expression-list_opt )
together with
在一起
argument-expression-list:
assignment-expression
argument-expression-list , assignment-expression <-- arglist comma
expression:
assignment-expression
expression , assignment-expression <-- comma operator
The comma operator can only occur in an expression, i.e. further down the in the grammar. So the compiler treats a comma in a function argument list as the one separating assignment-expressions, not as one separating expressions.
逗号操作符只能在表达式中出现,也就是说,在语法上更进一步。因此,编译器将函数参数列表中的逗号视为分隔赋值表达式的逗号,而不是分隔表达式的逗号。
#4
11
Existing answers say "because the C language spec says it's a list separator, and not an operator".
现有的答案说“因为C语言规范说它是一个列表分隔符,而不是一个操作符”。
However, your question is asking "how does the compiler know...", and that's altogether different: It's really no different from how the compiler knows that the comma in printf("Hello, world\n");
isn't a comma operator: The compiler 'knows' because of the context where the comma appears - basically, what's gone before.
但是,您的问题是:“编译器如何知道……”这与编译器知道printf中的逗号(“Hello, world\n”)没有什么不同;不是逗号运算符:编译器“知道”是因为出现逗号的上下文——基本上,是以前的情况。
The C 'language' can be described in Backus-Naur Form (BNF) - essentially, a set of rules that the compiler's parser uses to scan your input file. The BNF for C will distinguish between these different possible occurences of commas in the language.
C“语言”可以用Backus-Naur表单(BNF)来描述——本质上,这是编译器的解析器用来扫描输入文件的一组规则。C语言的BNF将区分这些不同的可能出现的逗号在语言中的出现。
There are lots of good resources on how compilers work, and how to write one.
关于编译器如何工作以及如何编写一个编译器,有很多很好的参考资料。
#5
6
C99标准草案说:
As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call
f(a, (t=3, t+2), c)
the function has three arguments, the second of which has the value 5.如语法所示,逗号操作符(如子句中所述)不能出现在使用逗号分隔列表中的项的上下文中(例如函数的参数或初始化器的列表)。另一方面,它可以在圆括号表达式中使用,也可以在条件运算符的第二个表达式中使用。在函数调用f(a, (t=3, t+2)和c)中,函数有三个参数,第二个参数的值为5。
In other words, "because".
换句话说,“因为”。
#6
1
There are multiple facets to this question. One par is that the definition says so. Well, how does the compiler know what context this comma is in? That's the parser's job. For C in particular, the language can be parsed by an LR(1) parser (http://en.wikipedia.org/wiki/Canonical_LR_parser).
这个问题有多个方面。一个参数是定义是这样说的。编译器如何知道这个逗号在什么上下文?这是解析器的工作。特别是对于C,可以通过LR(1)解析器解析语言(http://en.wikipedia.org/wiki/Canonical_LR_parser)。
The way this works is that the parser generates a bunch of tables that make up the possible states of the parser. Only a certain set of symbols are valid in certain states, and the symbols may have different meaning in different states. The parser knows that it is parsing a function because of the preceding symbols. Thus, it knows the possible states do not include the comma operator.
它的工作方式是解析器生成一组表,这些表组成解析器的可能状态。只有一组符号在某些状态下是有效的,符号在不同的状态下可能有不同的意义。由于前面的符号,解析器知道它正在解析一个函数。因此,它知道可能的状态不包括逗号运算符。
I am being very general here, but you can read all about the details in the Wiki.
我在这里说的很一般,但是你可以在维基上读到所有的细节。