In flex, I want to return multiple tokens for one match of a regular expression. Is there a way to do this?
在flex中,我想为正则表达式的一个匹配返回多个标记。有没有办法做到这一点?
3 个解决方案
#1
2
The way I've been doing this is to create a queue of to-be-returned tokens, and at the beginning of yylex()
, check for tokens and return them.
我这样做的方法是创建一个待返回令牌的队列,并在yylex()的开头,检查令牌并返回它们。
#2
0
Do you mean all matches? Are you using regex functions or string functions? Use the global flag.
你的意思是所有的比赛?你在使用正则表达式函数或字符串函数吗?使用全局标志。
As for flex, I don't think you can do that. You test for a match with one pattern at a time so that's probably out of scope. Why'd you want that? As an optimization? Scoping issues?
至于flex,我认为你不能这样做。您一次测试一个模式的匹配,以便可能超出范围。你为什么要这样?作为优化?范围问题?
#3
-1
Usually, this is handled by a parser on top of the scanner which gives you much cleaner code. You can emulate that to some degree with states:
通常,这由扫描程序顶部的解析器处理,为您提供更清晰的代码。您可以在某种程度上模仿州:
%option noyywrap
%top {
#define TOKEN_LEFT_PAREN 4711
#define TOKEN_RIGHT_PAREN 4712
#define TOKEN_NUMBER 4713
}
%x PAREN_STATE
%%
"(" BEGIN(PAREN_STATE); return TOKEN_LEFT_PAREN;
<PAREN_STATE>{
[0-9]+ return TOKEN_NUMBER;
")" BEGIN(INITIAL); return TOKEN_RIGHT_PAREN;
.|\n /* maybe signal syntax error here */
}
%%
int main (int argc, char *argv [])
{
int i;
while ((i = yylex ()))
printf ("%d\n", i);
return 0;
}
but this will get very messy as soon as your grammar gets more complex.
但是一旦语法变得更复杂,这将变得非常混乱。
#1
2
The way I've been doing this is to create a queue of to-be-returned tokens, and at the beginning of yylex()
, check for tokens and return them.
我这样做的方法是创建一个待返回令牌的队列,并在yylex()的开头,检查令牌并返回它们。
#2
0
Do you mean all matches? Are you using regex functions or string functions? Use the global flag.
你的意思是所有的比赛?你在使用正则表达式函数或字符串函数吗?使用全局标志。
As for flex, I don't think you can do that. You test for a match with one pattern at a time so that's probably out of scope. Why'd you want that? As an optimization? Scoping issues?
至于flex,我认为你不能这样做。您一次测试一个模式的匹配,以便可能超出范围。你为什么要这样?作为优化?范围问题?
#3
-1
Usually, this is handled by a parser on top of the scanner which gives you much cleaner code. You can emulate that to some degree with states:
通常,这由扫描程序顶部的解析器处理,为您提供更清晰的代码。您可以在某种程度上模仿州:
%option noyywrap
%top {
#define TOKEN_LEFT_PAREN 4711
#define TOKEN_RIGHT_PAREN 4712
#define TOKEN_NUMBER 4713
}
%x PAREN_STATE
%%
"(" BEGIN(PAREN_STATE); return TOKEN_LEFT_PAREN;
<PAREN_STATE>{
[0-9]+ return TOKEN_NUMBER;
")" BEGIN(INITIAL); return TOKEN_RIGHT_PAREN;
.|\n /* maybe signal syntax error here */
}
%%
int main (int argc, char *argv [])
{
int i;
while ((i = yylex ()))
printf ("%d\n", i);
return 0;
}
but this will get very messy as soon as your grammar gets more complex.
但是一旦语法变得更复杂,这将变得非常混乱。