如何在LEX / FLEX中编写非贪婪的匹配?

时间:2021-12-30 09:43:50

I'm trying to parse a legacy language (which is similar to 'C') using FLEX and BISON. Everything is working nicely except for matching strings.

我正在尝试使用FLEX和BISON解析遗留语言(类似于'C')。除了匹配字符串之外,一切都很好用。

This rather odd legacy language doesn't support quoting characters in string literals, so the following are all valid string literals:

这种相当奇怪的遗留语言不支持在字符串文字中引用字符,因此以下都是有效的字符串文字:

"hello"
""
"\"

I'm using the following rule to match string literals:

我正在使用以下规则来匹配字符串文字:

\".*\"            { yylval.strval = _strdup( yytext ); return LIT_STRING; }

Unfortunately this is a greedy match, so it matches code like the following:

不幸的是,这是一个贪婪的匹配,所以它匹配如下代码:

"hello", "world"

As a single string (hello", "world).

作为单个字符串(hello“,”world)。

The usual non-greedy quantifier .*? doesn't seem to work in FLEX. Any ideas?

通常的非贪心量词。*?似乎在FLEX中不起作用。有任何想法吗?

2 个解决方案

#1


11  

Just prohibit having a quote in between the quotes.

只是禁止在引号之间加引号。

\"[^"]*\"

#2


4  

Backslash escaped quotes

反斜杠转义报价

The following also allows it:

以下还允许它:

\"(\\.|[^\n"\\])*\" {
        fprintf( yyout, "STRING: %s\n", yytext );
    }

and disallows for newlines inside of string constants.

并且不允许在字符串常量中使用换行符。

E.g.:

例如。:

>>> "a\"b""c\d"""
STRING: "a\"b"
STRING: "c\d"
STRING: ""

and fails on:

并失败:

>>> "\"

When implementing such C-like features, make sure to look for existing Lex implementations, e.g.: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

在实现类似C的功能时,请务必查找现有的Lex实现,例如:http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

#1


11  

Just prohibit having a quote in between the quotes.

只是禁止在引号之间加引号。

\"[^"]*\"

#2


4  

Backslash escaped quotes

反斜杠转义报价

The following also allows it:

以下还允许它:

\"(\\.|[^\n"\\])*\" {
        fprintf( yyout, "STRING: %s\n", yytext );
    }

and disallows for newlines inside of string constants.

并且不允许在字符串常量中使用换行符。

E.g.:

例如。:

>>> "a\"b""c\d"""
STRING: "a\"b"
STRING: "c\d"
STRING: ""

and fails on:

并失败:

>>> "\"

When implementing such C-like features, make sure to look for existing Lex implementations, e.g.: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

在实现类似C的功能时,请务必查找现有的Lex实现,例如:http://www.lysator.liu.se/c/ANSI-C-grammar-l.html