Recently I have been trying to create in Haskell a regex interpretor. What I did was create a new data type with all possible constructors (for sequence, *
, ^
, intervals, etc) and then define a matcher function. It works wonders but my problem is that I have to convert the input (the String, for example "a(b*)(c|d)ef"
) to my data type ("Seq (Sym a) (Seq (Rep Sym b) (Seq (Or Sym c Sym d) Sym ef))"
). I am having trouble with this part of the problem (I tried creating a new data type, a parsing tree, but I failed completely). Any ideas on how I could solve it?
最近我一直试图在Haskell中创建一个正则表达式解释器。我所做的是创建一个包含所有可能构造函数的新数据类型(用于序列,*,^,区间等),然后定义匹配器函数。它可以创造奇迹,但我的问题是我必须将输入(字符串,例如“a(b *)(c | d)ef”)转换为我的数据类型(“Seq(Sym a)”(Seq(Rep Sym) b)(Seq(或Sym c Sym d)Sym ef))“)。我遇到了这部分问题(我尝试创建一个新的数据类型,一个解析树,但我完全失败了)。关于如何解决它的任何想法?
2 个解决方案
#1
8
The canonical approach is to use a parser combinator library, such as Parsec. Parser combinator libraries (like parser generators) let you write descriptions of your grammar, yielding a parser from strings to tokens in that language.
规范方法是使用解析器组合器库,例如Parsec。解析器组合库(如解析器生成器)允许您编写语法描述,从而使用该语言生成从字符串到标记的解析器。
You simply have to encode your grammar as a Parsec function.
您只需将语法编码为Parsec函数即可。
As an example, see this previous SO question: Using Parsec to parse regular expressions
作为示例,请参阅此前的SO问题:使用Parsec解析正则表达式
#2
4
That's an interesting article (a play) on the implementation of regular expressions:
这是关于正则表达式实现的一篇有趣的文章(戏剧):
正则表达式的游戏
#1
8
The canonical approach is to use a parser combinator library, such as Parsec. Parser combinator libraries (like parser generators) let you write descriptions of your grammar, yielding a parser from strings to tokens in that language.
规范方法是使用解析器组合器库,例如Parsec。解析器组合库(如解析器生成器)允许您编写语法描述,从而使用该语言生成从字符串到标记的解析器。
You simply have to encode your grammar as a Parsec function.
您只需将语法编码为Parsec函数即可。
As an example, see this previous SO question: Using Parsec to parse regular expressions
作为示例,请参阅此前的SO问题:使用Parsec解析正则表达式
#2
4
That's an interesting article (a play) on the implementation of regular expressions:
这是关于正则表达式实现的一篇有趣的文章(戏剧):
正则表达式的游戏