In calculate if two arbitrary regular expressions have any overlapping solutions (assuming it's possible).
在计算两个任意正则表达式是否有任何重叠的解决方案(假设它是可能的)。
For example these two regular expressions can be shown to have no intersections by brute force because the two solution sets are calculable because it's finite.
例如,这两个正则表达式可以通过强力显示没有交叉点,因为这两个解决方案集是可计算的,因为它是有限的。
^1(11){0,1000}$ ∩ ^(11){0,1000}$ = {}
{1,111, ..., ..111} ∩ {11,1111, ..., ...11} = {}
{} = {}
But replacing the {0,1000}
by *
remove the possibility for a brute force solution, so a smarter algorithm must be created.
但是用* 0替换{0,1000}消除了强力解决方案的可能性,因此必须创建更智能的算法。
^1(11)*$ ∩ ^(11)*$ = {}
{1,^1(11)*$} ∩ {^(11)*$} = {}
{1,^1(11)*$} ∩ {11,^11(11)*$} = {}
{1,111,^111(11)*$} ∩ {11,^(11)*$} = {}
.....
In another similar question one answer was to calculate the intersection regex. Is that possible to do? If so how would one write an algorithm to do such a thing?
在另一个类似的问题中,一个答案是计算交集正则表达式。这可能吗?如果是这样,怎么会写一个算法来做这样的事情呢?
I think this problem might be domain of the
halting problem.
我认为这个问题可能是暂停问题的结果。
EDIT:
编辑:
I've used the accepted solution to create the DFAs for the example problem. It's fairly easy to see how you can use a BFS or DFS on the graph of states for M_3
to determine if a final state from M_3
is reachable.
我已经使用已接受的解决方案为示例问题创建了DFA。很容易看出如何在M_3的状态图上使用BFS或DFS来确定M_3的最终状态是否可达。
2 个解决方案
#1
17
It is not in the domain of the halting problem; deciding whether the intersection of regular languages is empty or not can be solved as follows:
它不在停止问题的范围内;判断常规语言的交集是否为空可以解决如下:
- Construct a DFA M1 for the first language.
- 为第一语言构建DFA M1。
- Construct a DFA M2 for the second language. Hint: Kleene's Theorem and Power Set machine construction
- 为第二语言构建DFA M2。提示:Kleene的定理和动力装置机器结构
- Construct a DFA M3 for M1 intersect M2. Hint: Cartesian Product Machine construction
- 为M1相交M2构造DFA M3。提示:笛卡尔产品机器结构
- Determine whether L(M3) is empty. Hint: If M3 has n states, and M3 doesn't accept any strings of length no greater than n, then L(M3) is empty... why?
- 确定L(M3)是否为空。提示:如果M3有n个状态,并且M3不接受任何长度不大于n的字符串,则L(M3)为空......为什么?
Each of those things can be algorithmically done and/or checked. Also, naturally, once you have a DFA recognizing the intersection of your languages, you can construct a regex to match the language. And if you start out with a regex, you can make a DFA. This is definitely computable.
这些事物中的每一个都可以通过算法完成和/或检查。此外,当然,一旦您有DFA识别您的语言的交集,您就可以构建一个与该语言匹配的正则表达式。如果你从正则表达式开始,你可以制作DFA。这绝对是可计算的。
EDIT:
编辑:
So to build a Cartesian Product Machine, you need two DFAs. Let M1 = (E, q0, Q1, A1, f1) and M2 = (E, q0', Q2, A2, f2). In both cases, E is the input alphabet, q0 is the start state, Q is the set of all states, A is the set of accepting states, and f is the transition function. Construct M3 where...
因此,要构建笛卡尔积机,需要两个DFA。设M1 =(E,q0,Q1,A1,f1),M2 =(E,q0',Q2,A2,f2)。在这两种情况下,E是输入字母,q0是开始状态,Q是所有状态的集合,A是接受状态的集合,f是转换函数。构建M3 ......
- E3 = E
- E3 = E.
- Q3 = Q1 x Q2 (ordered pairs)
- Q3 = Q1 x Q2(有序对)
- q0'' = (q0, q0')
- q0''=(q0,q0')
- A3 = {(x, y) | x in A1 and y in A2}
- A3 = {(x,y)| A1中的x和A2中的y}
- f3(s, (x, y)) = (f1(s, x), f2(s, y))
- f3(s,(x,y))=(f1(s,x),f2(s,y))
Provided I didn't make any mistakes, L(M3) = L(M1) intersect L(M2). Neat, huh?
如果我没有犯任何错误,则L(M3)= L(M1)与L(M2)相交。整洁,对吧?
#2
2
I've created a PHP implementation of Patrick87 answer. In addition to implementing the Intersection via Cartesian Product Machine, I've also implemented an alterative algorithm for finding Intersections of DFAs using De Morgan.
我已经创建了Patrick87答案的PHP实现。除了通过笛卡尔积分机实现交叉口之外,我还使用De Morgan实现了一种用于查找DFA交叉点的替代算法。
Intersection( DFA_1, DFA_2 ) === ! UNION( ! DFA_1, ! DFA_2 )
* ! is defined as negation
This works very well for DFAs as the negation of a fully defined DFA (those with every possible transition state defined) is just to add all non-final states to the final state set and remove all current final states from the final state set (non-final -> final, final -> non->final). Union of DFA can be done easily by turning them into a NFA and then creating a new starting node that connects the unioned DFA's old start nodes by lambda transforms.
这对于DFA非常有效,因为完全定义的DFA(定义了每个可能的转换状态的那些)的否定只是将所有非最终状态添加到最终状态集并从最终状态集中移除所有当前最终状态(非-final - > final,final - > non> final)。通过将它们转换为NFA,然后创建一个新的起始节点,通过lambda变换连接联合DFA的旧起始节点,可以轻松完成DFA联合。
In addition to solving the intersection problem, the library I created is also able to determinize a NFA to a DFA and convert Regex to NFA.
除了解决交集问题之外,我创建的库还能够将NFA确定为DFA并将Regex转换为NFA。
EDIT:
编辑:
I have created a webapp that allows this sort of transformations on regex languagues using what I learned form this question (and others).
我已经创建了一个webapp,允许使用我从这个问题(和其他人)学到的东西对正则表达式语言进行这种转换。
#1
17
It is not in the domain of the halting problem; deciding whether the intersection of regular languages is empty or not can be solved as follows:
它不在停止问题的范围内;判断常规语言的交集是否为空可以解决如下:
- Construct a DFA M1 for the first language.
- 为第一语言构建DFA M1。
- Construct a DFA M2 for the second language. Hint: Kleene's Theorem and Power Set machine construction
- 为第二语言构建DFA M2。提示:Kleene的定理和动力装置机器结构
- Construct a DFA M3 for M1 intersect M2. Hint: Cartesian Product Machine construction
- 为M1相交M2构造DFA M3。提示:笛卡尔产品机器结构
- Determine whether L(M3) is empty. Hint: If M3 has n states, and M3 doesn't accept any strings of length no greater than n, then L(M3) is empty... why?
- 确定L(M3)是否为空。提示:如果M3有n个状态,并且M3不接受任何长度不大于n的字符串,则L(M3)为空......为什么?
Each of those things can be algorithmically done and/or checked. Also, naturally, once you have a DFA recognizing the intersection of your languages, you can construct a regex to match the language. And if you start out with a regex, you can make a DFA. This is definitely computable.
这些事物中的每一个都可以通过算法完成和/或检查。此外,当然,一旦您有DFA识别您的语言的交集,您就可以构建一个与该语言匹配的正则表达式。如果你从正则表达式开始,你可以制作DFA。这绝对是可计算的。
EDIT:
编辑:
So to build a Cartesian Product Machine, you need two DFAs. Let M1 = (E, q0, Q1, A1, f1) and M2 = (E, q0', Q2, A2, f2). In both cases, E is the input alphabet, q0 is the start state, Q is the set of all states, A is the set of accepting states, and f is the transition function. Construct M3 where...
因此,要构建笛卡尔积机,需要两个DFA。设M1 =(E,q0,Q1,A1,f1),M2 =(E,q0',Q2,A2,f2)。在这两种情况下,E是输入字母,q0是开始状态,Q是所有状态的集合,A是接受状态的集合,f是转换函数。构建M3 ......
- E3 = E
- E3 = E.
- Q3 = Q1 x Q2 (ordered pairs)
- Q3 = Q1 x Q2(有序对)
- q0'' = (q0, q0')
- q0''=(q0,q0')
- A3 = {(x, y) | x in A1 and y in A2}
- A3 = {(x,y)| A1中的x和A2中的y}
- f3(s, (x, y)) = (f1(s, x), f2(s, y))
- f3(s,(x,y))=(f1(s,x),f2(s,y))
Provided I didn't make any mistakes, L(M3) = L(M1) intersect L(M2). Neat, huh?
如果我没有犯任何错误,则L(M3)= L(M1)与L(M2)相交。整洁,对吧?
#2
2
I've created a PHP implementation of Patrick87 answer. In addition to implementing the Intersection via Cartesian Product Machine, I've also implemented an alterative algorithm for finding Intersections of DFAs using De Morgan.
我已经创建了Patrick87答案的PHP实现。除了通过笛卡尔积分机实现交叉口之外,我还使用De Morgan实现了一种用于查找DFA交叉点的替代算法。
Intersection( DFA_1, DFA_2 ) === ! UNION( ! DFA_1, ! DFA_2 )
* ! is defined as negation
This works very well for DFAs as the negation of a fully defined DFA (those with every possible transition state defined) is just to add all non-final states to the final state set and remove all current final states from the final state set (non-final -> final, final -> non->final). Union of DFA can be done easily by turning them into a NFA and then creating a new starting node that connects the unioned DFA's old start nodes by lambda transforms.
这对于DFA非常有效,因为完全定义的DFA(定义了每个可能的转换状态的那些)的否定只是将所有非最终状态添加到最终状态集并从最终状态集中移除所有当前最终状态(非-final - > final,final - > non> final)。通过将它们转换为NFA,然后创建一个新的起始节点,通过lambda变换连接联合DFA的旧起始节点,可以轻松完成DFA联合。
In addition to solving the intersection problem, the library I created is also able to determinize a NFA to a DFA and convert Regex to NFA.
除了解决交集问题之外,我创建的库还能够将NFA确定为DFA并将Regex转换为NFA。
EDIT:
编辑:
I have created a webapp that allows this sort of transformations on regex languagues using what I learned form this question (and others).
我已经创建了一个webapp,允许使用我从这个问题(和其他人)学到的东西对正则表达式语言进行这种转换。