A Levenshtein implementation in C# and F#. The C# version is 10 times faster for two strings of about 1500 chars. C#: 69 ms, F# 867 ms. Why? As far as I can tell, they do the exact same thing? Doesn't matter if it is a Release or a Debug build.
Levenshtein在C#和F#中的实现。对于两个大约1500个字符的字符串,C#版本快10倍。 C#:69 ms,F#867 ms。为什么?据我所知,他们做同样的事情?无论是Release还是Debug构建都无关紧要。
EDIT: If anyone comes here looking specifically for the Edit Distance implementation, it is broken. Working code is here.
编辑:如果有人来这里专门寻找编辑距离实施,它就会被打破。工作代码在这里。
C#:
C#:
private static int min3(int a, int b, int c){ return Math.Min(Math.Min(a, b), c);}public static int EditDistance(string m, string n){ var d1 = new int[n.Length]; for (int x = 0; x < d1.Length; x++) d1[x] = x; var d0 = new int[n.Length]; for(int i = 1; i < m.Length; i++) { d0[0] = i; var ui = m[i]; for (int j = 1; j < n.Length; j++ ) { d0[j] = 1 + min3(d1[j], d0[j - 1], d1[j - 1] + (ui == n[j] ? -1 : 0)); } Array.Copy(d0, d1, d1.Length); } return d0[n.Length - 1];}
F#:
F#:
let min3(a, b, c) = min a (min b c)let levenshtein (m:string) (n:string) = let d1 = Array.init n.Length id let d0 = Array.create n.Length 0 for i=1 to m.Length-1 do d0.[0] <- i let ui = m.[i] for j=1 to n.Length-1 do d0.[j] <- 1 + min3(d1.[j], d0.[j-1], d1.[j-1] + if ui = n.[j] then -1 else 0) Array.blit d0 0 d1 0 n.Length d0.[n.Length-1]
1 个解决方案
#1
185
The problem is that the min3
function is compiled as a generic function that uses generic comparison (I thought this uses just IComparable
, but it is actually more complicated - it would use structural comparison for F# types and it's fairly complex logic).
问题是min3函数被编译为使用泛型比较的泛型函数(我认为它只使用IComparable,但它实际上更复杂 - 它将使用F#类型的结构比较,而且它是相当复杂的逻辑)。
> let min3(a, b, c) = min a (min b c);;val min3 : 'a * 'a * 'a -> 'a when 'a : comparison
In the C# version, the function is not generic (it just takes int
). You can improve the F# version by adding type annotations (to get the same thing as in C#):
在C#版本中,该函数不是通用的(它只需要int)。您可以通过添加类型注释来改进F#版本(以获得与C#中相同的内容):
let min3(a:int, b, c) = min a (min b c)
...or by making min3
as inline
(in which case, it will be specialized to int
when used):
...或者将min3设为内联(在这种情况下,使用时将专门用于int):
let inline min3(a, b, c) = min a (min b c);;
For a random string str
of length 300, I get the following numbers:
对于长度为300的随机字符串str,我得到以下数字:
> levenshtein str ("foo" + str);;Real: 00:00:03.938, CPU: 00:00:03.900, GC gen0: 275, gen1: 1, gen2: 0val it : int = 3> levenshtein_inlined str ("foo" + str);;Real: 00:00:00.068, CPU: 00:00:00.078, GC gen0: 0, gen1: 0, gen2: 0val it : int = 3
#1
185
The problem is that the min3
function is compiled as a generic function that uses generic comparison (I thought this uses just IComparable
, but it is actually more complicated - it would use structural comparison for F# types and it's fairly complex logic).
问题是min3函数被编译为使用泛型比较的泛型函数(我认为它只使用IComparable,但它实际上更复杂 - 它将使用F#类型的结构比较,而且它是相当复杂的逻辑)。
> let min3(a, b, c) = min a (min b c);;val min3 : 'a * 'a * 'a -> 'a when 'a : comparison
In the C# version, the function is not generic (it just takes int
). You can improve the F# version by adding type annotations (to get the same thing as in C#):
在C#版本中,该函数不是通用的(它只需要int)。您可以通过添加类型注释来改进F#版本(以获得与C#中相同的内容):
let min3(a:int, b, c) = min a (min b c)
...or by making min3
as inline
(in which case, it will be specialized to int
when used):
...或者将min3设为内联(在这种情况下,使用时将专门用于int):
let inline min3(a, b, c) = min a (min b c);;
For a random string str
of length 300, I get the following numbers:
对于长度为300的随机字符串str,我得到以下数字:
> levenshtein str ("foo" + str);;Real: 00:00:03.938, CPU: 00:00:03.900, GC gen0: 275, gen1: 1, gen2: 0val it : int = 3> levenshtein_inlined str ("foo" + str);;Real: 00:00:00.068, CPU: 00:00:00.078, GC gen0: 0, gen1: 0, gen2: 0val it : int = 3