I have a fairly complicated mathematical function that I've been advised should be implemented as a User Defined Function in SQL Server so that it can be used efficiently from within a SQL query.
我有一个相当复杂的数学函数,我被建议在SQL Server中作为一个用户定义函数实现,这样就可以在SQL查询中有效地使用它。
The problem is that it must be very efficient as it may be executed thousands of times per second, and I subsequently heard that UDFs are very inefficient.
问题是它必须非常高效,因为它可能每秒执行数千次,我后来听说UDFs非常低效。
Someone suggested that I could implement the function in C# instead, and that this would be much more efficient.
有人建议我可以用c#实现这个函数,这样会更有效。
What should I do?
我应该做什么?
4 个解决方案
#1
2
Complex mathematical functions will execute much more quickly in C# than T-SQL. It's well worth trying.
在c#中执行复杂的数学函数要比T-SQL快得多。这是很值得尝试。
Edit In answer to your comment, I found this blog post, which states:
在回答你的评论时,我发现了这篇博文,上面写着:
User-defined scalar functions are likely the most obvious candidates for SQLCLR usage. There are two primary reasons for this. The first is that CLR functions actually have lower invocation overhead than T-SQL functions. T-SQL functions require the runtime to create a new T-SQL frame, which is expensive. CLR functions are embedded in the plan as a function pointer for direct execution.
用户定义的标量函数可能是SQLCLR使用的最明显的候选者。这主要有两个原因。首先,CLR函数实际上比T-SQL函数具有更低的调用开销。T-SQL函数需要运行时创建一个新的T-SQL框架,这很昂贵。CLR函数作为直接执行的函数指针嵌入到计划中。
And in conclusion of a single test:
在一个测试的结尾:
In the case of this prime number validating function the performance advantage of SQLCLR over T-SQL is an order of magnitude in speedup!
在这个质数验证函数的情况下,SQLCLR比T-SQL的性能优势是一个数量级的加速!
So, C# sounds like the way to go.
所以,c#听起来很不错。
#2
1
I would do both, measure the times and stick with the one with better performance.
我会同时做这两件事,测量时间并坚持做表现更好的那件事。
#3
1
Standard UDFs in SQL Server can be inefficient Because they are compiled each time they run, (as they are not included in the query plan for the entire SQL statement executed by the Query Processor).
SQL Server中的标准udf可能效率不高,因为每次运行时都会编译它们(因为它们不包含在查询处理器执行的整个SQL语句的查询计划中)。
This is because a standard UDF defines a processing algorithm that cannot be "folded" into the overall query plan that the outer sql statement is executing...
这是因为标准UDF定义了一种处理算法,它不能“折叠”到外部sql语句正在执行的整个查询计划中……
An inline table valued user defined function, otoh, because it is simply defined as a sql statement itself, can be folded into the overall query plan, and therefore is extremely fast and performant.
一个内联表值用户定义函数otoh,因为它被简单地定义为sql语句本身,可以被折叠到整个查询计划中,因此它非常快速和高效。
For a complex math function, generally this is probably not possible, or at minimum will be very difficult), but if your function can be written as an inline UDF, that would definitely be the way to go. Otherwise, you're probably better off doing it in code.
对于一个复杂的数学函数,通常这可能是不可能的,或者至少是非常困难的,但是如果您的函数可以作为内联的UDF来编写,那么这肯定是一种方法。否则,最好是用代码来做。
#4
0
As @otavio says you need to obtain measurements before deciding. However should also consider the function it self in a larger set or optimisations . For example are you running this function many times on the same data, or could you perform this on as needed basis. Can you store the results for example, and are you able to write a bigger function that operates over a set of data than calling it may times, etc.
正如@otavio所说,您需要在决定之前获得度量。但是,也应该在更大的集合或优化中考虑它自己的功能。例如,您是否在相同的数据上多次运行此函数,或者您是否可以根据需要执行此函数。你能把结果存储起来吗?你能写一个更大的函数,它能在一组数据上操作,而不是调用它。
#1
2
Complex mathematical functions will execute much more quickly in C# than T-SQL. It's well worth trying.
在c#中执行复杂的数学函数要比T-SQL快得多。这是很值得尝试。
Edit In answer to your comment, I found this blog post, which states:
在回答你的评论时,我发现了这篇博文,上面写着:
User-defined scalar functions are likely the most obvious candidates for SQLCLR usage. There are two primary reasons for this. The first is that CLR functions actually have lower invocation overhead than T-SQL functions. T-SQL functions require the runtime to create a new T-SQL frame, which is expensive. CLR functions are embedded in the plan as a function pointer for direct execution.
用户定义的标量函数可能是SQLCLR使用的最明显的候选者。这主要有两个原因。首先,CLR函数实际上比T-SQL函数具有更低的调用开销。T-SQL函数需要运行时创建一个新的T-SQL框架,这很昂贵。CLR函数作为直接执行的函数指针嵌入到计划中。
And in conclusion of a single test:
在一个测试的结尾:
In the case of this prime number validating function the performance advantage of SQLCLR over T-SQL is an order of magnitude in speedup!
在这个质数验证函数的情况下,SQLCLR比T-SQL的性能优势是一个数量级的加速!
So, C# sounds like the way to go.
所以,c#听起来很不错。
#2
1
I would do both, measure the times and stick with the one with better performance.
我会同时做这两件事,测量时间并坚持做表现更好的那件事。
#3
1
Standard UDFs in SQL Server can be inefficient Because they are compiled each time they run, (as they are not included in the query plan for the entire SQL statement executed by the Query Processor).
SQL Server中的标准udf可能效率不高,因为每次运行时都会编译它们(因为它们不包含在查询处理器执行的整个SQL语句的查询计划中)。
This is because a standard UDF defines a processing algorithm that cannot be "folded" into the overall query plan that the outer sql statement is executing...
这是因为标准UDF定义了一种处理算法,它不能“折叠”到外部sql语句正在执行的整个查询计划中……
An inline table valued user defined function, otoh, because it is simply defined as a sql statement itself, can be folded into the overall query plan, and therefore is extremely fast and performant.
一个内联表值用户定义函数otoh,因为它被简单地定义为sql语句本身,可以被折叠到整个查询计划中,因此它非常快速和高效。
For a complex math function, generally this is probably not possible, or at minimum will be very difficult), but if your function can be written as an inline UDF, that would definitely be the way to go. Otherwise, you're probably better off doing it in code.
对于一个复杂的数学函数,通常这可能是不可能的,或者至少是非常困难的,但是如果您的函数可以作为内联的UDF来编写,那么这肯定是一种方法。否则,最好是用代码来做。
#4
0
As @otavio says you need to obtain measurements before deciding. However should also consider the function it self in a larger set or optimisations . For example are you running this function many times on the same data, or could you perform this on as needed basis. Can you store the results for example, and are you able to write a bigger function that operates over a set of data than calling it may times, etc.
正如@otavio所说,您需要在决定之前获得度量。但是,也应该在更大的集合或优化中考虑它自己的功能。例如,您是否在相同的数据上多次运行此函数,或者您是否可以根据需要执行此函数。你能把结果存储起来吗?你能写一个更大的函数,它能在一组数据上操作,而不是调用它。