I have seen several patterns used to 'overcome' the lack of constants in SQL Server, but none of them seem to satisfy both performance and readability / maintainability concerns.
我已经看到了一些用于“克服”SQL Server中常量不足的模式,但它们似乎都不能同时满足性能和可读性/维护性方面的问题。
In the below example, assuming that we have an integral 'status' classification on our table, the options seem to be:
在下面的例子中,假设我们的表上有一个完整的“状态”分类,选项似乎是:
- Just to hard code it, and possibly just 'comment' the status
- 仅仅是硬编码,或者只是“评论”状态
-- StatusId 87 = Loaded
SELECT ... FROM [Table] WHERE StatusId = 87;
- Using a lookup table for states, and then joining to this table so that the
WHERE
clause references the friendly name. - 使用状态查找表,然后连接到该表,以便WHERE子句引用友好的名称。
SubQuery:
子查询:
SELECT ...
FROM [Table]
WHERE
StatusId = (SELECT StatusId FROM TableStatus WHERE StatusName = 'Loaded');
or joined
或加入
SELECT ...
FROM [Table] t INNER JOIN TableStatus ts On t.StatusId = ts.StatusId
WHERE ts.StatusName = 'Loaded';
- A bunch of scalar UDF's defined which return constants, viz
- 一个标量UDF定义的返回常数的集合
CREATE Function LoadedStatus()
RETURNS INT
AS
BEGIN
RETURN 87
END;
and then
然后
SELECT ... FROM [Table] WHERE StatusId = LoadedStatus();
(IMO this causes a lot of pollution in the database - this might be OK in an Oracle package wrapper)
(在我看来,这在数据库中造成了很多污染——这在Oracle包包装中是可以的)
- And similar patterns with Table Valued Functions holding the constants with values as rows or columns, which are
CROSS APPLIED
back to[Table]
- 还有类似的模式,表值函数以行或列的形式保存常量,这些值交叉应用到[Table]
How have other SO users have solved this common issue?
其他用户是如何解决这个常见问题的?
Edit : Bounty - Does anyone have a best practice method for maintaining $(variables) in DBProj DDL / Schema scripts as per Remus answer and comment?
编辑:Bounty——是否有人有根据Remus的回答和注释在DBProj DDL /模式脚本中维护$(变量)的最佳实践方法?
4 个解决方案
#1
12
Hard coded. With SQL performance trumps maintainability.
硬编码的。SQL性能胜过可维护性。
The consequences in the execution plan between using a constant that the optimizer can inspect at plan generation time vs. using any form of indirection (UDF, JOIN, sub-query) are often dramatic. SQL 'compilation' is an extraordinary process (in the sense that is not 'ordinary' like say IL code generation) in as the result is determined not only by the language construct being compiled (ie. the actual text of the query) but also by the data schema (existing indexes) and actual data in those indexes (statistics). When a hard coded value is used, the optimizer can give a better plan because it can actually check the value against the index statistics and get an estimate of the result.
使用一个常量,优化器可以在计划生成时检查,而使用任何形式的间接(UDF, JOIN,子查询),在执行计划中所产生的后果通常是非常显著的。SQL ' compile '是一个非常特殊的过程(在某种意义上,它并不像IL代码生成那样“普通”),其结果不仅取决于被编译的语言结构(即编译的语言构造)。查询的实际文本),但也由数据模式(现有索引)和这些索引中的实际数据(统计数据)组成。当使用硬编码值时,优化器可以提供更好的计划,因为它实际上可以根据索引统计数据检查值并得到结果的估计。
Another consideration is that a SQL application is not code only, but by a large margin is code and data. 'Refactoring' a SQL program is ... different. Where in a C# program one can change a constant or enum, recompile and happily run the application, in SQL one cannot do so because the value is likely present in millions of records in the database and changing the constant value implies also changing GBs of data, often online while new operations occur.
另一个考虑是,SQL应用程序不是代码,而是大量的代码和数据。“重构”一个SQL程序是……不同。在一个c#程序可以改变一个常数或枚举编译和愉快地运行应用程序,在SQL不能这样做,因为该值可能出现在数以百万计的数据库中的记录和修改定值也意味着改变GBs的数据,经常在线,而新的操作发生。
Just because the value is hard-coded in the queries and procedures seen by the server does not necessarily mean the value has to be hard coded in the original project source code. There are various code generation tools that can take care of this. Consider something as trivial as leveraging the sqlcmd scripting variables:
仅仅因为该值是在服务器看到的查询和过程中硬编码的,并不一定意味着该值必须在原始项目源代码中硬编码。有各种代码生成工具可以解决这个问题。考虑使用sqlcmd脚本变量这样的小事:
defines.sql
:
defines.sql:
:setvar STATUS_LOADED 87
somesource.sql
:
somesource.sql:
:r defines.sql
SELECT ... FROM [Table] WHERE StatusId = $(STATUS_LOADED);
someothersource.sql
:
someothersource.sql:
:r defines.sql
UPDATE [Table] SET StatusId = $(STATUS_LOADED) WHERE ...;
#2
6
While I agree with Remus Rusanu, IMO, maintainability of the code (and thus readability, least astonishment etc.) trump other concerns unless the performance difference is sufficiently significant as to warrant doing otherwise. Thus, the following query loses on readability:
虽然我同意Remus Rusanu、IMO的观点,但是代码的可维护性(因此可读性、最少的惊奇等等)胜过其他的问题,除非性能差异足够显著,从而有理由这样做。因此,以下查询在可读性上失去了:
Select ..
From Table
Where StatusId = 87
In general, when I have system dependent values which will be referenced in code (perhaps mimicked in an enumeration by name), I use string primary keys for the tables in which they are kept. Contrast this to user-changeable data in which I generally use surrogate keys. The use of a primary key that requires entry helps (albeit not perfectly) to indicate to other developers that this value is not meant to be arbitrary.
一般来说,当我有系统相关的值时,这些值将在代码中被引用(可能在按名称进行的枚举中被模仿),我将对保存它们的表使用字符串主键。这与我通常使用代理键的用户可变数据形成对比。使用需要输入的主键可以帮助(尽管不是很完美)向其他开发人员表明这个值不是任意的。
Thus, my "Status" table would look like:
因此,我的“状态”表将是:
Create Table Status
(
Code varchar(6) Not Null Primary Key
, ...
)
Select ...
From Table
Where StatusCode = 'Loaded'
This makes the query more readable, it does not require a join to the Status table, and does not require the use of a magic number (or guid). Using user-defined functions, IMO is a bad practice. Beyond the performance implications, no developer would ever expect UDFs to be used in this manner and thus violates the least astonishment criteria. You would almost be compelled to have a UDF for each constant value; otherwise, what you are passing into the function: a name? a magic value? If a name, you might as well keep the name in a table and use it directly in the query. If a magic value, you are back the original problem.
这使得查询更具可读性,它不需要对状态表的连接,也不需要使用一个神奇的数字(或guid)。使用用户定义的函数,IMO是一个坏习惯。除了性能影响之外,没有开发人员会期望udf以这种方式使用,因此违反了最不令人惊讶的标准。你几乎要为每个常数值设置一个UDF;否则,您要传递给函数的是什么:名称?一个魔法值吗?如果是名称,您还可以将名称保存在表中,并在查询中直接使用它。如果有一个神奇的值,你就回到了原来的问题。
#3
3
I have been using the scalar function option in our DB and it's work fine and as per my view is the best way of this solution.
我一直在我们的DB中使用标量函数选项,它工作得很好,根据我的观点,这是这个解决方案的最佳方式。
if more values related to one item then made lookup like if you load combobox or any other control with static value then use lookup that's the best way to do this.
如果有更多与一项相关的值,然后进行查找,就像如果您加载combobox或任何其他具有静态值的控件,那么使用查找是最好的方法。
#4
2
You can also add more fields to your status table that act as unique markers or groupers for status values. For example, if you add an isLoaded field to your status table, record 87 could be the only one with the field's value set, and you can test for the value of the isLoaded field instead of the hard-coded 87 or status description.
您还可以向状态表添加更多的字段,这些字段充当状态值的惟一标记或groupers。例如,如果向状态表添加一个isLoaded字段,那么记录87可能是唯一一个设置了该字段值的字段,您可以测试isLoaded字段的值,而不是硬编码的87或状态描述。
#1
12
Hard coded. With SQL performance trumps maintainability.
硬编码的。SQL性能胜过可维护性。
The consequences in the execution plan between using a constant that the optimizer can inspect at plan generation time vs. using any form of indirection (UDF, JOIN, sub-query) are often dramatic. SQL 'compilation' is an extraordinary process (in the sense that is not 'ordinary' like say IL code generation) in as the result is determined not only by the language construct being compiled (ie. the actual text of the query) but also by the data schema (existing indexes) and actual data in those indexes (statistics). When a hard coded value is used, the optimizer can give a better plan because it can actually check the value against the index statistics and get an estimate of the result.
使用一个常量,优化器可以在计划生成时检查,而使用任何形式的间接(UDF, JOIN,子查询),在执行计划中所产生的后果通常是非常显著的。SQL ' compile '是一个非常特殊的过程(在某种意义上,它并不像IL代码生成那样“普通”),其结果不仅取决于被编译的语言结构(即编译的语言构造)。查询的实际文本),但也由数据模式(现有索引)和这些索引中的实际数据(统计数据)组成。当使用硬编码值时,优化器可以提供更好的计划,因为它实际上可以根据索引统计数据检查值并得到结果的估计。
Another consideration is that a SQL application is not code only, but by a large margin is code and data. 'Refactoring' a SQL program is ... different. Where in a C# program one can change a constant or enum, recompile and happily run the application, in SQL one cannot do so because the value is likely present in millions of records in the database and changing the constant value implies also changing GBs of data, often online while new operations occur.
另一个考虑是,SQL应用程序不是代码,而是大量的代码和数据。“重构”一个SQL程序是……不同。在一个c#程序可以改变一个常数或枚举编译和愉快地运行应用程序,在SQL不能这样做,因为该值可能出现在数以百万计的数据库中的记录和修改定值也意味着改变GBs的数据,经常在线,而新的操作发生。
Just because the value is hard-coded in the queries and procedures seen by the server does not necessarily mean the value has to be hard coded in the original project source code. There are various code generation tools that can take care of this. Consider something as trivial as leveraging the sqlcmd scripting variables:
仅仅因为该值是在服务器看到的查询和过程中硬编码的,并不一定意味着该值必须在原始项目源代码中硬编码。有各种代码生成工具可以解决这个问题。考虑使用sqlcmd脚本变量这样的小事:
defines.sql
:
defines.sql:
:setvar STATUS_LOADED 87
somesource.sql
:
somesource.sql:
:r defines.sql
SELECT ... FROM [Table] WHERE StatusId = $(STATUS_LOADED);
someothersource.sql
:
someothersource.sql:
:r defines.sql
UPDATE [Table] SET StatusId = $(STATUS_LOADED) WHERE ...;
#2
6
While I agree with Remus Rusanu, IMO, maintainability of the code (and thus readability, least astonishment etc.) trump other concerns unless the performance difference is sufficiently significant as to warrant doing otherwise. Thus, the following query loses on readability:
虽然我同意Remus Rusanu、IMO的观点,但是代码的可维护性(因此可读性、最少的惊奇等等)胜过其他的问题,除非性能差异足够显著,从而有理由这样做。因此,以下查询在可读性上失去了:
Select ..
From Table
Where StatusId = 87
In general, when I have system dependent values which will be referenced in code (perhaps mimicked in an enumeration by name), I use string primary keys for the tables in which they are kept. Contrast this to user-changeable data in which I generally use surrogate keys. The use of a primary key that requires entry helps (albeit not perfectly) to indicate to other developers that this value is not meant to be arbitrary.
一般来说,当我有系统相关的值时,这些值将在代码中被引用(可能在按名称进行的枚举中被模仿),我将对保存它们的表使用字符串主键。这与我通常使用代理键的用户可变数据形成对比。使用需要输入的主键可以帮助(尽管不是很完美)向其他开发人员表明这个值不是任意的。
Thus, my "Status" table would look like:
因此,我的“状态”表将是:
Create Table Status
(
Code varchar(6) Not Null Primary Key
, ...
)
Select ...
From Table
Where StatusCode = 'Loaded'
This makes the query more readable, it does not require a join to the Status table, and does not require the use of a magic number (or guid). Using user-defined functions, IMO is a bad practice. Beyond the performance implications, no developer would ever expect UDFs to be used in this manner and thus violates the least astonishment criteria. You would almost be compelled to have a UDF for each constant value; otherwise, what you are passing into the function: a name? a magic value? If a name, you might as well keep the name in a table and use it directly in the query. If a magic value, you are back the original problem.
这使得查询更具可读性,它不需要对状态表的连接,也不需要使用一个神奇的数字(或guid)。使用用户定义的函数,IMO是一个坏习惯。除了性能影响之外,没有开发人员会期望udf以这种方式使用,因此违反了最不令人惊讶的标准。你几乎要为每个常数值设置一个UDF;否则,您要传递给函数的是什么:名称?一个魔法值吗?如果是名称,您还可以将名称保存在表中,并在查询中直接使用它。如果有一个神奇的值,你就回到了原来的问题。
#3
3
I have been using the scalar function option in our DB and it's work fine and as per my view is the best way of this solution.
我一直在我们的DB中使用标量函数选项,它工作得很好,根据我的观点,这是这个解决方案的最佳方式。
if more values related to one item then made lookup like if you load combobox or any other control with static value then use lookup that's the best way to do this.
如果有更多与一项相关的值,然后进行查找,就像如果您加载combobox或任何其他具有静态值的控件,那么使用查找是最好的方法。
#4
2
You can also add more fields to your status table that act as unique markers or groupers for status values. For example, if you add an isLoaded field to your status table, record 87 could be the only one with the field's value set, and you can test for the value of the isLoaded field instead of the hard-coded 87 or status description.
您还可以向状态表添加更多的字段,这些字段充当状态值的惟一标记或groupers。例如,如果向状态表添加一个isLoaded字段,那么记录87可能是唯一一个设置了该字段值的字段,您可以测试isLoaded字段的值,而不是硬编码的87或状态描述。