I'm trying my best to persuade my boss into letting us use foreign keys in our databases - so far without luck.
我正在尽力说服我的老板让我们在我们的数据库中使用外键 - 到目前为止没有运气。
He claims it costs a significant amount of performance, and says we'll just have jobs to cleanup the invalid references now and then.
他声称这需要花费大量的性能,并且说我们现在只需要有工作来清理无效的引用。
Obviously this doesn't work in practice, and the database is flooded with invalid references.
显然这在实践中不起作用,并且数据库充斥着无效的引用。
Does anyone know of a comparison, benchmark or similar which proves there's no significant performance hit to using foreign keys? (Which I hope will convince him)
有没有人知道比较,基准或类似的证明使用外键没有显着的性能影响? (我希望能说服他)
6 个解决方案
#1
30
There is a tiny performance hit on inserts, updates and deletes because the FK has to be checked. For an individual record this would normally be so slight as to be unnoticeable unless you start having a ridiculous number of FKs associated to the table (Clearly it takes longer to check 100 other tables than 2). This is a good thing not a bad thing as databases without integrity are untrustworthy and thus useless. You should not trade integrity for speed. That performance hit is usually offset by the better ability to optimize execution plans.
插入,更新和删除的性能很小,因为必须检查FK。对于单个记录,除非你开始有一个与表相关联的荒谬数量的FK,否则这通常会非常轻微,以至于不明显(显然,检查100个其他表需要更长的时间而不是2)。这是一件好事并非坏事,因为没有完整性的数据库是不值得信任的,因而无用。你不应该以速度交换诚信。这种性能影响通常会被更好的优化执行计划的能力所抵消。
We have a medium sized database with around 9 million records and FKs everywhere they should be and rarely notice a performance hit (except on one badly designed table that has well over 100 foreign keys, it is a bit slow to delete records from this as all must be checked). Almost every dba I know of who deals with large, terabyte sized databases and a true need for high performance on large data sets insists on foreign key constraints because integrity is key to any database. If the people with terabyte-sized databases can afford the very small performance hit, then so can you.
我们有一个中等大小的数据库,大约有900万条记录和FK到处应有,并且很少注意到性能损失(除了一个设计糟糕的表有超过100个外键,从这里删除记录有点慢)必须检查)。几乎我所知道的每个dba谁处理大型的TB级数据库以及对大型数据集的高性能的真正需求都坚持外键约束,因为完整性是任何数据库的关键。如果具有TB级数据库的人能够承受非常小的性能影响,那么您也可以。
FKs are not automatically indexed and if they are not indexed this can cause performance problems.
FK不会自动编入索引,如果它们未编入索引,则可能会导致性能问题。
Honestly, I'd take a copy of your database, add properly indexed FKs and show the time difference to insert, delete, update and select from those tables in comparision with the same from your database without the FKs. Show that you won't be causing a performance hit. Then show the results of queries that show orphaned records that no longer have meaning because the PK they are related to no longer exists. It is especially effective to show this for tables which contain financial information ("We have 2700 orders that we can't associate with a customer" will make management sit up and take notice).
老实说,我会带一份你的数据库,添加正确索引的FK并显示插入,删除,更新和从这些表中选择的时间差,与没有FK的数据库中的相同。表明您不会造成性能损失。然后显示查询的结果,这些查询显示由于与它们相关的PK不再存在而不再具有意义的孤立记录。对于包含财务信息的表格显示此信息尤其有效(“我们有2700个订单,我们无法与客户关联”将使管理层坐下来注意)。
#2
16
From Microsoft Patterns and Practices: Chapter 14 Improving SQL Server Performance:
从Microsoft模式和实践:第14章提高SQL Server性能:
When primary and foreign keys are defined as constraints in the database schema, the server can use that information to create optimal execution plans.
当主键和外键在数据库模式中定义为约束时,服务器可以使用该信息来创建最佳执行计划。
#3
6
This is more of a political issue than a technical one. If your project management doesn't see any value in maintaining the integrity of your data, you need to be on a different project.
这更像是一个政治问题,而不是技术问题。如果您的项目管理没有看到维护数据完整性的任何价值,那么您需要处于不同的项目中。
If your boss doesn't already know or care that you have thousands of invalid references, he isn't going to start caring just because you tell him about it. I sympathize with the other posters here who are trying to urge you to do the "right thing" by fighting the good fight, but I've tried it many times before and in actual practice it doesn't work. The story of David and Goliath makes good reading, but in real life it's a losing proposition.
如果你的老板不知道或不关心你有成千上万的无效引用,他就不会因为你告诉他这件事而开始关心。我同情这里的其他海报,他们试图通过打好这场斗争来敦促你做出“正确的事”,但我之前尝试了很多次,在实际操作中它没有用。大卫和歌利亚的故事很好,但在现实生活中这是一个失败的主张。
#4
4
Does anyone know of a comparison, benchmark or similar which proves there's no significant performance hit to using foreign keys ? (Which I hope will convince him)
有没有人知道比较,基准或类似的证明使用外键没有显着的性能影响? (我希望能说服他)
I think you're going about this the wrong way. Benchmarks never convince anyone.
我认为你这是错误的做法。基准从未说服任何人。
What you should do, is first uncover the problems that result from not using foreign key constraints. Try to quantify how much work it costs to "clean out invalid references". In addition, try and gauge how many errors result in the business process because of these errors. If you can attach a dollar amount to that - even better.
您应该做的是首先发现因不使用外键约束而导致的问题。尝试量化“清除无效引用”花费的工作量。此外,尝试并衡量由于这些错误导致业务流程中出现的错误数量。如果你可以附加一美元金额 - 甚至更好。
Now for a benchmark - you should try and get insight into your workload, identify which type of operations are done most often. Then set up a testing environment, and replay those operations with foreign keys in place. Then compare.
现在,对于基准测试 - 您应该尝试深入了解您的工作负载,确定最常进行的操作类型。然后设置一个测试环境,并使用外键重放这些操作。然后比较。
Personally I would not claim right away without knowledge of the applications that are running on the database that foreign keys don't cost performance. Especially if you have cascading deletes and/or updates in combination with composite natural primary keys, then I personally would have some fear of performance issues, especially timed-out or deadlocked transactions due to side-effects of cascading operations.
就个人而言,如果不了解数据库上运行的应用程序,外键不会降低性能,我就不会立即声明。特别是如果你有复合自然主键的级联删除和/或更新,那么我个人会担心性能问题,特别是由于级联操作的副作用导致的超时或死锁事务。
But no-one can tell you- you have to test yourself, with your data, your workload, your number of concurrent users, your hardware, your applications.
但没有人可以告诉你 - 你必须自己测试数据,工作量,并发用户数量,硬件和应用程序。
#5
3
It is OK to be concerned about performance, but making paranoid decisions is not.
关注性能是可以的,但做出偏执的决定则不然。
You can easily write benchmark code to show results yourself, but first you'll need to find out what performance your boss is concerned about and detail exactly those metrics.
您可以轻松编写基准代码以自行显示结果,但首先您需要了解您的老板所关注的性能并详细说明这些指标。
As far as the invalid references ar concerned, if you don't allow nulls on your foreign keys, you won't get invalid references. The database will esception if you try to assign an invalid foreign key that does not exist. If you need "nulls", assign a key to be "UNDEFINED" or something like that, and make that the default key.
就无效引用而言,如果您不允许外键使用空值,则不会获得无效引用。如果您尝试分配不存在的无效外键,则会出现数据库。如果您需要“nulls”,请将键指定为“UNDEFINED”或类似的键,并将其设为默认键。
Finally, explain database normalisation issues to your boss, because I think you will quickly find that this issue will be more of a problem than foreign key performance ever will.
最后,向老板解释数据库规范化问题,因为我认为你很快就会发现这个问题比外键性能更有问题。
#6
1
A significant factor in the cost would be the size of the index the foreign key references - if it's small and frequently used, the performance impact will be negligible, large and less frequently used indexes will have more impact, but if your foreign key is against a clustered index, it still shouldn't be a huge hit, but @Ronald Bouman is right - you need to test to be sure.
成本中的一个重要因素是外键引用的索引大小 - 如果它很小且经常使用,性能影响可以忽略不计,大型和不常用的索引会产生更大的影响,但如果你的外键是反对的一个聚集的索引,它仍然不应该是一个巨大的打击,但@Ronald Bouman是对的 - 你需要测试以确定。
#1
30
There is a tiny performance hit on inserts, updates and deletes because the FK has to be checked. For an individual record this would normally be so slight as to be unnoticeable unless you start having a ridiculous number of FKs associated to the table (Clearly it takes longer to check 100 other tables than 2). This is a good thing not a bad thing as databases without integrity are untrustworthy and thus useless. You should not trade integrity for speed. That performance hit is usually offset by the better ability to optimize execution plans.
插入,更新和删除的性能很小,因为必须检查FK。对于单个记录,除非你开始有一个与表相关联的荒谬数量的FK,否则这通常会非常轻微,以至于不明显(显然,检查100个其他表需要更长的时间而不是2)。这是一件好事并非坏事,因为没有完整性的数据库是不值得信任的,因而无用。你不应该以速度交换诚信。这种性能影响通常会被更好的优化执行计划的能力所抵消。
We have a medium sized database with around 9 million records and FKs everywhere they should be and rarely notice a performance hit (except on one badly designed table that has well over 100 foreign keys, it is a bit slow to delete records from this as all must be checked). Almost every dba I know of who deals with large, terabyte sized databases and a true need for high performance on large data sets insists on foreign key constraints because integrity is key to any database. If the people with terabyte-sized databases can afford the very small performance hit, then so can you.
我们有一个中等大小的数据库,大约有900万条记录和FK到处应有,并且很少注意到性能损失(除了一个设计糟糕的表有超过100个外键,从这里删除记录有点慢)必须检查)。几乎我所知道的每个dba谁处理大型的TB级数据库以及对大型数据集的高性能的真正需求都坚持外键约束,因为完整性是任何数据库的关键。如果具有TB级数据库的人能够承受非常小的性能影响,那么您也可以。
FKs are not automatically indexed and if they are not indexed this can cause performance problems.
FK不会自动编入索引,如果它们未编入索引,则可能会导致性能问题。
Honestly, I'd take a copy of your database, add properly indexed FKs and show the time difference to insert, delete, update and select from those tables in comparision with the same from your database without the FKs. Show that you won't be causing a performance hit. Then show the results of queries that show orphaned records that no longer have meaning because the PK they are related to no longer exists. It is especially effective to show this for tables which contain financial information ("We have 2700 orders that we can't associate with a customer" will make management sit up and take notice).
老实说,我会带一份你的数据库,添加正确索引的FK并显示插入,删除,更新和从这些表中选择的时间差,与没有FK的数据库中的相同。表明您不会造成性能损失。然后显示查询的结果,这些查询显示由于与它们相关的PK不再存在而不再具有意义的孤立记录。对于包含财务信息的表格显示此信息尤其有效(“我们有2700个订单,我们无法与客户关联”将使管理层坐下来注意)。
#2
16
From Microsoft Patterns and Practices: Chapter 14 Improving SQL Server Performance:
从Microsoft模式和实践:第14章提高SQL Server性能:
When primary and foreign keys are defined as constraints in the database schema, the server can use that information to create optimal execution plans.
当主键和外键在数据库模式中定义为约束时,服务器可以使用该信息来创建最佳执行计划。
#3
6
This is more of a political issue than a technical one. If your project management doesn't see any value in maintaining the integrity of your data, you need to be on a different project.
这更像是一个政治问题,而不是技术问题。如果您的项目管理没有看到维护数据完整性的任何价值,那么您需要处于不同的项目中。
If your boss doesn't already know or care that you have thousands of invalid references, he isn't going to start caring just because you tell him about it. I sympathize with the other posters here who are trying to urge you to do the "right thing" by fighting the good fight, but I've tried it many times before and in actual practice it doesn't work. The story of David and Goliath makes good reading, but in real life it's a losing proposition.
如果你的老板不知道或不关心你有成千上万的无效引用,他就不会因为你告诉他这件事而开始关心。我同情这里的其他海报,他们试图通过打好这场斗争来敦促你做出“正确的事”,但我之前尝试了很多次,在实际操作中它没有用。大卫和歌利亚的故事很好,但在现实生活中这是一个失败的主张。
#4
4
Does anyone know of a comparison, benchmark or similar which proves there's no significant performance hit to using foreign keys ? (Which I hope will convince him)
有没有人知道比较,基准或类似的证明使用外键没有显着的性能影响? (我希望能说服他)
I think you're going about this the wrong way. Benchmarks never convince anyone.
我认为你这是错误的做法。基准从未说服任何人。
What you should do, is first uncover the problems that result from not using foreign key constraints. Try to quantify how much work it costs to "clean out invalid references". In addition, try and gauge how many errors result in the business process because of these errors. If you can attach a dollar amount to that - even better.
您应该做的是首先发现因不使用外键约束而导致的问题。尝试量化“清除无效引用”花费的工作量。此外,尝试并衡量由于这些错误导致业务流程中出现的错误数量。如果你可以附加一美元金额 - 甚至更好。
Now for a benchmark - you should try and get insight into your workload, identify which type of operations are done most often. Then set up a testing environment, and replay those operations with foreign keys in place. Then compare.
现在,对于基准测试 - 您应该尝试深入了解您的工作负载,确定最常进行的操作类型。然后设置一个测试环境,并使用外键重放这些操作。然后比较。
Personally I would not claim right away without knowledge of the applications that are running on the database that foreign keys don't cost performance. Especially if you have cascading deletes and/or updates in combination with composite natural primary keys, then I personally would have some fear of performance issues, especially timed-out or deadlocked transactions due to side-effects of cascading operations.
就个人而言,如果不了解数据库上运行的应用程序,外键不会降低性能,我就不会立即声明。特别是如果你有复合自然主键的级联删除和/或更新,那么我个人会担心性能问题,特别是由于级联操作的副作用导致的超时或死锁事务。
But no-one can tell you- you have to test yourself, with your data, your workload, your number of concurrent users, your hardware, your applications.
但没有人可以告诉你 - 你必须自己测试数据,工作量,并发用户数量,硬件和应用程序。
#5
3
It is OK to be concerned about performance, but making paranoid decisions is not.
关注性能是可以的,但做出偏执的决定则不然。
You can easily write benchmark code to show results yourself, but first you'll need to find out what performance your boss is concerned about and detail exactly those metrics.
您可以轻松编写基准代码以自行显示结果,但首先您需要了解您的老板所关注的性能并详细说明这些指标。
As far as the invalid references ar concerned, if you don't allow nulls on your foreign keys, you won't get invalid references. The database will esception if you try to assign an invalid foreign key that does not exist. If you need "nulls", assign a key to be "UNDEFINED" or something like that, and make that the default key.
就无效引用而言,如果您不允许外键使用空值,则不会获得无效引用。如果您尝试分配不存在的无效外键,则会出现数据库。如果您需要“nulls”,请将键指定为“UNDEFINED”或类似的键,并将其设为默认键。
Finally, explain database normalisation issues to your boss, because I think you will quickly find that this issue will be more of a problem than foreign key performance ever will.
最后,向老板解释数据库规范化问题,因为我认为你很快就会发现这个问题比外键性能更有问题。
#6
1
A significant factor in the cost would be the size of the index the foreign key references - if it's small and frequently used, the performance impact will be negligible, large and less frequently used indexes will have more impact, but if your foreign key is against a clustered index, it still shouldn't be a huge hit, but @Ronald Bouman is right - you need to test to be sure.
成本中的一个重要因素是外键引用的索引大小 - 如果它很小且经常使用,性能影响可以忽略不计,大型和不常用的索引会产生更大的影响,但如果你的外键是反对的一个聚集的索引,它仍然不应该是一个巨大的打击,但@Ronald Bouman是对的 - 你需要测试以确定。