I have to create a database to store information being sent and received to / from a 3rd party web service portal. There are about 150 fields of information to be sent though I can remove about 50 of those fields by normalising (there are three sets addresses that can be saved in an address table, for example). However, this still leaves a table that could potentially have 100 columns.
我必须创建一个数据库来存储从第三方Web服务门户发送和接收的信息。虽然我可以通过规范化删除大约50个字段,但是有大约150个字段要发送的信息(例如,有三组地址可以保存在地址表中)。但是,这仍然会留下一个可能有100列的表。
I've come up with two ways of handling this though I'm not sure which to use:
虽然我不确定使用哪种方法,但我想出了两种处理方式:
1. Have a table with 100 columns and three references to an address table.
1.拥有一个包含100列的表和三个对地址表的引用。
2. Break it down into maybe 15-20 separate dedicated tables.
2.将其分解为15-20个独立的专用表。
Option 1 seems the quickest as it involves the fewest joins but the idea of a table with 100 columns doesn't feel right.
选项1似乎最快,因为它涉及最少的连接,但是有100列的表的想法感觉不对。
Option 2 feels better and would break things down in to more managable chunks but it won't save any database space and will increase the number of joins. Pretty much all the columns in the database will have a value and I cannot normalise these columns any further.
选项2感觉更好,并且会将内容分解为更多可管理的块,但它不会保存任何数据库空间并且会增加连接数。几乎所有数据库中的列都有一个值,我无法进一步规范化这些列。
My question is, in this situation is it acceptable to have a table with c.100 columns in it or should I try and break it down over several tables for presentation?
我的问题是,在这种情况下,可以接受一个包含c.100列的表,或者我应该尝试将其分解为几个表以进行演示?
Please note: The table structure will not change over the course of it's useage, a new database would be created for a new version of the web service portal. I have no control over the web service data structure.
请注意:表结构在使用过程中不会改变,将为新版本的Web服务门户创建新数据库。我无法控制Web服务数据结构。
Edit: @Oded's answer below has made me think a bit more about how the data will be accessed; it will really only be accessed in whole and not in part. I wouldn't for example, need to return columns 5-20 on a regular basis.
编辑:@Oded在下面的回答让我更多地考虑如何访问数据;它实际上只能全部访问而不是部分访问。例如,我不需要定期返回5-20列。
Answer: I accepted Oded's answer based on the comments after he posted it helped me make my mind up and I decided to go with option 1. As the data is accessed in full then having one table seems the better solution. If, for example, I regularly wanted to access columns 5-20 rather than the full table row then I'd see about breaking it up into separate tables for performance reasons.
答:我在发布之后根据评论接受了Oded的答案,帮助我解决了问题,我决定选择1。由于数据是完全访问的,因此有一个表似乎是更好的解决方案。例如,如果我经常想要访问第5-20列而不是完整的表行,那么出于性能原因,我会看到将其分解为单独的表。
3 个解决方案
#1
10
Speaking from a relational purist point of view - first, there is nothing against having 100 columns in a table, if they are related. The point here is that if after normalizing you still have 100 columns, that's OK.
从关系纯粹主义的角度来讲 - 首先,如果它们是相关的,那么表中就没有100列。这里的要点是,如果在标准化后你仍然有100列,那就没关系。
But you should normalize, and in the process you may very well end up with 15-20 separate dedicated tables, which most relational database professionals would agree is a better design (avoid data duplication with the update/delete issues associated, smaller data footprint etc...).
但是你应该规范化,并且在这个过程中你最终可能会得到15-20个独立的专用表,大多数关系数据库专业人员会同意这是一个更好的设计(避免数据重复与相关的更新/删除问题,更小的数据占用空间等...)。
Pragmatically, however, if there is a measurable performance problem, it may be sensible to denormalize your design for performance benefit. The key here - measureable. Don't optimize before you have an actual problem.
但是,实际上,如果存在可测量的性能问题,那么为了性能优势而将设计非规范化可能是明智的。关键在于 - 可测量。在遇到实际问题之前不要进行优化。
In that respect, I'd say you should go with the set of 15-20 tables as an initial design.
在这方面,我会说你应该使用15-20个表作为初始设计。
#2
3
From MSDN:Maximum Capacity Specifications for SQL Server :
从MSDN:SQL Server的最大容量规范:
Columns per nonwide table: 1,024
每个非全表的列:1,024
Columns per wide table: 30,000
每张宽表的列数:30,000
So I think 100 columns is ok in your case. And also maybe you need to note(from same link):
所以我认为在你的情况下100列是可以的。也许你需要注意(来自同一个链接):
Columns per primary key: 16
每个主键的列数:16
Of course this is only in the case if need data only as Log for a service.
当然,只有在需要数据仅作为服务的日志时才会出现这种情况。
If after reading from service you need to maintain data -> then normalising seems better...
如果从服务中读取后需要维护数据 - >那么正常化似乎更好......
#3
1
If you find it easier to "manage" tables with fewer columns, however you happen to define manageability (e.g. less horizontal scrolling when looking at the table data in SSMS), you can break the table up into several tables with 1-to-1 relationships without violating the rules of normalization.
如果您发现使用较少的列“管理”表更容易,但是您恰好定义了可管理性(例如,在查看SSMS中的表数据时减少水平滚动),您可以将表分成几个表,一对一不违反规范化规则的关系。
#1
10
Speaking from a relational purist point of view - first, there is nothing against having 100 columns in a table, if they are related. The point here is that if after normalizing you still have 100 columns, that's OK.
从关系纯粹主义的角度来讲 - 首先,如果它们是相关的,那么表中就没有100列。这里的要点是,如果在标准化后你仍然有100列,那就没关系。
But you should normalize, and in the process you may very well end up with 15-20 separate dedicated tables, which most relational database professionals would agree is a better design (avoid data duplication with the update/delete issues associated, smaller data footprint etc...).
但是你应该规范化,并且在这个过程中你最终可能会得到15-20个独立的专用表,大多数关系数据库专业人员会同意这是一个更好的设计(避免数据重复与相关的更新/删除问题,更小的数据占用空间等...)。
Pragmatically, however, if there is a measurable performance problem, it may be sensible to denormalize your design for performance benefit. The key here - measureable. Don't optimize before you have an actual problem.
但是,实际上,如果存在可测量的性能问题,那么为了性能优势而将设计非规范化可能是明智的。关键在于 - 可测量。在遇到实际问题之前不要进行优化。
In that respect, I'd say you should go with the set of 15-20 tables as an initial design.
在这方面,我会说你应该使用15-20个表作为初始设计。
#2
3
From MSDN:Maximum Capacity Specifications for SQL Server :
从MSDN:SQL Server的最大容量规范:
Columns per nonwide table: 1,024
每个非全表的列:1,024
Columns per wide table: 30,000
每张宽表的列数:30,000
So I think 100 columns is ok in your case. And also maybe you need to note(from same link):
所以我认为在你的情况下100列是可以的。也许你需要注意(来自同一个链接):
Columns per primary key: 16
每个主键的列数:16
Of course this is only in the case if need data only as Log for a service.
当然,只有在需要数据仅作为服务的日志时才会出现这种情况。
If after reading from service you need to maintain data -> then normalising seems better...
如果从服务中读取后需要维护数据 - >那么正常化似乎更好......
#3
1
If you find it easier to "manage" tables with fewer columns, however you happen to define manageability (e.g. less horizontal scrolling when looking at the table data in SSMS), you can break the table up into several tables with 1-to-1 relationships without violating the rules of normalization.
如果您发现使用较少的列“管理”表更容易,但是您恰好定义了可管理性(例如,在查看SSMS中的表数据时减少水平滚动),您可以将表分成几个表,一对一不违反规范化规则的关系。