如果一个表中有太多列,性能是否会下降?

时间:2022-07-22 04:23:55

Is there some performance loss if one of tables in my database has huge amount of columns? Let's say I have a table with 30 columns.

如果我的数据库中的一个表有大量的列,是否会有性能损失?假设有一个30列的表。

Should I consider splitting the table into few smaller ones or is it ok?

我应该把桌子分成几个小的,还是可以?

What is a recommended maximum amount of columns per database table?

每个数据库表推荐的最大列数是多少?

Thank you.

谢谢你!

9 个解决方案

#1


14  

If you really need all those columns (that is, it's not just a sign that you have a poorly designed table) then by all means keep them.

如果您确实需要所有这些列(也就是说,这不仅仅是您有一个设计糟糕的表的标志),那么一定要保留它们。

It's not a performance problem, as long as you

这不是一个性能问题,只要你。

  • use appropriate indexes on columns you need to use to select rows
  • 在需要使用的列上使用适当的索引来选择行。
  • don't retrieve columns you don't need in SELECT operations
  • 不你不需要检索列选择操作

If you have 30, or even 200 columns it's no problem to the database. You're just making it work a little harder if you want to retrieve all those columns at once.

如果你有30岁甚至200列数据库是没有问题的。你只是使它做更多的工作,如果你想要检索所有列。

But having a lot of columns is a bad code smell; I can't think of any legitimate reason a well-designed table would have this many columns and you may instead be needing a one-many relationship with some other, much simpler, table.

但是有很多列是一个糟糕的代码味道;我想不出任何合法的理由一个设计良好的表会有这么多列,您可能需要与其他许多关系,更简单,表。

#2


13  

I don't agree with all these posts saying 30 columns smells like bad code. If you've never worked on a system that had an entity that had 30+ legitimate attributes, then you probably don't have much experience.

我不同意所有这些说30个专栏闻起来像坏代码的文章。如果您从未使用过具有30+合法属性的实体的系统,那么您可能没有太多的经验。

The answer provided by HLGEM is actually the best one of the bunch. I particularly like his question of "is there a natural split....frequently used vs. not frequently used" are very good questions to ask yourself, and you may be able to break up the table in a natural way (if things get out of hand).

HLGEM给出的答案实际上是最好的。我特别喜欢他的问题“有自然分割....常用和不常用的“非常好的问题要问自己,你也许能够打破表在一个自然的方式(如果事情失控)。

My comment would be, if your performance is currently acceptable, don't look to reinvent a solution unless you need it.

我的意见是,如果您的性能目前是可接受的,那么除非您需要,否则不要寻找重新创建解决方案的方法。

#3


10  

I'm going to weigh in on this even though you've already selected an answer. Yes tables that are too wide could cause performance problems (and data problems as well) and should be separated out into tables with one-one relationships. This is due to how the database stores the data (well at least in SQL Server not sure about mySQl but it is worth doing some reading inthe documentation about how the datbase stores and accesses the data).

尽管你们已经选好了答案,我还是要强调一下。是的,太宽的表可能会导致性能问题(以及数据问题),应该将它们分割为具有one-one关系的表。这是由于数据库存储数据的方式(至少在SQL Server中是这样的,不确定mySQl,但是在关于datbase如何存储和访问数据的文档中进行一些阅读是值得的)。

Thirty columns might be too wide and might not, it depends on how wide the columns are. If you add up the total number of bytes that your 30 columns will take up, is it wider than the maximum number of bytes that can be stored in a record?

30列可能太宽,也可能不是,这取决于列有多宽。如果您将30列占用的总字节数相加,它是否比记录中可以存储的最大字节数更大?

Are some of the columns ones you will need less often than others (in other words is there a natural split between required and frequently used info and other stuff that may appear in only one place not everywhere else), then consider splitting up the table.

是否有一些列,您将比其他列更少地需要它们(换句话说,在必需的和经常使用的信息以及其他可能只出现在一个地方而不是其他地方的信息之间存在自然的分离),然后考虑将该表拆分。

If some of your columns are things like phone1, phone2, phone3 - then it doesn't matter how many columns you have you need a related table with a one to many relationship instead.

如果你的一些列phone1,phone2 phone3——然后不管你有多少列需要一个相关的表与一对多的关系。

In general though 30 columns is not unusually big and will probably be OK.

一般来说虽然30列并不是非常的大,可能会好的。

#4


7  

Technically speaking, 30 columns is absolutely fine. However, tables with many columns are often a sign that your database isn't properly normalized, that is, it can contain redundant and / or inconsistent data.

从技术上讲,30栏绝对没问题。然而,有许多列的表通常是一个信号,表明您的数据库没有正确地规范化,也就是说,它可以包含冗余和/或不一致的数据。

#5


3  

Should be fine, unless you have select * from yourHugeTable all over the place. Always only select the columns you need.

应该没问题,除非你在你的大桌子上到处都选了*。总是只选择需要的列。

#6


2  

30 columns would not normally be considered an excessive number.

30列通常不会被认为是一个过多的数字。

Three thousand columns, on the other hand... How would you implement a very wide "table"?

另一方面,3000列……如何实现一个非常宽的“表”?

#7


2  

Beyond performance, DataBase normalization is a need for databases with too many tables and relations. Normalization gives you easy access to your models and flexible relations to execute diffrent sql queries.

除了性能之外,数据库规范化是数据库的需要,它包含了太多的表和关系。规范化使您能够方便地访问模型和灵活的关系来执行不同的sql查询。

As it is shown in here, there are eight forms of normalization. But for many systems, applying first, second and third normal forms is enough.

如图所示,有八种形式的标准化。但对于许多系统来说,应用第一、第二和第三种范式就足够了。

So, instead of selecting related columns and write long sql queries, a good normalized database tables would be better.

所以,选择相关的列和编写sql查询,一个好的规范化数据库表可能会更好。

#8


2  

30 doesn't seem too many to me. In addition to necessary indexes and proper SELECT queries, for wide tables, 2 basic tips apply well:

30岁对我来说不算太多。除了必要的索引和适当的选择查询,对于宽表,2个基本技巧适用:

  1. Define your column as small as possible.
  2. 将列定义得尽可能小。
  3. Avoid using dynamic columns such as VARCHAR or TEXT as much as possible when you have large number of columns per table. Try using fixed length columns such as CHAR. This is to trade off disk storage for performance.
  4. 当每个表有大量列时,尽量避免使用动态列,如VARCHAR或TEXT。尝试使用固定长度的列,如CHAR。这是为了性能而放弃磁盘存储。

For instance, for columns 'name', 'gender', 'age', 'bio' in 'person' table with as many as 100 or even more columns, to maximize performance, they are best to be defined as:

例如,在“person”表中,列“name”、“gender”、“age”、“bio”,列数最多可达100列,甚至更多,为了使性能最大化,最好定义为:

  1. name - CHAR(70)
  2. 的名字——CHAR(70)
  3. gender - TINYINT(1)
  4. 性别——非常小的整数(1)
  5. age - TINYINT(2)
  6. 年龄非常小的整数(2)
  7. bio - TEXT
  8. bio -文本

The idea is to define columns as small as possible and in fixed length where reasonably possible. Dynamic columns should be to the end of the table structure so fixed length columns are ALL before them.

其思想是尽可能地将列定义为小的,在合理的情况下定义为固定的长度。动态列应该在表结构的末尾,所以固定长度的列都在它们前面。

It goes without saying this would introduce tremendous disk storage wasted with large amount of rows, but as you want performance I guess that would be the cost.

毫无疑问,这会带来大量的磁盘存储,浪费大量的行,但是如果您想要性能,我想这就是代价。

Another tip is as you go along you would find columns that are much more frequently used (selected or updated) than the others, you should separate them into another table to form a one to one relationship to the other table that contains infrequent used columns and perform the queries with less columns involved.

另一个建议是,当你沿着你会发现列更常用的比其他人(选择或更新),你应该分开到另一个表,形成一对一的关系到另一个表,其中包含罕见的使用涉及少列列和执行查询。

#9


0  

Usage wise, it's appropriate in some situations, for example where tables serve more than one application that share some columns but not others, and where reporting requires a real-time single data pool for all, no data transitions. If a 200 column table enables that analytic power and flexibility, then I'd say "go long." Of course in most situations normalization offers efficiency and is best practice, but do what works for your need.

使用明智,在某些情况下是适当的,例如,在某些情况下,表服务于一个共享一些列的应用程序,而不是其他的应用程序,而报告需要一个实时的单个数据池,没有数据转换。如果一个200列的表格能够提供分析能力和灵活性,那么我会说“做多”。当然,在大多数情况下,规范化提供了效率,并且是最佳实践,但要做适合您需要的工作。

#1


14  

If you really need all those columns (that is, it's not just a sign that you have a poorly designed table) then by all means keep them.

如果您确实需要所有这些列(也就是说,这不仅仅是您有一个设计糟糕的表的标志),那么一定要保留它们。

It's not a performance problem, as long as you

这不是一个性能问题,只要你。

  • use appropriate indexes on columns you need to use to select rows
  • 在需要使用的列上使用适当的索引来选择行。
  • don't retrieve columns you don't need in SELECT operations
  • 不你不需要检索列选择操作

If you have 30, or even 200 columns it's no problem to the database. You're just making it work a little harder if you want to retrieve all those columns at once.

如果你有30岁甚至200列数据库是没有问题的。你只是使它做更多的工作,如果你想要检索所有列。

But having a lot of columns is a bad code smell; I can't think of any legitimate reason a well-designed table would have this many columns and you may instead be needing a one-many relationship with some other, much simpler, table.

但是有很多列是一个糟糕的代码味道;我想不出任何合法的理由一个设计良好的表会有这么多列,您可能需要与其他许多关系,更简单,表。

#2


13  

I don't agree with all these posts saying 30 columns smells like bad code. If you've never worked on a system that had an entity that had 30+ legitimate attributes, then you probably don't have much experience.

我不同意所有这些说30个专栏闻起来像坏代码的文章。如果您从未使用过具有30+合法属性的实体的系统,那么您可能没有太多的经验。

The answer provided by HLGEM is actually the best one of the bunch. I particularly like his question of "is there a natural split....frequently used vs. not frequently used" are very good questions to ask yourself, and you may be able to break up the table in a natural way (if things get out of hand).

HLGEM给出的答案实际上是最好的。我特别喜欢他的问题“有自然分割....常用和不常用的“非常好的问题要问自己,你也许能够打破表在一个自然的方式(如果事情失控)。

My comment would be, if your performance is currently acceptable, don't look to reinvent a solution unless you need it.

我的意见是,如果您的性能目前是可接受的,那么除非您需要,否则不要寻找重新创建解决方案的方法。

#3


10  

I'm going to weigh in on this even though you've already selected an answer. Yes tables that are too wide could cause performance problems (and data problems as well) and should be separated out into tables with one-one relationships. This is due to how the database stores the data (well at least in SQL Server not sure about mySQl but it is worth doing some reading inthe documentation about how the datbase stores and accesses the data).

尽管你们已经选好了答案,我还是要强调一下。是的,太宽的表可能会导致性能问题(以及数据问题),应该将它们分割为具有one-one关系的表。这是由于数据库存储数据的方式(至少在SQL Server中是这样的,不确定mySQl,但是在关于datbase如何存储和访问数据的文档中进行一些阅读是值得的)。

Thirty columns might be too wide and might not, it depends on how wide the columns are. If you add up the total number of bytes that your 30 columns will take up, is it wider than the maximum number of bytes that can be stored in a record?

30列可能太宽,也可能不是,这取决于列有多宽。如果您将30列占用的总字节数相加,它是否比记录中可以存储的最大字节数更大?

Are some of the columns ones you will need less often than others (in other words is there a natural split between required and frequently used info and other stuff that may appear in only one place not everywhere else), then consider splitting up the table.

是否有一些列,您将比其他列更少地需要它们(换句话说,在必需的和经常使用的信息以及其他可能只出现在一个地方而不是其他地方的信息之间存在自然的分离),然后考虑将该表拆分。

If some of your columns are things like phone1, phone2, phone3 - then it doesn't matter how many columns you have you need a related table with a one to many relationship instead.

如果你的一些列phone1,phone2 phone3——然后不管你有多少列需要一个相关的表与一对多的关系。

In general though 30 columns is not unusually big and will probably be OK.

一般来说虽然30列并不是非常的大,可能会好的。

#4


7  

Technically speaking, 30 columns is absolutely fine. However, tables with many columns are often a sign that your database isn't properly normalized, that is, it can contain redundant and / or inconsistent data.

从技术上讲,30栏绝对没问题。然而,有许多列的表通常是一个信号,表明您的数据库没有正确地规范化,也就是说,它可以包含冗余和/或不一致的数据。

#5


3  

Should be fine, unless you have select * from yourHugeTable all over the place. Always only select the columns you need.

应该没问题,除非你在你的大桌子上到处都选了*。总是只选择需要的列。

#6


2  

30 columns would not normally be considered an excessive number.

30列通常不会被认为是一个过多的数字。

Three thousand columns, on the other hand... How would you implement a very wide "table"?

另一方面,3000列……如何实现一个非常宽的“表”?

#7


2  

Beyond performance, DataBase normalization is a need for databases with too many tables and relations. Normalization gives you easy access to your models and flexible relations to execute diffrent sql queries.

除了性能之外,数据库规范化是数据库的需要,它包含了太多的表和关系。规范化使您能够方便地访问模型和灵活的关系来执行不同的sql查询。

As it is shown in here, there are eight forms of normalization. But for many systems, applying first, second and third normal forms is enough.

如图所示,有八种形式的标准化。但对于许多系统来说,应用第一、第二和第三种范式就足够了。

So, instead of selecting related columns and write long sql queries, a good normalized database tables would be better.

所以,选择相关的列和编写sql查询,一个好的规范化数据库表可能会更好。

#8


2  

30 doesn't seem too many to me. In addition to necessary indexes and proper SELECT queries, for wide tables, 2 basic tips apply well:

30岁对我来说不算太多。除了必要的索引和适当的选择查询,对于宽表,2个基本技巧适用:

  1. Define your column as small as possible.
  2. 将列定义得尽可能小。
  3. Avoid using dynamic columns such as VARCHAR or TEXT as much as possible when you have large number of columns per table. Try using fixed length columns such as CHAR. This is to trade off disk storage for performance.
  4. 当每个表有大量列时,尽量避免使用动态列,如VARCHAR或TEXT。尝试使用固定长度的列,如CHAR。这是为了性能而放弃磁盘存储。

For instance, for columns 'name', 'gender', 'age', 'bio' in 'person' table with as many as 100 or even more columns, to maximize performance, they are best to be defined as:

例如,在“person”表中,列“name”、“gender”、“age”、“bio”,列数最多可达100列,甚至更多,为了使性能最大化,最好定义为:

  1. name - CHAR(70)
  2. 的名字——CHAR(70)
  3. gender - TINYINT(1)
  4. 性别——非常小的整数(1)
  5. age - TINYINT(2)
  6. 年龄非常小的整数(2)
  7. bio - TEXT
  8. bio -文本

The idea is to define columns as small as possible and in fixed length where reasonably possible. Dynamic columns should be to the end of the table structure so fixed length columns are ALL before them.

其思想是尽可能地将列定义为小的,在合理的情况下定义为固定的长度。动态列应该在表结构的末尾,所以固定长度的列都在它们前面。

It goes without saying this would introduce tremendous disk storage wasted with large amount of rows, but as you want performance I guess that would be the cost.

毫无疑问,这会带来大量的磁盘存储,浪费大量的行,但是如果您想要性能,我想这就是代价。

Another tip is as you go along you would find columns that are much more frequently used (selected or updated) than the others, you should separate them into another table to form a one to one relationship to the other table that contains infrequent used columns and perform the queries with less columns involved.

另一个建议是,当你沿着你会发现列更常用的比其他人(选择或更新),你应该分开到另一个表,形成一对一的关系到另一个表,其中包含罕见的使用涉及少列列和执行查询。

#9


0  

Usage wise, it's appropriate in some situations, for example where tables serve more than one application that share some columns but not others, and where reporting requires a real-time single data pool for all, no data transitions. If a 200 column table enables that analytic power and flexibility, then I'd say "go long." Of course in most situations normalization offers efficiency and is best practice, but do what works for your need.

使用明智,在某些情况下是适当的,例如,在某些情况下,表服务于一个共享一些列的应用程序,而不是其他的应用程序,而报告需要一个实时的单个数据池,没有数据转换。如果一个200列的表格能够提供分析能力和灵活性,那么我会说“做多”。当然,在大多数情况下,规范化提供了效率,并且是最佳实践,但要做适合您需要的工作。