I have a web application that I am currently working on that uses a MySQL database for the back-end, and I need to know what is better for my situation before I continue any further.
我有一个我正在使用的Web应用程序,它使用MySQL数据库作为后端,我需要知道在我继续任何事情之前对我的情况有什么好处。
Simply put, in this application users will be able to construct their own forms with any number fields (they decide) and right now I have it all stored in a couple tables linked by foreign keys. A friend of mine suggests that to keep things "easy/fast" that I should convert each user's form to a flat table so that querying data from them stays fast (in case of large growth).
简单地说,在这个应用程序中,用户将能够使用任何数字字段(他们决定)构建自己的表单,现在我将它全部存储在由外键链接的几个表中。我的一个朋友建议,为了保持“简单/快速”,我应该将每个用户的表单转换为平面表,以便查询来自它们的数据保持快速(如果增长很大)。
Should I keep the database normalized with everything pooled into relational tables with foreign keys (indexes, etc) or should I construct flat tables for every new form that a user creates?
我是否应该将数据库标准化为使用外键(索引等)汇集到关系表中的所有内容,还是应该为用户创建的每个新表单构建平面表?
Obviously some positives of creating flat tables is data separation (security) and query speeds would be cut down. But seriously how much gain would I get from this? I really don't want 10000 tables and to be dropping, altering, and adding all of the time, but if it will be better than I will do it... I just need some input.
显然,创建平面表的一些好处是数据分离(安全性),并且会降低查询速度。但是我会从中获得多少收益呢?我真的不想要10000个表,并且要丢弃,改变和添加所有的时间,但如果它会比我更好...我只需要一些输入。
Thank you
谢谢
7 个解决方案
#1
21
Rule of thumb. It's easier to go from normalized to denormalized than the other way around.
经验法则。从规范化到非规范化比从另一种方式转向更容易。
Start with a reasonable level of database normalization (by reasonable I mean readable, maintainable, and efficient but not prematurely optimized), then if you hit performance issues as you grow, you have the option of looking into ways in which denormalization may increase performance.
从合理的数据库规范化水平开始(合理的意思是可读,可维护,高效但不过早优化),然后如果在增长时遇到性能问题,您可以选择反规范化可以提高性能的方法。
#2
5
Keep your data normalized. If you index properly, you will not encounter performance issues for a very long time.
保持数据规范化。如果索引正确,很长一段时间内都不会遇到性能问题。
Regarding security: The flat approach will require you to write lots of create/drop table, alter table etc statements, ie a lot more code and a lot more points of failure.
关于安全性:平面方法将要求您编写大量的创建/删除表,更改表等语句,即更多的代码和更多的失败点。
The only reason to have flat files would be when your users can connect to the DB directly (you could still go for row level security). But in that case, you are really reimplementing a variant of phpmyadmin
拥有平面文件的唯一原因是用户可以直接连接到数据库(您仍然可以获得行级安全性)。但在这种情况下,你真的重新实现了phpmyadmin的变种
#3
3
...in this application users will be able to construct their own forms with any number fields...
...在此应用程序中,用户将能够使用任何数字字段构建自己的表单...
Yikes! Then how could you possibly do any sort of normalization when the users are, in essense, making the database decisions for you.
哎呀!那么,当用户为您做出数据库决策时,您怎么可能进行任何规范化。
I think you either need to manage it step by step or let your freak flag fly and just keeping buying hardware to keep up with the thrashing you're going to get when the users really start to get into it....Case in point, look what happens when users start to understand how to make new forms and views in SharePoint...CRIKY!! Talk about scope creep!!
我认为你要么需要一步一步地管理它,要么让你的怪异旗帜飞起来,只是继续购买硬件以跟上用户真正开始进入它时会遇到的颠簸......就这一点而言,看看当用户开始了解如何在SharePoint中创建新表单和视图时会发生什么...... CRIKY !!谈论范围蔓延!!
#4
2
Altering the schema during runtime is rarely a good idea. What you want to consider is the EAV (Entity-Attribute-Value) model.
在运行时期间更改模式很少是个好主意。您要考虑的是EAV(实体 - 属性 - 值)模型。
Wikipedia has some very good info on the pros and cons, as well as implementation details. EAV is to be avoided when possible, but for situations like yours with an unknown number of columns for each form, EAV is woth considering.
*有一些非常好的信息,包括优缺点,以及实施细节。在可能的情况下应避免使用EAV,但对于像您这样的情况,每种形式的列数未知,EAV需要考虑。
#5
1
Keep your data normalized. The system will should stay fast provided you have proper indexing.
保持数据规范化。如果您有适当的索引,系统将保持快速。
If you really want to go fast then switch the schema to one of the key value databases like bigDB /couchDB etc. That is totally denormalized and very very fast.
如果你真的想要快速,那么将模式切换到一个键值数据库,如bigDB / couchDB等。这完全非规范化,非常快。
#6
1
The way I would handle this is to use a normalized, extensible "Property" table, such as below:
我处理这个的方法是使用一个规范化的,可扩展的“Property”表,如下所示:
Table: FormProperty
id: pk
form_id: fk(Form)
key: varchar(128)
value: varchar(2048)
The above is just an example, but I've used this pattern in many cases, and it tends to work out pretty well. The only real "gotcha" is that you need to serialize the value as a string/varchar and then deserialize it to whatever it needs to be, so there is a little added responsibility on the client.
以上只是一个例子,但我在很多情况下使用过这种模式,而且它的效果非常好。唯一真正的“问题”是你需要将值序列化为字符串/ varchar,然后将其反序列化为它需要的任何东西,因此在客户端上有一点额外的责任。
#7
0
Normalized == fast searches, easier to maintain indexes, slower insert transactions (on multiple rows)
规范化==快速搜索,更容易维护索引,更慢的插入事务(在多行上)
Denormalized == fast inserts, ususally this is used when there are a lot of inserts (data warehouses that collect and record chronological data)
非规范化==快速插入,通常在有大量插入时使用(数据仓库收集并记录按时间顺序排列的数据)
#1
21
Rule of thumb. It's easier to go from normalized to denormalized than the other way around.
经验法则。从规范化到非规范化比从另一种方式转向更容易。
Start with a reasonable level of database normalization (by reasonable I mean readable, maintainable, and efficient but not prematurely optimized), then if you hit performance issues as you grow, you have the option of looking into ways in which denormalization may increase performance.
从合理的数据库规范化水平开始(合理的意思是可读,可维护,高效但不过早优化),然后如果在增长时遇到性能问题,您可以选择反规范化可以提高性能的方法。
#2
5
Keep your data normalized. If you index properly, you will not encounter performance issues for a very long time.
保持数据规范化。如果索引正确,很长一段时间内都不会遇到性能问题。
Regarding security: The flat approach will require you to write lots of create/drop table, alter table etc statements, ie a lot more code and a lot more points of failure.
关于安全性:平面方法将要求您编写大量的创建/删除表,更改表等语句,即更多的代码和更多的失败点。
The only reason to have flat files would be when your users can connect to the DB directly (you could still go for row level security). But in that case, you are really reimplementing a variant of phpmyadmin
拥有平面文件的唯一原因是用户可以直接连接到数据库(您仍然可以获得行级安全性)。但在这种情况下,你真的重新实现了phpmyadmin的变种
#3
3
...in this application users will be able to construct their own forms with any number fields...
...在此应用程序中,用户将能够使用任何数字字段构建自己的表单...
Yikes! Then how could you possibly do any sort of normalization when the users are, in essense, making the database decisions for you.
哎呀!那么,当用户为您做出数据库决策时,您怎么可能进行任何规范化。
I think you either need to manage it step by step or let your freak flag fly and just keeping buying hardware to keep up with the thrashing you're going to get when the users really start to get into it....Case in point, look what happens when users start to understand how to make new forms and views in SharePoint...CRIKY!! Talk about scope creep!!
我认为你要么需要一步一步地管理它,要么让你的怪异旗帜飞起来,只是继续购买硬件以跟上用户真正开始进入它时会遇到的颠簸......就这一点而言,看看当用户开始了解如何在SharePoint中创建新表单和视图时会发生什么...... CRIKY !!谈论范围蔓延!!
#4
2
Altering the schema during runtime is rarely a good idea. What you want to consider is the EAV (Entity-Attribute-Value) model.
在运行时期间更改模式很少是个好主意。您要考虑的是EAV(实体 - 属性 - 值)模型。
Wikipedia has some very good info on the pros and cons, as well as implementation details. EAV is to be avoided when possible, but for situations like yours with an unknown number of columns for each form, EAV is woth considering.
*有一些非常好的信息,包括优缺点,以及实施细节。在可能的情况下应避免使用EAV,但对于像您这样的情况,每种形式的列数未知,EAV需要考虑。
#5
1
Keep your data normalized. The system will should stay fast provided you have proper indexing.
保持数据规范化。如果您有适当的索引,系统将保持快速。
If you really want to go fast then switch the schema to one of the key value databases like bigDB /couchDB etc. That is totally denormalized and very very fast.
如果你真的想要快速,那么将模式切换到一个键值数据库,如bigDB / couchDB等。这完全非规范化,非常快。
#6
1
The way I would handle this is to use a normalized, extensible "Property" table, such as below:
我处理这个的方法是使用一个规范化的,可扩展的“Property”表,如下所示:
Table: FormProperty
id: pk
form_id: fk(Form)
key: varchar(128)
value: varchar(2048)
The above is just an example, but I've used this pattern in many cases, and it tends to work out pretty well. The only real "gotcha" is that you need to serialize the value as a string/varchar and then deserialize it to whatever it needs to be, so there is a little added responsibility on the client.
以上只是一个例子,但我在很多情况下使用过这种模式,而且它的效果非常好。唯一真正的“问题”是你需要将值序列化为字符串/ varchar,然后将其反序列化为它需要的任何东西,因此在客户端上有一点额外的责任。
#7
0
Normalized == fast searches, easier to maintain indexes, slower insert transactions (on multiple rows)
规范化==快速搜索,更容易维护索引,更慢的插入事务(在多行上)
Denormalized == fast inserts, ususally this is used when there are a lot of inserts (data warehouses that collect and record chronological data)
非规范化==快速插入,通常在有大量插入时使用(数据仓库收集并记录按时间顺序排列的数据)