I have a scenario and two options to achieve it. Which one will be more efficient?
我有一个场景和两个选项来实现它。哪个更有效率?
I am using mySQL to store attendance of students (around 100 million). And later use this attendance data to plot charts and results based on user’s selection.
我使用mySQL存储学生出勤(大约1亿)。然后利用这个考勤数据根据用户的选择绘制图表和结果。
Approach.1) Store attendance data of student for each day in new row (which will increase the number of rows exponentially and reduce processing time)
1)在新行中存储学生每天的考勤数据(以指数形式增加行数,减少处理时间)
Approach.2) Store serialized or JSON formatted row of one year’s attendance data of each student in a row (Which will increase processing time when updating attendance each day and reduce database size)
2)连续存储每个学生一年考勤数据的串行或JSON格式的行(这将增加每天更新考勤的处理时间,减少数据库大小)
1 个解决方案
#1
9
First I think you are confused, the number of rows will increase linear not exponential that is a big difference.
首先,我觉得你们很困惑,行数会增加线性而不是指数,这是一个很大的区别。
Second 100k is nothing for a database. even if you store 365 days that is only 36 million, I have that in a week,
第二个100k对数据库来说什么都不是。即使你储存365天,也只有3600万,我一周就有,
Third Store in a JSON may complicated future query.
JSON中的第三个存储可能会使未来的查询变得复杂。
So I suggest go with approach 1
所以我建议使用方法1
Using proper Index, design and a fast HDD a db can handle billion of records.
使用适当的索引、设计和快速的硬盘驱动器可以处理数十亿条记录。
Also you may consider save historic data in a different schema so current data is a little faster, but that is just a minor tuneup
另外,您可以考虑将历史数据保存在不同的模式中,所以当前的数据要快一些,但这只是一个小的调整。
#1
9
First I think you are confused, the number of rows will increase linear not exponential that is a big difference.
首先,我觉得你们很困惑,行数会增加线性而不是指数,这是一个很大的区别。
Second 100k is nothing for a database. even if you store 365 days that is only 36 million, I have that in a week,
第二个100k对数据库来说什么都不是。即使你储存365天,也只有3600万,我一周就有,
Third Store in a JSON may complicated future query.
JSON中的第三个存储可能会使未来的查询变得复杂。
So I suggest go with approach 1
所以我建议使用方法1
Using proper Index, design and a fast HDD a db can handle billion of records.
使用适当的索引、设计和快速的硬盘驱动器可以处理数十亿条记录。
Also you may consider save historic data in a different schema so current data is a little faster, but that is just a minor tuneup
另外,您可以考虑将历史数据保存在不同的模式中,所以当前的数据要快一些,但这只是一个小的调整。