I have a website that has about 50,000 people in a database. To produce my statistics I need to pull data from 3 tables and join them together. More data comes in daily so the page is getting slower and slower. I was wondering if it would be possible to create a table that joins all the data I need and then run my php off that table. It would be great if the table creation could run hourly or some other set internal so that new data is included. Is this possible and advisable? Can you point me to some resources?
我的网站在数据库中有大约50,000人。为了生成我的统计数据,我需要从3个表中提取数据并将它们连接在一起。每天都会有更多数据出现,因此页面变得越来越慢。我想知道是否有可能创建一个连接我需要的所有数据的表,然后从该表运行我的php。如果表创建可以每小时运行或其他一些内部设置以便包含新数据,那将是很好的。这是可行的吗?你能指点我一些资源吗?
I am using mysql for the database.
我正在使用mysql的数据库。
Thanks!
I have 3 tables here - the village level, the resident level, and if they are absent, an absent table with their results.
我这里有3张桌子 - 村庄级别,居民级别,如果他们不在,则缺席表格及其结果。
SELECT EU, sum(TF) as TFsum, sum(TT) as TTsum, sum(KID) as Nkid,
sum(ADULT) as Nadult
from
(select EU, b.name as Person,
CASE
WHEN b.RIGHT_EYE_TF=1 THEN 1
WHEN b.LEFT_EYE_TF=1 THEN 1
WHEN c.RIGHT_EYE_TF=1 THEN 1
WHEN c.LEFT_EYE_TF=1 THEN 1
ELSE 0
END AS TF,
CASE
WHEN b.RIGHT_EYE_TT=1 THEN 1
WHEN b.LEFT_EYE_TT=1 THEN 1
WHEN c.RIGHT_EYE_TT=1 THEN 1
WHEN c.LEFT_EYE_TT=1 THEN 1
ELSE 0
END AS TT,
CASE
WHEN AGE <= 9 THEN 1
ELSE 0
END AS KID,
CASE
WHEN AGE >= 15 THEN 1
ELSE 0
END AS ADULT
from
villagedb a LEFT JOIN residentdb b
ON
a.CLUSTER = b.RES_CLUSTER
LEFT JOIN
absentdb c
on
b.RES_HOUSEHOLD_ID=c.RES_HOUSEHOLD_ID AND
b.NAME = c.NAME
GROUP BY EU, b.name
) S
GROUP BY EU
3 个解决方案
#1
0
If you need to create statistic on large amount of data, and often, best approach is to do denormalization of data in tables.
如果需要在大量数据上创建统计信息,通常最好的方法是对表中的数据进行非规范化。
In plain English create new tables, populate it with data which you would get from joins, and when you insert data in your old tables, also populate this tables. In this way you will speed up significtly reports. Because joins are not fast, especially with lot of data, having duplicated data is much faster, but you need to work harder at having data in sync all the time.
在简单的英语中创建新表,使用从连接中获取的数据填充它,当您在旧表中插入数据时,也填充此表。通过这种方式,您将加快有意义的报告速度。由于连接速度不快,特别是对于大量数据而言,拥有重复数据的速度要快得多,但是您需要更加努力地使数据始终保持同步。
#2
0
Try the following:
请尝试以下方法:
- Add indexes
- Normalize tables (i would suggest reading up on normalization, in order to improve the performance your tables have to be at least in first 3 normal forms and if you further want to normalize it you can use BCNF.
规范化表格(我建议阅读规范化,以便提高表格必须至少在前3种正常形式中的表现,如果你想进一步规范它,你可以使用BCNF。
You can create a table out of two tables that have common fields. Instead of joining the tables you would have something like cache table where you select necessary data from.
您可以使用具有公共字段的两个表创建一个表。您可以使用缓存表,而不是加入表格,从中选择必要的数据。
#3
0
I see 2 issues here:
我在这里看到2个问题:
- first one already pointed - optimize query and db ( indexes, denormalization, views, stored procedures etc)
- second - display limited results like top 100 only ( pagination let say)
第一个已经指出 - 优化查询和db(索引,非规范化,视图,存储过程等)
第二 - 显示有限的结果,如前100名(分页让我们说)
#1
0
If you need to create statistic on large amount of data, and often, best approach is to do denormalization of data in tables.
如果需要在大量数据上创建统计信息,通常最好的方法是对表中的数据进行非规范化。
In plain English create new tables, populate it with data which you would get from joins, and when you insert data in your old tables, also populate this tables. In this way you will speed up significtly reports. Because joins are not fast, especially with lot of data, having duplicated data is much faster, but you need to work harder at having data in sync all the time.
在简单的英语中创建新表,使用从连接中获取的数据填充它,当您在旧表中插入数据时,也填充此表。通过这种方式,您将加快有意义的报告速度。由于连接速度不快,特别是对于大量数据而言,拥有重复数据的速度要快得多,但是您需要更加努力地使数据始终保持同步。
#2
0
Try the following:
请尝试以下方法:
- Add indexes
- Normalize tables (i would suggest reading up on normalization, in order to improve the performance your tables have to be at least in first 3 normal forms and if you further want to normalize it you can use BCNF.
规范化表格(我建议阅读规范化,以便提高表格必须至少在前3种正常形式中的表现,如果你想进一步规范它,你可以使用BCNF。
You can create a table out of two tables that have common fields. Instead of joining the tables you would have something like cache table where you select necessary data from.
您可以使用具有公共字段的两个表创建一个表。您可以使用缓存表,而不是加入表格,从中选择必要的数据。
#3
0
I see 2 issues here:
我在这里看到2个问题:
- first one already pointed - optimize query and db ( indexes, denormalization, views, stored procedures etc)
- second - display limited results like top 100 only ( pagination let say)
第一个已经指出 - 优化查询和db(索引,非规范化,视图,存储过程等)
第二 - 显示有限的结果,如前100名(分页让我们说)