Well I have a videos website and a few of its tables are:
我有一个视频网站,其中一些表是:
tags
id ~ int(11), auto-increment [PRIMARY KEY]
tag_name ~ varchar(255)
videotags
tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]
videos
id ~ int(11), auto-increment [PRIMARY KEY]
video_name ~ varchar(255)
Now at this point the tags table has >1000 rows and the videotags table has >32000 rows. So when I run a query to display all tags from most common to least common it takes >15 seconds to execute.
此时,标签表有> 1000行,录像带表有> 32000行。因此,当我运行查询以显示从最常见到最不常见的所有标记时,执行时间大于15秒。
I am using PHP and my code (watered down for simplicity) is as follows:
我正在使用PHP和我的代码(为简单而淡化)如下:
foreach ($database->query("SELECT tag_name,COUNT(tag_id) AS 'tag_count' FROM tags LEFT OUTER JOIN videotags ON tags.id=videotags.tag_id GROUP BY tags.id ORDER BY tag_count DESC") as $tags)
{
echo $tags["tag_name"] . ', ';
}
Now keeping in mind that this being 100% accurate isn't as important to me as it being fast. So even if the query was executed once a day and its results were used for the remainder of the day, I wouldn't care.
现在请记住,这是100%准确对我而言并不像快速那么重要。因此,即使查询每天执行一次并且其结果在一天的剩余时间内使用,我也不在乎。
I know absolutely nothing about MySQL/PHP caching so please help!
我对MySQL / PHP缓存一无所知,所以请帮忙!
4 个解决方案
#1
3
MarkR mentioned the index. Make sure you:
MarkR提到了该指数。确保你:
create index videotags_tag_id on videotags(tag_id);
#2
2
32,000 rows is still a small table - there's no way your performance should be that bad.
32,000行仍然是一个小桌子 - 你的表现不可能那么糟糕。
Can you run EXPLAIN
on your query - I'd guess you're indexes are wrong somewhere.
你可以在你的查询上运行EXPLAIN - 我猜你的索引在某处是错误的。
You say in the question:
你在问题中说:
tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]
Are they definitely in that order? If not, then it won't use the index.
他们肯定是那个顺序吗?如果没有,那么它将不会使用索引。
#3
0
I think your best bet is to create some kind of summary table which you maintain when things change.
我认为你最好的办法是创建一些你在事情发生变化时保留的汇总表。
The query above needs to scan all the rows in the table in order to find the aggregates in the group by - there is NO WHERE CLAUSE. A query with no where clause has no hope of optimisation, as it necessarily has to check every row.
上面的查询需要扫描表中的所有行,以便通过以下方式查找聚合: - 没有WHERE CLAUSE。没有where子句的查询没有优化的希望,因为它必须检查每一行。
The fix is to create a summary table with the same data as the result of that query (or similar), which you will have to maintain from time to time when the data change or change significantly.
修复方法是创建一个摘要表,其中包含与该查询(或类似)的结果相同的数据,当数据发生变化或显着变化时,您将不得不维护该摘要表。
Only you can decide, based on the nature of your application and your data, whether it's appropriate to update the summary table on a scheduled basis, on each update, or some combination.
只有您可以根据应用程序和数据的性质决定是否适合按计划,每次更新或某种组合更新摘要表。
As you're doing a join, the right indexes are still beneficial, but you knew that, right, and had already done it?
当你正在进行连接时,正确的索引仍然是有益的,但你知道,对,并且已经完成了吗?
#4
0
Are you using InnoDB or MyISAM? In MyISAM COUNT is basically free, but in InnoDB it has to physically count the rows.
您使用的是InnoDB还是MyISAM?在MyISAM中COUNT基本上是免费的,但在InnoDB中它必须实际计算行数。
#1
3
MarkR mentioned the index. Make sure you:
MarkR提到了该指数。确保你:
create index videotags_tag_id on videotags(tag_id);
#2
2
32,000 rows is still a small table - there's no way your performance should be that bad.
32,000行仍然是一个小桌子 - 你的表现不可能那么糟糕。
Can you run EXPLAIN
on your query - I'd guess you're indexes are wrong somewhere.
你可以在你的查询上运行EXPLAIN - 我猜你的索引在某处是错误的。
You say in the question:
你在问题中说:
tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]
Are they definitely in that order? If not, then it won't use the index.
他们肯定是那个顺序吗?如果没有,那么它将不会使用索引。
#3
0
I think your best bet is to create some kind of summary table which you maintain when things change.
我认为你最好的办法是创建一些你在事情发生变化时保留的汇总表。
The query above needs to scan all the rows in the table in order to find the aggregates in the group by - there is NO WHERE CLAUSE. A query with no where clause has no hope of optimisation, as it necessarily has to check every row.
上面的查询需要扫描表中的所有行,以便通过以下方式查找聚合: - 没有WHERE CLAUSE。没有where子句的查询没有优化的希望,因为它必须检查每一行。
The fix is to create a summary table with the same data as the result of that query (or similar), which you will have to maintain from time to time when the data change or change significantly.
修复方法是创建一个摘要表,其中包含与该查询(或类似)的结果相同的数据,当数据发生变化或显着变化时,您将不得不维护该摘要表。
Only you can decide, based on the nature of your application and your data, whether it's appropriate to update the summary table on a scheduled basis, on each update, or some combination.
只有您可以根据应用程序和数据的性质决定是否适合按计划,每次更新或某种组合更新摘要表。
As you're doing a join, the right indexes are still beneficial, but you knew that, right, and had already done it?
当你正在进行连接时,正确的索引仍然是有益的,但你知道,对,并且已经完成了吗?
#4
0
Are you using InnoDB or MyISAM? In MyISAM COUNT is basically free, but in InnoDB it has to physically count the rows.
您使用的是InnoDB还是MyISAM?在MyISAM中COUNT基本上是免费的,但在InnoDB中它必须实际计算行数。