使用内部联接优化mysql和where

时间:2022-06-11 04:14:59

I have query:

我有疑问:

SELECT DISTINCT h.id,
                h.host
FROM pozycje p
INNER JOIN hosty h ON p.host_id = h.id
INNER JOIN keywordy k ON k.id=p.key_id
AND k.bing=0
WHERE h.archive_data_checked IS NULL LIMIT 20

It's fast when some rows exists but if no results exists it takes 2,3 sek to execute. I would like to have less than 1 sec. Explain looks like:

当某些行存在时速度很快但如果不存在结果则需要2,3 sek才能执行。我想不到1秒。说明看起来像:

http://tinyurl.com/gogx42n

http://tinyurl.com/gogx42n

Table pozycje has 30 000 000 rows, hosty has 4 000 000 rows and keywordy has 40 000 rows. Engine InnoDB, server with 32GB RAM

表pozycje有30 000 000行,hosty有4 000 000行,keywordy有40 000行。 Engine InnoDB,带32GB RAM的服务器

What indexes or improvements can I do to spped up query when no results exists?

当没有结果时,我可以做什么索引或改进来加强查询?

edit:

编辑:

show table keywordy;

show table keywordy;

 CREATE TABLE `keywordy` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `main_kw` varchar(255) CHARACTER SET utf8 NOT NULL,
 `keyword` varchar(255) CHARACTER SET utf8 NOT NULL,
 `lang` varchar(10) CHARACTER SET utf8 NOT NULL,
 `searches` int(11) NOT NULL,
 `cpc` float NOT NULL,
 `competition` float NOT NULL,
 `currency` varchar(10) CHARACTER SET utf8 NOT NULL,
 `data` date DEFAULT NULL,
 `adwords` int(11) NOT NULL,
 `monitoring` tinyint(1) NOT NULL DEFAULT '0',
 `bing` tinyint(1) NOT NULL DEFAULT '0',
 PRIMARY KEY (`id`),
 UNIQUE KEY `keyword` (`keyword`,`lang`),
 KEY `id_bing` (`id`,`bing`)
) ENGINE=InnoDB AUTO_INCREMENT=38362 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

2 个解决方案

#1


0  

can pls test this:

可以测试一下:

SELECT DISTINCT h.id,
                h.host              
FROM hosty h
WHERE
    EXISTS ( SELECT 1 FROM keywordy WHERE id=p.key_id AND bing=0)
  AND
    EXISTS ( SELECT 1 FROM pozycje WHERE host_id = h.id)
  AND h.archive_data_checked IS NULL LIMIT 20

#2


0  

I would first offer the following question. Which would have the smaller "set" if you did a query on

我先提出以下问题。如果你进行了查询,哪个会有较小的“设置”

select count(*) from KeyWordy where bing = 0
vs
select count(*) from hosty where archive_date_checked IS NULL

I would then try to optimize the query knowing the smaller set and work with that as my primary criteria for indexing. If KeyWordy is more likely to be the smaller set, I would offer your tables to have the following indexes

然后我会尝试优化查询,知道较小的集合,并将其作为索引的主要标准。如果KeyWordy更可能是较小的集合,我会提供您的表以具有以下索引

table       index
keywordy    (bing, id)   specifically NOT (id, bing) as bing FIRST is optimized for where or JOIN clause
pozycje     (key_id, host_id )
hosty       (archive_data_checked, id, host)

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      Keywordy k
         JOIN pozycje p
            ON k.id = p.key_id
            JOIN hosty h
               on archive_data_checked IS NULL
              AND p.host_id = h.id
   WHERE
      k.bing = 0
   LIMIT 
      20

if the HOSTY table would be smaller base on the archive_data_checked IS NULL, I offer the following

如果基于archive_data_checked IS NULL的HOSTY表会更小,我提供以下内容

table       index
pozycje     (host_id, key_id )    reversed of other option

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      hosty h 
         JOIN pozycje p
            ON h.id = p.host_id
            JOIN Keywordy k
               on k.bing = 0
              AND p.key_id = k.id
   WHERE 
      h.archive_data_checked IS NULL 
   LIMIT 
      20

One FINAL option, might be to add the keyword "STRAIGHT_JOIN" such as

一个FINAL选项,可能是添加关键字“STRAIGHT_JOIN”等

select STRAIGHT_JOIN DISTINCT ... rest of query

If it works for you, what timing improvements does this offer.

如果它适合您,那么它提供了什么时间改进。

#1


0  

can pls test this:

可以测试一下:

SELECT DISTINCT h.id,
                h.host              
FROM hosty h
WHERE
    EXISTS ( SELECT 1 FROM keywordy WHERE id=p.key_id AND bing=0)
  AND
    EXISTS ( SELECT 1 FROM pozycje WHERE host_id = h.id)
  AND h.archive_data_checked IS NULL LIMIT 20

#2


0  

I would first offer the following question. Which would have the smaller "set" if you did a query on

我先提出以下问题。如果你进行了查询,哪个会有较小的“设置”

select count(*) from KeyWordy where bing = 0
vs
select count(*) from hosty where archive_date_checked IS NULL

I would then try to optimize the query knowing the smaller set and work with that as my primary criteria for indexing. If KeyWordy is more likely to be the smaller set, I would offer your tables to have the following indexes

然后我会尝试优化查询,知道较小的集合,并将其作为索引的主要标准。如果KeyWordy更可能是较小的集合,我会提供您的表以具有以下索引

table       index
keywordy    (bing, id)   specifically NOT (id, bing) as bing FIRST is optimized for where or JOIN clause
pozycje     (key_id, host_id )
hosty       (archive_data_checked, id, host)

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      Keywordy k
         JOIN pozycje p
            ON k.id = p.key_id
            JOIN hosty h
               on archive_data_checked IS NULL
              AND p.host_id = h.id
   WHERE
      k.bing = 0
   LIMIT 
      20

if the HOSTY table would be smaller base on the archive_data_checked IS NULL, I offer the following

如果基于archive_data_checked IS NULL的HOSTY表会更小,我提供以下内容

table       index
pozycje     (host_id, key_id )    reversed of other option

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      hosty h 
         JOIN pozycje p
            ON h.id = p.host_id
            JOIN Keywordy k
               on k.bing = 0
              AND p.key_id = k.id
   WHERE 
      h.archive_data_checked IS NULL 
   LIMIT 
      20

One FINAL option, might be to add the keyword "STRAIGHT_JOIN" such as

一个FINAL选项,可能是添加关键字“STRAIGHT_JOIN”等

select STRAIGHT_JOIN DISTINCT ... rest of query

If it works for you, what timing improvements does this offer.

如果它适合您,那么它提供了什么时间改进。