有没有办法用if改进这个查询?

时间:2021-04-01 03:53:44

I use this query to select a language string from a database containing strings in many languages. The database looks like this:

我使用此查询从包含多种语言字符串的数据库中选择语言字符串。数据库看起来像这样:

`string_id`   BIGINT
`language_id` BIGINT
`datetime`    DATETIME
`text`        TEXT

For example, the data can look like this:

例如,数据可能如下所示:

`string_id` | `language_id` | `datetime`          | `text`
1           | 1             | 2014.04.22 14:43:00 | hello world
1           | 2             | 2014.04.22 14:43:02 | hallo welt

So this is the same string in german and english. The german one was changed two seconds after the english one.

所以这是德语和英语中的相同字符串。德国人在英国人之后两秒钟改变了。

I juse this (sub)query to get the machting string. It automatically fallbacks to any language if the requested language does not exist. So for example, this query would fallback to english or german if I requst the string in spain (=id 3):

我使用这个(子)查询来获取加工字符串。如果请求的语言不存在,它会自动回退到任何语言。因此,例如,如果我在西班牙(= id 3)中请求字符串,则此查询将回退到英语或德语:

SELECT
    z.`text`
FROM
    `language_strings` AS z
WHERE
    a.`joined_string_id` = z.`string_id` 
ORDER BY
    IF(z.`language_id` = 3, 1, 0) DESC,
    z.`datetime` DESC
LIMIT
    1

The performance issue here is, that the IF(..., 1, 0) removes a lot of opportunities because the result has to be calculated every time the query is executed.

这里的性能问题是,IF(...,1,0)消除了很多机会,因为每次执行查询时都必须计算结果。

I tried a lot to improve this query, all useful indexes are still created. MySQL is able to hit this query with the internal cache, but without cache it takes some time to calculate. This is a performance issue when getting a lot of rows (e.g. 1000) because MySQL has to perform 1000 subquerys.

我尝试了很多来改进这个查询,仍然创建了所有有用的索引。 MySQL能够使用内部缓存命中此查询,但没有缓存需要一些时间来计算。当获取大量行(例如1000)时,这是一个性能问题,因为MySQL必须执行1000个子类。

Do you have an idea how to improve this query? Adding new columns to store precalculated data would be an option for me.

您是否知道如何改进此查询?添加新列来存储预先计算的数据对我来说是一个选择。

4 个解决方案

#1


1  

(SELECT
    1 as ord, z.`text`
FROM
    `language_strings` AS z
WHERE
    a.`joined_string_id` = z.`string_id` and z.`language_id` = 3
limit 1)
union all
(SELECT
    2 as ord, z.`text`
FROM
    `language_strings` AS z
WHERE
    a.`joined_string_id` = z.`string_id`
ORDER BY
    z.`datetime` DESC
LIMIT 1)
ORDER BY ord
LIMIT 1

Updated. Twinkles thank you for the note.

更新。 Twinkles谢谢你的说明。

#2


1  

SELECT COALESCE(primary.`text`,fallback.`text`)
FROM (
  SELECT 1 `ord`, z.`text`, z.`datetime`
  FROM `language_strings` AS z
  WHERE z.`language_id` = 3
) primary
FULL OUTER JOIN
(
  SELECT 2 `ord`, z.`text`, z.`datetime`
  FROM `language_strings` AS z
) fallback
ON (primary.`string_id` = fallback.`string_id`
    AND primary.`string_id` = a.`joined_string_id`)
ORDER BY `ord` ASC, `datetime` DESC
LIMIT 1

#3


1  

This appears to be a correlated sub query, which assuming there are a fair number of rows on the table a would be quite inefficient. Might be better to recode this as joined sub queries.

这似乎是一个相关的子查询,假设表a上有相当多的行,效率非常低。可能最好将其重新编码为已加入的子查询。

Maybe as follows:-

可能如下: -

SELECT a.*, IFNULL(ls1.`text`, ls2.`text`)
FROM some_table a
LEFT OUTER JOIN 
(
    SELECT string_id, MAX(datetime) AS MaxDateTime
    FROM language_strings
    WHERE language_id = 3
    GROUP BY string_id
) AS MainLanguage1
ON a.joined_string_id = MainLanguage1.string_id
LEFT OUTER JOIN language_strings ls1
ON MainLanguage1.string_id = ls1.string_id AND MainLanguage1.datetime = ls1.MaxDateTime
LEFT OUTER JOIN 
(
    SELECT string_id, MAX(datetime)
    FROM language_strings
    WHERE language_id != 3
    GROUP BY string_id
) AS MainLanguage2
ON a.joined_string_id = MainLanguage2.string_id
LEFT OUTER JOIN language_strings ls2
ON MainLanguage2.string_id = ls2.string_id AND MainLanguage2.datetime = ls2.MaxDateTime

This gets the latest date for a string_id where the language is 3, and then a join to get the matching text to go with it, and the latest date for a a string_id where the language is not 3 and then a join to get the matching text to go with that.

这将获取string_id的最新日期,其中语言为3,然后是连接以获取匹配的文本,以及aa string_id的最新日期,其中语言不是3,然后是连接以获取匹配的文本顺其自然。

Then the text that is returned is just brought back using IFNULL to bring back the text for language 3, and if not found then the text for languages other than 3.

然后使用IFNULL将返回的文本带回来以恢复语言3的文本,如果没有找到,则返回3以外的语言文本。

#4


0  

While I tested all the posted solution and got kinda headache of the complexity of them, I thought there must be a better way to do this. Inspirated by the COALESCE from @Twinkles that I diddn't know before I decided to try the same code with using another, "temporary" table that definitly contains every possible solution.

虽然我测试了所有发布的解决方案并且对它们的复杂性感到有些头疼,但我认为必须有更好的方法来实现这一点。来自@Twinkles的COALESCE的启示,在我决定使用另一个“临时”表格来尝试相同的代码之前我不知道,这个表格肯定包含了所有可能的解决方案。

This little query generates that table and gurantees that there is definitelya entry for every language:

这个小查询生成该表并保证每种语言都有一个条目:

INSERT INTO
    `language_strings_compiled`
(
    `string_id`,
    `language_id`,
    `text`
)
SELECT
    a.`string_id`,
    b.`language_id`,
    (
        SELECT
            z.`text`
        FROM
            `language_strings` AS z
        WHERE
            a.`string_id` = z.`string_id`
        ORDER BY
            IF(z.`language_id` = b.`language_id`, 1, 0) DESC,
            z.`datetime` DESC
        LIMIT 1
    ) AS `text`
FROM
    `language_strings` AS a
JOIN
    `languages` AS b
GROUP BY
    a.`string_id`,
    b.`language_id`

And then, my subquery can look like this:

然后,我的子查询可能如下所示:

COALESCE
(
    (
        SELECT
            z.`text`
        FROM
            `language_strings_compiled` AS z
        WHERE
            a.`joined_string_id` = z.`string_id`
        AND
            z.`language_id` = 3
        LIMIT
            1
    ),
    (
        SELECT
            z.`text`
        FROM
            `language_strings` AS z
        WHERE
            a.`joined_string_id` = z.`string_id`
        ORDER BY
            IF(z.`language_id` = 3, 1, 0) DESC,
            z.`datetime` DESC
        LIMIT
            1
    )
)

This solution is 10 times faster than the solution without the "compiled" table. And it is able to fallback to the "old" solution if there are some new language strings that are not known by the compiled table at all.

此解决方案比没有“已编译”表的解决方案快10倍。如果存在一些编译表根本不知道的新语言字符串,它就能够回退到“旧”解决方案。

Thanks for all the solution, I tried them all but everytime I ran into the "sub-sub-query"-problem so far.

感谢所有的解决方案,我尝试了所有这些,但每次我遇到“子子查询” - 问题到目前为止。

#1


1  

(SELECT
    1 as ord, z.`text`
FROM
    `language_strings` AS z
WHERE
    a.`joined_string_id` = z.`string_id` and z.`language_id` = 3
limit 1)
union all
(SELECT
    2 as ord, z.`text`
FROM
    `language_strings` AS z
WHERE
    a.`joined_string_id` = z.`string_id`
ORDER BY
    z.`datetime` DESC
LIMIT 1)
ORDER BY ord
LIMIT 1

Updated. Twinkles thank you for the note.

更新。 Twinkles谢谢你的说明。

#2


1  

SELECT COALESCE(primary.`text`,fallback.`text`)
FROM (
  SELECT 1 `ord`, z.`text`, z.`datetime`
  FROM `language_strings` AS z
  WHERE z.`language_id` = 3
) primary
FULL OUTER JOIN
(
  SELECT 2 `ord`, z.`text`, z.`datetime`
  FROM `language_strings` AS z
) fallback
ON (primary.`string_id` = fallback.`string_id`
    AND primary.`string_id` = a.`joined_string_id`)
ORDER BY `ord` ASC, `datetime` DESC
LIMIT 1

#3


1  

This appears to be a correlated sub query, which assuming there are a fair number of rows on the table a would be quite inefficient. Might be better to recode this as joined sub queries.

这似乎是一个相关的子查询,假设表a上有相当多的行,效率非常低。可能最好将其重新编码为已加入的子查询。

Maybe as follows:-

可能如下: -

SELECT a.*, IFNULL(ls1.`text`, ls2.`text`)
FROM some_table a
LEFT OUTER JOIN 
(
    SELECT string_id, MAX(datetime) AS MaxDateTime
    FROM language_strings
    WHERE language_id = 3
    GROUP BY string_id
) AS MainLanguage1
ON a.joined_string_id = MainLanguage1.string_id
LEFT OUTER JOIN language_strings ls1
ON MainLanguage1.string_id = ls1.string_id AND MainLanguage1.datetime = ls1.MaxDateTime
LEFT OUTER JOIN 
(
    SELECT string_id, MAX(datetime)
    FROM language_strings
    WHERE language_id != 3
    GROUP BY string_id
) AS MainLanguage2
ON a.joined_string_id = MainLanguage2.string_id
LEFT OUTER JOIN language_strings ls2
ON MainLanguage2.string_id = ls2.string_id AND MainLanguage2.datetime = ls2.MaxDateTime

This gets the latest date for a string_id where the language is 3, and then a join to get the matching text to go with it, and the latest date for a a string_id where the language is not 3 and then a join to get the matching text to go with that.

这将获取string_id的最新日期,其中语言为3,然后是连接以获取匹配的文本,以及aa string_id的最新日期,其中语言不是3,然后是连接以获取匹配的文本顺其自然。

Then the text that is returned is just brought back using IFNULL to bring back the text for language 3, and if not found then the text for languages other than 3.

然后使用IFNULL将返回的文本带回来以恢复语言3的文本,如果没有找到,则返回3以外的语言文本。

#4


0  

While I tested all the posted solution and got kinda headache of the complexity of them, I thought there must be a better way to do this. Inspirated by the COALESCE from @Twinkles that I diddn't know before I decided to try the same code with using another, "temporary" table that definitly contains every possible solution.

虽然我测试了所有发布的解决方案并且对它们的复杂性感到有些头疼,但我认为必须有更好的方法来实现这一点。来自@Twinkles的COALESCE的启示,在我决定使用另一个“临时”表格来尝试相同的代码之前我不知道,这个表格肯定包含了所有可能的解决方案。

This little query generates that table and gurantees that there is definitelya entry for every language:

这个小查询生成该表并保证每种语言都有一个条目:

INSERT INTO
    `language_strings_compiled`
(
    `string_id`,
    `language_id`,
    `text`
)
SELECT
    a.`string_id`,
    b.`language_id`,
    (
        SELECT
            z.`text`
        FROM
            `language_strings` AS z
        WHERE
            a.`string_id` = z.`string_id`
        ORDER BY
            IF(z.`language_id` = b.`language_id`, 1, 0) DESC,
            z.`datetime` DESC
        LIMIT 1
    ) AS `text`
FROM
    `language_strings` AS a
JOIN
    `languages` AS b
GROUP BY
    a.`string_id`,
    b.`language_id`

And then, my subquery can look like this:

然后,我的子查询可能如下所示:

COALESCE
(
    (
        SELECT
            z.`text`
        FROM
            `language_strings_compiled` AS z
        WHERE
            a.`joined_string_id` = z.`string_id`
        AND
            z.`language_id` = 3
        LIMIT
            1
    ),
    (
        SELECT
            z.`text`
        FROM
            `language_strings` AS z
        WHERE
            a.`joined_string_id` = z.`string_id`
        ORDER BY
            IF(z.`language_id` = 3, 1, 0) DESC,
            z.`datetime` DESC
        LIMIT
            1
    )
)

This solution is 10 times faster than the solution without the "compiled" table. And it is able to fallback to the "old" solution if there are some new language strings that are not known by the compiled table at all.

此解决方案比没有“已编译”表的解决方案快10倍。如果存在一些编译表根本不知道的新语言字符串,它就能够回退到“旧”解决方案。

Thanks for all the solution, I tried them all but everytime I ran into the "sub-sub-query"-problem so far.

感谢所有的解决方案,我尝试了所有这些,但每次我遇到“子子查询” - 问题到目前为止。