在连接中使用REGEXP优化sql查询

时间:2022-12-16 15:44:49

I have the following situation:

我有以下情况:

Table Words:

表的单词:

| ID |   WORD |
|----|--------|
|  1 |     us |
|  2 |     to |
|  3 | belong |
|  4 |    are |
|  5 |   base |
|  6 |   your |
|  7 |    all |
|  8 |     is |
|  9 |  yours |

Table Sentence:

表的句子:

| ID |                                  SENTENCE |
|----|-------------------------------------------|
|  1 | <<7>> <<6>> <<5>> <<4>> <<3>> <<2>> <<1>> |
|  2 |                         <<7>> <<8>> <<9>> |

And i want to replace the <<(\d)>> with the equivalent word from the Word-Table.

我想把<<(\d)>>替换为word表中的等价词。

So the result should be

所以结果应该是

| ID |                       SENTENCE |
|----|--------------------------------|
|  1 | all your base are belong to us |
|  2 |                   all is yours |

What i came up with is the following SQL-Code:

我得到的是下面的sql代码:

SELECT id, GROUP_CONCAT(word ORDER BY pos SEPARATOR ' ') AS sentence FROM (
    SELECT sentence.id, words.word, LOCATE(words.id, sentence.sentence) AS pos
    FROM sentence
    LEFT JOIN words
    ON (sentence.sentence REGEXP CONCAT('<<',words.id,'>>'))
    ) AS TEMP
GROUP BY id

I made a sqlfiddle for this:

我做了一个sqlfiddle:

http://sqlfiddle.com/#!2/634b8/4

http://sqlfiddle.com/ ! 2/634b8/4

The code basically is working, but i'd like to ask you pros if there is a way without a derived table or without filesort in the execution plan.

代码基本上是可以工作的,但是我想问一下专业人士,在执行计划中是否存在没有派生表或没有文件排序的方法。

1 个解决方案

#1


3  

You should make a table with one entry per word, so your sentense (sic) can be made by joining on that table. It would look something like this

你应该做一张每个单词都有一个条目的表格,这样你的句子(原文如此)就可以通过在表格上的连接来完成。它看起来是这样的

SentenceId, wordId, location
2,          7,       1
2,          8,       2
2,          9,       3

They way you have it set up, you are not taking advantage of your database, basically putting several points of data in 1 table-field.

他们是这样设置的,你没有利用你的数据库,基本上把几个数据点放在一个表字段中。

The location field (it is tempting to call it "order", but as this is an SQL keyword, don't do it, you'll hate yourself) can be used to 'sort' the sentence.

location字段(它很容易被称为“order”,但是因为这是一个SQL关键字,所以不要这么做,你会讨厌自己)可以用来对句子进行“排序”。

(and you might want to rename sentense to sentence?)

(你可能想重命名句子?)

#1


3  

You should make a table with one entry per word, so your sentense (sic) can be made by joining on that table. It would look something like this

你应该做一张每个单词都有一个条目的表格,这样你的句子(原文如此)就可以通过在表格上的连接来完成。它看起来是这样的

SentenceId, wordId, location
2,          7,       1
2,          8,       2
2,          9,       3

They way you have it set up, you are not taking advantage of your database, basically putting several points of data in 1 table-field.

他们是这样设置的,你没有利用你的数据库,基本上把几个数据点放在一个表字段中。

The location field (it is tempting to call it "order", but as this is an SQL keyword, don't do it, you'll hate yourself) can be used to 'sort' the sentence.

location字段(它很容易被称为“order”,但是因为这是一个SQL关键字,所以不要这么做,你会讨厌自己)可以用来对句子进行“排序”。

(and you might want to rename sentense to sentence?)

(你可能想重命名句子?)