快速我的mysql查询宽度REGEXP

时间:2021-12-24 12:07:45

I have made sql query to compare data

我已经做了sql查询来比较数据

SELECT 
    title
FROM
    links_forum
WHERE
    title REGEXP '[[:<:]]The grey[[:>:]].*[[:<:]](cam|ts|divx|mkv|xvid|dvd|dvdr|dvdrip|brrip|br2dvd|r5|r6|x264|ts2dvd|dvd5|dvd9|720p|1080p)[[:>:]]';

it takes now like 150 sec to excecute

现在需要150秒才能完成

Is there a faster way ?

有更快的方法吗?

2 个解决方案

#1


1  

The use of the REGEXP matching operator to search defeats using an index scan on the title column to find your content. You may still get some benefit from an index on title because you're only retrieving that one column, so try that if you haven't done so already.

使用REGEXP匹配运算符在标题列上使用索引扫描搜索失败以查找您的内容。您仍然可以从标题索引中获得一些好处,因为您只检索那一列,所以如果您还没有这样做,请尝试一下。

REGEXP is also not the fastest of text matching algorithms. You might get a benefit from trying

REGEXP也不是文本匹配算法中最快的。你可能会从尝试中获益

   SELECT title
     FROM links_forum
    WHERE title LIKE '%The grey%'
      AND title REGEXP (your big regular expression)

This will still defeat the use of any index, but it will search the title column faster for what I assume is the most important part of your search term. LIKE is a little faster than REGEXP.

这仍将无法使用任何索引,但它会更快地搜索标题列,因为我认为这是搜索词中最重要的部分。 LIKE比REGEXP快一点。

Another choice: Instead of searching a big blob of text for something which appears to be video media, you might consider creating a separate table with video media titles in it.

另一种选择:您可以考虑在其中创建一个包含视频媒体标题的单独表格,而不是搜索大量文本以查找视频媒体。

Specifically, you might create a links_forum_media_title table, with links_forum_id and title columns in it. Then when you insert entries in to links_forum you'll also insert entries into this particular table. Then you can create a (title, links_forum_id) index on that table and use it to look up your titles. That's a programming change. But it will solve your performance problem definitively. If you're planning to scale this application up, that will be good.

具体来说,您可以创建一个links_forum_media_title表,其中包含links_forum_id和title列。然后,当您将条目插入到links_forum时,您还将在此特定表中插入条目。然后,您可以在该表上创建(title,links_forum_id)索引,并使用它来查找您的标题。这是一个编程改变。但它最终将解决您的性能问题。如果您计划扩展此应用程序,那将是一件好事。

You could try FULLTEXT searching. But it's not great for this kind of application where you look for some kind of media format code (cam, ts, divx).

您可以尝试FULLTEXT搜索。但是对于这种寻找某种媒体格式代码(cam,ts,divx)的应用程序来说,这并不是很好。

#2


0  

I believe this query may only be optimized by a small margin. Maybe what you need is to optimize your database and have some other way to know if a given entry is of any of the types specified. For instance add some field saying whether a given entry is video or something like that.

我相信这个查询可能只是略微优化。也许你需要的是优化你的数据库,并有一些其他的方法来知道给定的条目是否是指定的任何类型。例如,添加一些字段,说明给定的条目是视频还是类似的东西。

#1


1  

The use of the REGEXP matching operator to search defeats using an index scan on the title column to find your content. You may still get some benefit from an index on title because you're only retrieving that one column, so try that if you haven't done so already.

使用REGEXP匹配运算符在标题列上使用索引扫描搜索失败以查找您的内容。您仍然可以从标题索引中获得一些好处,因为您只检索那一列,所以如果您还没有这样做,请尝试一下。

REGEXP is also not the fastest of text matching algorithms. You might get a benefit from trying

REGEXP也不是文本匹配算法中最快的。你可能会从尝试中获益

   SELECT title
     FROM links_forum
    WHERE title LIKE '%The grey%'
      AND title REGEXP (your big regular expression)

This will still defeat the use of any index, but it will search the title column faster for what I assume is the most important part of your search term. LIKE is a little faster than REGEXP.

这仍将无法使用任何索引,但它会更快地搜索标题列,因为我认为这是搜索词中最重要的部分。 LIKE比REGEXP快一点。

Another choice: Instead of searching a big blob of text for something which appears to be video media, you might consider creating a separate table with video media titles in it.

另一种选择:您可以考虑在其中创建一个包含视频媒体标题的单独表格,而不是搜索大量文本以查找视频媒体。

Specifically, you might create a links_forum_media_title table, with links_forum_id and title columns in it. Then when you insert entries in to links_forum you'll also insert entries into this particular table. Then you can create a (title, links_forum_id) index on that table and use it to look up your titles. That's a programming change. But it will solve your performance problem definitively. If you're planning to scale this application up, that will be good.

具体来说,您可以创建一个links_forum_media_title表,其中包含links_forum_id和title列。然后,当您将条目插入到links_forum时,您还将在此特定表中插入条目。然后,您可以在该表上创建(title,links_forum_id)索引,并使用它来查找您的标题。这是一个编程改变。但它最终将解决您的性能问题。如果您计划扩展此应用程序,那将是一件好事。

You could try FULLTEXT searching. But it's not great for this kind of application where you look for some kind of media format code (cam, ts, divx).

您可以尝试FULLTEXT搜索。但是对于这种寻找某种媒体格式代码(cam,ts,divx)的应用程序来说,这并不是很好。

#2


0  

I believe this query may only be optimized by a small margin. Maybe what you need is to optimize your database and have some other way to know if a given entry is of any of the types specified. For instance add some field saying whether a given entry is video or something like that.

我相信这个查询可能只是略微优化。也许你需要的是优化你的数据库,并有一些其他的方法来知道给定的条目是否是指定的任何类型。例如,添加一些字段,说明给定的条目是视频还是类似的东西。