如何使用索引优化此MySql表以获得搜索速度?

时间:2021-10-12 03:56:46

As a MySql amateur I would like to ask for some advice regarding table optimization and the use of indexes.

作为MySql业余爱好者,我想问一些关于表优化和索引使用的建议。

Consider a table containing advertisements posted by users. The table has the following structure (this is a Laravel implementation but I think the code is quite self-explanatory):

考虑一个包含用户发布的广告的表。该表具有以下结构(这是一个Laravel实现,但我认为代码是不言自明的):

Schema::create('advertisements', function (Blueprint $table) {
        $table->increments('id');     //PRIMARY KEY AUTOINCREMENTS
        $table->text('images');       //TEXT
        $table->string('name', 150);  //VARCHAR(150)
        $table->string('slug');       //VARCHAR(255)
        $table->text('description');
        $table->string('offer_type',7)->nullable()->index();
        $table->float('price')->nullable();
        $table->string('deal_type')->nullable()->index();
        $table->char('price_period',1)->nullable()->index();
        $table->float('price_per_day')->nullable();
        $table->float('deposit')->nullable();

        $table->integer('category_id')->unsigned()->index();
        $table->foreign('category_id')->references('id')->on('categories');

        $table->integer('author_id')->unsigned()->nullable();
        $table->foreign('author_id')->references('id')->on('users');

        $table->timestamps();
    });

Users on a website can search for advertisements in the table above using several criteria, such as: price range, offer_type, price_period or deal_type.

网站上的用户可以使用多个条件搜索上表中的广告,例如:price range,offer_type,price_period或deal_type。

As you can see I have indexed the offer_type, price_period and deal_type columns. From what I understand this causes the DB to create a BTREE index of the values within these columns.

如您所见,我已将offer_type,price_period和deal_type列编入索引。根据我的理解,这会导致DB在这些列中创建值的BTREE索引。

However these values are always going to be from a pre-defined set: For example - price_period is always one of: NULL, h, d, w, m, y (hour, day, week, month, year.) The deal_type column would always be either offer or demand.

但是,这些值总是来自预定义的集合:例如 - price_period始终是以下之一:NULL,h,d,w,m,y(小时,天,周,月,年。)deal_type列永远是要约或要求。

Question: If I have a set of columns that will only contain values from a pre-defined, small range of values, is it better (performance-wise) to create a separate table for them and use foreign keys rather than indexing the columns? EDIT: After further research I realize now, that foreign keys are just a referrence tool and not a performance one and they can (and should) be indexed as well. But does having an indexed foreign key, which is a number, perform better than having a short string indexed?

问题:如果我有一组列只包含来自预定义的小范围值的值,那么为它们创建单独的表并使用外键而不是索引列是否更好(性能方面)?编辑:经过进一步的研究,我现在意识到,外键只是一个参考工具而不是性能工具,它们也可以(并且应该)被编入索引。但是,拥有索引的外键(一个数字)是否比索引短字符串表现更好?

1 个解决方案

#1


1  

Indexing flags and other low-cardinality columns is usually useless. For example, if half the table has a certain value for a flag, it is faster to ignore the index on that flag and simply scan the entire table.

索引标志和其他低基数列通常是无用的。例如,如果表的一半具有某个标志值,则忽略该标志上的索引并简单地扫描整个表会更快。

We really need to see the queries in order to judge what indexes are needed. Based on your hints, I will make a stab anyway...

我们确实需要查看查询以判断需要哪些索引。根据你的提示,无论如何我都要刺...

"such as: price range, offer_type, price_period or deal_type" -- I assume the user will give a min and max price? Then let's build a "composite" index ending with price_per_day. Will they always be specifying all of the other three columns? And a single value for each column? If yes to all of the above, then this composite index is optimal:

“例如:价格范围,offer_type,price_period或deal_type” - 我假设用户会给出最低和最高价格?然后让我们构建一个以price_per_day结尾的“复合”索引。他们会一直指定所有其他三列吗?每列的单个值?如果对以上所有都是肯定的,那么这个复合索引是最佳的:

INDEX(over_types, price_period, deal_type, price_per_day)

(The first 3 columns can be in any order, but the thing applied to the range needs to be last.)

(前3列可以是任何顺序,但应用于范围的内容需要最后。)

If the user may include only some of those flags, and/or may include multiple values for them, then it becomes messier. Watch what users ask for and tailor extra indexes based on common queries. Use this index cookbook to help build them.

如果用户可能仅包括那些标志中的一些,和/或可能包括多个值,则它变得更加混乱。观察用户要求的内容,并根据常见查询定制额外的索引。使用此索引cookbook来帮助构建它们。

#1


1  

Indexing flags and other low-cardinality columns is usually useless. For example, if half the table has a certain value for a flag, it is faster to ignore the index on that flag and simply scan the entire table.

索引标志和其他低基数列通常是无用的。例如,如果表的一半具有某个标志值,则忽略该标志上的索引并简单地扫描整个表会更快。

We really need to see the queries in order to judge what indexes are needed. Based on your hints, I will make a stab anyway...

我们确实需要查看查询以判断需要哪些索引。根据你的提示,无论如何我都要刺...

"such as: price range, offer_type, price_period or deal_type" -- I assume the user will give a min and max price? Then let's build a "composite" index ending with price_per_day. Will they always be specifying all of the other three columns? And a single value for each column? If yes to all of the above, then this composite index is optimal:

“例如:价格范围,offer_type,price_period或deal_type” - 我假设用户会给出最低和最高价格?然后让我们构建一个以price_per_day结尾的“复合”索引。他们会一直指定所有其他三列吗?每列的单个值?如果对以上所有都是肯定的,那么这个复合索引是最佳的:

INDEX(over_types, price_period, deal_type, price_per_day)

(The first 3 columns can be in any order, but the thing applied to the range needs to be last.)

(前3列可以是任何顺序,但应用于范围的内容需要最后。)

If the user may include only some of those flags, and/or may include multiple values for them, then it becomes messier. Watch what users ask for and tailor extra indexes based on common queries. Use this index cookbook to help build them.

如果用户可能仅包括那些标志中的一些,和/或可能包括多个值,则它变得更加混乱。观察用户要求的内容,并根据常见查询定制额外的索引。使用此索引cookbook来帮助构建它们。