将SELECT DISTINCT ON查询从Postgresql转换为MySQL

时间:2022-01-15 22:45:48

I've been using PostgreSQL and now migrating to MySQL.

我一直在使用PostgreSQL,现在正在迁移到MySQL。

In my queries, I'm using PostgreSQL's SELECT DISTINCT ON (col1, col2, col3), I was wondering if there is any counterpart of this statement in MySQL.

在我的查询中,我正在使用PostgreSQL的SELECT DISTINCT ON(col1,col2,col3),我想知道MySQL中是否存在此语句的任何对应物。

4 个解决方案

#1


29  

There's not an exact equivalent to convert a Postgresql query that makes use of SELECT DISTINCT ON to MySQL.

没有完全等同于将使用SELECT DISTINCT ON的Postgresql查询转换为MySQL。

Postgresql SELECT DISTINCT ON

Postgresql SELECT DISTINCT ON

In Postgresql, the following query will eliminate all rows where the expressions (col1, col2, col3) match, and it will only keep the "first col4, col5 row" for each set of matched rows:

在Postgresql中,以下查询将消除表达式(col1,col2,col3)匹配的所有行,并且它将仅保留每组匹配行的“first col4,col5 row”:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename

So if your table is like this:

所以如果你的表是这样的:

col1 | col2 | col3 | col4 | col5
--------------------------------
1    | 2    | 3    | 777  | 888
1    | 2    | 3    | 888  | 999
3    | 3    | 3    | 555  | 555

our query will keep just one row for (1,2,3) and one row for (3,3,3). The resulting rows will then be:

我们的查询只为(1,2,3)保留一行,为(3,3,3)保留一行。结果行将是:

col4 | col5
-----------
777  | 888
555  | 555

please notice that the "first row" of each set is unpredictable, our fist row might be (888, 999) as well unless we specify an ORDER BY:

请注意每组的“第一行”是不可预测的,除非我们指定ORDER BY,否则我们的第一行也可能是(888,999):

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename
ORDER BY col1, col2, col3, col4

(the DISTINCT on expressions must match the leftmost ORDER BY expressions, but the ORDER BY can contain additional expressions).

(表达式上的DISTINCT必须与最左边的ORDER BY表达式匹配,但ORDER BY可以包含其他表达式)。

MySQL extension to GROUP BY

对GROUP BY的MySQL扩展

MySQL extends the use of GROUP BY so that we can select nonaggregated columns not named in the GROUP BY clause. Whenever we select nonaggregated columns the server is free to choose any value from each group from that column, so the resulting values will be indetermined.

MySQL扩展了GROUP BY的使用范围,以便我们可以选择GROUP BY子句中未命名的非聚合列。每当我们选择非聚合列时,服务器可以*地从该列中的每个组中选择任何值,因此结果值将是不确定的。

So this Postgresql query:

所以这个Postgresql查询:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename

can be considered equivalent to this MySQL query:

可以认为等效于此MySQL查询:

SELECT col4, col5
FROM tablename
GROUP BY col1, col2, col3

both Postgresql and MySQL will return the "First row" for each (col1, col2, col3), and in both cases the row returned is unpredictable because we didn't specify and order by clause.

Postgresql和MySQL都将为每个返回“第一行”(col1,col2,col3),并且在这两种情况下返回的行都是不可预测的,因为我们没有指定和order by子句。

A lot of people would be very tempted to convert this Postgresql query with an ORDER BY:

很多人都很想用ORDER BY转换这个Postgresql查询:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename
ORDER BY col1, col2, col3, col4

with this one:

与这一个:

SELECT col4, col5
FROM (
  SELECT col1, col2, col3, col4, col5
  FROM tablename
  ORDER BY col1, col2, col3, col4
) s
GROUP BY col1, col2, col3

the idea here is to apply an ORDER BY to a subquery so that when MySQL groups by col1, col2, col3 it will keep the first encountered value for col4 and col5. The idea is good, but it's wrong! MySQL is free to choose any value for col4 and col5, and we don't know which are the first values encountered, it depends on the optimizer. So I would correct it to this:

这里的想法是将ORDER BY应用于子查询,以便当MySQL按col1,col2,col3分组时,它将保留col4和col5的第一个遇到的值。这个想法很好,但这是错的! MySQL可以*选择col4和col5的任何值,我们不知道遇到的第一个值是哪个,它取决于优化器。所以我会纠正它:

SELECT t1.col4, t1.col5
FROM tablename t1 INNER JOIN (SELECT col1, col2, col3, MIN(col4) as m_col4
                              FROM tablename
                              GROUP BY col1, col2, col3) s
     ON t1.col1=s.col1
        AND t1.col2=s.col2
        AND t1.col3=s.col3
        AND t1.col4=s.m_col4
GROUP BY
  t1.col1, t1.col2, t1.col3, t1.col4

but this is starting to get more complicated.

但这开始变得更加复杂。

Conclusion

结论

As a general rule, there's not an exact way to convert a Postgresql query to a MySQL query, but there are a lot of workarounds, the resulting query might be as simple as the original one or it might become very complicated, but it depends on the query itself.

作为一般规则,没有将Postgresql查询转换为MySQL查询的确切方法,但是有很多变通方法,结果查询可能像原始查询一样简单,也可能变得非常复杂,但它依赖于查询本身。

#2


0  

Use a subquery to determine the order, and an outer query to group them.

使用子查询来确定顺序,使用外部查询对它们进行分组。

Like @a_horse_with_no_name points out, this works because MySQL allows partial group by, unlike other DBMSs.

就像@a_horse_with_no_name指出的那样,这是有效的,因为MySQL允许部分分组,与其他DBMS不同。

For example:

例如:

CREATE TABLE customer_order
    (`customer` varchar(5), `item` varchar(6), `date` datetime)
;

INSERT INTO customer_order
    (`customer`, `item`, `date`)
VALUES
    ('alice', 'widget', '2000-01-05 00:00:00'),
    ('bob', 'widget', '2000-01-02 00:00:00'),
    ('alice', 'widget', '2000-01-01 00:00:00'),
    ('alice', 'wodget', '2000-01-06 00:00:00')
;

Query for each customer's first order:

查询每个客户的第一个订单:

select *
from
  (select customer, item, date
  from customer_order
  order by date) c
group by customer

Result:

结果:

| CUSTOMER |   ITEM |                           DATE |
|----------|--------|--------------------------------|
|    alice | widget | January, 01 2000 00:00:00+0000 |
|      bob | widget | January, 02 2000 00:00:00+0000 |

http://sqlfiddle.com/#!2/6cbbe/1

http://sqlfiddle.com/#!2/6cbbe/1

#3


-3  

You can't select distinct values from multiple columns. while selecting use query like this

您无法从多列中选择不同的值。同时选择使用这样的查询

select distinct col1, col2 from table

#4


-4  

you should migrate to PDO or MSYQLI instead of MYSQL as its already deprecated.

您应该迁移到PDO或MSYQLI而不是MYSQL,因为它已被弃用。

about your question you can do

关于你可以做的问题

   SELECT DISTINCT col1, col2, col3

or

要么

    SELECT col1, col2, col3
    ........

    GROUP BY col1 --//--- or whatever column you want to be distinct

#1


29  

There's not an exact equivalent to convert a Postgresql query that makes use of SELECT DISTINCT ON to MySQL.

没有完全等同于将使用SELECT DISTINCT ON的Postgresql查询转换为MySQL。

Postgresql SELECT DISTINCT ON

Postgresql SELECT DISTINCT ON

In Postgresql, the following query will eliminate all rows where the expressions (col1, col2, col3) match, and it will only keep the "first col4, col5 row" for each set of matched rows:

在Postgresql中,以下查询将消除表达式(col1,col2,col3)匹配的所有行,并且它将仅保留每组匹配行的“first col4,col5 row”:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename

So if your table is like this:

所以如果你的表是这样的:

col1 | col2 | col3 | col4 | col5
--------------------------------
1    | 2    | 3    | 777  | 888
1    | 2    | 3    | 888  | 999
3    | 3    | 3    | 555  | 555

our query will keep just one row for (1,2,3) and one row for (3,3,3). The resulting rows will then be:

我们的查询只为(1,2,3)保留一行,为(3,3,3)保留一行。结果行将是:

col4 | col5
-----------
777  | 888
555  | 555

please notice that the "first row" of each set is unpredictable, our fist row might be (888, 999) as well unless we specify an ORDER BY:

请注意每组的“第一行”是不可预测的,除非我们指定ORDER BY,否则我们的第一行也可能是(888,999):

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename
ORDER BY col1, col2, col3, col4

(the DISTINCT on expressions must match the leftmost ORDER BY expressions, but the ORDER BY can contain additional expressions).

(表达式上的DISTINCT必须与最左边的ORDER BY表达式匹配,但ORDER BY可以包含其他表达式)。

MySQL extension to GROUP BY

对GROUP BY的MySQL扩展

MySQL extends the use of GROUP BY so that we can select nonaggregated columns not named in the GROUP BY clause. Whenever we select nonaggregated columns the server is free to choose any value from each group from that column, so the resulting values will be indetermined.

MySQL扩展了GROUP BY的使用范围,以便我们可以选择GROUP BY子句中未命名的非聚合列。每当我们选择非聚合列时,服务器可以*地从该列中的每个组中选择任何值,因此结果值将是不确定的。

So this Postgresql query:

所以这个Postgresql查询:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename

can be considered equivalent to this MySQL query:

可以认为等效于此MySQL查询:

SELECT col4, col5
FROM tablename
GROUP BY col1, col2, col3

both Postgresql and MySQL will return the "First row" for each (col1, col2, col3), and in both cases the row returned is unpredictable because we didn't specify and order by clause.

Postgresql和MySQL都将为每个返回“第一行”(col1,col2,col3),并且在这两种情况下返回的行都是不可预测的,因为我们没有指定和order by子句。

A lot of people would be very tempted to convert this Postgresql query with an ORDER BY:

很多人都很想用ORDER BY转换这个Postgresql查询:

SELECT DISTINCT ON (col1, col2, col3) col4, col5
FROM tablename
ORDER BY col1, col2, col3, col4

with this one:

与这一个:

SELECT col4, col5
FROM (
  SELECT col1, col2, col3, col4, col5
  FROM tablename
  ORDER BY col1, col2, col3, col4
) s
GROUP BY col1, col2, col3

the idea here is to apply an ORDER BY to a subquery so that when MySQL groups by col1, col2, col3 it will keep the first encountered value for col4 and col5. The idea is good, but it's wrong! MySQL is free to choose any value for col4 and col5, and we don't know which are the first values encountered, it depends on the optimizer. So I would correct it to this:

这里的想法是将ORDER BY应用于子查询,以便当MySQL按col1,col2,col3分组时,它将保留col4和col5的第一个遇到的值。这个想法很好,但这是错的! MySQL可以*选择col4和col5的任何值,我们不知道遇到的第一个值是哪个,它取决于优化器。所以我会纠正它:

SELECT t1.col4, t1.col5
FROM tablename t1 INNER JOIN (SELECT col1, col2, col3, MIN(col4) as m_col4
                              FROM tablename
                              GROUP BY col1, col2, col3) s
     ON t1.col1=s.col1
        AND t1.col2=s.col2
        AND t1.col3=s.col3
        AND t1.col4=s.m_col4
GROUP BY
  t1.col1, t1.col2, t1.col3, t1.col4

but this is starting to get more complicated.

但这开始变得更加复杂。

Conclusion

结论

As a general rule, there's not an exact way to convert a Postgresql query to a MySQL query, but there are a lot of workarounds, the resulting query might be as simple as the original one or it might become very complicated, but it depends on the query itself.

作为一般规则,没有将Postgresql查询转换为MySQL查询的确切方法,但是有很多变通方法,结果查询可能像原始查询一样简单,也可能变得非常复杂,但它依赖于查询本身。

#2


0  

Use a subquery to determine the order, and an outer query to group them.

使用子查询来确定顺序,使用外部查询对它们进行分组。

Like @a_horse_with_no_name points out, this works because MySQL allows partial group by, unlike other DBMSs.

就像@a_horse_with_no_name指出的那样,这是有效的,因为MySQL允许部分分组,与其他DBMS不同。

For example:

例如:

CREATE TABLE customer_order
    (`customer` varchar(5), `item` varchar(6), `date` datetime)
;

INSERT INTO customer_order
    (`customer`, `item`, `date`)
VALUES
    ('alice', 'widget', '2000-01-05 00:00:00'),
    ('bob', 'widget', '2000-01-02 00:00:00'),
    ('alice', 'widget', '2000-01-01 00:00:00'),
    ('alice', 'wodget', '2000-01-06 00:00:00')
;

Query for each customer's first order:

查询每个客户的第一个订单:

select *
from
  (select customer, item, date
  from customer_order
  order by date) c
group by customer

Result:

结果:

| CUSTOMER |   ITEM |                           DATE |
|----------|--------|--------------------------------|
|    alice | widget | January, 01 2000 00:00:00+0000 |
|      bob | widget | January, 02 2000 00:00:00+0000 |

http://sqlfiddle.com/#!2/6cbbe/1

http://sqlfiddle.com/#!2/6cbbe/1

#3


-3  

You can't select distinct values from multiple columns. while selecting use query like this

您无法从多列中选择不同的值。同时选择使用这样的查询

select distinct col1, col2 from table

#4


-4  

you should migrate to PDO or MSYQLI instead of MYSQL as its already deprecated.

您应该迁移到PDO或MSYQLI而不是MYSQL,因为它已被弃用。

about your question you can do

关于你可以做的问题

   SELECT DISTINCT col1, col2, col3

or

要么

    SELECT col1, col2, col3
    ........

    GROUP BY col1 --//--- or whatever column you want to be distinct