SQL Query从每个类别中选择底部2

时间:2022-05-22 12:27:38

In Mysql, I want to select the bottom 2 items from each category

在Mysql中,我想从每个类别中选择最底层的2个项目

Category Value
1        1.3
1        4.8
1        3.7
1        1.6
2        9.5
2        9.9
2        9.2
2        10.3
3        4
3        8
3        16

Giving me:

Category Value
1        1.3
1        1.6
2        9.5
2        9.2
3        4
3        8

Before I migrated from sqlite3 I had to first select a lowest from each category, then excluding anything that joined to that, I had to again select the lowest from each category. Then anything equal to that new lowest or less in a category won. This would also pick more than 2 in case of a tie, which was annoying... It also had a really long runtime.

在我从sqlite3迁移之前,我必须首先从每个类别中选择最低,然后排除任何加入的类别,我不得不再次从每个类别中选择最低。然后,在类别中等于新的最低或更低的任何东西都获胜。如果出现平局,这也会选择超过2个,这很烦人......它也有很长的运行时间。

My ultimate goal is to count the number of times an individual is in one of the lowest 2 of a category (there is also a name field) and this is the one part I don't know how to do. Thanks

我的最终目标是计算一个人在一个类别中最低的2个中的次数(还有一个名称字段),这是我不知道该怎么做的一部分。谢谢

5 个解决方案

#1


4  

You could try this:

你可以试试这个:

SELECT * FROM (
  SELECT c.*,
        (SELECT COUNT(*)
         FROM user_category c2
         WHERE c2.category = c.category
         AND c2.value < c.value) cnt
  FROM user_category c ) uc
WHERE cnt < 2

It should give you the desired results, but check if performance is ok.

它应该给你想要的结果,但检查性能是否正常。

#2


8  

SELECT c1.category, c1.value
FROM catvals c1
LEFT OUTER JOIN catvals c2
  ON (c1.category = c2.category AND c1.value > c2.value)
GROUP BY c1.category, c1.value
HAVING COUNT(*) < 2;

Tested on MySQL 5.1.41 with your test data. Output:

使用您的测试数据在MySQL 5.1.41上进行测试。输出:

+----------+-------+
| category | value |
+----------+-------+
|        1 |  1.30 |
|        1 |  1.60 |
|        2 |  9.20 |
|        2 |  9.50 |
|        3 |  4.00 |
|        3 |  8.00 |
+----------+-------+

(The extra decimal places are because I declared the value column as NUMERIC(9,2).)

(额外的小数位是因为我将值列声明为NUMERIC(9,2)。)

Like other solutions, this produces more than 2 rows per category if there are ties. There are ways to construct the join condition to resolve that, but we'd need to use a primary key or unique key in your table, and we'd also have to know how you intend ties to be resolved.

与其他解决方案一样,如果存在关联,则每个类别产生超过2行。有一些方法可以构建连接条件来解决这个问题,但是我们需要在表中使用主键或唯一键,我们还必须知道你打算如何解决这个问题。

#3


1  

Here's a solution that handles duplicates properly. Table name is 'zzz' and columns are int and float

这是一个正确处理重复的解决方案。表名是'zzz',列是int和float

select
    smallest.category category, min(smallest.value) value
from 
    zzz smallest
group by smallest.category

union

select
    second_smallest.category category, min(second_smallest.value) value
from
    zzz second_smallest
where
    concat(second_smallest.category,'x',second_smallest.value)
    not in ( -- recreate the results from the first half of the union
        select concat(c.category,'x',min(c.value))
        from zzz c
        group by c.category
    )
group by second_smallest.category

order by category

Caveats:

  • If there is only one value for a given category, then only that single entry is returned.
  • 如果给定类别只有一个值,则仅返回该单个条目。

  • If there was a unique recordID for each row you wouldn't need all the concats to simulate a unique key.
  • 如果每行都有唯一的recordID,则不需要所有的concats来模拟唯一键。

Your mileage may vary,

你的旅费可能会改变,

--Mark

#4


1  

A union should work. I'm not sure of the performance compared to Peter's solution.

工会应该有效。与彼得的解决方案相比,我不确定性能。

SELECT smallest.category, MIN(smallest.value)
    FROM categories smallest
GROUP BY smallest.category
UNION
SELECT second_smallest.category, MIN(second_smallest.value)
    FROM categories second_smallest
    WHERE second_smallest.value  > (SELECT MIN(smallest.value) FROM categories smallest WHERE second.category = second_smallest.category)
GROUP BY second_smallest.category

#5


1  

Here is a very generalized solution, that would work for selecting first n rows for each Category. This will work even if there are duplicates in value.

这是一个非常通用的解决方案,可用于为每个类别选择前n行。即使存在重复值,这也会起作用。

/* creating temporary variables */
mysql> set @cnt = 0;
mysql> set @trk = 0;

/* query */
mysql> select Category, Value 
       from (select *, 
                @cnt:=if(@trk = Category, @cnt+1, 0) cnt, 
                @trk:=Category 
                from user_categories 
                order by Category, Value ) c1 
       where c1.cnt < 2;

Here is the result.

这是结果。

+----------+-------+
| Category | Value |
+----------+-------+
|        1 |   1.3 |
|        1 |   1.6 |
|        2 |   9.2 |
|        2 |   9.5 |
|        3 |     4 |
|        3 |     8 |
+----------+-------+

This is tested on MySQL 5.0.88 Note that initial value of @trk variable should be not the least value of Category field.

这是在MySQL 5.0.88上测试的。注意@trk变量的初始值应该不是Category字段的最小值。

#1


4  

You could try this:

你可以试试这个:

SELECT * FROM (
  SELECT c.*,
        (SELECT COUNT(*)
         FROM user_category c2
         WHERE c2.category = c.category
         AND c2.value < c.value) cnt
  FROM user_category c ) uc
WHERE cnt < 2

It should give you the desired results, but check if performance is ok.

它应该给你想要的结果,但检查性能是否正常。

#2


8  

SELECT c1.category, c1.value
FROM catvals c1
LEFT OUTER JOIN catvals c2
  ON (c1.category = c2.category AND c1.value > c2.value)
GROUP BY c1.category, c1.value
HAVING COUNT(*) < 2;

Tested on MySQL 5.1.41 with your test data. Output:

使用您的测试数据在MySQL 5.1.41上进行测试。输出:

+----------+-------+
| category | value |
+----------+-------+
|        1 |  1.30 |
|        1 |  1.60 |
|        2 |  9.20 |
|        2 |  9.50 |
|        3 |  4.00 |
|        3 |  8.00 |
+----------+-------+

(The extra decimal places are because I declared the value column as NUMERIC(9,2).)

(额外的小数位是因为我将值列声明为NUMERIC(9,2)。)

Like other solutions, this produces more than 2 rows per category if there are ties. There are ways to construct the join condition to resolve that, but we'd need to use a primary key or unique key in your table, and we'd also have to know how you intend ties to be resolved.

与其他解决方案一样,如果存在关联,则每个类别产生超过2行。有一些方法可以构建连接条件来解决这个问题,但是我们需要在表中使用主键或唯一键,我们还必须知道你打算如何解决这个问题。

#3


1  

Here's a solution that handles duplicates properly. Table name is 'zzz' and columns are int and float

这是一个正确处理重复的解决方案。表名是'zzz',列是int和float

select
    smallest.category category, min(smallest.value) value
from 
    zzz smallest
group by smallest.category

union

select
    second_smallest.category category, min(second_smallest.value) value
from
    zzz second_smallest
where
    concat(second_smallest.category,'x',second_smallest.value)
    not in ( -- recreate the results from the first half of the union
        select concat(c.category,'x',min(c.value))
        from zzz c
        group by c.category
    )
group by second_smallest.category

order by category

Caveats:

  • If there is only one value for a given category, then only that single entry is returned.
  • 如果给定类别只有一个值,则仅返回该单个条目。

  • If there was a unique recordID for each row you wouldn't need all the concats to simulate a unique key.
  • 如果每行都有唯一的recordID,则不需要所有的concats来模拟唯一键。

Your mileage may vary,

你的旅费可能会改变,

--Mark

#4


1  

A union should work. I'm not sure of the performance compared to Peter's solution.

工会应该有效。与彼得的解决方案相比,我不确定性能。

SELECT smallest.category, MIN(smallest.value)
    FROM categories smallest
GROUP BY smallest.category
UNION
SELECT second_smallest.category, MIN(second_smallest.value)
    FROM categories second_smallest
    WHERE second_smallest.value  > (SELECT MIN(smallest.value) FROM categories smallest WHERE second.category = second_smallest.category)
GROUP BY second_smallest.category

#5


1  

Here is a very generalized solution, that would work for selecting first n rows for each Category. This will work even if there are duplicates in value.

这是一个非常通用的解决方案,可用于为每个类别选择前n行。即使存在重复值,这也会起作用。

/* creating temporary variables */
mysql> set @cnt = 0;
mysql> set @trk = 0;

/* query */
mysql> select Category, Value 
       from (select *, 
                @cnt:=if(@trk = Category, @cnt+1, 0) cnt, 
                @trk:=Category 
                from user_categories 
                order by Category, Value ) c1 
       where c1.cnt < 2;

Here is the result.

这是结果。

+----------+-------+
| Category | Value |
+----------+-------+
|        1 |   1.3 |
|        1 |   1.6 |
|        2 |   9.2 |
|        2 |   9.5 |
|        3 |     4 |
|        3 |     8 |
+----------+-------+

This is tested on MySQL 5.0.88 Note that initial value of @trk variable should be not the least value of Category field.

这是在MySQL 5.0.88上测试的。注意@trk变量的初始值应该不是Category字段的最小值。