基于另一列的MAX从一列中选择DISTINCT

I have a query that returns the relative activity of users in each region. I want to be returned that list but with each user only in 1 region, so I want to filter out on everyone's MAX applications.

我有一个查询返回每个区域中用户的相对活动。我希望返回该列表，但每个用户只在1个区域，所以我想过滤掉每个人的MAX应用程序。

The current query:

当前查询：

SELECT 
    r.region_id,
    ha.user_id,
    count(ha.user_id) AS applications 
FROM 
    sit_applications ha
LEFT JOIN 
    listings_regions r 
        ON 
            r.listingID = ha.listingID
            AND deleted = 0
WHERE 
    ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH) 
GROUP BY 
    ha.user_id, r.region_id
HAVING 
    applications > 0
ORDER BY 
    r.region_id DESC

I need to filter this query so I only grab each user_id once, and with it's biggest applications for a region. This is so I have a list of all the top performers for each region, with no duplicate users.

我需要过滤此查询，因此我只抓取每个user_id一次，并使用它是区域的最大应用程序。这是我有一个每个地区的所有最佳表现者的列表，没有重复的用户。

2 个解决方案

#1

In MySQL, you have three basic ways to do this:

在MySQL中，您有三种基本方法：

Using variables
使用变量
Using a complex join
使用复杂的连接
Using a hack with substring_index() and group_concat().
使用带有substring_index（）和group_concat（）的hack。

The complex join is really a mess when you have aggregation queries. The hack is fun, but does have its limitation. So, let's consider the variables method:

当您有聚合查询时，复杂的连接实际上是一团糟。黑客很有趣，但确实有其局限性。那么，让我们考虑变量方法：

SELECT ur.*
FROM (SELECT ur.*,
             (@rn := if(@u = user_id, @rn + 1,
                        if(@u := user_id, 1, 1)
                       )
             ) as rn
      FROM (SELECT r.region_id, ha.user_id, count(ha.user_id) AS applications 
            FROM sit_applications ha LEFT JOIN 
                 listings_regions r 
                 ON r.listingID = ha.listingID AND deleted = 0
            WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH) 
            GROUP BY ha.user_id, r.region_id
            HAVING applications > 0
           ) ur CROSS JOIN
           (SELECT @u := -1, @rn := 0) params
      ORDER BY user_id, applications DESC
     ) ur
WHERE rn = 1;

Note: Aspects of your query do not really make sense, even though I left them in. You are using LEFT JOIN, so r.region_id could be NULL -- and that is usually not desirable. You have a HAVING clause that is totally unnecessary, because the COUNT() is always 1 -- assuming that ha.user_id is never NULL. I suspect that the logic could be replaced with an INNER JOIN, no HAVING clause, and COUNT(*).

注意：查询的各个方面确实没有意义，即使我把它们留在了。你使用LEFT JOIN，所以r.region_id可能是NULL - 这通常是不可取的。你有一个完全不必要的HAVING子句，因为COUNT（）总是1 - 假设ha.user_id永远不是NULL。我怀疑逻辑可以替换为INNER JOIN，没有HAVING子句和COUNT（*）。

#2

You could try wrapping the query and extracting out what you want:

您可以尝试包装查询并提取出您想要的内容：

SELECT t2.user_id, t2.region_id, t2.applications
FROM
(
    SELECT t.user_id, MAX(t.applications) AS applications
    FROM
    (
        SELECT r.region_id, ha.user_id, COUNT(ha.user_id) AS applications 
        FROM sit_applications ha LEFT JOIN listings_regions r 
            ON r.listingID = ha.listingID AND deleted = 0
        WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH) 
        GROUP BY ha.user_id, r.region_id
        HAVING applications > 0
    ) t
    GROUP BY t.user_id
) t1
INNER JOIN
(
    SELECT r.region_id, ha.user_id, COUNT(ha.user_id) AS applications 
    FROM sit_applications ha LEFT JOIN listings_regions r 
        ON r.listingID = ha.listingID AND deleted = 0
    WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH) 
    GROUP BY ha.user_id, r.region_id
    HAVING applications > 0
) t2
    ON t1.user_id = t2.user_id AND t1.applications = t2.applications

#1