MYSQL如何从表格中选择所有电子邮件，但限制具有相同域名的电子邮件数量

As the topic sugests I want to select all emails in the list. But limit the number of emails with the same domain.

当主题sugests我想要选择列表中的所有电子邮件。但是限制具有相同域的电子邮件数量。

Lets say i have 500 gmail adressses.

可以说我有500个gmail地址。

And 2 example.com adresses.

2个example.com的地址。

.. and so on..

.. 等等..

I want to just grab 2 of each adress with the same domain.

我想用同一个域抓取每个地址中的2个。

With this string i can select the number of domains that accurs on each domain so maybe I can do something with this string.

使用此字符串,我可以选择每个域上苛刻的域的数量,因此我可以使用此字符串执行某些操作。

SELECT substring_index(email, '@', -1), COUNT(*) FROM emaillist GROUP
BY substring_index(email, '@', -1);

Please help!

2 个解决方案

#1

SELECT
  MIN(email) AS address1
  IF(MAX(email)==MIN(email),NULL,MAX(email)) AS address2
FROM emaillist
GROUP BY substring_index(email, '@', -1);

and if you want them in one column

如果你想在一列中

SELECT MIN(email) AS address1
FROM emaillist
GROUP BY substring_index(email, '@', -1)
UNION
SELECT MAX(email) AS address1
FROM emaillist
GROUP BY substring_index(email, '@', -1)

#2

SELECT ID, Email, SUBSTRING_INDEX(EMAIL, '@', -1) Domain
FROM   emaillist a
WHERE  
(
    SELECT  COUNT(*)
    FROM    emaillist e
    WHERE   SUBSTRING_INDEX(e.EMAIL, '@', -1) = SUBSTRING_INDEX(a.EMAIL, '@', -1) AND
            a.ID <= e.ID
) <= 2;

SQLFiddle Demo

The above query doesn't use INDEX. The effect of that is it will perform FULL TABLE SCAN causing the query to be slow if you have a very large database.

以上查询不使用INDEX。这样做的结果是它会执行FULL TABLE SCAN,如果你有一个非常大的数据库,会导致查询速度变慢。

My advise for you is to create an extra column and you will have to define an INDEX for it, eg,

我建议你创建一个额外的列,你必须为它定义一个INDEX,例如,

CREATE TABLE emaillist 
(
    ID INT AUTO_INCREMENT PRIMARY KEY,
    EMAIL VARCHAR(50) NOT NULL,
    DOMAIN VARCHAR(15) NOT NULL,
    KEY (DOMAIN)
)

#1

SELECT
  MIN(email) AS address1
  IF(MAX(email)==MIN(email),NULL,MAX(email)) AS address2
FROM emaillist
GROUP BY substring_index(email, '@', -1);