sql查询——查找另一个表的模式

时间:2022-04-03 15:45:52

I have a table with colors:

我有一张有颜色的桌子:

COLORS

idColor   Name
-------   ------
   4      Yellow
   5      Green
   6      Red

And I have another table with data:

我还有另一张数据表

PRODUCTS

idProduct   idCategory   idColor
---------   ----------   -------
    1           1           4     
    2           1           5     
    3           1           6     
    4           2           10    
    5           2           11    
    6           2           12    
    7           3           4     
    8           3           5     
    9           3           8     
    10          4           4     
    11          4           5     
    12          4           6     
    13          5           4     
    14          6           4     
    15          6           5     

I just want return rows from Products when the idColor values from table Colors (4, 5, 6) are present in the second table and IdCategory has exactly 3 elements with the same idColor values 4, 5, 6.

当表颜色的idColor值(4,5,6)出现在第二个表中,而IdCategory有3个相同的idColor值4、5、6的元素时,我只想返回product的返回行。

For this example, The query should return:

对于本例,查询应该返回:

IdCategory
----------
    1      
    4      

4 个解决方案

#1


5  

Try this:

试试这个:

SELECT idCategory
FROM PRODUCTS
GROUP BY idCategory
HAVING COUNT(*) = 3
AND COUNT(DISTINCT CASE WHEN idColor IN (4,5,6) THEN idColor END) = 3

Here is a demo for you to try.

这里有一个演示供您尝试。

UPDATED

更新

If you want to dynamically filter the results depending on the values of the table `COLOR

如果您想根据表颜色的值动态地过滤结果

SELECT idCategory
FROM PRODUCTS P
LEFT JOIN (SELECT idColor, COUNT(*) OVER() TotalColors
           FROM COLORS) C
     ON P.idColor = C.idColor
GROUP BY idCategory
HAVING COUNT(*) = MIN(C.TotalColors)
AND COUNT(DISTINCT C.idColor) = MIN(C.TotalColors)

Here is a fiddle with this example.

这里有一个例子。

#2


3  

You can use aggregates to make sure it has all 3 colors, and also to make sure it DOESN'T have any other colors. Something like this:

你可以使用聚合来确保它有所有的三种颜色,也可以确保它没有任何其他颜色。是这样的:

SELECT *
FROM
(
SELECT idCategory
  , SUM(CASE WHEN idColor IN (4, 5, 6) THEN 1 ELSE 0 END) AS GoodColors
  , SUM(CASE WHEN idColor NOT IN (4, 5, 6) THEN 1 ELSE 0 END) AS BadColors
FROM Products
GROUP BY idCategory
) t0
WHERE GoodColors = 3 AND BadColors = 0

Note, if the 4, 5, 6 is found more than once per idCategory then a different technique must be employed. But from your example, it doesn't appear that way.

注意,如果每个idCategory发现的4,5,6不止一次,那么必须使用不同的技术。但从你的例子来看,情况并非如此。

#3


0  

I am guessing that you would like to perform this task based on data in a table, rather than hardcoding the values 4, 5, and 6 (like in some of the answers given). To that end, in my solution I created a dbo.ColorSets table that you can fill with as many different sets of colors as you want, then run the query and see all the product Categories that match those color Sets. The reason I didn't just use your dbo.Color table is that it appeared to be the lookup table, complete with color names, so it didn't seem like the right one to be picking out a particular set of colors rather than the entire list possible.

我猜您希望基于表中的数据执行此任务,而不是硬编码值4、5和6(类似于给出的一些答案)。为此,在我的解决方案中,我创建了一个dbo。ColorSets表,您可以填充任意数量的不同颜色集,然后运行查询并查看与这些颜色集匹配的所有产品类别。我不只是用你的dbo。颜色表似乎是一个查找表,包含了颜色名称,所以它看起来不像是一个正确的选择一组特定的颜色而不是整个列表。

I used a technique that will maintain good performance even on huge amounts of data, as compared to other query methods that use aggregates exclusively. No matter what method one uses, this task will pretty much always require a scan of the entire Products table because you can't compare all the rows without, well, comparing all the rows. But the JOIN is on indexable columns and is only for the candidates that have a very good chance of being proper matches, so the amount of work required is greatly reduced.

我使用了一种技术,即使在大量的数据上也能保持良好的性能,与其他使用聚合的查询方法相比。无论使用什么方法,这个任务几乎总是需要扫描整个product表,因为如果不比较所有的行,就不能比较所有的行。但是,连接是可转位的列,并且只适合有很好机会进行适当匹配的候选项,因此需要大大减少所需的工作量。

Here's what the ColorSets table looks like:

这是ColorSets表的样子:

CREATE TABLE dbo.ColorSets (
   idSet int NOT NULL,
   idColor int NOT NULL,
   CONSTRAINT PK_ColorSet PRIMARY KEY CLUSTERED (idSet, idColor)
);

INSERT dbo.ColorSets
VALUES
   (1, 4), 
   (1, 5),
   (1, 6), -- your color set: yellow, green, and red
   (2, 4),
   (2, 5),
   (2, 8)  -- an additional color set: yellow, green, and purple
;

And the query (see this working in a SqlFiddle):

查询(请参阅在SqlFiddle中工作):

WITH Sets AS (
   SELECT
      idSet,
      Grp = Checksum_Agg(idColor)
   FROM
      dbo.ColorSets
   GROUP BY
      idSet
), Categories AS (
   SELECT
      idCategory,
      Grp = Checksum_Agg(idColor)
   FROM
      dbo.Products
   GROUP BY
      idCategory
)
SELECT
   S.idSet,
   C.idCategory
FROM
   Sets S
   INNER JOIN Categories C
      ON S.Grp = C.Grp
WHERE
   NOT EXISTS (
      SELECT *
      FROM
         (
            SELECT *
            FROM dbo.ColorSets CS
            WHERE CS.idSet = S.idSet
         ) CS
         FULL JOIN (
            SELECT *
            FROM dbo.Products P
            WHERE P.idCategory = C.idCategory
         ) P
            ON CS.idColor = P.idColor 
      WHERE
          CS.idColor IS NULL
          OR P.idColor IS NULL
   )
;

Result:

结果:

idSet  idCategory
 1       1
 2       3
 1       4

#4


-3  

If I understand your question, this should do it

如果我理解你的问题,这就对了

select distinct idCategory
  from Products 
 where idColors in (4,5,6)

#1


5  

Try this:

试试这个:

SELECT idCategory
FROM PRODUCTS
GROUP BY idCategory
HAVING COUNT(*) = 3
AND COUNT(DISTINCT CASE WHEN idColor IN (4,5,6) THEN idColor END) = 3

Here is a demo for you to try.

这里有一个演示供您尝试。

UPDATED

更新

If you want to dynamically filter the results depending on the values of the table `COLOR

如果您想根据表颜色的值动态地过滤结果

SELECT idCategory
FROM PRODUCTS P
LEFT JOIN (SELECT idColor, COUNT(*) OVER() TotalColors
           FROM COLORS) C
     ON P.idColor = C.idColor
GROUP BY idCategory
HAVING COUNT(*) = MIN(C.TotalColors)
AND COUNT(DISTINCT C.idColor) = MIN(C.TotalColors)

Here is a fiddle with this example.

这里有一个例子。

#2


3  

You can use aggregates to make sure it has all 3 colors, and also to make sure it DOESN'T have any other colors. Something like this:

你可以使用聚合来确保它有所有的三种颜色,也可以确保它没有任何其他颜色。是这样的:

SELECT *
FROM
(
SELECT idCategory
  , SUM(CASE WHEN idColor IN (4, 5, 6) THEN 1 ELSE 0 END) AS GoodColors
  , SUM(CASE WHEN idColor NOT IN (4, 5, 6) THEN 1 ELSE 0 END) AS BadColors
FROM Products
GROUP BY idCategory
) t0
WHERE GoodColors = 3 AND BadColors = 0

Note, if the 4, 5, 6 is found more than once per idCategory then a different technique must be employed. But from your example, it doesn't appear that way.

注意,如果每个idCategory发现的4,5,6不止一次,那么必须使用不同的技术。但从你的例子来看,情况并非如此。

#3


0  

I am guessing that you would like to perform this task based on data in a table, rather than hardcoding the values 4, 5, and 6 (like in some of the answers given). To that end, in my solution I created a dbo.ColorSets table that you can fill with as many different sets of colors as you want, then run the query and see all the product Categories that match those color Sets. The reason I didn't just use your dbo.Color table is that it appeared to be the lookup table, complete with color names, so it didn't seem like the right one to be picking out a particular set of colors rather than the entire list possible.

我猜您希望基于表中的数据执行此任务,而不是硬编码值4、5和6(类似于给出的一些答案)。为此,在我的解决方案中,我创建了一个dbo。ColorSets表,您可以填充任意数量的不同颜色集,然后运行查询并查看与这些颜色集匹配的所有产品类别。我不只是用你的dbo。颜色表似乎是一个查找表,包含了颜色名称,所以它看起来不像是一个正确的选择一组特定的颜色而不是整个列表。

I used a technique that will maintain good performance even on huge amounts of data, as compared to other query methods that use aggregates exclusively. No matter what method one uses, this task will pretty much always require a scan of the entire Products table because you can't compare all the rows without, well, comparing all the rows. But the JOIN is on indexable columns and is only for the candidates that have a very good chance of being proper matches, so the amount of work required is greatly reduced.

我使用了一种技术,即使在大量的数据上也能保持良好的性能,与其他使用聚合的查询方法相比。无论使用什么方法,这个任务几乎总是需要扫描整个product表,因为如果不比较所有的行,就不能比较所有的行。但是,连接是可转位的列,并且只适合有很好机会进行适当匹配的候选项,因此需要大大减少所需的工作量。

Here's what the ColorSets table looks like:

这是ColorSets表的样子:

CREATE TABLE dbo.ColorSets (
   idSet int NOT NULL,
   idColor int NOT NULL,
   CONSTRAINT PK_ColorSet PRIMARY KEY CLUSTERED (idSet, idColor)
);

INSERT dbo.ColorSets
VALUES
   (1, 4), 
   (1, 5),
   (1, 6), -- your color set: yellow, green, and red
   (2, 4),
   (2, 5),
   (2, 8)  -- an additional color set: yellow, green, and purple
;

And the query (see this working in a SqlFiddle):

查询(请参阅在SqlFiddle中工作):

WITH Sets AS (
   SELECT
      idSet,
      Grp = Checksum_Agg(idColor)
   FROM
      dbo.ColorSets
   GROUP BY
      idSet
), Categories AS (
   SELECT
      idCategory,
      Grp = Checksum_Agg(idColor)
   FROM
      dbo.Products
   GROUP BY
      idCategory
)
SELECT
   S.idSet,
   C.idCategory
FROM
   Sets S
   INNER JOIN Categories C
      ON S.Grp = C.Grp
WHERE
   NOT EXISTS (
      SELECT *
      FROM
         (
            SELECT *
            FROM dbo.ColorSets CS
            WHERE CS.idSet = S.idSet
         ) CS
         FULL JOIN (
            SELECT *
            FROM dbo.Products P
            WHERE P.idCategory = C.idCategory
         ) P
            ON CS.idColor = P.idColor 
      WHERE
          CS.idColor IS NULL
          OR P.idColor IS NULL
   )
;

Result:

结果:

idSet  idCategory
 1       1
 2       3
 1       4

#4


-3  

If I understand your question, this should do it

如果我理解你的问题,这就对了

select distinct idCategory
  from Products 
 where idColors in (4,5,6)