FROM子句中的SQL子查询

时间:2021-03-24 22:30:09

I've found a few questions that deal with this problem, and it appears that MySQL doesn't allow it. That's fine, I don't have to have a subquery in the FROM clause. However, I don't know how to get around it. Here's my setup:

我发现了一些关于这个问题的问题,MySQL似乎不允许这样做。很好,我不需要在FROM子句中有子查询。然而,我不知道如何避开它。这是我的设置:

I have a metrics table that has 3 columns I want: ControllerID, TimeStamp, and State. Basically, a data gathering engine contacts each controller in the database every 5 minutes and sticks an entry in the metrics table. The table has those three columns, plus a MetricsID that I don't care about. Maybe there is a better way to store those metrics, but I don't know it. Regardless, I want a view that takes the most recent TimeStamp for each of the different ControllerIDs and grabs the TimeStamp, ControllerID, and State. So if there are 4 controllers, the view should always have 4 rows, each with a different controller, along with its most recent state.

我有一个度量表,它有3个列:ControllerID、TimeStamp和State。基本上,数据收集引擎每隔5分钟就与数据库中的每个控制器联系一次,并在metrics表中插入一个条目。这个表有这三列,加上一个我不关心的矩阵sid。也许有更好的方法来存储这些度量,但是我不知道。无论如何,我希望有一个视图,它为每个不同的ControllerIDs获取最新的时间戳,并获取时间戳、ControllerID和状态。如果有4个控制器,视图应该有4行,每一行都有不同的控制器,以及最近的状态。

I've been able to create a query that gets what I want, but it relies on a subquery in the FROM clause, something that isn't allowed in a view. Here is what I have so far:

我可以创建一个查询来获取我想要的,但是它依赖于FROM子句中的子查询,这在视图中是不允许的。以下是我目前所拥有的:

SELECT *
FROM
    (SELECT 
    ControllerID, TimeStamp, State
    FROM Metrics
    ORDER BY TimeStamp DESC)
AS t
GROUP BY ControllerID;

Like I said, this works great. But I can't use it in a view. I've tried using the max() function, but as per here: SQL: Any straightforward way to order results FIRST, THEN group by another column? if I want any additional columns besides the GROUP BY and ORDER BY columns, max() doesn't work. I've confirmed this limitation, it doesn't work.

就像我说的,这很有效。但我不能在视图中使用它。我已经尝试过使用max()函数,但是如这里所示:SQL:有什么简单的方法可以先对结果排序,然后再对另一列进行分组吗?如果我想要除GROUP BY和ORDER BY列之外的任何其他列,max()不起作用。我已经确认了这个限制,它不起作用。

I've also tried to alter the metrics table to order by TimeStamp. That doesn't work either; the wrong rows are kept.

我还尝试通过时间戳将metrics表更改为order。这并不工作;错误的行被保留。

Edit: Here is the SHOW CREATE TABLE of the Metrics table I am pulling from:

编辑:这是我正在提取的度量表的SHOW CREATE表格:

 CREATE TABLE Metrics (
  MetricsID int(11) NOT NULL AUTO_INCREMENT,
  ControllerID int(11) NOT NULL,
  TimeStamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  State tinyint(4) NOT NULL,
  PRIMARY KEY (MetricsID),
  KEY makeItFast (ControllerID,MetricsID),
  KEY fast (ControllerID,TimeStamp),
  KEY fast2 (MetricsID),
  KEY MetricsID (MetricsID),
  KEY TimeStamp (TimeStamp)
) ENGINE=InnoDB AUTO_INCREMENT=8958 DEFAULT CHARSET=latin1

2 个解决方案

#1


2  

If you want the most recent row for each controller, the following is view friendly:

如果您希望为每个控制器提供最近的行,以下是视图友好的:

SELECT ControllerID, TimeStamp, State
FROM Metrics m
WHERE NOT EXISTS (SELECT 1
                  FROM Metrics m2
                  WHERE m2.ControllerId = m.ControllerId and m2.Timestamp > m.TimeStamp
                 );

Your query is not correct anyway, because it uses a MySQL extension that is not guaranteed to work. The value for state doesn't necessary come from the row with the largest timestamp. It comes from an arbitrary row.

无论如何,您的查询都是不正确的,因为它使用的是不能保证工作的MySQL扩展。状态的值不需要来自具有最大时间戳的行。它来自任意一行。

EDIT:

编辑:

For best performance, you want an index on Metrics(ControllerId, Timestamp).

为了获得最佳性能,您需要一个指标(ControllerId, Timestamp)。

#2


1  

Edit Sorry, I misunderstood your question; I thought you were trying to overcome the nested-query limitation in a view.

抱歉,我误解你的问题了;我以为您正在尝试克服视图中的嵌套查询限制。

You're trying to display the most recent row for each distinct ControllerID. Furthermore, you're trying to do it with a view.

您试图为每个不同的ControllerID显示最新的行。此外,您试图用视图来完成它。

First, let's do it. If your MetricsID column (which I know you don't care about) is an autoincrement column, this is really easy.

首先,让我们做。如果您的MetricsID列(我知道您不关心)是一个自动递增列,那么这很简单。

SELECT ControllerId, TimeStamp, State
  FROM Metrics m
  WHERE MetricsID IN (
              SELECT MAX(MetricsID) MetricsID
                FROM Metrics
               GROUP BY ControllerID) 
  ORDER BY ControllerID 

This query uses MAX ... GROUP BY to extract the highest-numbered (most recent) row for each controller. It can be made into a view.

这个查询使用MAX…通过提取每个控制器的最高编号(最近的)行。它可以被制成一个视图。

A compound index on (ControllerID, MetricsID) will be able to satisfy the subquery with a highly efficient loose index scan.

一个复合索引(ControllerID, MetricsID)将能够通过高效的松散索引扫描来满足子查询。

The root cause of my confusion: I didn't read your question carefully enough.

造成我困惑的根本原因是:我没有仔细阅读你的问题。

The root cause of your confusion: You're trying to take advantage of a pernicious MySQL extension to GROUP BY. Your idea of ordering the subquery may have worked. But your temporary success is an accidental side-effect of the present implementation. Read this: http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html

造成困惑的根本原因是:您试图利用一个有害的MySQL扩展来分组。您订购子查询的想法可能已经奏效。但是您暂时的成功是当前实现的意外副作用。读这篇文章:http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html

#1


2  

If you want the most recent row for each controller, the following is view friendly:

如果您希望为每个控制器提供最近的行,以下是视图友好的:

SELECT ControllerID, TimeStamp, State
FROM Metrics m
WHERE NOT EXISTS (SELECT 1
                  FROM Metrics m2
                  WHERE m2.ControllerId = m.ControllerId and m2.Timestamp > m.TimeStamp
                 );

Your query is not correct anyway, because it uses a MySQL extension that is not guaranteed to work. The value for state doesn't necessary come from the row with the largest timestamp. It comes from an arbitrary row.

无论如何,您的查询都是不正确的,因为它使用的是不能保证工作的MySQL扩展。状态的值不需要来自具有最大时间戳的行。它来自任意一行。

EDIT:

编辑:

For best performance, you want an index on Metrics(ControllerId, Timestamp).

为了获得最佳性能,您需要一个指标(ControllerId, Timestamp)。

#2


1  

Edit Sorry, I misunderstood your question; I thought you were trying to overcome the nested-query limitation in a view.

抱歉,我误解你的问题了;我以为您正在尝试克服视图中的嵌套查询限制。

You're trying to display the most recent row for each distinct ControllerID. Furthermore, you're trying to do it with a view.

您试图为每个不同的ControllerID显示最新的行。此外,您试图用视图来完成它。

First, let's do it. If your MetricsID column (which I know you don't care about) is an autoincrement column, this is really easy.

首先,让我们做。如果您的MetricsID列(我知道您不关心)是一个自动递增列,那么这很简单。

SELECT ControllerId, TimeStamp, State
  FROM Metrics m
  WHERE MetricsID IN (
              SELECT MAX(MetricsID) MetricsID
                FROM Metrics
               GROUP BY ControllerID) 
  ORDER BY ControllerID 

This query uses MAX ... GROUP BY to extract the highest-numbered (most recent) row for each controller. It can be made into a view.

这个查询使用MAX…通过提取每个控制器的最高编号(最近的)行。它可以被制成一个视图。

A compound index on (ControllerID, MetricsID) will be able to satisfy the subquery with a highly efficient loose index scan.

一个复合索引(ControllerID, MetricsID)将能够通过高效的松散索引扫描来满足子查询。

The root cause of my confusion: I didn't read your question carefully enough.

造成我困惑的根本原因是:我没有仔细阅读你的问题。

The root cause of your confusion: You're trying to take advantage of a pernicious MySQL extension to GROUP BY. Your idea of ordering the subquery may have worked. But your temporary success is an accidental side-effect of the present implementation. Read this: http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html

造成困惑的根本原因是:您试图利用一个有害的MySQL扩展来分组。您订购子查询的想法可能已经奏效。但是您暂时的成功是当前实现的意外副作用。读这篇文章:http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html