I've found a few questions that deal with this problem, and it appears that MySQL doesn't allow it. That's fine, I don't have to have a subquery in the FROM clause. However, I don't know how to get around it. Here's my setup:
我发现了一些关于这个问题的问题,MySQL似乎不允许这样做。很好,我不需要在FROM子句中有子查询。然而,我不知道如何避开它。这是我的设置:
I have a metrics table that has 3 columns I want: ControllerID, TimeStamp, and State. Basically, a data gathering engine contacts each controller in the database every 5 minutes and sticks an entry in the metrics table. The table has those three columns, plus a MetricsID that I don't care about. Maybe there is a better way to store those metrics, but I don't know it. Regardless, I want a view that takes the most recent TimeStamp for each of the different ControllerIDs and grabs the TimeStamp, ControllerID, and State. So if there are 4 controllers, the view should always have 4 rows, each with a different controller, along with its most recent state.
我有一个度量表,它有3个列:ControllerID、TimeStamp和State。基本上,数据收集引擎每隔5分钟就与数据库中的每个控制器联系一次,并在metrics表中插入一个条目。这个表有这三列,加上一个我不关心的矩阵sid。也许有更好的方法来存储这些度量,但是我不知道。无论如何,我希望有一个视图,它为每个不同的ControllerIDs获取最新的时间戳,并获取时间戳、ControllerID和状态。如果有4个控制器,视图应该有4行,每一行都有不同的控制器,以及最近的状态。
I've been able to create a query that gets what I want, but it relies on a subquery in the FROM clause, something that isn't allowed in a view. Here is what I have so far:
我可以创建一个查询来获取我想要的,但是它依赖于FROM子句中的子查询,这在视图中是不允许的。以下是我目前所拥有的:
SELECT *
FROM
(SELECT
ControllerID, TimeStamp, State
FROM Metrics
ORDER BY TimeStamp DESC)
AS t
GROUP BY ControllerID;
Like I said, this works great. But I can't use it in a view. I've tried using the max() function, but as per here: SQL: Any straightforward way to order results FIRST, THEN group by another column? if I want any additional columns besides the GROUP BY and ORDER BY columns, max() doesn't work. I've confirmed this limitation, it doesn't work.
就像我说的,这很有效。但我不能在视图中使用它。我已经尝试过使用max()函数,但是如这里所示:SQL:有什么简单的方法可以先对结果排序,然后再对另一列进行分组吗?如果我想要除GROUP BY和ORDER BY列之外的任何其他列,max()不起作用。我已经确认了这个限制,它不起作用。
I've also tried to alter the metrics table to order by TimeStamp. That doesn't work either; the wrong rows are kept.
我还尝试通过时间戳将metrics表更改为order。这并不工作;错误的行被保留。
Edit: Here is the SHOW CREATE TABLE
of the Metrics table I am pulling from:
编辑:这是我正在提取的度量表的SHOW CREATE表格:
CREATE TABLE Metrics (
MetricsID int(11) NOT NULL AUTO_INCREMENT,
ControllerID int(11) NOT NULL,
TimeStamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
State tinyint(4) NOT NULL,
PRIMARY KEY (MetricsID),
KEY makeItFast (ControllerID,MetricsID),
KEY fast (ControllerID,TimeStamp),
KEY fast2 (MetricsID),
KEY MetricsID (MetricsID),
KEY TimeStamp (TimeStamp)
) ENGINE=InnoDB AUTO_INCREMENT=8958 DEFAULT CHARSET=latin1
2 个解决方案
#1
2
If you want the most recent row for each controller, the following is view friendly:
如果您希望为每个控制器提供最近的行,以下是视图友好的:
SELECT ControllerID, TimeStamp, State
FROM Metrics m
WHERE NOT EXISTS (SELECT 1
FROM Metrics m2
WHERE m2.ControllerId = m.ControllerId and m2.Timestamp > m.TimeStamp
);
Your query is not correct anyway, because it uses a MySQL extension that is not guaranteed to work. The value for state
doesn't necessary come from the row with the largest timestamp. It comes from an arbitrary row.
无论如何,您的查询都是不正确的,因为它使用的是不能保证工作的MySQL扩展。状态的值不需要来自具有最大时间戳的行。它来自任意一行。
EDIT:
编辑:
For best performance, you want an index on Metrics(ControllerId, Timestamp)
.
为了获得最佳性能,您需要一个指标(ControllerId, Timestamp)。
#2
1
Edit Sorry, I misunderstood your question; I thought you were trying to overcome the nested-query limitation in a view.
抱歉,我误解你的问题了;我以为您正在尝试克服视图中的嵌套查询限制。
You're trying to display the most recent row for each distinct ControllerID. Furthermore, you're trying to do it with a view.
您试图为每个不同的ControllerID显示最新的行。此外,您试图用视图来完成它。
First, let's do it. If your MetricsID
column (which I know you don't care about) is an autoincrement column, this is really easy.
首先,让我们做。如果您的MetricsID列(我知道您不关心)是一个自动递增列,那么这很简单。
SELECT ControllerId, TimeStamp, State
FROM Metrics m
WHERE MetricsID IN (
SELECT MAX(MetricsID) MetricsID
FROM Metrics
GROUP BY ControllerID)
ORDER BY ControllerID
This query uses MAX ... GROUP BY
to extract the highest-numbered (most recent) row for each controller. It can be made into a view.
这个查询使用MAX…通过提取每个控制器的最高编号(最近的)行。它可以被制成一个视图。
A compound index on (ControllerID, MetricsID)
will be able to satisfy the subquery with a highly efficient loose index scan.
一个复合索引(ControllerID, MetricsID)将能够通过高效的松散索引扫描来满足子查询。
The root cause of my confusion: I didn't read your question carefully enough.
造成我困惑的根本原因是:我没有仔细阅读你的问题。
The root cause of your confusion: You're trying to take advantage of a pernicious MySQL extension to GROUP BY. Your idea of ordering the subquery may have worked. But your temporary success is an accidental side-effect of the present implementation. Read this: http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
造成困惑的根本原因是:您试图利用一个有害的MySQL扩展来分组。您订购子查询的想法可能已经奏效。但是您暂时的成功是当前实现的意外副作用。读这篇文章:http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
#1
2
If you want the most recent row for each controller, the following is view friendly:
如果您希望为每个控制器提供最近的行,以下是视图友好的:
SELECT ControllerID, TimeStamp, State
FROM Metrics m
WHERE NOT EXISTS (SELECT 1
FROM Metrics m2
WHERE m2.ControllerId = m.ControllerId and m2.Timestamp > m.TimeStamp
);
Your query is not correct anyway, because it uses a MySQL extension that is not guaranteed to work. The value for state
doesn't necessary come from the row with the largest timestamp. It comes from an arbitrary row.
无论如何,您的查询都是不正确的,因为它使用的是不能保证工作的MySQL扩展。状态的值不需要来自具有最大时间戳的行。它来自任意一行。
EDIT:
编辑:
For best performance, you want an index on Metrics(ControllerId, Timestamp)
.
为了获得最佳性能,您需要一个指标(ControllerId, Timestamp)。
#2
1
Edit Sorry, I misunderstood your question; I thought you were trying to overcome the nested-query limitation in a view.
抱歉,我误解你的问题了;我以为您正在尝试克服视图中的嵌套查询限制。
You're trying to display the most recent row for each distinct ControllerID. Furthermore, you're trying to do it with a view.
您试图为每个不同的ControllerID显示最新的行。此外,您试图用视图来完成它。
First, let's do it. If your MetricsID
column (which I know you don't care about) is an autoincrement column, this is really easy.
首先,让我们做。如果您的MetricsID列(我知道您不关心)是一个自动递增列,那么这很简单。
SELECT ControllerId, TimeStamp, State
FROM Metrics m
WHERE MetricsID IN (
SELECT MAX(MetricsID) MetricsID
FROM Metrics
GROUP BY ControllerID)
ORDER BY ControllerID
This query uses MAX ... GROUP BY
to extract the highest-numbered (most recent) row for each controller. It can be made into a view.
这个查询使用MAX…通过提取每个控制器的最高编号(最近的)行。它可以被制成一个视图。
A compound index on (ControllerID, MetricsID)
will be able to satisfy the subquery with a highly efficient loose index scan.
一个复合索引(ControllerID, MetricsID)将能够通过高效的松散索引扫描来满足子查询。
The root cause of my confusion: I didn't read your question carefully enough.
造成我困惑的根本原因是:我没有仔细阅读你的问题。
The root cause of your confusion: You're trying to take advantage of a pernicious MySQL extension to GROUP BY. Your idea of ordering the subquery may have worked. But your temporary success is an accidental side-effect of the present implementation. Read this: http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
造成困惑的根本原因是:您试图利用一个有害的MySQL扩展来分组。您订购子查询的想法可能已经奏效。但是您暂时的成功是当前实现的意外副作用。读这篇文章:http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html