如何使用Group By和self-join返回最小值,最大值,打开和关闭每日价格恢复设置?

时间:2021-03-12 12:27:29

SOLVED

All hail StackOverlow!

欢呼StackOverlow!

While I was gone, people left 2 solutions (thanks guys--what is the protocol for handing out karma for two working solutions?)

当我离开时,人们留下了两个解决方案(谢谢你们 - 为两个工作解决方案分发业力的协议是什么?)

Here is the solution that I came back to post. it is derived from yet ANOTHER StackOver solution:

这是我回来发布的解决方案。它源自另一个StackOver解决方案:

How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?

如何使用聚合函数获取MySQL查询中分组记录的第一个和最后一个记录?

...and my adaptation is:

......我的改编是:

SELECT
DATE_FORMAT(`DTE`, "%m/%d/%Y") AS trading_day,
MIN(`PRICE`) AS min_price,
MAX(`PRICE`) AS max_price,
SUBSTRING_INDEX( 
    GROUP_CONCAT(
        PRICE
        ORDER BY DTE ASC
        )
, ',', 1 ) AS opn_price,
SUBSTRING_INDEX( 
    GROUP_CONCAT(
        PRICE
        ORDER BY DTE DESC
        )
, ',', 1 ) AS cls_price 
FROM `CHART_DATA` 
GROUP BY trading_day
;

The data the "Q" above starts with is the data with which I am trying to end. Hopefully, this helps someone else since I suspect that my log is fairly common.

上面“Q”开头的数据是我试图结束的数据。希望这有助于其他人,因为我怀疑我的日志相当普遍。

I am willing to bet that one of these three solutions has a performance advantage. If anyone happens to know the inner workings of MySQL and query optimization and cares to recommend the "preferred" solution, that will be useful to know in the future.

我愿意打赌这三种解决方案中的一种具有性能优势。如果有人碰巧知道MySQL的内部工作方式和查询优化并且关心推荐“首选”解决方案,那将来有用的知识。

END SOLVED

UPDATE #2

Tryhing to come at it from yet another direction using this:

尝试使用此方法从另一个方向来:

http://forums.mysql.com/read.php?65,363723,363723

I get:

SELECT
DATE_FORMAT(`DTE`, "%m/%d/%Y") AS trading_day,
MIN(`PRICE`) AS min_price,
MAX(`PRICE`) AS max_price,
(SELECT opn_price FROM 
    (SELECT 
    DATE_FORMAT(`DTE`, "%m/%d/%Y") AS a_day,    
    PRICE AS opn_price,
    MIN(DTE) AS opn
    FROM `CHART_DATA` 
    GROUP BY a_day
    ORDER BY opn ASC LIMIT 1) AS tblO) AS opnqt,
(SELECT cls_price FROM 
    (SELECT 
    DATE_FORMAT(`DTE`, "%m/%d/%Y") AS a_day,    
    PRICE AS cls_price,
    MIN(DTE) AS cls
    FROM `CHART_DATA` 
    GROUP BY a_day
    ORDER BY cls DESC LIMIT 1) AS tblC) AS clsqt    
FROM `CHART_DATA` cht
GROUP BY trading_day;

This suffers from a similar dysfunction as the query in the first UPDATE below; the 'clsqt' (cls_price) value returned is the last closing price found in the data. Blech.

这遭受与下面第一个UPDATE中的查询类似的功能障碍;返回的'clsqt'(cls_price)值是数据中找到的最后一个收盘价。布莱什。

Plus, we are starting to get into "hideously complex" query space again and that CANNOT be good for performance.

此外,我们再次开始进入“可怕的复杂”查询空间,这对性能不利。

But if anyone sees the fix for the 'clsqt' value, I would accept it gladly and solve the performance issue at a later day. :)

但是,如果有人看到'clsqt'值的修复,我会很乐意接受它并在晚些时候解决性能问题。 :)

END UPDATE #2

END UPDATE#2

UPDATE

So close...here's where I am today:

如此接近......这就是我今天所处的位置:

SELECT 
DATE_FORMAT(`cht1`.`DTE`, "%m/%d/%Y") AS trading_day,
MIN(`cht1`.`PRICE`) AS min_price,
MAX(`cht1`.`PRICE`) AS max_price,
    MIN(cht1.DTE) AS opn_date1,
    MIN(cht1.DTE) AS opn_date2,
    `cht2`.`PRICE` AS opn_price,
        MAX(cht1.DTE) AS cls_date1,
        MAX(cht3.DTE) AS cls_date3,
        `cht3`.`PRICE` AS cls_price
FROM `CHART_DATA` cht1
     LEFT JOIN `CHART_DATA` cht2 
     ON cht2.DTE = cht1.DTE
     LEFT JOIN `CHART_DATA` cht3 
     ON cht3.DTE = cht1.DTE
GROUP BY trading_day
HAVING opn_date1 = opn_date2
AND cls_date1 = cls_date3
;

This retuns everything correctly BUT the correct 'cls_price' (it is returnign the same value for 'cls_price' as 'opn_price').

这会正确地返回所有内容但是正确的'cls_price'(它将'cls_price'的值与'opn_price'相同)。

However, 'cls_date1' and 'cls_date3' are the correct values, so I must be close.

但是,'cls_date1'和'cls_date3'是正确的值,所以我必须接近。

Anyone see what I am not?

有人看到我不是吗?

END UPDATE

I have been poring over SO with regard to left joins and self joins...and I must admit that I am not grokking.

关于左联盟和自我加入,我一直在仔细研究......我必须承认我不是很喜欢。

I found this "Q" that seems very close to what I want: left join with condition for right table in mysql

我发现这个“Q”看起来非常接近我想要的东西:在mysql中保留右表连接条件

What I am trying to do is retrieve open, close, min, and max price days from a single table (sample data below).

我要做的是从单个表中检索开盘价,收盘价,最小价和最大价格日(下面的示例数据)。

Min and Max are easy:

Min和Max很简单:

SELECT 
DATE_FORMAT(`DTE`, "%m/%d/%Y") AS trading_day,
MIN(`PRICE`) AS min_price,
MAX(`PRICE`) AS max_price
FROM `CHART_DATA` 
GROUP BY trading_day;

I want the results returned group by date, somthing like:

我希望结果按日期返回,类似于:

'trading_day' 'opn_price' 'min_price' 'max_price' 'cls_price'

Okay, so I try 'baby steps' with just one join...

好的,所以我只用一次加入就试试'婴儿步'

SELECT 
DATE_FORMAT(`cht1`.`DTE`, "%m/%d/%Y") AS trading_day,
MIN(`cht1`.`PRICE`) AS min_price,
MAX(`cht1`.`PRICE`) AS max_price,
`cht2`.`PRICE` AS opn_price
FROM `CHART_DATA` cht1
 LEFT JOIN `CHART_DATA` cht2 
 ON cht2.DTE = MIN(cht1.DTE)
GROUP BY trading_day;

...and I get the message "Invalid use of group function"

...我收到消息“无效使用组功能”

Of course, removing the "GROUP BY" is no help, since I need to return aggegate columns.

当然,删除“GROUP BY”没有任何帮助,因为我需要返回aggegate列。

I have a really complex solution that gets the open and close results, but not the min and max--and they are in separate result sets. I get the feeling that I have made this more complex than is necessary and that fi I could just grasp what is going on with the self joins cited in the "Q" referenced above, that my overall coding would improvie immeasurably. But I have spent something like 12 hours on this during the past weekend and am more confusted than ever.

我有一个非常复杂的解决方案,可以获得开放和结果,但不是最小值和最大值 - 它们位于不同的结果集中。我觉得我已经把它变得比必要的更复杂了,而且我可以理解上面引用的“Q”中引用的自连接的情况,我的整体编码会不可估量地改进。但是在过去的这个周末我花了大约12个小时的时间,而且比以往任何时候都更加安心。

All insight and explantion and observation is welcome at this point...

此时欢迎所有见解,解释和观察......

/* SAMPLE TABLE AND DATA */

CREATE TABLE `CHART_DATA` (
  `ID` varchar(10) DEFAULT NULL,
  `DTE` datetime DEFAULT NULL,
  `PRICE` double DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

/*Data for the table `CHART_DATA` */

INSERT INTO `chart_data`
            (`id`,`dte`,`price`)
VALUES      ('1','2011-01-01 00:10:00',0.65),
            ('2','2011-01-01 06:10:00',0.92),
            ('3','2011-01-01 12:10:00',0.59),
            ('4','2011-01-01 18:10:00',0.16),
            ('5','2011-01-02 00:10:00',0.28),
            ('6','2011-01-02 06:10:00',0.12),
            ('7','2011-01-02 12:10:00',0.92),
            ('8','2011-01-02 18:10:00',0.1),
            ('9','2011-01-03 00:10:00',0.34),
            ('10','2011-01-03 06:10:00',0.79),
            ('11','2011-01-03 12:10:00',1.23),
            ('12','2011-01-03 18:10:00',1.24),
            ('13','2011-01-04 00:10:00',1.12),
            ('14','2011-01-04 06:10:00',0.8),
            ('15','2011-01-04 12:10:00',0.65),
            ('16','2011-01-04 18:10:00',0.78),
            ('17','2011-01-05 00:10:00',0.65),
            ('18','2011-01-05 06:10:00',1.19),
            ('19','2011-01-05 12:10:00',0.89),
            ('20','2011-01-05 18:10:00',1.05),
            ('21','2011-01-06 00:10:00',0.29),
            ('22','2011-01-06 06:10:00',0.43),
            ('23','2011-01-06 12:10:00',0.26),
            ('24','2011-01-06 18:10:00',0.34),
            ('25','2011-01-07 00:10:00',0.22),
            ('26','2011-01-07 06:10:00',0.37),
            ('27','2011-01-07 12:10:00',1.22),
            ('28','2011-01-07 18:10:00',1.16),
            ('29','2011-01-08 00:10:00',0.3),
            ('30','2011-01-08 06:10:00',1.17),
            ('31','2011-01-08 12:10:00',0.62),
            ('32','2011-01-08 18:10:00',0.86),
            ('33','2011-01-09 00:10:00',0.84),
            ('34','2011-01-09 06:10:00',1.11),
            ('35','2011-01-09 12:10:00',0.92),
            ('36','2011-01-09 18:10:00',1.03),
            ('37','2011-01-10 00:10:00',1.13),
            ('38','2011-01-10 06:10:00',0.58),
            ('39','2011-01-10 12:10:00',1.03),
            ('40','2011-01-10 18:10:00',0.21),
            ('41','2011-01-11 00:10:00',0.12),
            ('42','2011-01-11 06:10:00',1.01),
            ('43','2011-01-11 12:10:00',0.19),
            ('44','2011-01-11 18:10:00',1.14),
            ('45','2011-01-12 00:10:00',0.55),
            ('46','2011-01-12 06:10:00',0.75),
            ('47','2011-01-12 12:10:00',0.66),
            ('48','2011-01-12 18:10:00',1.1),
            ('49','2011-01-13 00:10:00',0.68),
            ('50','2011-01-13 06:10:00',0.3),
            ('51','2011-01-13 12:10:00',0.9),
            ('52','2011-01-13 18:10:00',0.88),
            ('53','2011-01-14 00:10:00',0.64),
            ('54','2011-01-14 06:10:00',1.06),
            ('55','2011-01-14 12:10:00',1.12),
            ('56','2011-01-14 18:10:00',0.76),
            ('57','2011-01-15 00:10:00',0.18),
            ('58','2011-01-15 06:10:00',1.08),
            ('59','2011-01-15 12:10:00',0.66),
            ('60','2011-01-15 18:10:00',0.38),
            ('61','2011-01-16 00:10:00',1),
            ('62','2011-01-16 06:10:00',1.18),
            ('63','2011-01-16 12:10:00',1.15),
            ('64','2011-01-16 18:10:00',0.58),
            ('65','2011-01-17 00:10:00',1.04),
            ('66','2011-01-17 06:10:00',0.81),
            ('67','2011-01-17 12:10:00',0.35),
            ('68','2011-01-17 18:10:00',0.91),
            ('69','2011-01-18 00:10:00',0.14),
            ('70','2011-01-18 06:10:00',0.13),
            ('71','2011-01-18 12:10:00',1.03),
            ('72','2011-01-18 18:10:00',0.16),
            ('73','2011-01-19 00:10:00',1.05),
            ('74','2011-01-19 06:10:00',1.13),
            ('75','2011-01-19 12:10:00',1.21),
            ('76','2011-01-19 18:10:00',0.34),
            ('77','2011-01-20 00:10:00',0.63),
            ('78','2011-01-20 06:10:00',0.62),
            ('79','2011-01-20 12:10:00',0.19),
            ('80','2011-01-20 18:10:00',1.21),
            ('81','2011-01-21 00:10:00',0.83),
            ('82','2011-01-21 06:10:00',0.99),
            ('83','2011-01-21 12:10:00',0.83),
            ('84','2011-01-21 18:10:00',0.21),
            ('85','2011-01-22 00:10:00',0.8),
            ('86','2011-01-22 06:10:00',0.69),
            ('87','2011-01-22 12:10:00',0.87);  

2 个解决方案

#1


1  

So you want:

所以你要:

  • The alias C to correspond to all rows in the group for a given day, so you can use MAX() and MIN() over the rows in that group.
  • 别名C对应于给定日期内组中的所有行,因此您可以对该组中的行使用MAX()和MIN()。

  • The alias C2 to correspond to the last row in a given day.
  • 别名C2对应于给定日期的最后一行。

  • The alias C3 to correspond to a row later than C2 on the same day. If none is found, i.e. C3.* is NULL, then C2 is the latest on that day.
  • 别名C3对应于同一天的C2之后的行。如果没有找到,即C3。*为NULL,则C2是当天最新的。

This is often labeled a greatest-n-per-group query, and it comes up frequently on Stack Overflow. Here's a solution that I tested for your test data, but you can follow the tag I added to your question for other solutions and discussion.

这通常被标记为每组最大的查询,并且它经常出现在Stack Overflow上。这是我为您的测试数据测试过的解决方案,但您可以按照我在问题中添加的标记进行其他解决方案和讨论。

edit: I missed the requirement for both opening price and closing price. The following is edited.

编辑:我错过了开盘价和收盘价的要求。以下内容已经过编辑。

SELECT DATE_FORMAT(C.`DTE`, '%m/%d/%Y') AS trading_day, 
  MIN(C.`PRICE`) AS min_price, 
  MAX(C.`PRICE`) AS max_price, 
  Copen.`PRICE` AS opening_price,
  Cclose.`PRICE` AS closing_price 
FROM `CHART_DATA` AS C 
INNER JOIN `CHART_DATA` AS Cclose 
  ON DAY(C.`DTE`) = DAY(Cclose.`DTE`) 
LEFT OUTER JOIN `CHART_DATA` AS Cclose_later 
  ON DAY(C.`DTE`) = DAY(Cclose_later.`DTE`) AND Cclose.`DTE` < Cclose_later.`DTE`
INNER JOIN `CHART_DATA` AS Copen 
  ON DAY(C.`DTE`) = DAY(Copen.`DTE`) 
LEFT OUTER JOIN `CHART_DATA` AS Copen_earlier 
  ON DAY(C.`DTE`) = DAY(Copen_earlier.`DTE`) AND Copen.`DTE` < Copen_earlier.`DTE`
WHERE Cclose_later.`DTE` IS NULL AND Copen_earlier .`DTE` IS NULL 
GROUP BY trading_day;

#2


1  

SELECT
  a.trading_day, a.min_price, a.max_price, 
  b.price as opn_price, 
  c.price as cls_price
FROM
  (SELECT 
     DATE_FORMAT(`DTE`, "%m/%d/%Y") AS trading_day,
     MIN(`PRICE`) AS min_price,
     MAX(`PRICE`) AS max_price,
     MIN(`dte`) AS open_date,
     MAX(`dte`) AS close_date
   FROM `CHART_DATA`
   GROUP BY trading_day) a
LEFT JOIN
  `CHART_DATA` b ON b.dte = a.open_date
LEFT JOIN
  `CHART_DATA` c ON c.dte = a.close_date

Note: this solution may present some problems if your opening or closing entry has the exact same date/time value as another row (i.e. the transaction that came immediately after opening, or immediately before closing). To address this, I would suggest that you add a sequence number that is stored in a way that guarantees uniqueness, and increasing-with-respect-to-time. If you do this, then you would use the sequence value in the place of dte to replace the open_date and close_date I've used as join fields in my example.

注意:如果您的开盘或收盘条目具有与另一行完全相同的日期/时间值(即在开盘后立即或在收盘前立即进行的交易),此解决方案可能会出现一些问题。为了解决这个问题,我建议你添加一个序列号,该序列号以保证唯一性的方式存储,并且随着时间的推移而增加。如果你这样做,那么你将在dte的位置使用序列值来替换我在我的例子中用作连接字段的open_date和close_date。

#1


1  

So you want:

所以你要:

  • The alias C to correspond to all rows in the group for a given day, so you can use MAX() and MIN() over the rows in that group.
  • 别名C对应于给定日期内组中的所有行,因此您可以对该组中的行使用MAX()和MIN()。

  • The alias C2 to correspond to the last row in a given day.
  • 别名C2对应于给定日期的最后一行。

  • The alias C3 to correspond to a row later than C2 on the same day. If none is found, i.e. C3.* is NULL, then C2 is the latest on that day.
  • 别名C3对应于同一天的C2之后的行。如果没有找到,即C3。*为NULL,则C2是当天最新的。

This is often labeled a greatest-n-per-group query, and it comes up frequently on Stack Overflow. Here's a solution that I tested for your test data, but you can follow the tag I added to your question for other solutions and discussion.

这通常被标记为每组最大的查询,并且它经常出现在Stack Overflow上。这是我为您的测试数据测试过的解决方案,但您可以按照我在问题中添加的标记进行其他解决方案和讨论。

edit: I missed the requirement for both opening price and closing price. The following is edited.

编辑:我错过了开盘价和收盘价的要求。以下内容已经过编辑。

SELECT DATE_FORMAT(C.`DTE`, '%m/%d/%Y') AS trading_day, 
  MIN(C.`PRICE`) AS min_price, 
  MAX(C.`PRICE`) AS max_price, 
  Copen.`PRICE` AS opening_price,
  Cclose.`PRICE` AS closing_price 
FROM `CHART_DATA` AS C 
INNER JOIN `CHART_DATA` AS Cclose 
  ON DAY(C.`DTE`) = DAY(Cclose.`DTE`) 
LEFT OUTER JOIN `CHART_DATA` AS Cclose_later 
  ON DAY(C.`DTE`) = DAY(Cclose_later.`DTE`) AND Cclose.`DTE` < Cclose_later.`DTE`
INNER JOIN `CHART_DATA` AS Copen 
  ON DAY(C.`DTE`) = DAY(Copen.`DTE`) 
LEFT OUTER JOIN `CHART_DATA` AS Copen_earlier 
  ON DAY(C.`DTE`) = DAY(Copen_earlier.`DTE`) AND Copen.`DTE` < Copen_earlier.`DTE`
WHERE Cclose_later.`DTE` IS NULL AND Copen_earlier .`DTE` IS NULL 
GROUP BY trading_day;

#2


1  

SELECT
  a.trading_day, a.min_price, a.max_price, 
  b.price as opn_price, 
  c.price as cls_price
FROM
  (SELECT 
     DATE_FORMAT(`DTE`, "%m/%d/%Y") AS trading_day,
     MIN(`PRICE`) AS min_price,
     MAX(`PRICE`) AS max_price,
     MIN(`dte`) AS open_date,
     MAX(`dte`) AS close_date
   FROM `CHART_DATA`
   GROUP BY trading_day) a
LEFT JOIN
  `CHART_DATA` b ON b.dte = a.open_date
LEFT JOIN
  `CHART_DATA` c ON c.dte = a.close_date

Note: this solution may present some problems if your opening or closing entry has the exact same date/time value as another row (i.e. the transaction that came immediately after opening, or immediately before closing). To address this, I would suggest that you add a sequence number that is stored in a way that guarantees uniqueness, and increasing-with-respect-to-time. If you do this, then you would use the sequence value in the place of dte to replace the open_date and close_date I've used as join fields in my example.

注意:如果您的开盘或收盘条目具有与另一行完全相同的日期/时间值(即在开盘后立即或在收盘前立即进行的交易),此解决方案可能会出现一些问题。为了解决这个问题,我建议你添加一个序列号,该序列号以保证唯一性的方式存储,并且随着时间的推移而增加。如果你这样做,那么你将在dte的位置使用序列值来替换我在我的例子中用作连接字段的open_date和close_date。