I've been hammering my head against my desk for the past few days on this, and so I turn to you, Stack Overflow.
在过去的几天里,我一直用头敲打我的桌子,所以我转向你,堆叠溢出。
The software I'm working on has time-sensitive data. The usual solution for this is effective and expiration dates.
我正在开发的软件具有时间敏感的数据。通常的解决方法是有效的和过期的日期。
EFF_DT XPIR_DT VALUE
2000-05-01 2000-10-31 100
2000-11-01 (null) 90
This would be easy. Unfortunately, we require data that repeats on a yearly basis arbitrarily far into the future. In other words, each May 1 (starting in 2000) we may want the effective value to be 100, and each November 1 we may want to change it to 90.
这将是容易的。不幸的是,我们需要的数据是在将来任意地每年重复一次。换句话说,每年的5月1日(从2000年开始)我们可能希望有效值为100,而每年的11月1日我们可能希望将其更改为90。
This may be true for a long time (>50 years), and so I don't want to just create a hundred records. I.e., I don't want to do this:
这可能会持续很长一段时间(>50年),所以我不想只创建100条记录。即。,我不想这么做:
EFF_DT XPIR_DT VALUE
2000-05-01 2000-10-31 100
2000-11-01 2001-04-30 90
2001-05-01 2001-10-31 100
2001-11-01 2002-04-30 90
2002-05-01 2002-10-31 100
2002-11-01 2003-04-30 90
...
2049-05-01 2049-10-31 100
2049-11-01 2050-04-30 90
2050-05-01 2050-10-31 100
2050-11-01 2051-04-30 90
These values may also change with time. Values before 2000 might have been constant (no flip-flopping) and values for the coming decade may be different than the values for the last:
这些值也可能随着时间而改变。2000年以前的价值可能是不变的(没有变动),未来十年的价值可能与上一个十年的价值不同:
EFF_DT XPIR_DT REPEATABLE VALUE
1995-01-01 2000-04-30 false 85
2000-05-01 2010-04-30 true 100
2000-11-01 2010-10-31 true 90
2010-05-01 (null) true 120
2010-11-01 (null) true 115
We already have a text file (from a legacy app) that stores data in a form very close to this, so there are benefits to adhering to this type of structure as closely as possible.
我们已经有了一个文本文件(来自遗留应用程序),它以一种与之非常接近的形式存储数据,所以尽可能地坚持这种类型的结构是有好处的。
The question then comes on retrieval: which value would apply to today, 2010-03-09?
然后问题就来了:什么价值将适用于今天,2010-03-09?
It seems that the best way to do this would be to find the most recent instance of each effective date (of all the active rows), then see which is the greatest.
似乎最好的方法是找到每个有效日期的最近实例(所有活动行的最新实例),然后看看哪个是最好的。
EFF_DT MOST_RECENT XPIR_DT VALUE
2000-05-01 2009-05-01 2010-04-30 100
2000-11-01 2009-11-01 2010-10-31 90
The value for today would be 90, since 2009-11-01 is later than 2009-05-01.
今天的价格是90,因为2009-11-01晚于2009-05-01。
On, say, 2007-06-20:
说,2007-06-20:
EFF_DT MOST_RECENT XPIR_DT VALUE
2000-05-01 2007-05-01 2010-04-30 100
2000-11-01 2006-11-01 2010-10-31 90
The value would be 100 since 2007-05-01 is later than 2006-11-01.
由于2007-05-01年晚于2006-11-01年,所以价值为100。
Using the MySQL date functions, what's the most efficient way to calculate the MOST_RECENT
field?
使用MySQL date函数,计算most _recent字段最有效的方法是什么?
Or, can anyone think of a better way to do this?
或者,有人能想出更好的方法来做这件事吗?
The language is Java, if it matters. Thanks all!
如果重要的话,语言就是Java。感谢所有!
3 个解决方案
#1
2
Suppose your wanted 'date' is '2007-06-20'.
假设你想要的“约会”是“2007-06-20”。
You need to combine the non-repeating elements with the repeating ones, so you could do something like this (untested and probably needs some thinkering, but should give you the general idea):
你需要把非重复的元素和重复的元素结合起来,这样你就可以做这样的事情(未经测试,可能需要一些思考,但应该给你一个大概的想法):
select * from (
select * from mytable
where
repeatable = false
and
EFF_DT <= '2007-06-20' < XPIR_DT
union all
select * from mytable
where
repeatable = true
and EFF_DT <= str_to_date(concat("2007", "-", month(EFF_DT), "-", day(EFF_DT)), "%Y-%m-%d") < XPIR_DT
)
order by EFF_DT desc limit 1
#2
1
I've had to do similar things with recurring appointments & events, and you might find that MySQL will be a lot happier with the "static" date style that you don't want - each recurring instance spelled out in hundreds of rows.
我不得不对重复的约会和事件做类似的事情,您可能会发现MySQL会更喜欢您不希望的“静态”日期样式——每个重复的实例以数百行表示。
If possible, I'd consider creating a separate table to store them flattened out, while keeping the effective/expires dates where they are (to match legacy data & act as a parent), and a 1:many relation between the two tables (i.e. an "event_id" on the flattened data referencing the original's PK). Writing all those records will obviously take longer, but it's directly lightening the load from reading them (where things generally need to be faster).
如果可能的话,我将考虑创建一个单独的表来存储它们,同时保留它们所在的有效/过期日期(以匹配遗留数据并作为父表),以及两个表之间的1:多个关系(即引用原始PK的扁平数据上的“event_id”)。写所有这些记录显然需要更长的时间,但这直接减轻了读取记录的负担(在这些地方通常需要更快)。
Creating a stored procedure or external program to handle recalculating a flat start_date / end_date / value table should be fairly basic, given a common interval. Querying the data could then be as simple as WHERE @somedate BETWEEN start_date AND end_date
, instead of increasingly complex conversions & date math.
创建一个存储过程或外部程序来处理重新计算一个扁平的start_date / end_date / value表应该是相当基本的,给定一个公共间隔。查询数据可以简单到start_date和end_date之间的@somedate,而不是越来越复杂的转换和日期计算。
Again, INSERTs and UPDATEs will be slower, but "hundreds of rows" isn't even scratching the surface of what MySQL's capable of. If it's just 2 dates, an int, and some sort of int key, writing a few hundred records shouldn't take but a couple seconds on a sub-par server. If we were talking millions of records then maybe something could be tweaked (do you really need to track 50 years ahead or just the next 5? can recalculation be moved to off-peak times via cron? etc), but even then MySQL will just be that much more effective compared to calculating the difference every time.
同样,插入和更新将会更慢,但是“数百行”甚至还没有触及MySQL的能力。如果只有两个日期,一个int类型的键,写几百条记录在一个低于par的服务器上只需要几秒钟。如果我们说的是数以百万计的记录,那么可能会有一些改变(你真的需要追踪50年还是仅仅追踪下一个5年?)重新计算可以通过cron转移到非高峰时间吗?等等),但即使是这样,MySQL也会比每次计算差异时更加有效。
Also maybe of interest: What's the best way to model recurring events in a calendar application? & Data structure for storing recurring events?
同样值得关注的是:对日历应用程序中的重复事件建模的最佳方式是什么?和数据结构存储重复事件?
#3
0
Here is a query that you can use to calculate the more recent EFF_DT for a data set. You will have to fill in there where clause because i'm not sure how this data is organized.
这里有一个查询,您可以使用它来计算一个数据集最近的EFF_DT,您必须填写where子句,因为我不确定这个数据是如何组织的。
select EFF_DT form date_table where 1 order by EFF_DT desc limit 1
The flip flop of 90 and 100 is more complex, but you should be able to take care of this using the mysql data and time functions. This is a tricky one, and I'm not 100% on what you are trying to do. But, this query checks to see if the month of XPIR_DT is greater than May (the 5th month) but less than November (The 11th month). If this is true then the sql query will return 90, if its false then you'll get 100.
90和100的触发器比较复杂,但是您应该能够使用mysql数据和时间函数来处理这个问题。这是一个棘手的问题,我并不是100%的关注你想要做的事情。但是,该查询检查XPIR_DT月份是否大于5月份(第5个月),而小于11月份(第11个月)。如果这是真的,那么sql查询将返回90,如果它是假的,那么您将得到100。
select if((month(XPIR_DT)>=5) and (month(XPIR_DT)<11),90,100) from date_table where id=1
#1
2
Suppose your wanted 'date' is '2007-06-20'.
假设你想要的“约会”是“2007-06-20”。
You need to combine the non-repeating elements with the repeating ones, so you could do something like this (untested and probably needs some thinkering, but should give you the general idea):
你需要把非重复的元素和重复的元素结合起来,这样你就可以做这样的事情(未经测试,可能需要一些思考,但应该给你一个大概的想法):
select * from (
select * from mytable
where
repeatable = false
and
EFF_DT <= '2007-06-20' < XPIR_DT
union all
select * from mytable
where
repeatable = true
and EFF_DT <= str_to_date(concat("2007", "-", month(EFF_DT), "-", day(EFF_DT)), "%Y-%m-%d") < XPIR_DT
)
order by EFF_DT desc limit 1
#2
1
I've had to do similar things with recurring appointments & events, and you might find that MySQL will be a lot happier with the "static" date style that you don't want - each recurring instance spelled out in hundreds of rows.
我不得不对重复的约会和事件做类似的事情,您可能会发现MySQL会更喜欢您不希望的“静态”日期样式——每个重复的实例以数百行表示。
If possible, I'd consider creating a separate table to store them flattened out, while keeping the effective/expires dates where they are (to match legacy data & act as a parent), and a 1:many relation between the two tables (i.e. an "event_id" on the flattened data referencing the original's PK). Writing all those records will obviously take longer, but it's directly lightening the load from reading them (where things generally need to be faster).
如果可能的话,我将考虑创建一个单独的表来存储它们,同时保留它们所在的有效/过期日期(以匹配遗留数据并作为父表),以及两个表之间的1:多个关系(即引用原始PK的扁平数据上的“event_id”)。写所有这些记录显然需要更长的时间,但这直接减轻了读取记录的负担(在这些地方通常需要更快)。
Creating a stored procedure or external program to handle recalculating a flat start_date / end_date / value table should be fairly basic, given a common interval. Querying the data could then be as simple as WHERE @somedate BETWEEN start_date AND end_date
, instead of increasingly complex conversions & date math.
创建一个存储过程或外部程序来处理重新计算一个扁平的start_date / end_date / value表应该是相当基本的,给定一个公共间隔。查询数据可以简单到start_date和end_date之间的@somedate,而不是越来越复杂的转换和日期计算。
Again, INSERTs and UPDATEs will be slower, but "hundreds of rows" isn't even scratching the surface of what MySQL's capable of. If it's just 2 dates, an int, and some sort of int key, writing a few hundred records shouldn't take but a couple seconds on a sub-par server. If we were talking millions of records then maybe something could be tweaked (do you really need to track 50 years ahead or just the next 5? can recalculation be moved to off-peak times via cron? etc), but even then MySQL will just be that much more effective compared to calculating the difference every time.
同样,插入和更新将会更慢,但是“数百行”甚至还没有触及MySQL的能力。如果只有两个日期,一个int类型的键,写几百条记录在一个低于par的服务器上只需要几秒钟。如果我们说的是数以百万计的记录,那么可能会有一些改变(你真的需要追踪50年还是仅仅追踪下一个5年?)重新计算可以通过cron转移到非高峰时间吗?等等),但即使是这样,MySQL也会比每次计算差异时更加有效。
Also maybe of interest: What's the best way to model recurring events in a calendar application? & Data structure for storing recurring events?
同样值得关注的是:对日历应用程序中的重复事件建模的最佳方式是什么?和数据结构存储重复事件?
#3
0
Here is a query that you can use to calculate the more recent EFF_DT for a data set. You will have to fill in there where clause because i'm not sure how this data is organized.
这里有一个查询,您可以使用它来计算一个数据集最近的EFF_DT,您必须填写where子句,因为我不确定这个数据是如何组织的。
select EFF_DT form date_table where 1 order by EFF_DT desc limit 1
The flip flop of 90 and 100 is more complex, but you should be able to take care of this using the mysql data and time functions. This is a tricky one, and I'm not 100% on what you are trying to do. But, this query checks to see if the month of XPIR_DT is greater than May (the 5th month) but less than November (The 11th month). If this is true then the sql query will return 90, if its false then you'll get 100.
90和100的触发器比较复杂,但是您应该能够使用mysql数据和时间函数来处理这个问题。这是一个棘手的问题,我并不是100%的关注你想要做的事情。但是,该查询检查XPIR_DT月份是否大于5月份(第5个月),而小于11月份(第11个月)。如果这是真的,那么sql查询将返回90,如果它是假的,那么您将得到100。
select if((month(XPIR_DT)>=5) and (month(XPIR_DT)<11),90,100) from date_table where id=1