T-SQL从开始日期结束日期开始按月删除

时间:2021-07-15 08:51:06

I have an interesting query to do and am trying to find the best way to do it. Basically I have an absence table in our personnel database this records the staff id and then a start date and end date for the absence. End date being null if not yet entered (not returned). I cannot change the design.

我有一个有趣的查询要做,并试图找到最好的方法来做到这一点。基本上我在人事数据库中有一个缺席表,它记录了员工ID,然后是缺勤的开始日期和结束日期。如果尚未输入(未返回),则结束日期为空。我无法改变设计。

They would like a report by month on number of absences (12 month trend). With staff being off over the month change it obviously may be difficult to calculate.

他们希望按月报告缺勤人数(12个月趋势)。由于工作人员在一个月内发生变化,显然可能难以计算。

e.g. Staff off 25/11/08 to 05/12/08 (dd/MM/yy) I would want the days in November to go into the November count and the ones in December in the December count.

例如工作人员于2008年11月25日至2008年12月5日期间(年/月/日)我希望11月份的日子能够进入11月份的计数,12月份的那些日子会计入12月份。

I am currently thinking in order to count the number of days I need to separate the start and end date into a record for each day of the absence, assigning it to the month it is in. then group the data for reporting. As for the ones without an end date I would assume null is the current date as they are presently still absent.

我目前正在考虑计算将开始日期和结束日期分成缺席每一天的记录所需的天数,将其分配到它所在的月份。然后将数据分组以进行报告。至于没有结束日期的那些我会假设null是当前日期,因为它们目前仍然不存在。

What would be the best way to do this?

最好的方法是什么?

Any better ways?

有更好的方法吗?

Edit: This is SQL 2000 server currently. Hoping for an upgrade soon.

编辑:这是当前的SQL 2000服务器。希望尽快升级。

1 个解决方案

#1


I have had a similar issue where there has been a table of start/end dates designed for data storage but not for reporting.

我遇到过类似的问题,其中有一个开始/结束日期表,用于数据存储,但不用于报告。

I sought out the "fastest executing" solution and found that it was to create a 2nd table with the monthly values in there. I populated it with the months from Jan 2000 to Jan 2070. I'm expecting it will suffice or that I get a large pay cheque in 2070 to come and update it...

我找到了“执行速度最快”的解决方案,发现它是创建第二个表,其中包含每月的值。我填写的是从2000年1月到2070年1月的几个月。我预计它就足够了,或者我将在2070年获得大额支票并更新它......

DECLARE TABLE months (start DATETIME)
-- Populate with all month start dates that may ever be needed
-- And I would recommend indexing / primary keying by start

SELECT
    months.start,
    data.id,
    SUM(CASE WHEN data.start < months.start
            THEN DATEDIFF(DAY, months.start, data.end)
            ELSE DATEDIFF(DAY, data.start, DATEADD(month, 1, months.start))
        END) AS days
FROM
    data
INNER JOIN
    months
        ON data.start < DATEADD(month, 1, months.start)
        AND data.end > months.start
GROUP BY
   months.start,
   data.id

That join can be quite slow for various reasons, I'll search out another answer to another question to show why and how to optimise the join.

由于各种原因,这种加入可能会很慢,我会搜索另一个问题的另一个答案来说明为什么以及如何优化连接。

EDIT:

Here is another answer relating to overlapping date ranges and how to speed up the joins...

这是关于重叠日期范围以及如何加速连接的另一个答案......

Query max number of simultaneous events

查询最大并发事件数

#1


I have had a similar issue where there has been a table of start/end dates designed for data storage but not for reporting.

我遇到过类似的问题,其中有一个开始/结束日期表,用于数据存储,但不用于报告。

I sought out the "fastest executing" solution and found that it was to create a 2nd table with the monthly values in there. I populated it with the months from Jan 2000 to Jan 2070. I'm expecting it will suffice or that I get a large pay cheque in 2070 to come and update it...

我找到了“执行速度最快”的解决方案,发现它是创建第二个表,其中包含每月的值。我填写的是从2000年1月到2070年1月的几个月。我预计它就足够了,或者我将在2070年获得大额支票并更新它......

DECLARE TABLE months (start DATETIME)
-- Populate with all month start dates that may ever be needed
-- And I would recommend indexing / primary keying by start

SELECT
    months.start,
    data.id,
    SUM(CASE WHEN data.start < months.start
            THEN DATEDIFF(DAY, months.start, data.end)
            ELSE DATEDIFF(DAY, data.start, DATEADD(month, 1, months.start))
        END) AS days
FROM
    data
INNER JOIN
    months
        ON data.start < DATEADD(month, 1, months.start)
        AND data.end > months.start
GROUP BY
   months.start,
   data.id

That join can be quite slow for various reasons, I'll search out another answer to another question to show why and how to optimise the join.

由于各种原因,这种加入可能会很慢,我会搜索另一个问题的另一个答案来说明为什么以及如何优化连接。

EDIT:

Here is another answer relating to overlapping date ranges and how to speed up the joins...

这是关于重叠日期范围以及如何加速连接的另一个答案......

Query max number of simultaneous events

查询最大并发事件数