Excel - 计算跨多行传播的时间数据的持续时间

时间:2021-05-22 21:31:34

I have a spreadsheet with a dataset of a number of transactions, each of which is composed of substeps, each of which has the time that it occurred. There can be a variable number and order of steps.

我有一个电子表格,其中包含许多事务的数据集,每个事务都由子步骤组成,每个子步骤都有它发生的时间。可以有可变数量和步骤顺序。

I'd like to find the duration of each transaction. If I can do this in Excel then great, as it's already in that format. If there isn't a straight-forward way to do this in Excel, I'll load it into a database and do the analysis with SQL. If there is an Excel way round this it'll save a few hours setup though :)

我想找到每笔交易的持续时间。如果我可以在Excel中执行此操作那么很棒,因为它已经采用该格式。如果在Excel中没有直接的方法来执行此操作,我将把它加载到数据库中并使用SQL进行分析。如果有一个Excel方式围绕这个它将节省几个小时设置虽然:)

A simplified example of my data is as follows:

我的数据的简化示例如下:

TransID, Substep, Time
1, step A, 15:00:00
1, step B, 15:01:00
1, step C, 15:02:00
2, step B, 15:03:00
2, step C, 15:04:00
2, step E, 15:05:00
2, step F, 15:06:00
3, step C, 15:07:00
3, step D, 15:08:00
etc.

TransID,子步,时间1,步骤A,15:00:00 1,步骤B,15:01:00 1,步骤C,15:02:00 2,步骤B,15:03:00,步骤C, 15:04:00 2,步骤E,15:05:00 2,步骤F,15:06:00 3,步骤C,15:07:00 3,步骤D,15:08:00等

I'd like to produce a result set as follows:

我想生成如下结果集:

TransID, Duration
1, 00:02:00
2, 00:03:00
3, 00:01:00
etc.

TransID,持续时间1,00:02:00,00:03:00,00:01:00等

My initial try was with an extra column with a formula subtracting end time from start time, but without a repeating number of steps, or the same start and end steps I'm having difficulty seeing how this formula would work.

我最初的尝试是使用一个额外的列,其中公式从开始时间减去结束时间,但没有重复的步骤数,或者相同的开始和结束步骤,我很难看到这个公式如何工作。

I've also tried creating a pivot table based on this data with ID as the rows and Time as the data. I can change the field settings on the time data to return grouped values such as count or max, but am struggling to see how this can be setup to show max(time) - min(time) for each ID, hence why I'm thinking about heading to SQL. If anyone can point out anything obvious I'm missing though, I'd be very grateful.

我还尝试根据这些数据创建一个数据透视表,其中ID为行,Time为数据。我可以更改时间数据上的字段设置以返回分组值,例如count或max,但我很难看到如何设置它来显示每个ID的最大(时间) - 最小(时间),因此我为什么考虑前往SQL。如果有人能指出我遗失的任何明显的东西,我会非常感激。

As suggested by Hobbo, I've now used a pivot table with TransID as the rows and twice added Time as the data. After setting the field settings on the Time to Max on the first and Min on the second, a formula can be added just outside the pivot table to calculate the differences. One thing I'd been overlooking here is that the same value can be added to the data section more than once!

正如Hobbo所建议的,我现在使用带有TransID的数据透视表作为行,并且两次添加Time作为数据。在第一个时间设置为最大值并在第二个时间设置最小值时,可以在数据透视表外部添加公式以计算差异。我在这里忽略的一件事是,可以不止一次将相同的值添加到数据部分!

A follow-on problem was that the formula I add is of the form =GETPIVOTDATA("Max of Time",$A$4,"ID",1)-GETPIVOTDATA("Min of Time",$A$4,"ID",1), whici doesn't then increment when copying and pasting. Solutions to this are to either use the pivot table toolbar to turn off GETPIVOTDATA formulae, or rather than clicking on the pivot table when selecting cells in the formula, type the cell references instead (e.g. =H4-G4)

后续问题是我添加的公式是格式= GETPIVOTDATA(“最大时间”,$ A $ 4,“ID”,1)-GETPIVOTDATA(“最小时间”,$ A $ 4,“ID”) ,1),当复制和粘贴时,whici不会增加。对此的解决方案是使用数据透视表工具栏关闭GETPIVOTDATA公式,或者在公式中选择单元格时单击数据透视表,而是键入单元格引用(例如= H4-G4)

6 个解决方案

#1


1  

You were on the right lines with pivot tables. Drag in TransID as a row field then drag in two copies of Time as data fields in the pivot table; right click on each and specify Min as the summarization function for one and Max for the other. To the right of the pivot table add a formula to calculate the difference.

您使用数据透视表位于正确的位置。在TransID中作为行字段拖动,然后将两个时间副本作为数据字段拖动到数据透视表中;右键单击每个并指定Min作为一个的汇总函数,将Max指定为另一个。在数据透视表的右侧添加公式以计算差异。

alt text http://img296.imageshack.us/img296/5866/pivottableey5.jpg

alt text http://img296.imageshack.us/img296/5866/pivottableey5.jpg

"Looks good, the only problem I have is that the formula I add is of the the form =GETPIVOTDATA("Max of Time, $A$4, "ID", 1) - GETPIVOTDATA("Max of Time, $A$4, "ID", 1). When I copy that to the cells below, the 1 doesn't update to 2, 3 etc so they all show the same time. – Kris Coverdale "

“看起来很好,我唯一的问题是我添加的公式是形式= GETPIVOTDATA(”最长时间,$ A $ 4,“ID”,1) - GETPIVOTDATA(“最长时间,$ A $ 4, “ID”,1)。当我将其复制到下面的单元格时,1不会更新为2,3等所以它们都显示相同的时间。 - Kris Coverdale“

Use this button on the pivot table toolbar to switch GETPIVOTDATA formulae off.

使用数据透视表工具栏上的此按钮可以关闭GETPIVOTDATA公式。

alt text http://img117.imageshack.us/img117/9937/pivottabletoolbarjn3.jpg

替代文字http://img117.imageshack.us/img117/9937/pivottabletoolbarjn3.jpg

#2


2  

In your formula "GETPIVOTDATA("Max of Time, $A$4, "ID", 1) - GETPIVOTDATA("Max of Time, $A$4, "ID", 1)' the cell references are addressed between the symbol "$'. For example $A$4. When the cell references having $ symbol and you copy the formula to other cell then reference cells are not updated automatically. Hence you get the same type.

在您的公式“GETPIVOTDATA(”最大时间,$ A $ 4,“ID”,1) - GETPIVOTDATA(“最长时间,$ A $ 4,”ID“,1)'单元格引用在符号”$“之间寻址”。例如$ A $ 4。当单元格引用具有$符号并将公式复制到其他单元格时,则不会自动更新引用单元格。因此你得到相同的类型。

Perhaps you modify the formula as follows and then copy the formula to other cells. The formula should be like:

也许您修改公式如下,然后将公式复制到其他单元格。公式应该是这样的:

"GETPIVOTDATA("Max of Time, A4, "ID", 1) - GETPIVOTDATA("Max of Time, A4, "ID", 1)".

“GETPIVOTDATA(”最长时间,A4,“ID”,1) - GETPIVOTDATA(“最长时间,A4,”ID“,1)”。

Thanks.

#3


1  

Maybe something as simple as a query like this.

也许就像这样的查询一样简单。

SELECT TransID, DateDiff(mi, Min(Time),Max(Time)) AS Duration
FROM MyTable
GROUP BY TrandID

#4


1  

In excel:

  A     B        C
1 1, step A, 15:00:00
2 1, step B, 15:01:00
3 1, step C, 15:02:00
4 2, step B, 15:03:00
5 2, step C, 15:04:00
6 2, step E, 15:05:00
7 2, step F, 15:06:00
8 3, step C, 15:07:00
9 3, step D, 15:08:00

11 1, =max(if($A$1:$A$9=$A11,$C$1:$C$9,"")-min(if($A$1:$A$9=$A11,$C$1:$C$9,"")
12 2, =max(if($A$1:$A$9=$A12,$C$1:$C$9,"")-min(if($A$1:$A$9=$A12,$C$1:$C$9,"")

note: formulas are array functions so press ctrl-shift-enter after editing them.

注意:公式是数组函数,所以在编辑后按ctrl-shift-enter。

#5


1  

To add to Kibbee's post, in reference to the comment, you can use ADO with Excel:

要添加到Kibbee的帖子,参考评论,您可以在Excel中使用ADO:

'From: http://support.microsoft.com/kb/246335 '

strFile = Workbooks(1).FullName
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
    & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")

cn.Open strCon

strSQL = "SELECT TransID, DateDiff('n', Min([MyTime]),Max([MyTime])) AS Duration " _
         & "FROM [Sheet1$] GROUP BY TransID"

rs.Open strSQL, cn

'Write out to another sheet '
Worksheets(2).Cells(2, 1).CopyFromRecordset rs

EDIT: I have corrected some errors in the original post and changed the name of the time column to MyTime. Time is a reserved word in SQL and causes difficulties in queries. This now works on a very simple test.

编辑:我已更正原始帖子中的一些错误,并将时间列的名称更改为MyTime。时间是SQL中的保留字,会导致查询困难。现在这可以用于非常简单的测试。

#6


0  

Sometimes it is possible to do something once in Excel far more easily than it is to do something repeatably.

有时可以在Excel中执行一次比重复执行操作更容易。

Assuming you are just trying to get the answer once or twice, and then throw away the spreadsheet (as opposed to run it every night, or give it to someone else to run), here's how I would do it.

假设您只是试图获得一次或两次答案,然后丢弃电子表格(而不是每晚运行它,或者让其他人运行),我就是这样做的。

I assume your raw data is in columns A, B and C, with headings in row 1, and data starting in row 2.

我假设你的原始数据在A,B和C列中,标题在第1行,数据从第2行开始。

Sort the table by TransId as your primary key, and Time as your secondary, both ascending. (The following won't work if this isn't done.)

通过TransId将表作为主键进行排序,将Time作为次要键进行排序,均为升序。 (如果不这样做,以下内容将无效。)

Add a new column, D, titled Duration with a formula that like this (Excel formulae haven't formatting or comments; I have added those to help explain, but they need to be stripped out):

添加一个标题为“持续时间”的新列,其中包含这样的公式(Excel公式没有格式化或注释;我已添加了这些列以帮助解释,但需要将它们删除):

=IF(B2=B3,           // if this row's TransId is the same as the next one
    "",              // leave this field blank
    C3-              // else find the difference between the last timestamp and...
     VLOOKUP(        // look for the first value
        A2,          // matching this TransId
        A:C,         // within the entire table,
        3)           // Return the value in the third column - i.e. timestamp
    )

Now the data you want is in column D, but not in the format you want.

现在,您想要的数据位于D列,但不是您想要的格式。

Select Columns A-D and copy them. Use Paste Special to copy the values only into a new worksheet.

选择列A-D并复制它们。使用“选择性粘贴”仅将值复制到新工作表中。

Delete column B and column C in the new worksheet, so all is left is TransID and Duration.

删除新工作表中的B列和C列,所以剩下的就是TransID和Duration。

Sort by Duration, to bring all the rows with values next to each other.

按持续时间排序,以使所有行的值彼此相邻。

Sort only the rows with values by TransId.

通过TransId仅对具有值的行进行排序。

Voila, and there is your solution! Hope you don't need to repeat this!

瞧,有你的解决方案!希望你不需要重复这个!

p.s. This is untested

附:这是未经测试的

#1


1  

You were on the right lines with pivot tables. Drag in TransID as a row field then drag in two copies of Time as data fields in the pivot table; right click on each and specify Min as the summarization function for one and Max for the other. To the right of the pivot table add a formula to calculate the difference.

您使用数据透视表位于正确的位置。在TransID中作为行字段拖动,然后将两个时间副本作为数据字段拖动到数据透视表中;右键单击每个并指定Min作为一个的汇总函数,将Max指定为另一个。在数据透视表的右侧添加公式以计算差异。

alt text http://img296.imageshack.us/img296/5866/pivottableey5.jpg

alt text http://img296.imageshack.us/img296/5866/pivottableey5.jpg

"Looks good, the only problem I have is that the formula I add is of the the form =GETPIVOTDATA("Max of Time, $A$4, "ID", 1) - GETPIVOTDATA("Max of Time, $A$4, "ID", 1). When I copy that to the cells below, the 1 doesn't update to 2, 3 etc so they all show the same time. – Kris Coverdale "

“看起来很好,我唯一的问题是我添加的公式是形式= GETPIVOTDATA(”最长时间,$ A $ 4,“ID”,1) - GETPIVOTDATA(“最长时间,$ A $ 4, “ID”,1)。当我将其复制到下面的单元格时,1不会更新为2,3等所以它们都显示相同的时间。 - Kris Coverdale“

Use this button on the pivot table toolbar to switch GETPIVOTDATA formulae off.

使用数据透视表工具栏上的此按钮可以关闭GETPIVOTDATA公式。

alt text http://img117.imageshack.us/img117/9937/pivottabletoolbarjn3.jpg

替代文字http://img117.imageshack.us/img117/9937/pivottabletoolbarjn3.jpg

#2


2  

In your formula "GETPIVOTDATA("Max of Time, $A$4, "ID", 1) - GETPIVOTDATA("Max of Time, $A$4, "ID", 1)' the cell references are addressed between the symbol "$'. For example $A$4. When the cell references having $ symbol and you copy the formula to other cell then reference cells are not updated automatically. Hence you get the same type.

在您的公式“GETPIVOTDATA(”最大时间,$ A $ 4,“ID”,1) - GETPIVOTDATA(“最长时间,$ A $ 4,”ID“,1)'单元格引用在符号”$“之间寻址”。例如$ A $ 4。当单元格引用具有$符号并将公式复制到其他单元格时,则不会自动更新引用单元格。因此你得到相同的类型。

Perhaps you modify the formula as follows and then copy the formula to other cells. The formula should be like:

也许您修改公式如下,然后将公式复制到其他单元格。公式应该是这样的:

"GETPIVOTDATA("Max of Time, A4, "ID", 1) - GETPIVOTDATA("Max of Time, A4, "ID", 1)".

“GETPIVOTDATA(”最长时间,A4,“ID”,1) - GETPIVOTDATA(“最长时间,A4,”ID“,1)”。

Thanks.

#3


1  

Maybe something as simple as a query like this.

也许就像这样的查询一样简单。

SELECT TransID, DateDiff(mi, Min(Time),Max(Time)) AS Duration
FROM MyTable
GROUP BY TrandID

#4


1  

In excel:

  A     B        C
1 1, step A, 15:00:00
2 1, step B, 15:01:00
3 1, step C, 15:02:00
4 2, step B, 15:03:00
5 2, step C, 15:04:00
6 2, step E, 15:05:00
7 2, step F, 15:06:00
8 3, step C, 15:07:00
9 3, step D, 15:08:00

11 1, =max(if($A$1:$A$9=$A11,$C$1:$C$9,"")-min(if($A$1:$A$9=$A11,$C$1:$C$9,"")
12 2, =max(if($A$1:$A$9=$A12,$C$1:$C$9,"")-min(if($A$1:$A$9=$A12,$C$1:$C$9,"")

note: formulas are array functions so press ctrl-shift-enter after editing them.

注意:公式是数组函数,所以在编辑后按ctrl-shift-enter。

#5


1  

To add to Kibbee's post, in reference to the comment, you can use ADO with Excel:

要添加到Kibbee的帖子,参考评论,您可以在Excel中使用ADO:

'From: http://support.microsoft.com/kb/246335 '

strFile = Workbooks(1).FullName
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
    & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")

cn.Open strCon

strSQL = "SELECT TransID, DateDiff('n', Min([MyTime]),Max([MyTime])) AS Duration " _
         & "FROM [Sheet1$] GROUP BY TransID"

rs.Open strSQL, cn

'Write out to another sheet '
Worksheets(2).Cells(2, 1).CopyFromRecordset rs

EDIT: I have corrected some errors in the original post and changed the name of the time column to MyTime. Time is a reserved word in SQL and causes difficulties in queries. This now works on a very simple test.

编辑:我已更正原始帖子中的一些错误,并将时间列的名称更改为MyTime。时间是SQL中的保留字,会导致查询困难。现在这可以用于非常简单的测试。

#6


0  

Sometimes it is possible to do something once in Excel far more easily than it is to do something repeatably.

有时可以在Excel中执行一次比重复执行操作更容易。

Assuming you are just trying to get the answer once or twice, and then throw away the spreadsheet (as opposed to run it every night, or give it to someone else to run), here's how I would do it.

假设您只是试图获得一次或两次答案,然后丢弃电子表格(而不是每晚运行它,或者让其他人运行),我就是这样做的。

I assume your raw data is in columns A, B and C, with headings in row 1, and data starting in row 2.

我假设你的原始数据在A,B和C列中,标题在第1行,数据从第2行开始。

Sort the table by TransId as your primary key, and Time as your secondary, both ascending. (The following won't work if this isn't done.)

通过TransId将表作为主键进行排序,将Time作为次要键进行排序,均为升序。 (如果不这样做,以下内容将无效。)

Add a new column, D, titled Duration with a formula that like this (Excel formulae haven't formatting or comments; I have added those to help explain, but they need to be stripped out):

添加一个标题为“持续时间”的新列,其中包含这样的公式(Excel公式没有格式化或注释;我已添加了这些列以帮助解释,但需要将它们删除):

=IF(B2=B3,           // if this row's TransId is the same as the next one
    "",              // leave this field blank
    C3-              // else find the difference between the last timestamp and...
     VLOOKUP(        // look for the first value
        A2,          // matching this TransId
        A:C,         // within the entire table,
        3)           // Return the value in the third column - i.e. timestamp
    )

Now the data you want is in column D, but not in the format you want.

现在,您想要的数据位于D列,但不是您想要的格式。

Select Columns A-D and copy them. Use Paste Special to copy the values only into a new worksheet.

选择列A-D并复制它们。使用“选择性粘贴”仅将值复制到新工作表中。

Delete column B and column C in the new worksheet, so all is left is TransID and Duration.

删除新工作表中的B列和C列,所以剩下的就是TransID和Duration。

Sort by Duration, to bring all the rows with values next to each other.

按持续时间排序,以使所有行的值彼此相邻。

Sort only the rows with values by TransId.

通过TransId仅对具有值的行进行排序。

Voila, and there is your solution! Hope you don't need to repeat this!

瞧,有你的解决方案!希望你不需要重复这个!

p.s. This is untested

附:这是未经测试的