Excel VBA,使用数组加速代码

时间:2021-09-29 21:23:49

thanks in advance for any help on this, I have a big spreadsheet I need to parse into other spreadsheets, and I have something working, albeit slowly. I've read that using arrays is a better approach, but I can't seem to get it working, I think I can pull the main spreadsheet into an array, but I can't operate on it like I want. Specifically, I can't grab certain rows from the main array and insert them into another array to copy into a target sheet at the end. Here are the original, working functions:

感谢您在这方面的帮助,我有一个大的电子表格,我需要解析到其他的电子表格,我有一些工作,尽管很慢。我读到过使用数组是一种更好的方法,但是我似乎不能让它工作,我认为我可以将主电子表格拖放到一个数组中,但是我不能像我想的那样操作它。具体地说,我不能从主数组中获取某些行并将它们插入到另一个数组中,以便在末尾复制到目标表中。这是原来的工作功能:

Private Function CopyValues(rngSource As Range, rngTarget As Range)

    rngTarget.Resize(rngSource.Rows.Count, rngSource.Columns.Count).Value = rngSource.Value

End Function

Private Function RESORT(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
Set i = Sheets(FROMSHEET)
Set e = Sheets(TOSHEET)


Dim d
Dim j
Dim q
d = 1
j = 2


e.Select
Cells.Select
Selection.Clear
i.Select
Rows(1).Copy
e.Select
Rows(1).PasteSpecial

Do Until IsEmpty(i.Range("G" & j))
    If i.Range(Column & j) = "Total" Then
        i.Select
        Rows(j).Copy
        e.Select
        Rows(2).PasteSpecial
        ' CopyValues i.Rows(j), e.Rows(2)
        Exit Do
    End If
    j = j + 1
Loop

d = 2
j = 2

Do Until IsEmpty(i.Range("G" & j))

    If i.Range(Column & j) = TOSHEET Or i.Range(Column & j) = EXTRA1 Or i.Range(Column & j) = EXTRA2 Or i.Range(Column & j) = EXTRA3 Then
        d = d + 1
        CopyValues i.Range(i.Cells(j, 1), i.Cells(j, 11)), e.Range(e.Cells(d, 1), e.Cells(d, 11)) 'e.Range("A" & d)

    ElseIf i.Range("A" & j) = e.Range("A" & d) And i.Range("I" & j) = "Total" Then
        d = d + 1
        e.Select
        Rows(2).Copy
        Rows(d).PasteSpecial
        ' CopyValues e.Rows(2), e.Rows(d)
    End If
    j = j + 1
Loop
e.Select
Rows(2).Delete
Range("A1").Select

End Function

So, I have two questions. First, am I correct that moving to arrays will speed this up? Second, how do I do the array stuff? Thanks! Here's sort of what I'm hacking on, many different attempts in there, I realize it's ugly:

我有两个问题。首先,移动到数组会加快这个速度吗?第二,我怎么做数组呢?谢谢!这是我在做的,很多不同的尝试,我意识到它很丑:

Private Function RESORT2(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
' Set i = Sheets(FROMSHEET)
' Set e = Sheets(TOSHEET)
Dim d
Dim j As Long
Dim i As Long
Dim k As Long

Dim myarray As Variant
Dim arrTO As Variant


d = 1

j = 1

      'myarray = Worksheets(FROMSHEET).Range("a1").Resize(10, 20)
      myarray = Worksheets(FROMSHEET).Range("a1:z220").Value 'Resize(10, 20)
      For i = 1 To UBound(myarray)
        If myarray(i, 9) = TOSHEET Then
        'arrTO = myarray
          '  Worksheets(TOSHEET).Range("A" & j).Resize(1, 20) = Application.WorksheetFunction.Transpose(myarray(i))
            Worksheets(TOSHEET).Range("A" & j).Value = Application.WorksheetFunction.Transpose(myarray)
         '   arrTO = j 'Application.WorksheetFunction.Index(myarray, 0, 1)

            j = j + 1

        End If

      Next
      Worksheets(TOSHEET).Range("a1").Resize(10, 20) = arrTO


End Function

===================================

First Edit

第一次编辑

OK, i tried cleaning up and a the following:

好吧,我试着清理一下,下面是:

Private Function RESORT(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
Set FRO = Sheets(FROMSHEET)
Set TOO = Sheets(TOSHEET)

Dim TOO_IND
Dim FRO_IND
Dim TotalRow

TotalRow = 2
TOO_IND = 2
FRO_IND = 2

TOO.Cells.Clear
TOO.Rows(1).Value = FRO.Rows(1).Value

Do Until IsEmpty(FRO.Range("G" & TotalRow))
    If FRO.Range(Column & TotalRow) = "Total" Then
        FRO.Select
        Rows(TotalRow).Copy
        TOO.Select
        Rows(2).PasteSpecial
        ' CopyValues FRO.Rows(j), TOO.Rows(2)
        Exit Do
    End If
    TotalRow = TotalRow + 1
Loop

Do Until IsEmpty(FRO.Range("G" & FRO_IND))

    If FRO.Range(Column & FRO_IND) = TOSHEET Or FRO.Range(Column & FRO_IND) = EXTRA1 Or FRO.Range(Column & FRO_IND) = EXTRA2 Or FRO.Range(Column & FRO_IND) = EXTRA3 Then
        TOO_IND = TOO_IND + 1
        TOO.Rows(TOO_IND).Value = FRO.Rows(FRO_IND).Value
    ElseIf FRO.Range("A" & FRO_IND) = TOO.Range("A" & TOO_IND) And FRO.Range("I" & FRO_IND) = "Total" Then
        TOO_IND = TOO_IND + 1
        TOO.Select
        Rows(2).Copy
        Rows(TOO_IND).PasteSpecial
     '   TOO.Rows(TOO_IND).PasteSpecial = FRO.Rows(2).PasteSpecial  ' this isn't working, I need format and formula, if I just do .formula it doesn't work
    End If
    FRO_IND = FRO_IND + 1
Loop

TOO.Rows(2).Delete
'Range("A1").Select

End Function

So, while it looks cleaner and is more readable, it's actually slower (3.2s vs. 2.86s on my smallest sample set).

所以,虽然它看起来更干净,可读性更好,但实际上它更慢(在我最小的样本集中是3.2秒和2.86秒)。

I think the array is going to be the solution. I run this routine multiple times on the same sample set, but with different qualifiers, if in the main I dump the sample set into an array, then pass this array to this sort routine, I think it'll be faster. But I'm still not sure how to do my operations on arrays, specifically copying one row from array to array.

我认为这个数组就是解。我在同一个示例集上多次运行这个例程,但是使用不同的限定符,如果在主程序中我将示例集转储到一个数组中,然后将这个数组传递给这个排序例程,我认为它会更快。但是我仍然不确定如何对数组进行操作,特别是从数组到数组复制一行。

Thanks everyone, I'm going to keep at it!

谢谢大家,我会继续努力的!

==============================================================

Second Edit Ok, I'm much closer now! What once took ~133seconds, now only takes 10.51seconds!

第二次编辑好,我现在离你更近了!曾经花了13秒的时间,现在只需要10秒51秒!

Here's the latest, please let me know if there are ways to tweak this, I'm still trying to trim up some time. I have not yet coded anything to grab the array once and then pass the array to the RESORT function, I'm looking into that next to see if that will help speed things up.

这是最新的,请让我知道如果有办法调整,我仍在努力削减一些时间。我还没有编写任何代码来获取数组,然后将数组传递给RESORT函数,接下来我将研究它,看看这是否有助于加快速度。

Is there a way to copy the formula and the value into the same array? I don't like the way I do it, but it does work.

是否有办法将公式和值复制到同一个数组中?我不喜欢我做这件事的方式,但它确实有用。

Private Function RESORT(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
Set FRO = Sheets(FROMSHEET)
Set TOO = Sheets(TOSHEET)

Dim TotalRow

TotalRow = 2
TOO_IND = 2
FRO_IND = 2

Dim Col As Long
Dim FROM_Row As Long
Dim TO_Row As Long

Const NumCol = 25

Dim myarray As Variant
Dim myarrayform As Variant
Dim arrTO(1 To 1000, 1 To 2000)
Dim arrTotal(1 To 1, 1 To NumCol)

TO_Row = 2
myarray = Worksheets(FROMSHEET).Range("a1:z1000").Value
myarrayform = Worksheets(FROMSHEET).Range("a1:z1000").FormulaR1C1

TOO.Cells.Clear

For Col = 1 To NumCol
    arrTO(1, Col) = myarray(1, Col)
Next

For FROM_Row = 1 To UBound(myarray)
    If myarray(FROM_Row, Column) = "Total" Then
        For Col = 1 To NumCol
            arrTotal(1, Col) = myarrayform(FROM_Row, Col)
        Next
        Exit For
    End If
Next

For FROM_Row = 1 To UBound(myarray)
    If myarray(FROM_Row, Column) = TOSHEET Or myarray(FROM_Row, Column) = EXTRA1 Or myarray(FROM_Row, Column) = EXTRA2 Or myarray(FROM_Row, Column) = EXTRA3 Then
        For Col = 1 To NumCol
            arrTO(TO_Row, Col) = myarray(FROM_Row, Col)
        Next
        TO_Row = TO_Row + 1
    ElseIf myarray(FROM_Row, 1) = arrTO(TO_Row - 1, 1) And myarray(FROM_Row, Column) = "Total" Then
        For Col = 1 To NumCol
            arrTO(TO_Row, Col) = arrTotal(1, Col)
        Next
        TO_Row = TO_Row + 1
    End If
Next
Worksheets(TOSHEET).Range("a1").Resize(1000, 2000) = arrTO

End Function

Thanks for any and all help! Happy New Year!

谢谢大家的帮助!新年快乐!

3 个解决方案

#1


4  

Iterating over arrays in VBA will not necessarily be faster than iterating over the collection objects that your first method uses. The collections are likely implemented as linked lists, so for the purpose of starting at the beginning and looping over them, they will be equally as speedy as arrays.

在VBA中迭代数组并不一定比在第一个方法使用的集合对象上迭代要快。集合很可能是作为链表实现的,因此为了从一开始就开始并在其上循环,它们将与数组一样快速。

The high-level answer is that your sort algorithm will usually be vastly more important than your specific code details. That is, as long as your details don't somehow increase the complexity of running that algorithm.

高层次的回答是,排序算法通常比特定的代码细节重要得多。也就是说,只要您的细节不会增加运行该算法的复杂性。

In my experience, the best way to speed up VBA is to eschew all functions that have an effect on the UI. If your code moves around the selected cell, or switches the actively viewed sheet, etc, that is the biggest timesink. I think those functions Select, Copy(), and PasteSpecial() might be guilty of that. It would be better to store worksheet and range objects, and write directly to their cells as required. You do that in your 2nd method, and I think it is much more important than changing your data type.

根据我的经验,加快VBA的最好方法是避开所有对UI有影响的功能。如果您的代码在选定的单元格周围移动,或者切换活动查看表,等等,这是最大的时间墨水。我认为这些函数Select、Copy()和pastspecial()可能会为此感到内疚。最好是存储工作表和范围对象,并根据需要直接写入它们的单元格。在第二种方法中,我认为这比改变数据类型重要得多。

#2


1  

I agree with @Seth Battin, but have a few additional things to add.

我同意@Seth Battin,但还有一些其他的事情要补充。

While arrays can be faster, if you need to search them they do not scale well. The code you have written will iterate through your dataset n times (where n is the number of TOSHEETs you have). Also your code is writing data to the worksheet once for each row (which is time consuming), It is faster (but more code) to put all the data into a single 2D array and write once.

虽然数组可以更快,但如果需要搜索它们,它们就不能很好地伸缩。您所编写的代码将遍历您的数据集n次(其中n是您拥有的TOSHEETs的数量)。同样,您的代码为每一行(这很耗时)向工作表写入数据一次,将所有数据放入一个2D数组并编写一次,速度更快(但代码更多)。

A better program flow might be

更好的程序流可能是

Read each line of data

读取每一行数据。

Assign it to a data structure (I would use a scripting dictionary containing 2D arrays)

将它分配给数据结构(我将使用包含2D数组的脚本字典)

After all the data is read iterate the scripting dictionary outputting each 2D array

在读取所有数据之后,迭代脚本字典输出每个2D数组

This will minimize both reads and writes to the spreadsheet which is where the preformance bottlenecks are for this type of vba program.

这将最小化对电子表格的读写,电子表格是这类vba程序的性能瓶颈所在。

#3


0  

Yes. You would definitely speed up your code by using arrays instead of collections of cells. This is because accessing the properties of the objects takes time.

是的。您肯定会使用数组而不是单元格集合来加速代码。这是因为访问对象的属性需要时间。

Honestly though, your code would likely not benefit very much from using arrays as your code is more reasonably modified by eliminating unnecessary loops.

老实说,您的代码很可能不会从使用数组中获益,因为通过消除不必要的循环,您的代码得到了更合理的修改。

I've re-written the beginning of your RESORT function in a more Excel centric way avoiding some of the pitfalls like selects. I'd also suggest trying to use variable names that are meaningful, especially for objects.

我以一种更以Excel为中心的方式重写了RESORT函数的开头,避免了一些陷阱,如选择。我还建议尝试使用有意义的变量名,特别是对于对象。

OPTION EXPLICIT
Private Function RESORT(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
'Actually indicate variable types.
dim i as worksheet, dim e as worksheet
dim searchRange as Range

Set i = Sheets(FROMSHEET)
Set e = Sheets(TOSHEET)


Dim d as long
Dim j as long
dim lastRow as long 'Using a meaningful variable name
d = 1
j = 2

'I'm assuming you were using PasteSpecial because you only want values.
'I removed your unnecessary selects
e.Cells.Clear
'Move values directly instead of copy paste
i.Rows(1).value = e.Rows(1).value

'Check the first range
If Not IsEmpty(.Range("G" & j)) Then
    'Determine the last row to check.
    'This would break if j is equivalent to the last possible row... 
    'but only an example
    If IsEmpty(.Range("G" & j+1) then
        lastRow = j
    else 
        lastrow = i.Range("G" & j).End(xlDown).Row
    end if
    'Get the search Range
    'We might have used arrays here but it's less complicated to 
    ' use built in functions.
    Set searchRange = i.Range(i.Range(Column & j), _
                      i.Range(Column, lastrow).Find("Total"))
    If Not (searchRange Is Nothing) Then
        'Copy the values of the found row.
        e.Rows(2).value = searchRange.EntireRow.value
    End If
End If

After doing that I realize that the part that might more reasonably use arrays is after where I stopped. If you want to use arrays here, what you need to do is effectively copy all of the relevant area to an array and then reference the array the same way that you would reference cells.

在这样做之后,我意识到可能更合理地使用数组的部分是在我停止的地方。如果你想在这里使用数组,你需要做的就是将所有相关的区域复制到一个数组中,然后像引用单元格一样引用这个数组。

For Example:

例如:

myArray = i.Range("A1:B10")
MsgBox myArray(10, 2) 'Displays value of B10 (10th row, 2nd column)
MsgBox i.Cells(10, 2) 'Displays value of B10 (10th row, 2nd column)

#1


4  

Iterating over arrays in VBA will not necessarily be faster than iterating over the collection objects that your first method uses. The collections are likely implemented as linked lists, so for the purpose of starting at the beginning and looping over them, they will be equally as speedy as arrays.

在VBA中迭代数组并不一定比在第一个方法使用的集合对象上迭代要快。集合很可能是作为链表实现的,因此为了从一开始就开始并在其上循环,它们将与数组一样快速。

The high-level answer is that your sort algorithm will usually be vastly more important than your specific code details. That is, as long as your details don't somehow increase the complexity of running that algorithm.

高层次的回答是,排序算法通常比特定的代码细节重要得多。也就是说,只要您的细节不会增加运行该算法的复杂性。

In my experience, the best way to speed up VBA is to eschew all functions that have an effect on the UI. If your code moves around the selected cell, or switches the actively viewed sheet, etc, that is the biggest timesink. I think those functions Select, Copy(), and PasteSpecial() might be guilty of that. It would be better to store worksheet and range objects, and write directly to their cells as required. You do that in your 2nd method, and I think it is much more important than changing your data type.

根据我的经验,加快VBA的最好方法是避开所有对UI有影响的功能。如果您的代码在选定的单元格周围移动,或者切换活动查看表,等等,这是最大的时间墨水。我认为这些函数Select、Copy()和pastspecial()可能会为此感到内疚。最好是存储工作表和范围对象,并根据需要直接写入它们的单元格。在第二种方法中,我认为这比改变数据类型重要得多。

#2


1  

I agree with @Seth Battin, but have a few additional things to add.

我同意@Seth Battin,但还有一些其他的事情要补充。

While arrays can be faster, if you need to search them they do not scale well. The code you have written will iterate through your dataset n times (where n is the number of TOSHEETs you have). Also your code is writing data to the worksheet once for each row (which is time consuming), It is faster (but more code) to put all the data into a single 2D array and write once.

虽然数组可以更快,但如果需要搜索它们,它们就不能很好地伸缩。您所编写的代码将遍历您的数据集n次(其中n是您拥有的TOSHEETs的数量)。同样,您的代码为每一行(这很耗时)向工作表写入数据一次,将所有数据放入一个2D数组并编写一次,速度更快(但代码更多)。

A better program flow might be

更好的程序流可能是

Read each line of data

读取每一行数据。

Assign it to a data structure (I would use a scripting dictionary containing 2D arrays)

将它分配给数据结构(我将使用包含2D数组的脚本字典)

After all the data is read iterate the scripting dictionary outputting each 2D array

在读取所有数据之后,迭代脚本字典输出每个2D数组

This will minimize both reads and writes to the spreadsheet which is where the preformance bottlenecks are for this type of vba program.

这将最小化对电子表格的读写,电子表格是这类vba程序的性能瓶颈所在。

#3


0  

Yes. You would definitely speed up your code by using arrays instead of collections of cells. This is because accessing the properties of the objects takes time.

是的。您肯定会使用数组而不是单元格集合来加速代码。这是因为访问对象的属性需要时间。

Honestly though, your code would likely not benefit very much from using arrays as your code is more reasonably modified by eliminating unnecessary loops.

老实说,您的代码很可能不会从使用数组中获益,因为通过消除不必要的循环,您的代码得到了更合理的修改。

I've re-written the beginning of your RESORT function in a more Excel centric way avoiding some of the pitfalls like selects. I'd also suggest trying to use variable names that are meaningful, especially for objects.

我以一种更以Excel为中心的方式重写了RESORT函数的开头,避免了一些陷阱,如选择。我还建议尝试使用有意义的变量名,特别是对于对象。

OPTION EXPLICIT
Private Function RESORT(FROMSHEET As Variant, Column As Variant, TOSHEET As Variant, EXTRA1 As Variant, EXTRA2 As Variant, EXTRA3 As Variant)
'Actually indicate variable types.
dim i as worksheet, dim e as worksheet
dim searchRange as Range

Set i = Sheets(FROMSHEET)
Set e = Sheets(TOSHEET)


Dim d as long
Dim j as long
dim lastRow as long 'Using a meaningful variable name
d = 1
j = 2

'I'm assuming you were using PasteSpecial because you only want values.
'I removed your unnecessary selects
e.Cells.Clear
'Move values directly instead of copy paste
i.Rows(1).value = e.Rows(1).value

'Check the first range
If Not IsEmpty(.Range("G" & j)) Then
    'Determine the last row to check.
    'This would break if j is equivalent to the last possible row... 
    'but only an example
    If IsEmpty(.Range("G" & j+1) then
        lastRow = j
    else 
        lastrow = i.Range("G" & j).End(xlDown).Row
    end if
    'Get the search Range
    'We might have used arrays here but it's less complicated to 
    ' use built in functions.
    Set searchRange = i.Range(i.Range(Column & j), _
                      i.Range(Column, lastrow).Find("Total"))
    If Not (searchRange Is Nothing) Then
        'Copy the values of the found row.
        e.Rows(2).value = searchRange.EntireRow.value
    End If
End If

After doing that I realize that the part that might more reasonably use arrays is after where I stopped. If you want to use arrays here, what you need to do is effectively copy all of the relevant area to an array and then reference the array the same way that you would reference cells.

在这样做之后,我意识到可能更合理地使用数组的部分是在我停止的地方。如果你想在这里使用数组,你需要做的就是将所有相关的区域复制到一个数组中,然后像引用单元格一样引用这个数组。

For Example:

例如:

myArray = i.Range("A1:B10")
MsgBox myArray(10, 2) 'Displays value of B10 (10th row, 2nd column)
MsgBox i.Cells(10, 2) 'Displays value of B10 (10th row, 2nd column)