如何在VBA excel宏中进行regex搜索和替换?

时间:2021-01-10 16:49:44

I'd like to create a VBA macro that replaces all cells in a worksheet with text strings in a time format (regular expression):

我想创建一个VBA宏,用时间格式(正则表达式)的文本字符串替换工作表中的所有单元格:

(1[0-2]|[1-9]):[0-5][0-9]:[0-5][0-9] [AP]M

with the cell address and worksheet name. I think the call will be akin too:

使用单元地址和工作表名称。我认为这一呼吁也将是相似的:

 Cells.Replace What:="1:23:45 AM",    
    Replacement:="=cell(""filename"")&cell(""Address"")", _
    LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:= _
    False, ReplaceFormat:=False

But I'm hoping I can make the "What:=" argument a reg ex, or at least restricted to a time format.

但我希望我能把“What:=”的论点变成一个事实,或者至少限制在一个时间格式。

How would I go about this?

我该怎么做呢?


Test Data: Save the following in CSV format:

测试数据:保存以下CSV格式:

00:00,04:27,00:36,04:31,00:00
00:00,00:00,04:18,01:07,10:06
00:00,00:00,00:00,00:00,00:00

Eventually the macro will delete all the zero times, and replace the other times with static text that is the evaluated formula =cell("filename")&"!"&cell("address")

最终,宏将删除所有的零次,并使用静态文本替换其他的零次,静态文本是计算公式=cell(“filename”)和!


Result of acting on the above input file (I would be saving the sheets as XLSX):

执行上述输入文件的结果(我将以XLSX方式保存这些表):

     [    A    ]   [     B     ]  [     C     ]  [     D     ]  [     E     ]
[1]                'Sheet1!$B$1   'Sheet1!$C$1   'Sheet1!$D$1
[2]                               'Sheet1!$C$2   'Sheet1!$D$2   'Sheet1!$E$2
[3]

For brevity, I stripped out the directory and file name that the =cell("filename") function returns, although the above is what I really would like.

为了简单起见,我去掉了=cell(“filename”)函数返回的目录和文件名,尽管上面是我真正想要的。

3 个解决方案

#1


5  

I've updated my code formerly hosted here to

我已经更新了以前托管在这里的代码

  1. Remove any text fields in a user selected range that are '00:00
  2. 删除用户选择范围内的任何文本字段,这些字段都是“00:00”
  3. Replace any text "time fields" with the full path
  4. 用完整路径替换任何文本“time fields”

(nb: In the end the Regex is overkill as a cell test for a value betwen 0.0 and 1.0 would suffice given the actual data format)

(nb:考虑到实际的数据格式,Regex对于0.0和1.0的值的单元测试已经足够了)

如何在VBA excel宏中进行regex搜索和替换?如何在VBA excel宏中进行regex搜索和替换?

    'Press Alt + F11 to open the Visual Basic Editor (VBE)
    'From the Menu, choose Insert-Module.
    'Paste the code into the right-hand code window.
    'Press Alt + F11 to close the VBE
    'In Xl2003 Goto Tools … Macro … Macros and double-click KillTime  


    Sub KillTime()
    Dim rng1 As Range
    Dim rngArea As Range
    Dim lngRow As Long
    Dim lngCol As Long
    Dim lngCalc As Long
    Dim objReg As Object
    Dim strSht As String
    Dim X()

    On Error Resume Next
    Set rng1 = Application.InputBox("Select range for the replacement of leading zeros", "User select", Selection.Address, , , , , 8)
    If rng1 Is Nothing Then Exit Sub
    On Error GoTo 0

    strSht = ActiveWorkbook.Path & "\[" & ActiveWorkbook.Name & "]" & rng1.Parent.Name
    'remove '00:00
    rng1.Replace "00:00", vbNullString, xlWhole

    'See Patrick Matthews excellent article on using Regular Expressions with VBA
    Set objReg = CreateObject("vbscript.regexp")
    objReg.Pattern = "^0\.\d+$"    
     'Speed up the code by turning off screenupdating and setting calculation to manual
      'Disable any code events that may occur when writing to cells
    With Application
        lngCalc = .Calculation
        .ScreenUpdating = False
        .Calculation = xlCalculationManual
        .EnableEvents = False
    End With

    'Test each area in the user selected range

    'Non contiguous range areas are common when using SpecialCells to define specific cell types to work on
    For Each rngArea In rng1.Areas
        'The most common outcome is used for the True outcome to optimise code speed
        If rngArea.Cells.Count > 1 Then
           'If there is more than once cell then set the variant array to the dimensions of the range area
           'Using Value2 provides a useful speed improvement over Value. On my testing it was 2% on blank cells, up to 10% on non-blanks
            X = rngArea.Value2
            For lngRow = 1 To rngArea.Rows.Count
                For lngCol = 1 To rngArea.Columns.Count
                   If objReg.test(X(lngRow, lngCol)) Then X(lngRow, lngCol) = strSht & rngArea.Cells(1).Offset(lngRow - 1, lngCol - 1).Address(0, 0)
                Next lngCol
            Next lngRow
            'Dump the updated array back over the initial range
            rngArea.Value2 = X
        Else
            'caters for a single cell range area. No variant array required
               If objReg.test(rngArea.Value) Then rngArea.Value = strSht & rngArea.Address(0, 0)            
        End If
    Next rngArea

    'cleanup the Application settings
    With Application
        .ScreenUpdating = True
        .Calculation = lngCalc
        .EnableEvents = True
    End With

    Set objReg = Nothing
    End Sub

#2


5  

Since you are trying to replace a format, I'd do a replace based on a format. With a regex it seems like you'd be forced to deal with the underlying number.

由于您正在尝试替换一种格式,我将基于一种格式进行替换。使用regex,您似乎将*处理底层数字。

I tested this in XL 2003 and 2010:

我在XL 2003和2010年测试过这个

Sub ReplaceByFormat()
With ActiveSheet.Cells
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Application.FindFormat.NumberFormat = "h:mm AM/PM"
End With
End Sub

EDIT

编辑

First off I had a mistake above in placing the FindFormat at the end. It needs to be at the beginning (Doh).

首先,我在最后放置FindFormat时犯了一个错误。它需要在开始(Doh)。

There is no OR argument to the replace function. So below, I've just repeated the code for a second type of format.

替换函数没有参数。下面,我重复了第二种格式的代码。

This code assumes that the dates are all constants. If they are formulas you could fix with a find and replace in the vba. If they're a mix, you'll need to extend the code a bit:

这段代码假设日期都是常量。如果它们是公式,你可以在vba中找到并替换。如果它们是混合的,你需要扩展一下代码:

Sub ReplaceByFormat()

With ActiveSheet.Cells.SpecialCells(xlCellTypeConstants)
    Application.FindFormat.NumberFormat = "h:mm AM/PM"
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Application.FindFormat.NumberFormat = "m/d/yyyy"
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
End With
End Sub

#3


0  

Firstly, a better pattern would be:

首先,更好的模式是:

  /[0-2]?[0-9]:[0-5][0-9]:[0-5][0-9] [A|P]M/

Secondly, the actual code without anything specific to your problem would be as simple as (adapting to your scenario, off course):

第二,实际的代码没有针对您的问题的任何特定的代码就像(适应您的场景):

Set RegExp= CreateObject("VBScript.RegExp")
RegExp.Pattern = "[0-2]?[0-9]:[0-5][0-9]:[0-5][0-9] [A|P]M"
For i = ......
  Expr = Format(ActiveSheet.Cells(i, 1).Value, ActiveSheet.Cells(i, 1).NumberFormat)
  If RegExp.Test(Expr) Then Replace....
Next i

This way VBA will treat the values in the cells as they appear.

这样,VBA将根据细胞中出现的值来处理。

EDIT

编辑

About the pattern also matching "29:00:00 |M" - I'm not sure why "|" is considered a valid char since it means "OR". The same happens if you use ",". Anyway, a better one surely is:

关于模式也匹配“29:00:00 |M”——我不确定为什么“|”被认为是一个有效的字符,因为它的意思是“或者”。如果你用","也会发生同样的情况。无论如何,一个更好的肯定是:

/^(([0-1]?[0-9])|(2[0-4])):[0-5][0-9]:[0-5][0-9] [A|P]M$/
  • Allows 0-24 hours only
  • 只允许0-24小时
  • "^" and "$" makes sure the cell contains only time format value in it, denoting begining and end of the string
  • “^”和“$”确保细胞只包含时间格式的值,表示开始和结束的字符串

But in the end, it doesn't matter if the RegExp pattern matches EXACTLY only time values, because the formating of the input data will be done with Excel, wich will pre-validate the cell content (if you type "29:00:00" it will convert it to 5 AM of the next day). This being an Excel solution can lead to an Excel solution only, but not a global solution.

但最后,RegExp模式是否只匹配时间值并不重要,因为输入数据的格式化将使用Excel完成,wich将预先验证单元格内容(如果输入“29:00:00”,它将在第二天上午5点将其转换为单元格内容)。这是一个Excel解决方案,只能导致Excel解决方案,而不是全局解决方案。

In that sense, using regular expression is not even common in Excel - RegExp is a string tester, without semantic meaning evaluation, for which you have other means to validate inputs in this context. For instance, you could do the same with pure VBA:

从这个意义上说,使用正则表达式在Excel中是不常见的——RegExp是一个字符串测试器,没有语义意义的评估,在这种情况下,您还有其他方法来验证输入。例如,你可以用纯VBA做同样的事情:

Function IsTime(rng As Range) As Boolean
  Dim sValue As String
  sValue = rng.Cells(1).Text
  On Error Resume Next
  IsTime = IsDate(TimeValue(sValue))
  On Error GoTo 0
End Function
'Source: http://excel.tips.net/T003292_Checking_for_Time_Input.html

#1


5  

I've updated my code formerly hosted here to

我已经更新了以前托管在这里的代码

  1. Remove any text fields in a user selected range that are '00:00
  2. 删除用户选择范围内的任何文本字段,这些字段都是“00:00”
  3. Replace any text "time fields" with the full path
  4. 用完整路径替换任何文本“time fields”

(nb: In the end the Regex is overkill as a cell test for a value betwen 0.0 and 1.0 would suffice given the actual data format)

(nb:考虑到实际的数据格式,Regex对于0.0和1.0的值的单元测试已经足够了)

如何在VBA excel宏中进行regex搜索和替换?如何在VBA excel宏中进行regex搜索和替换?

    'Press Alt + F11 to open the Visual Basic Editor (VBE)
    'From the Menu, choose Insert-Module.
    'Paste the code into the right-hand code window.
    'Press Alt + F11 to close the VBE
    'In Xl2003 Goto Tools … Macro … Macros and double-click KillTime  


    Sub KillTime()
    Dim rng1 As Range
    Dim rngArea As Range
    Dim lngRow As Long
    Dim lngCol As Long
    Dim lngCalc As Long
    Dim objReg As Object
    Dim strSht As String
    Dim X()

    On Error Resume Next
    Set rng1 = Application.InputBox("Select range for the replacement of leading zeros", "User select", Selection.Address, , , , , 8)
    If rng1 Is Nothing Then Exit Sub
    On Error GoTo 0

    strSht = ActiveWorkbook.Path & "\[" & ActiveWorkbook.Name & "]" & rng1.Parent.Name
    'remove '00:00
    rng1.Replace "00:00", vbNullString, xlWhole

    'See Patrick Matthews excellent article on using Regular Expressions with VBA
    Set objReg = CreateObject("vbscript.regexp")
    objReg.Pattern = "^0\.\d+$"    
     'Speed up the code by turning off screenupdating and setting calculation to manual
      'Disable any code events that may occur when writing to cells
    With Application
        lngCalc = .Calculation
        .ScreenUpdating = False
        .Calculation = xlCalculationManual
        .EnableEvents = False
    End With

    'Test each area in the user selected range

    'Non contiguous range areas are common when using SpecialCells to define specific cell types to work on
    For Each rngArea In rng1.Areas
        'The most common outcome is used for the True outcome to optimise code speed
        If rngArea.Cells.Count > 1 Then
           'If there is more than once cell then set the variant array to the dimensions of the range area
           'Using Value2 provides a useful speed improvement over Value. On my testing it was 2% on blank cells, up to 10% on non-blanks
            X = rngArea.Value2
            For lngRow = 1 To rngArea.Rows.Count
                For lngCol = 1 To rngArea.Columns.Count
                   If objReg.test(X(lngRow, lngCol)) Then X(lngRow, lngCol) = strSht & rngArea.Cells(1).Offset(lngRow - 1, lngCol - 1).Address(0, 0)
                Next lngCol
            Next lngRow
            'Dump the updated array back over the initial range
            rngArea.Value2 = X
        Else
            'caters for a single cell range area. No variant array required
               If objReg.test(rngArea.Value) Then rngArea.Value = strSht & rngArea.Address(0, 0)            
        End If
    Next rngArea

    'cleanup the Application settings
    With Application
        .ScreenUpdating = True
        .Calculation = lngCalc
        .EnableEvents = True
    End With

    Set objReg = Nothing
    End Sub

#2


5  

Since you are trying to replace a format, I'd do a replace based on a format. With a regex it seems like you'd be forced to deal with the underlying number.

由于您正在尝试替换一种格式,我将基于一种格式进行替换。使用regex,您似乎将*处理底层数字。

I tested this in XL 2003 and 2010:

我在XL 2003和2010年测试过这个

Sub ReplaceByFormat()
With ActiveSheet.Cells
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Application.FindFormat.NumberFormat = "h:mm AM/PM"
End With
End Sub

EDIT

编辑

First off I had a mistake above in placing the FindFormat at the end. It needs to be at the beginning (Doh).

首先,我在最后放置FindFormat时犯了一个错误。它需要在开始(Doh)。

There is no OR argument to the replace function. So below, I've just repeated the code for a second type of format.

替换函数没有参数。下面,我重复了第二种格式的代码。

This code assumes that the dates are all constants. If they are formulas you could fix with a find and replace in the vba. If they're a mix, you'll need to extend the code a bit:

这段代码假设日期都是常量。如果它们是公式,你可以在vba中找到并替换。如果它们是混合的,你需要扩展一下代码:

Sub ReplaceByFormat()

With ActiveSheet.Cells.SpecialCells(xlCellTypeConstants)
    Application.FindFormat.NumberFormat = "h:mm AM/PM"
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
    Application.FindFormat.NumberFormat = "m/d/yyyy"
    .Replace What:="", Replacement:="=cell(""filename"")&cell(""Address"")", _
             SearchFormat:=True, _
             ReplaceFormat:=False, _
             LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
End With
End Sub

#3


0  

Firstly, a better pattern would be:

首先,更好的模式是:

  /[0-2]?[0-9]:[0-5][0-9]:[0-5][0-9] [A|P]M/

Secondly, the actual code without anything specific to your problem would be as simple as (adapting to your scenario, off course):

第二,实际的代码没有针对您的问题的任何特定的代码就像(适应您的场景):

Set RegExp= CreateObject("VBScript.RegExp")
RegExp.Pattern = "[0-2]?[0-9]:[0-5][0-9]:[0-5][0-9] [A|P]M"
For i = ......
  Expr = Format(ActiveSheet.Cells(i, 1).Value, ActiveSheet.Cells(i, 1).NumberFormat)
  If RegExp.Test(Expr) Then Replace....
Next i

This way VBA will treat the values in the cells as they appear.

这样,VBA将根据细胞中出现的值来处理。

EDIT

编辑

About the pattern also matching "29:00:00 |M" - I'm not sure why "|" is considered a valid char since it means "OR". The same happens if you use ",". Anyway, a better one surely is:

关于模式也匹配“29:00:00 |M”——我不确定为什么“|”被认为是一个有效的字符,因为它的意思是“或者”。如果你用","也会发生同样的情况。无论如何,一个更好的肯定是:

/^(([0-1]?[0-9])|(2[0-4])):[0-5][0-9]:[0-5][0-9] [A|P]M$/
  • Allows 0-24 hours only
  • 只允许0-24小时
  • "^" and "$" makes sure the cell contains only time format value in it, denoting begining and end of the string
  • “^”和“$”确保细胞只包含时间格式的值,表示开始和结束的字符串

But in the end, it doesn't matter if the RegExp pattern matches EXACTLY only time values, because the formating of the input data will be done with Excel, wich will pre-validate the cell content (if you type "29:00:00" it will convert it to 5 AM of the next day). This being an Excel solution can lead to an Excel solution only, but not a global solution.

但最后,RegExp模式是否只匹配时间值并不重要,因为输入数据的格式化将使用Excel完成,wich将预先验证单元格内容(如果输入“29:00:00”,它将在第二天上午5点将其转换为单元格内容)。这是一个Excel解决方案,只能导致Excel解决方案,而不是全局解决方案。

In that sense, using regular expression is not even common in Excel - RegExp is a string tester, without semantic meaning evaluation, for which you have other means to validate inputs in this context. For instance, you could do the same with pure VBA:

从这个意义上说,使用正则表达式在Excel中是不常见的——RegExp是一个字符串测试器,没有语义意义的评估,在这种情况下,您还有其他方法来验证输入。例如,你可以用纯VBA做同样的事情:

Function IsTime(rng As Range) As Boolean
  Dim sValue As String
  sValue = rng.Cells(1).Text
  On Error Resume Next
  IsTime = IsDate(TimeValue(sValue))
  On Error GoTo 0
End Function
'Source: http://excel.tips.net/T003292_Checking_for_Time_Input.html