使用正则表达式提取URL

时间:2021-08-11 13:28:42

I receive 10 or 12 emails from the same website and I would like to extract a particular URL using regex (if possible) and have it pasted into the correct Excel file. The e-mails are in Outlook and I already have a VBA script (that runs from Outlook VBA) that I used to extract the Subject and Sender. However, I really need that particular URL in each e-mail to be the third piece of information extracted.

我从同一个网站收到10到12封电子邮件,我想使用regex提取一个特定的URL(如果可能的话),并将它粘贴到正确的Excel文件中。邮件在Outlook中,我已经有了一个VBA脚本(从Outlook VBA运行),用于提取主题和发送者。然而,我确实需要在每封电子邮件中都包含这个特定的URL作为提取的第三条信息。

I have tried to create a series of steps to:

我试着设计了一系列步骤:

  1. create the RegEx
  2. 创建一个正则表达式
  3. apply the RegEx to the current email
  4. 将RegEx应用到当前的电子邮件
  5. place the extracted URL into the excel document.
  6. 将提取的URL放入excel文档中。

However, whatever I have created fails miserably. The VBA pasted below always worked until I wrote in the additional RegEx part.

然而,我所创造的一切都失败了。粘贴在下面的VBA始终有效,直到我在附加的RegEx部分中编写。

I believe I have the correct pattern:

我相信我有正确的模式:

/http:\/\/www.changedetection.com\/log(.*)/ig

Whenever I run the new VBA script, it doesn't do anything. The old code always worked. The code is written into This Outlook Session (just to clarify) because the mailitem needs to run from a script.

每当我运行新的VBA脚本时,它什么都不做。旧代码总是有效的。代码被写入这个Outlook会话(只是为了澄清),因为mailitem需要从脚本运行。

Const xlUp As Long = -4162
Sub ExportToExcel(MyMail As MailItem)

    Dim strID As String, olNS As Outlook.NameSpace
    Dim olMail As Outlook.MailItem
    Dim strFileName As String
    Dim strBody As String
    Dim Reg1 As RegExp
    Dim M1 As MatchCollection
    Dim M As Match

    Set Reg1 = New RegExp
    With Reg1
        .Pattern = "http://www\.changedetection\.com/log(.*)"
        .IgnoreCase = True
        .Global = True
     End With

    If Reg1.test(olMail.Body) Then

        Set M1 = Reg1.Execute(olMail.Body)
        For Each M In M1
        strBody = M.SubMatches(1)
        Next
    End If

    '~~> Excel Variables
    Dim oXLApp As Object, oXLwb As Object, oXLws As Object
    Dim lRow As Long

    strID = MyMail.EntryID
    Set olNS = Application.GetNamespace("MAPI")
    Set olMail = olNS.GetItemFromID(strID)



    '~~> Establish an EXCEL application object
    On Error Resume Next
    Set oXLApp = GetObject(, "Excel.Application")

    '~~> If not found then create new instance
    If Err.Number <> 0 Then
        Set oXLApp = CreateObject("Excel.Application")
    End If
    Err.Clear
    On Error GoTo 0

    '~~> Show Excel
    oXLApp.Visible = True

    '~~> Open the relevant file
    Set oXLwb = oXLApp.Workbooks.Open("M:\Monitor\Monitor_Test_1.xlsx")

    '~~> Set the relevant output sheet. Change as applicable
    Set oXLws = oXLwb.Sheets("Test")

    lRow = oXLws.Range("A" & oXLApp.Rows.Count).End(xlUp).Row + 1

    '~~> Write to outlook
    With oXLws
        '
        '~~> Code here to output data from email to Excel File
        '~~> For example
        '
        .Range("A" & lRow).Value = olMail.Subject
        .Range("B" & lRow).Value = olMail.SenderName
        .Range("C" & lRow).Value = strBody

        '
    End With

    '~~> Close and Clean up Excel
    oXLwb.Close (True)
    oXLApp.Quit

    Set Reg1 = Nothing
    Set oXLws = Nothing
    Set oXLwb = Nothing
    Set oXLApp = Nothing

    Set olMail = Nothing
    Set olNS = Nothing
End Sub

1 个解决方案

#1


1  

VBScript regex patterns don't use / to indicate the start and end. Nor do they use i or g after the trailing / to indicate case-insensitivety or globalness. Instead, use the IgnoreCase and Global properties.

VBScript regex模式不使用/来指示开始和结束。他们也不使用我或g在后面/表示不敏感或全球性。相反,使用IgnoreCase和全局属性。

For example:

例如:

With Reg1
    .Pattern = "http://www\.changedetection\.com/log(.*)"
    .IgnoreCase = True
    .Global = True
End With

Here's a great reference if you're looking for more information about the RegExp object.

如果您正在寻找关于RegExp对象的更多信息,这里有一个很好的参考。

#1


1  

VBScript regex patterns don't use / to indicate the start and end. Nor do they use i or g after the trailing / to indicate case-insensitivety or globalness. Instead, use the IgnoreCase and Global properties.

VBScript regex模式不使用/来指示开始和结束。他们也不使用我或g在后面/表示不敏感或全球性。相反,使用IgnoreCase和全局属性。

For example:

例如:

With Reg1
    .Pattern = "http://www\.changedetection\.com/log(.*)"
    .IgnoreCase = True
    .Global = True
End With

Here's a great reference if you're looking for more information about the RegExp object.

如果您正在寻找关于RegExp对象的更多信息,这里有一个很好的参考。