Extract all matches by regex pattern

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Extract all matches by regex pattern

Post by YasserKhalil »

Hello everyone
I need a template code that enables me to extract all the matches by specific pattern in regex.
As in my case I have data and there is a pattern, and the pattern highlights about 22 matches.
What I need is an example only and I will try to apply it on my case on my own
Thanks advanced for the help.

User avatar
HansV
Administrator
Posts: 78499
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Extract all matches by regex pattern

Post by HansV »

Here is a generic procedure:

Code: Select all

Sub RegExAll(strData As String, strPattern As String)
    Dim RE As Object
    Dim REMatches As Object
    Dim REMatch As Object
    Set RE = CreateObject("VBScript.RegExp")
    With RE
        .Global = True
        .Pattern = strPattern
        Set REMatches = .Execute(strData)
        For Each REMatch In REMatches
            Debug.Print "Position: " & REMatch.FirstIndex & _
                ", Value: " & REMatch.Value
        Next REMatch
    End With
End Sub
Example of use: find all digits in a text string/

Code: Select all

Sub Test()
    RegExAll "Windows 10 Version 2004", "\d"
End Sub
Result:

Position: 8, Value: 1
Position: 9, Value: 0
Position: 19, Value: 2
Position: 20, Value: 0
Position: 21, Value: 0
Position: 22, Value: 4

(RegEx starts counting at 0)
Best wishes,
Hans

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Re: Extract all matches by regex pattern

Post by YasserKhalil »

That's amazing my tutor. Very simple and clear. This is easy to learn from.
Thank you very much. I will try to solve the issue I am working on and if I am stuck at a point, I will back to you.
Best and Kind Regards

snb
4StarLounger
Posts: 580
Joined: 14 Nov 2012, 16:06

Re: Extract all matches by regex pattern

Post by snb »

Alternative way of writiing:

Code: Select all

Sub M_snb()
    MsgBox F_snb(Array("\d", "Windows 10 Version 2004"))
End Sub

Function F_snb(sn)
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = sn(0)
        For Each it In .Execute(sn(1))
           F_snb = F_snb & vbLf & "Position: " & it.FirstIndex & " Value: " & it.Value
        Next
    End With
End Function

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Re: Extract all matches by regex pattern

Post by YasserKhalil »

Thank you very much snb. I like your approach of compacting the codes.
Regards

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Re: Extract all matches by regex pattern

Post by YasserKhalil »

@HansV
I have the following pattern

Code: Select all

^[^\S\n]*(\d{1,3})\n\s*(\d{6,})[^\S\n]*\n\s*(\d{14})[^\S\n]*\n(.+)\n(.+)
The pattern groups the full match to five parts
https://regex101.com/r/AUBup2/2

How can I use the code to put every five groups in one row?

Code: Select all

Sub Test()
    Dim x, a(1 To 1000, 1 To 5), bot As New ChromeDriver, col As Object, sInput As String, sPattern As String, i As Long, j As Long, cnt As Long
    sPattern = "^\s*\d{1,3}(?:\n(?!\s*\d{1,3}\n).*){4}"
    With bot
        .AddArgument "--headless"
        .Get "file:///C:\Sample.html"
        sInput = .FindElementByCss("table[id='all']").Text
    End With
    With CreateObject("VBScript.RegExp")
        .Global = True: .MultiLine = True: .IgnoreCase = True
        .Pattern = sPattern
        If .Test(sInput) Then
            Set col = .Execute(sInput)
            For i = 0 To col.Count - 1
                x = Split(col.Item(i), vbLf)
                cnt = cnt + 1
                For j = LBound(x) To UBound(x)
                    a(i + 1, j + 1) = Application.WorksheetFunction.Clean(Trim(x(j)))
                Next j
            Next i
        End If
    End With
    ActiveSheet.Range("A1").Resize(cnt, UBound(a, 2)).Value = a
End Sub
Here's the HTML for the whole page
https://pastebin.com/nu0dLvch

Copy the contents and paste into text file and save as Sample.html on the drive C:\

The code works fine.. But I would like to learn how to deal with groups instead of using Split function to split the five parts.

LisaGreen
5StarLounger
Posts: 964
Joined: 08 Nov 2012, 17:54

Re: Extract all matches by regex pattern

Post by LisaGreen »

That would seem to be a regex problem rather than a VBx one.

However... My fav personal regex site is one that you may want to try ...
https://regexr.com/ as well to compare results.
... at the very least for comparison of results.

Groups in regex as I think you possibly know are bounded by... ( ).

This is another great regex site!
https://www.regular-expressions.info/refcapture.html

HTH
Lisa

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Re: Extract all matches by regex pattern

Post by YasserKhalil »

Thanks a lot Lisa
The regex part is related to the pattern and the pattern is already there and the groups are set in the pattern. Please review the post well
The needed part is how to handle and extract the groups from the pattern using VBA ..

LisaGreen
5StarLounger
Posts: 964
Joined: 08 Nov 2012, 17:54

Re: Extract all matches by regex pattern

Post by LisaGreen »

The generic answer from Hans is perfect.

It allows easy adjustment of the regex expression and doesn't depend on VBA hardly at all... just as a shell around the regex process.

Compacting codes IMNSVHO is counterproductive in that you will need to decipher it later and that takes time. You always... IMHO have to think about future maintenance at the very least because it will probably be **you** having to update the code in a few years. How much will you forget in that time about that little piece of code!!! *** Remember!! You will forget!! *** And how about other people that follow you? Will they understand or even more important... how long will it take them to understand... that compressed and cryptic piece of code??!! There is a very famous quote ... "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." Code for readability.
John F. Woods in 1991

Okay... flame off.

I'd advise you to ask questions about grouping in the many regex forums available. Not what you wanted to hear maybe.

Lisa

YasserKhalil
PlatinumLounger
Posts: 4914
Joined: 31 Aug 2016, 09:02

Re: Extract all matches by regex pattern

Post by YasserKhalil »