top of page

PowerShell script to run RegEx and Output results

Fidel from Brisbane has posted a very useful PowerShell script here, that you can use to run a regular expression search through a text file and extract only the matching hits.


When used with the RegEx search for Bates numbers discussed last night, it can be used to automatically extract a complete list of Bates numbers in any text file.


So if you start with a text file that looks like this:


. . . with Bates numbers at the end of each paragraph, you can run this PowerShell script:


select-string -Path C:\foofolder\input.txt -Pattern "(\b\w{1,10}(-|_|\s?)[0-9]{5,12}\b|\b\w{1,10}(-|_|\s?)\b\w{1,10}(-|_|\s?)[0-9]{5,12}\b)" -AllMatches | % { $_.Matches } | select-object Value -unique | sort-object Value > C:\foofolder\output2.txt


to pull out the Bates numbers. Note that you need to specify the path of your text file at:


select-string -Path C:\foofolder\input.txt


. . . put in the Regex in quotes at:


-Pattern "(\b\w{1,10}(-|_|\s?)[0-9]{5,12}\b|\b\w{1,10}(-|_|\s?)\b\w{1,10}(-|_|\s?)[0-9]{5,12}\b)" -AllMatches


. . . and then specify an output file at the end:


sort-object Value > C:\foofolder\output2.txt


You should end up with a text file that just lists the Bates numbers and has them sorted as well!








Comments


bottom of page