Let’s say you have a log file. There’s some info in there, like URLs, that you need them in a list.
Copy-pasting? Hell no, Powershell to the rescue!
PowerShell
## Source: DotJim blog (http://dandraka.com)# Jim Andrakakis, February 2025## Change the regex to fit your purposes# and of course the input file$regEx = 'https?://[^\s/$.?#].[^\s]*'$inputFile = "C:\logs\mybiglog.txt"$outputFile = [System.IO.Path]::Combine([System.IO.Path]::GetDirectoryName($inputFile), "out_$([guid]::NewGuid().ToString().Split('-')[0]).txt")$content = Get-Content -Path $inputFile -Raw$matches = [regex]::Matches($content, $regEx)$matches | ForEach-Object { $_.Value } | Out-File -FilePath $outputFile
This is the easy way. And it works… unless the log file is big, meaning, more than a few GB. In this case, trying to fit the whole file in memory (which Get-Content does) is going to blow up your system.
So, what do you do? You stream. No, not like Netflix. Well, kind of:
PowerShell
## Source: DotJim blog (http://dandraka.com)# Jim Andrakakis, February 2025## Change the regex to fit your purposes# and of course the input file$inputFile = "C:\logs\mybiglog.txt"$outputFile = "C:\logs\out_$([guid]::NewGuid().ToString().Split('-')[0]).txt"$regEx = [regex]'https?://[^\s/$.?#].[^\s]*'# Create a stream reader for the input and a writer for the output$reader = [System.IO.File]::OpenText($inputFile)$writer = [System.IO.StreamWriter]::new($outputFile)try { while ($line = $reader.ReadLine()) { $matches = $regEx.Matches($line) foreach ($match in $matches) { $writer.WriteLine($match.Value) } }}finally { # Always close your streams to release the file locks $reader.Close() $writer.Dispose() $reader.Dispose()}Write-Host "Processing complete. Results saved to: $outputFile"
Have fun coding!