0

This is my input file which is random, can be any number not just 9999 and any letters: The below format will always come after a - (dash).

-
9999 99AKDSLY9ZWSRK99999
9999 99BGRPOE99FTRQ99999

Expected output:

AKDSLY9ZSRK
BGRPOE99TRQ

So I need to remove the first part of each line, always numbers:

9999 99
9999 99

Then remove the not-required characters:

99AKDSLY9ZW → in this case is the W but could be any letter
99BGRPOE99F → in this case is the F but could be any letter

And finally remove the last 5 digits, always numbers:

99999
99999

What I´m trying to use, regex (first time using it):

$result = [regex]::Matches($InputFile, '(^\d{4}\s\d{2}[A-Z0-9]\d{5}$)') -replace '\d{4}\s\d{2}', '')
$result

It's not giving me an error message but it's not showing me the characters I'm expecting to see at $result.

I was expecting to see something in $result to then start the formatting, deleting the characters I don't need.

What could be missing here, please?

3
  • 1
    Generalize it. ^[\d ]*(.+?)[\d ]*$ replace $1, then Then remove the not-required characters in a separate regex operation. Set multi-line mode for both regex operations. Commented Mar 28, 2017 at 18:27
  • 1
    Your regular expression says start of string, four digits, space, two digits, a single letter or number, five digits, end of string - because of the single letter or number, it will never match anything. (and even if you change it to match, the match would be the entire line, which makes it useless) '9999 99BGRPOE99FTRQ99999' -replace '^\d{4} \d{2}|\d{5}$' -replace '.(?=...$)' to replace the start or the end numbers, then replace the fourth-from-last-character (my best guess as to why you are chosing the W and F...) Commented Mar 28, 2017 at 18:33
  • the W and F are extra characters that come randomly with the input file and need to be deleted so this file can be consumed by another application, otherwise, if characters are not removed, validation will fail. I´m trying both ways shared here, thanks a bunch. Commented Mar 28, 2017 at 18:54

1 Answer 1

1

Try something like this:

$str = (Get-Content ... -Raw) -replace '\r'

$cb = {
  $args[0].Groups[1].Value -replace '(?m)^.{7}' -replace '(?m).(.{3}).{5}$', '$1'
}

$re = [regex]'(?m)^(?<=-\n)((?:\d{4}\s\d{2}[^\n]*\d{5}(?:\n|$))+)'

$re.Replace($str, $cb)

The regular expression $re matches multiline substrings that start with a hyphen and a newline, followed by one or more line with your digit/letter combinations. The (?<=...) is a positive lookbehind assertion to ensure that you only get a match when the lines with the digit/letter combinations are preceded by a line with a hyphen (without making that line part of the actual match).

The scriptblock $cb is an anonymous callback function that the Regex.Replace() method calls on each match. For each line in a match it removes the first 7 characters from the beginning of the line, and replaces the last 9 characters from the end of the line with the 2nd through 4th of those characters.

For simplicity reasons the sample code removes carriage return characters (CR, \r) from the string, so that all newlines are linefeed characters (LF, \n) instead of the default CR-LF.

Sign up to request clarification or add additional context in comments.

3 Comments

Hi Ansgar, this is a very nice explanation about the regex usage. I´ve tried this and works perfectly checking the results at console, when I output the file using | Out-File it shows everything at same line, like a text without carriage return. If I comment out this part -replace '\r', it does not work either, kind of tricky.
Split the string at newlines before piping the result to Out-File. If you want to keep the CR in the input string you need to adjust the regular expression to account for the additional characters.
Worked like a charm!!! Thanks @Ansgar Wiechers. And I´ve got to learn a little bit more about PowerShell, especially about regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.