2

For context, please check this SO post.

full script:

(get-content file.txt -ReadCount 0) -replace '([^,]")"','$1' | set-content newfile.txt

I am specifically looking for a translation of the logic in this portion of the script:

'([^,]")"','$1' |

Can someone please either explain the logic/syntax, or point me in the right direction?

2
  • Looks somewhat like a regular expression to me. Commented Oct 29, 2013 at 17:02
  • Can you please explain the operators in the search pattern? I can't find the meaning of the ^ symbol. Commented Oct 29, 2013 at 17:07

3 Answers 3

3

'([^,]")"' is a regular expression that matches any character except a comma followed by two consecutive double quotes. The parentheses group the first character and the first double quote.

'$1' is a back-reference to the group in the match, which in the replacement means "replace the match with just the first group", e.g. in a string foo""bar the sequence o"" would be replaced with just o", thus removing the second double quote.

| is a pipe that feeds the result of the replacement into the next cmdlet in the pipeline (Set-Content newfile.txt).

Sign up to request clarification or add additional context in comments.

Comments

2
'([^,]")"','$1'

Let's break this into two pieces the regex pattern '([^,]")"' and the replacement text '$1'. The () in the regex pattern creates an unnamed capture group that is referenced in the replacement text via $1 i.e. it is the first (and only in this case) set of parens. What is matched & captured in this capture group is any character except a comma followed by a double quote that is also followed by another double quote outside of the capture group. So it eliminates one of two consecutive double quotes unless the first is preceded by a comma.

Comments

1

the ^ symbol matches the beginning of a string.
Great references here and here that pretty well explain everything.
Intro to Regex in Power shell here.

when the ^ symbol appears in brackets [] it will match anything not including what follows the carrot

4 Comments

Ok. Are you able to translate the logic expressed here: '([^,]")"','$1' ?
^ does not match beginning of string when in a character class [...]. Inside the [] it matches a literal caret character.
@Eris Both of you are wrong. Inside square brackets the caret inverts the character class. [^,] means "any character except a comma".
Oh yes, the negation when its the First character in a class, but not when it's later. Magic-meaning ftw!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.