Since your text file is mostly JSON, it is better and more robust to remove the non-JSON preamble and use ConvertFrom-Json to parse the JSON into an object whose properties you can access to get the desired information:
# -> 11260434, i.e. a *number*, due to from-JSON parsing.
(
(Get-Content -Raw C:\Folder\scanjson.txt) -replace '(?s)^.+(?=\{)' |
ConvertFrom-Json
).scanId
Get-Content -Raw reads the file as a whole into a single, multiline string.
-replace '(?s)^.+(?=\{)' removes everything up to, but excluding, the first { character; the components of the regex are as follows:
(?s) is an inline option that makes . match newline characters too.
^ matches at the start of the string.
.+ matches any nonempty (+) run of characters (.); if your input file doesn't always have a preamble, use .* instead.
(?=\{) is a lookahead assertion that matches a { character without including it in the match.
- By not specifying a substitution expression, the match is replaced with the empty string, i.e. effectively removed.
The resulting string is valid JSON that ConvertFrom-Json parses into [pscustomobject] instances, whose properties, such as .scanId, you can access.
- Note that, unlike the strictly text-based regex-parsing, JSON supports a few data types, causing an unquoted token such as
11260434 to be parsed as a number.
- An integer - such as in the case at hand - is parsed as type
[int] (System.Int32) in Windows PowerShell vs. as [long] (System.Int64) in PowerShell (Core) 7.
As for what you tried:
Your regex was too permissive; use
"scanId"\s*:\s*(?<digits>[0-9]+) instead;[1] i.e.:
# -> '11260434', i.e. a *string*, due to regex parsing.
(
Select-String -List -Path C:\Folder\scanjson.txt '"scanId"\s*:\s*(?<digits>[0-9]+)'
).Matches[0].Groups['digits'].Value
Note the need to drill down into the Select-String output object, which is of type [Microsoft.PowerShell.Commands.MatchInfo], to get the capture-group value of interest.
-List ensures that matching stops after the first match.
Note: In cases where you expect multiple matches, you'd omit -List and pipe to a ForEach-Object call instead of using direct property access on the result (as shown above):
Select-String -Path C:\Folder\scanjson.txt '"scanId"\s*:\s*(?<digits>[0-9]+)'|
ForEach-Object { $_.Matches[0].Groups['digits'].Value }
Due to using regex-based parsing, the result is a string, simply cast the entire expression to, e.g. [int] to get a number.
[1] For an explanation of the regex and the option to experiment with it, see this regex101.com page. Note that regex101.com's .NET support is limited to C#, which may require tweaks to PowerShell regexes, such as not using '...' and escaping " chars. as ""; see this answer for guidance.
"scanId"\s*:\s*(?<digits>[0-9]+)?