Parse data from log file using Regex patterns

Question

I have a log file full of logs of this type :

2020-02-04 04:00:31,503 [z4y6480f-214b-4253-9223-n02542f706ac] [INFO] [ServiceType] [ObjectType] - Information about the log

I would like, using regex patterns, to retrieve the time, the last text in brackets ([ObjectType] in the exemple) and the information message after the hyphen.

Example of Input :

2020-02-04 04:00:33,435 [z4y6480f-214b-4253-9223-n02542f706ac] [INFO] [ServiceTypeJohn] [ObjectTypeJohn] - Information about the John log
2020-02-04 06:50:34,465 [z4y6480f-214b-4253-9223-n02542f706ac] [INFO] [ServiceTypeBob] [ObjectTypeBob] - Information about the Bob log
2020-02-04 07:20:34,677 [z4y6480f-214b-4253-9223-n02542f706ac] [INFO] [ServiceTypeSam] [ObjectTypeSam] - Information about the Sam log

Desired output :

04:00:33,435 [ObjectTypeJohn] - Information about the John log
06:50:34,465 [ObjectTypeBob] - Information about the Bob log
07:20:34,677 [ObjectTypeSam] - Information about the Sam log

So far I have tried this but didn't succeed :

(Get-Content Output.txt) -replace '^(\d\d:\d\d:\d\d).*(\[.*?\] - .*?)$','$1;$2'

Would appreciate any help on this, thanks.

Wiktor Stribiżew · Accepted Answer · 2020-03-11 14:02:22Z

2

You may use

(Get-Content Output.txt) -replace '^\S+\s+(\S+).*(\[[^][]*])\s*(-.*)', '$1 $2 $3'

See the .NET regex demo

Details

^ - start of the string
\S+ - 1+ chars other than whitespace
\s+ - 1+ whitespaces
(\S+) - Group 1: 1+ chars other than whitespace
.* - any 0+ chars other than newline, as many as possible
(\[[^][]*]) - Group 2: [, 0+ chars other than [ and ] and then a ] char
\s* - 1+ whitespaces
(-.*) - Group 3: - and the rest of the string.

Demo results:

answered Mar 11, 2020 at 14:02

Wiktor Stribiżew

631k41 gold badges502 silver badges633 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kimo Over a year ago

Thanks works perfectly! Could you please also show how to retrieve the brackets of [ServiceType] instead of [ObjectType] ?

Wiktor Stribiżew Over a year ago

@Kimo Like this.

mklement0 · Accepted Answer · 2020-03-11 14:21:44Z

2

As an alternative to a regex solution, consider use of the unary form of the -split operator, which makes for a conceptually simpler solution:

(Get-Content Output.txt).ForEach({ 
  # Split line into an array of fields by whitespace.
  $fields = -split $_ 
  # Extract the fields of interest by index and re-join with spaces.
  $fields[1, 5 + 6..($fields.Count-1)] -join ' ' 
})

The unary form of -split behaves similar to the Unix awk utility, in that it tokenizes a line by any runs of non-empty whitespace, ignoring leading and trailing whitespace).

Note that the solution above relies on the fields before the - not containing whitespace themselves, which is true for the sample input.

edited Mar 11, 2020 at 14:21

answered Mar 11, 2020 at 14:09

mklement0

453k68 gold badges729 silver badges989 bronze badges

Collectives™ on Stack Overflow

Parse data from log file using Regex patterns

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related