Python extracting string

Question

I have a dataframe where one of the columns which is in string format looks like this

    filename
 0  Machine02-2022-01-28_00-21-45.blf.424
 1  Machine02-2022-01-28_00-21-45.blf.425
 2  Machine02-2022-01-28_00-21-45.blf.426
 3  Machine02-2022-01-28_00-21-45.blf.427
 4  Machine02-2022-01-28_00-21-45.blf.428

I want my column to look like this

      filename
 0    2022-01-28 00-21-45 424
 1    2022-01-28 00-21-45 425
 2    2022-01-28 00-21-45 426
 3    2022-01-28 00-21-45 427
 4    2022-01-28 00-21-45 428

I tried this code

df['filename'] = df['filename'].str.extract(r"(\d{4}-\d{1,2}-\d{1,2})_(\d{2}-\d{2}-\d{2}).*\.(\d+)", r"\1 \2 \3")

I am getting this error, unsupported operand type(s) for &: 'str' and 'int'.
Can anyone please tell me where I am doing wrong ?

Not sure why you're getting this error but here is what it means: & is the so-called "bit-wise and operator" which applies "AND" bit by bit (thus the name). Python converts ints to binary on the fly for thisoperator, but for strings this is not possible. You get a similar error when you try + a int and string. — white
– white, Commented Mar 23, 2022 at 8:01

Corralien · Accepted Answer · 2022-03-23 07:58:15Z

5

Use str.replace and add .*- to remove strings like Machine02-:

df['filename'] = df['filename'].str.replace(r".*-(\d{4}-\d{1,2}-\d{1,2})_(\d{2}-\d{2}-\d{2}).*\.(\d+)", r"\1 \2 \3")
print(df)

# Output
                  filename
0  2022-01-28 00-21-45 424
1  2022-01-28 00-21-45 425
2  2022-01-28 00-21-45 426
3  2022-01-28 00-21-45 427
4  2022-01-28 00-21-45 428

answered Mar 23, 2022 at 7:58

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Karma_X Over a year ago

I tried this, it didn't worked

Corralien Over a year ago

I don't understand, it seems to work with your example, no?

Karma_X Over a year ago

Its not working with my example

Abhigyan Over a year ago

@Vinay_S this does exactly what you want, as advertised, I have tried it out. What error are you facing?

Karma_X Over a year ago

@Corralien Thanks for this input. I had leading space, that's why it was not working. Now its working. Thanks once again

|

prahasanam_boi · Accepted Answer · 2022-03-23 08:03:07Z

4

please try this:

df['filename'] = df['filename'].str.split('-',1).apply(lambda x:' '.join(x[1].split('_')).replace('.blf.',' '))

answered Mar 23, 2022 at 8:03

prahasanam_boi

8965 silver badges12 bronze badges

Comments

wwnde · Accepted Answer · 2022-03-23 08:34:03Z

4

Use replace

df['filename']=df['filename'].str.replace('Machine|\.blf\.',' ',regex=True).str.strip().str.replace('^\d+\-','',regex=True)



 filename
0  2022-01-28_00-21-45 424
1  2022-01-28_00-21-45 425
2  2022-01-28_00-21-45 426
3  2022-01-28_00-21-45 427
4  2022-01-28_00-21-45 428

or

Extract values between e02 and .blf

df['filename']=df['filename'].str.extract('((?<=[e02])[\w|\-]+(?=[.blf]))')



    filename
0  02-2022-01-28_00-21-45
1  02-2022-01-28_00-21-45
2  02-2022-01-28_00-21-45
3  02-2022-01-28_00-21-45
4  02-2022-01-28_00-21-45

edited Mar 23, 2022 at 8:34

answered Mar 23, 2022 at 8:05

wwnde

26.7k6 gold badges22 silver badges38 bronze badges

1 Comment

Corralien Over a year ago

I think you missed the suffix (424, 425, 426, ...)

Ziur Olpa · Accepted Answer · 2022-03-23 08:04:29Z

1

Regex are nice, but sometimes is easier and more readable to make a replace, if the arguments won't ever change:

df['filename'] = df['filename'].str.replace('Machine02-','',regex=False)
df['filename'] = df['filename'].str.replace('.blf.',' ',regex=False)

answered Mar 23, 2022 at 8:04

Ziur Olpa

2,2352 gold badges18 silver badges35 bronze badges

1 Comment

Karma_X Over a year ago

Thanks for this answer. I am currently using it and it will work only if string has 'Machine02-' . Sometimes I may have random names like M2, Mach2, etc. I have a huge data and I can't keep looking like that. So I am trying another method which I explained in question

Collectives™ on Stack Overflow

Python extracting string

4 Answers 4

7 Comments

Comments

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

7 Comments

Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related