0

I have some strings in my dataframe and I have replaced it.

df2['x'].replace(['APPEAL','AppealNo.','AppealNO.','Co.Appeal','COMP.APPL','Co.Appeal','Comp.','AppealNo','CoAppealnies','CoAppealnies','companyappealno.''CompAPPNo.','CApealNo','CApeal','companyappeal'],'CoAppeal', regex=True,inplace=True)

Is there a way where I can predict the future combinations of strings in python with the given strings so I can replace it without doing the process manually?

8
  • What do you mean by "future combinations"? Some function that takes CoAppeal and returns a list of variations that will likely appear in your data? Commented Jul 21, 2022 at 16:32
  • yes any further combinations that occur at some point when im fetching data. Commented Jul 21, 2022 at 16:34
  • There is a concept of distance, which given an arbitrary string x, you can compute the distance between x and "CoAppeal", and if that distance is small enough, you'll make the replacement. The two are equivalent, in the sense that the list you want is the list of all strings with a small enough distance, but definging the distance function is likely easier than searching all possible strings. Commented Jul 21, 2022 at 16:34
  • Can you write down the code? Commented Jul 21, 2022 at 16:35
  • This is more of a "the function exists, but it's not obvious which exact function you need" situation. A simple example is Levenshtein distance, but this is unlikely to be sufficient in your case. Commented Jul 21, 2022 at 16:37

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.