1

Trying to use the python re package to look for filenames with a certain pattern. Got a wee test script which just has all values hardcoded in but this isn't the normal use:

#!/usr/bin/env python

try:
    import re2 as re
except ImportError:
    import re

filepath1 = "C:\Users\Administrator\AppData\Local\Temp\77ce4ba2a605e22b8699eef874d075fb585d259ed6cade2e503e6dbf58020aa0.exe:Zone.Identifier"
filepath2 = "C:\Users\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier"
re_pattern = re.compile("C\:\\\\Users\\\\[^\\\\]*\\\\AppData\\\\Local\\\\Temp\\\\[^.]*\.exe\:Zone\.Identifier")

print "1: " + str(re_pattern.search(filepath1))
print "2: " + str(re_pattern.search(filepath2))

For some reason this returns None for 1 and a match for 2, but as far as I can work out they should both be matching. Probably just a stupid mistake but if someone can spot it that would be awesome.

Basically the pattern should match any .exe with a Zone ID in the %TEMP% directory, regardless of username

1
  • 1
    ever heard of raw prefix? that would avoid doubling the backslashes. Commented May 28, 2018 at 16:00

2 Answers 2

2

The issue is that one of the paths starts with 7, and if you try \7 in the console, you'll see that it's interpreted as a code, because you're not using raw prefix for literals.

>>> print("\7")
<some garbage char, bell?>
>>> print(r"\7")
\7

That explains that your regex doesn't work for that particular path (for the other path you were "lucky" because you're using Python 2 and \+any upper char isn't a particular escape sequence so it's not changed (In python 3, \U is interpreted!)

Now, for paths, in this simple case, you could use fnmatch instead to match wildcards not regexes:

import fnmatch

filepath1 = r"C:\Users\Administrator\AppData\Local\Temp\77ce4ba2a605e22b8699eef874d075fb585d259ed6cade2e503e6dbf58020aa0.exe:Zone.Identifier"
filepath2 = r"C:\Users\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier"
filepath3 = r"C:\Urs\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier"

for f in (filepath1,filepath2,filepath3):
    print(f,fnmatch.fnmatch(f,r"C:\Users\*\AppData\*\Temp\*.exe:Zone.Identifier"))

prints:

C:\Users\Administrator\AppData\Local\Temp\77ce4ba2a605e22b8699eef874d075fb585d259ed6cade2e503e6dbf58020aa0.exe:Zone.Identifier True
C:\Users\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier True
C:\Urs\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier False
Sign up to request clarification or add additional context in comments.

Comments

1

Sorry i misunderstood your question.

import re
filepath1 = r"C:\Users\Administrator\AppData\Local\Temp\77ce4ba2a605e22b8699eef874d075fb585d259ed6cade2e503e6dbf58020aa0.exe:Zone.Identifier"
filepath2 = r"C:\Users\Administrator\AppData\Local\Temp\svchost.exe:Zone.Identifier"

print(re.search(r"C\:\\Users\\(.*)\\AppData\\Local\\Temp\\[a-zA-Z0-9]+\.exe\:Zone\.Identifier$", filepath1))
print(re.search(r"C\:\\Users\\(.*)\\AppData\\Local\\Temp\\[a-zA-Z0-9]+\.exe\:Zone\.Identifier$", filepath2))

Output:

<_sre.SRE_Match object at 0x03176AE0>
<_sre.SRE_Match object at 0x03176AE0>

Please note the raw string(r) at the beginning of your filepath

1 Comment

Thanks! This is what I was looking for. Marked the other as the answer though since the detail might help others :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.