I am trying to run a join command within Python, and I'm being foiled by subprocess. I'm combining thousands of large files iteratively, so a dictionary would require a lot of memory. My rationale is that join only has to deal with two files at a time, so my memory overhead will be lower.
I have tried many different versions of this trying to get subprocess to run. Can anyone explain why this is not working? When I print the cmd and execute it myself on the shell, it runs perfectly.
cmd = "join <(sort %s) <(sort %s)" % (outfile, filename)
with open(out_temp, 'w') as out:
return_code = subprocess.call(cmd, stdout=out, shell=True)
if return_code != 0:
print "not working!"
break
The error produced looks like this. However, when I have python print cmd and execute it myself on the shell, it runs perfectly.
/bin/sh: -c: line 0: syntax error near unexpected token `('
I have also tried turning the command into a list, but I'm not sure what the rationale is for how to break up the commands. Can anyone explain? outfile and filename are variables
["join" , "<(sort" , outfile , ") <(sort" , filename , ")"]
Any help would be appreciated! I'm doing this in Python because I'm heavily parsing filenames upstream to figure out which files to combine.