9

In Python, I often reuse variables in manner analogous to this:

files = files[:batch_size]

I like this technique because it helps me cut on the number of variables I need to track.

Never had any problems but I am wondering if I am missing potential downsides e.g. performance etc.

4
  • I don't see what's the question here. What is the alternative to compare against? Using a second variable like files = XYZ; files_head = files[:batch_size]? Why should there be any difference? Commented Jan 29, 2012 at 16:08
  • alternative being something like: new_set_of_files=files[:batch_size] Commented Jan 29, 2012 at 16:09
  • You'd notice the main one right away: Hey! I still need that old value for files!. Commented Jan 29, 2012 at 16:14
  • 2
    the alternative is to use tons of extra variables. unused1 = files[0], unused2 = 'foobar', 'unused3 = -1, veryunused = None`. Indeed, that doesn't make the code very readable. But someone might like it. seriously, what is your question? Commented Jan 29, 2012 at 16:15

4 Answers 4

8

There is no technical downside to reusing variable names. However, if you reuse a variable and change its "purpose", that may confuse others reading your code (especially if they miss the reassignment).

In the example you've provided, though, realize that you are actually spawning an entirely new list when you splice. Until the GC collects the old copy of that list, that list will be stored in memory twice (except what you spliced out). An alternative is to iterate over that list and stop when you reach the batch_sizeth element, instead of finishing the list, or even more succinctly, del files[batch_size:].

Sign up to request clarification or add additional context in comments.

3 Comments

+1 Nice point about how you can avoid creating a new object (though perhaps it's less readable?)
@cheeken: Yet another alternative (probably the most Pythonic) is to create a generator with itertools.islice.
Good Call, @NiklasBaumstark! To whomever downvoted: I would appreciate a comment explaining why so that I might correct it.
5

Some info on that specific example: If you just want to iterate, map or filter the result, you can use a generator to avoid an array copy:

import itertools
files = itertools.islice(files, batch_size)

As for the general case: Whether you assign the new value to an already existing name or to a new name should make absolutely no difference (at least from the point of view of the interpreter/VM). Both methods produce almost the exact same bytecode:

Python 2.7.2 (default, Nov 21 2011, 17:25:27) 
[GCC 4.6.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> def func1(files):
...   files = files[:100]
... 
>>> def func2(files):
...   new_files = files[:100]
... 
>>> dis.dis(func1)
  2           0 LOAD_FAST                0 (files)
              3 LOAD_CONST               1 (100)
              6 SLICE+2             
              7 STORE_FAST               0 (files)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        
>>> dis.dis(func2)
  2           0 LOAD_FAST                0 (files)
              3 LOAD_CONST               1 (100)
              6 SLICE+2             
              7 STORE_FAST               1 (new_files)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        

The same can be observed in Python 3.

In fact, func1 could even be a bit faster, because the name files has been seen before and could already be in some variable lookup cache.

2 Comments

Can I ask you what difference is going to make that different 1?
@Rik: I think it's the index of the affected local variable (0 in the first case, because this is the first accessed variable, 1 in the second case). I am not 100% sure, though.
1

There really aren't going to be many downsides to reusing variables, except that you're not going to experience many advantages either. The Python GC is going to have to run anyway to collect the old object, so there isn't an immediate memory gain when you override the variable, unlike in statically-compiled languages such as C, where reusing a variable prevents memory allocation entirely for the new object.

Further, you can truly confuse any future readers of your code, who generally expect new objects to have new names (a byproduct of garbage-collected languages).

Comments

1

The downside would be, that you can't use:

file_rest = files[batch_size:]

Regarding performance there is no downside. On the contrary: you might even improve performance by avoiding hash collision in the same name-space.

There was a SO-post regarding this in an other context.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.