Python function returns different result with memoization

Question

My Python pathfinding function returns different results if I use a memoizing decorator. It returns a correct value on its own, but after being memoized, it returns an incorrect value.

The function I'm talking about looks like this:

@functools.lru_cache(maxsize=None)
def recursiveTraversal(originIndex, target, steps):
    startingVertex = data[originIndex]

    if startingVertex["ID"] == target:
        if steps == 0:
            return Path(0, [])
        else:
            return None
    else:
        if steps == 0:
            return None
        else:
            potentialPaths = []
            for i in startingVertex["Edges"]:
                nextVertex = data[i]
                nextPath = recursiveTraversal(i, target, steps - 1)
                if nextPath == None:
                    continue
                nextPath.weight += int(nextVertex["Weight"])
                nextPath.vertices.append(i)

                potentialPaths.append(nextPath)
            if len(potentialPaths) > 0:
                minPath = min(potentialPaths, key=lambda x: x.weight)
                return minPath
            else:
                return None

A complete runnable example can be found here. The upper part of the file is all data, and the code is at the bottom. To reproduce this, just comment out line 15, and observe that the output is different.

How can I get the memoized version to output the same thing as the normal version?

Can you give us a minimal reproducible example? First, that input should be part of the question, not a comment. More importantly, if you just run recursiveTraversal(0, "1", 3) on the code we're given, it's obviously just going to raise a NameError on data. — abarnert
– abarnert, Commented Jul 29, 2018 at 22:46
Also, you say the data here wasn't changed at all… but aren't those nextPath values just references to some node in data? From a quick glance, I don't see anywhere that could be returning anything else. And you immediately modify whatever's returned by changing its weight and appending another vertex. — abarnert
– abarnert, Commented Jul 29, 2018 at 22:49
nextVertex is a reference to data, but I'm actually changing nextPath, which is a Path object, not a reference to data. — Reubend
– Reubend, Commented Jul 29, 2018 at 22:53
I've just updated this with a runnable example (complete with data and the Path class). — Reubend
– Reubend, Commented Jul 29, 2018 at 23:19

a_guest · Accepted Answer · 2018-07-30 01:31:57Z

1

The problem is that you are modifying attributes of the return values of recursiveTraversal. This function returns Path objects and you modify their attributes weight and vertices. So for the non-cached version each time you call the function with (x, y, z) arguments a fresh Path(0, []) object will be created and its attributes are modified later on in the for loop. But for each (x, y, z) call you are assured to start from a fresh object. Now for the cached version, instead of providing a fresh object by going all the way down the recursive tree, the cache wrapper just provides you with the instance of a previously created Path object (which already has modified weight and vertices attributes) and these are modified even further (i.e. this modifies the cache). This can be seen from the following example:

# Augment `Path` class with `__repr__`.
class Path:
    # Other methods go here.

    def __repr__(self):
        return '{}({}, {})'.format(self.__class__.__name__, repr(self.weight), repr(self.vertices))

data = [
    {'ID': '2', 'Weight': 1, 'Edges': [1]},
    {'ID': '1', 'Weight': 1, 'Edges': []}
]

print(recursiveTraversal(0, '1', 1))  # Prints "Path(1, [1])".
print(recursiveTraversal(1, '1', 0))  # Prints "Path(1, [1])".

Checking the function recursiveTraversal is seems that for steps=0 it should return Path(0, []) in case the target matches. Nevertheless it returns Path(1, [1]). This happens because the previous call to recursiveTraversal already invoked recursiveTraversal(1, '1', 0) and modified the result's weight and vertices attributes. When performing the second explicit call to recursiveTraversal(1, '1', 0) you'll get back the cached reference to that object.

Possible solution

A possible solution is to create copies of cached objects before modifying them further. This prevents that the cache gets modified.

from copy import deepcopy

# Within `recursiveTraversal`:
# ...
nextPath = deepcopy(recursiveTraversal(i, target, steps - 1))
# ...

answered Jul 30, 2018 at 1:31

a_guest

36.7k15 gold badges75 silver badges137 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Reubend Over a year ago

Thank you so much, this is exactly right! For anyone else who encounters this, it seems the cache stores references to the return values, not the return values themselves. That's the root of my problem.

a_guest Over a year ago

@Reubend The cache stores exactly what the function returns. However, in Python, the notion of a variable is different than from example in C++. This post and this post will give you more insight.

Collectives™ on Stack Overflow

Python function returns different result with memoization

1 Answer 1

Possible solution

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Possible solution

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related