0

I am implementing a function which involves operations on numpy arrays and I am getting Memory Error on it. I am explicitly stating the dimensions of the numpy array which are creating the issue.

a = np.random.rand(15239,1)
b = np.random.rand(1,329960)
c  = np.subtract(a,b)**2
d = np.random.rand(15239,1)
e = np.random.rand(1,329960)
del a
gc.collect()
f = np.subtract(d,e)**2
del d
gc.collect()
g = np.sqrt(c + f).min(axis=0)
del c,f
gc.collect()

I am getting Memory Error on running them.

Though, the function which is using them is given below-

def make_weight_map(masks):
    """
    Generate the weight maps as specified in the UNet paper
    for a set of binary masks.

    Parameters
    ----------
    masks: array-like
        A 3D array of shape (n_masks, image_height, image_width),
        where each slice of the matrix along the 0th axis represents one binary mask.

    Returns
    -------
    array-like
        A 2D array of shape (image_height, image_width)

    """
    masks = masks.numpy()
    nrows, ncols = masks.shape[1:]
    masks = (masks > 0).astype(int)
    distMap = np.zeros((nrows * ncols, masks.shape[0]))
    X1, Y1 = np.meshgrid(np.arange(nrows), np.arange(ncols))
    X1, Y1 = np.c_[X1.ravel(), Y1.ravel()].T
    for i, mask in enumerate(masks):
        # find the boundary of each mask,
        # compute the distance of each pixel from this boundary
        bounds = find_boundaries(mask, mode='inner')
        X2, Y2 = np.nonzero(bounds)
        xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
        ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
        distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
    ix = np.arange(distMap.shape[0])
    if distMap.shape[1] == 1:
        d1 = distMap.ravel()
        border_loss_map = w0 * np.exp((-1 * (d1) ** 2) / (2 * (sigma ** 2)))
    else:
        if distMap.shape[1] == 2:
            d1_ix, d2_ix = np.argpartition(distMap, 1, axis=1)[:, :2].T
        else:
            d1_ix, d2_ix = np.argpartition(distMap, 2, axis=1)[:, :2].T
        d1 = distMap[ix, d1_ix]
        d2 = distMap[ix, d2_ix]
        border_loss_map = w0 * np.exp((-1 * (d1 + d2) ** 2) / (2 * (sigma ** 2)))
    xBLoss = np.zeros((nrows, ncols))
    xBLoss[X1, Y1] = border_loss_map
    # class weight map
    loss = np.zeros((nrows, ncols))
    w_1 = 1 - masks.sum() / loss.size
    w_0 = 1 - w_1
    loss[masks.sum(0) == 1] = w_1
    loss[masks.sum(0) == 0] = w_0
    ZZ = xBLoss + loss
    return ZZ

Traceback of the error when used in function is below- I am using the system of 32 GB RAM, I also tested the code on 61 GB RAM-

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-32-0f30ef7dc24d> in <module>
----> 1 img = make_weight_map(img)

<ipython-input-31-e75a6281476f> in make_weight_map(masks)
     34         xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
     35         ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
---> 36         distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
     37     ix = np.arange(distMap.shape[0])
     38     if distMap.shape[1] == 1:

MemoryError:

I have checked the below question but couldn't find the solution to my problem-
Python/Numpy Memory Error
Memory growth with broadcast operations in NumPy

This is another question with Memmap approach but I don't know how to apply in my use case.

2
  • c is large, (15239,329960). Verify that. So is f. c+f produces another array of that size. And sqrt another. The result of min is smaller, like b, though I don't if internally it has to make a temporary large array or not. Commented Oct 7, 2019 at 16:22
  • Hi, Yes,c and f have a shape of (15239,329960) and c+f also do Commented Oct 7, 2019 at 16:28

1 Answer 1

1

No mystery, these are really large arrays. At 64-bit precision, an array of shape (15239,329960) needs...

>>> np.product((15239,329960)) * 8 / 2**30
37.46345967054367

...about 37GiB! Things to try:

  • Reduce the bit-depth, e.g. use np.float16, requiring 25% of the memory.
  • Is the data actually dense, or can you use scipy.sparse?
  • Maybe it's time for dask?
  • Get more RAM!
Sign up to request clarification or add additional context in comments.

2 Comments

Getting Overflow occurred in squared when using `float16, Yes, the data is dense. I guess, I have to try using dask now
Can you please have a look at this problem, I am having while converting to dask array.stackoverflow.com/questions/58277168/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.