Run Time of 2 for loops with Gradient calculation

Ask Question

Asked 8 months ago

Modified 8 months ago

Viewed 46 times

I am trying to run the following code:

for ie in range(100):
    energy = torch.tensor([.1 + .1 * ie], dtype = torch.float32, requires_grad = True)
    xe = .5 + 20 * (torch.log(energy) - log_01) / (log_10 - log_01) #(log_ parts are some constants)
    xe2 = xe * xe
    
    for it in range(100):
        theta = torch.tensor([it * theta_max / 99], dtype = torch.float32, requires_grad = True)
        Y[0] = torch.exp(PXmg1_p[0, 0]) + torch.exp(PXmg1_p[0, 1] * torch.pow(xe, PXmg1_p[0, 2]))
        Y[1] = torch.exp(PXmg2_p[0, 0]) + torch.exp(PXmg2_p[0, 1] * torch.pow(xe, PXmg2_p[0, 2]))
        Y[2] = torch.exp(PXmg3_p[0, 0]) + torch.exp(PXmg3_p[0, 1] * torch.pow(xe, PXmg3_p[0, 2]))
        Y[3] = torch.exp(PXmg4_p[0, 0]) + torch.exp(PXmg4_p[0, 1] * torch.pow(xe, PXmg4_p[0, 2]))

        thisp0_mg[ie, it] = solvecubic(energy, theta, 0)

        thisp0de_mg[ie, it] = solvecubic(energy, theta, 2)
        thisp0de2_mg[ie, it] = solvecubic(energy, theta, 22)
        thisp0de3_mg[ie, it] = solvecubic(energy, theta, 25)
        thisp0dth_mg[ie, it] = solvecubic(energy, theta, 3)
        thisp0dth2_mg[ie, it] = solvecubic(energy, theta, 32)

where

def solvecubic(energy, theta, mode):
#Evaluate B given Y
B = torch.linalg.solve(A, Y)

val = 0
x = .5 + 4 * theta / theta_max
    
for i in range(4):
    val += B[i] * x**i

if mode == 0:
    return val.item()

elif mode == 2 or mode == 22 or mode == 25:
    if mode == 2:
        return torch.autograd.grad(val, energy, retain_graph = True)[0].item()

    if mode == 22:
        first_der = torch.autograd.grad(val, energy, create_graph = True)[0]

        return torch.autograd.grad(first_der, energy, retain_graph = True)[0].item()

    if mode == 25:
        first_der = torch.autograd.grad(val, energy, create_graph = True)[0]
        
        second_der = torch.autograd.grad(first_der, energy, create_graph = True)[0]

        return torch.autograd.grad(first_der, energy, retain_graph = True)[0].item()

elif mode == 3 or mode == 32:
    if mode == 3:
        return torch.autograd.grad(val, theta, retain_graph = True)[0].item()

    if mode == 32:
        first_der = torch.autograd.grad(val, theta, create_graph = True)[0]
        
        return torch.autograd.grad(first_der, theta, retain_graph = True)[0].item()

where A is vandermode matrix (4x4)

This code seem to work really slowly, and the cuda is not available on my computer, so I cannot use torch.cuda.clear_cache(). I have also tried using del theta, or del energy after their iteration over the value is done, but it doesn't seem it improves that much.

Is there a way to delete the computational graphs after each iteration so that I save some memory. The code is extremely slow as for now.

Just to give an idea, in the inner loop I have tried range 1, 2, and 3. The time it takes to compile is 1 min, 2 min 30 sec, 10 min respectively.

edited Apr 10 at 15:28

asked Apr 10 at 10:07

Noyanini

13 bronze badges

maybe you could use Google Colab which gives access to computers with GPU

furas
– furas

2025-04-10 10:11:40 +00:00
Commented Apr 10 at 10:11
The issue is almost certainly related to the unknown method some_function(xe2, theta). With your code and def some_function(xe2, theta): return xe2 * theta the result is obtained in 0.2 seconds. To do more to help we would need to know more about some_function()

JonSG
– JonSG

2025-04-10 13:42:57 +00:00
Commented Apr 10 at 13:42
Note that the inner loop results are the same for every outer iteration and the outer values and inner values can be pre-computed then passed via itertools.product() to some_function(). Doing this reduces the runtime to 0.1 seconds

JonSG
– JonSG

2025-04-10 13:45:06 +00:00
Commented Apr 10 at 13:45
@JonSG, I have added the parts you asked, if you want check. I think the problem is retain_graph part, but I get an error if I don't use it

Noyanini
– Noyanini

2025-04-10 15:21:05 +00:00
Commented Apr 10 at 15:21

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Run Time of 2 for loops with Gradient calculation

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest