GPU cache utilization

Question

For some programs (not only one) I see that for most of the kernels, cache utilizations (l2 and unified) are low (up to 3 in the scale of 1 to 10). The programs are not toy and simple. Is that normal? The device is M2000.

I would like to know how cache utilization is measured? I didn't find any explanation about that in the documents.

Robert Crovella · Accepted Answer · 2019-05-14 20:50:51Z

4

If the kernel is limited by some other factor, such as compute or memory bound, then it is normal for the cache utilization to be low. The only way you can get the cache utilization really high (7 or higher) is to have a lot of data reuse in that cache.

The cache utilization should be measured as a percentage (from 0 to 10, 10 being 100%) of peak cache bandwidth (apparently with some normalization).

Often (will vary by GPU, and not clearly published) the available L2 cache bandwidth is around 2x or more the available memory (i.e. GPU DRAM) bandwidth. Therefore, to get a reading above 5 on this metric, the data bandwidth into your code as seen at the L2 would have to be higher than memory bandwidth. This usually implies data reuse.

It should be possible to write a test microbenchmark to explore this.

edited May 14, 2019 at 20:50

answered May 14, 2019 at 19:04

Robert Crovella

154k12 gold badges255 silver badges300 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mahmood Over a year ago

Thanks for that. I also have seen that for some kernels the reported L2 utilization is n/a. Does that mean, the kernel doesn't use cache at all? But L2 hit rate is a number greater than zero.

Collectives™ on Stack Overflow

GPU cache utilization

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related