Will the same thread accessing the same memory bank twice cause conflicts?

Question

I am working on a kernel that does a vector reduction. It basically adds up all the positions in the vector and stores the result in position 0.

I'm following this scheme, with blocks of 512 float elements:

reduction scheme

The code:

//scratch[] is a vector located in shared memory with all 512 elements
NUM_ELEMENTS = 512;
for( stride=NUM_ELEMENTS/2; stride>=1; stride = stride/2 ) {
  if (threadIdx.x < stride){
    scratch[threadIdx.x] += scratch[threadIdx.x + stride];
  }
  __syncthreads();
}

The odd thing is, I'm expecting to get shared bank conflicts and I'm not. In the first iteration, thread 0 is adding up position 0 and position 256, which reside in the same bank. Thread 1 is adding up position 1 and position 257, and so on.

All of these operations require each thread in the warp to obtain 2 distinct values from the same bank, yet, I get no conflicts whatsoever:

result

What am I missing?

Greg Smith · Accepted Answer · 2017-01-27 02:09:13Z

3

The calculation for bank conflicts is on a per memory instruction per request basis. The shared load (right hand side) and the shared store (left hand side) are executed as separate instructions many clock cycles apart.

answered Jan 27, 2017 at 2:09

Greg Smith

11.7k2 gold badges39 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ismarlowe Over a year ago

Thank you. I see what you're saying now and it is embarrassingly obvious. Thank you!

Collectives™ on Stack Overflow

Will the same thread accessing the same memory bank twice cause conflicts?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related