Skip to main content
corrected spelling, clarified terminology, reduced heavy generalization
Source Link

I wonder what might be a good way to efficiently poll 21 texels.

The answer is the efficient way is the way that is not polling 21 texels. Sorry to be obvious but smartphones don'tmobile devices may not have the necessary bus width to support such kernels. You need to optimize by reducing the size of the texture plugged in the sampler so that caching will cover a larger kernel radius.

Also, you could forget about your disk kernel and use a two passes algorithm using a vertical kernel, and another one using a purely horizontal, this way you pass from "2D" to "1D" so to speak, and reduce drastically the number of samplings as well as improving cache performance thanks to linear access.

Vertical fetechesfetches should not affect cache performance thanks to the Z storage textures should be arranged in GPU memory. cf http://en.wikipedia.org/wiki/Z-order_curve

I wonder what might be a good way to efficiently poll 21 texels.

The answer is the efficient way is the way that is not polling 21 texels. Sorry to be obvious but smartphones don't have the necessary bus width to support such kernels. You need to optimize by reducing the size of the texture plugged in the sampler so that caching will cover a larger kernel radius.

Also, you could forget about your disk kernel and use a two passes algorithm using a vertical kernel, and another one using a purely horizontal, this way you pass from "2D" to "1D" so to speak, and reduce drastically the number of samplings as well as improving cache performance thanks to linear access.

Vertical feteches should not affect cache performance thanks to the Z storage textures should be arranged in GPU memory. cf http://en.wikipedia.org/wiki/Z-order_curve

I wonder what might be a good way to efficiently poll 21 texels.

The answer is the efficient way is the way that is not polling 21 texels. Sorry to be obvious but mobile devices may not have the necessary bus width to support such kernels. You need to optimize by reducing the size of the texture plugged in the sampler so that caching will cover a larger kernel radius.

Also, you could forget about your disk kernel and use a two passes algorithm using a vertical kernel, and another one using a purely horizontal, this way you pass from "2D" to "1D" so to speak, and reduce drastically the number of samplings as well as improving cache performance thanks to linear access.

Vertical fetches should not affect cache performance thanks to the Z storage textures should be arranged in GPU memory. cf http://en.wikipedia.org/wiki/Z-order_curve

Source Link
v.oddou
  • 2.4k
  • 1
  • 14
  • 21

I wonder what might be a good way to efficiently poll 21 texels.

The answer is the efficient way is the way that is not polling 21 texels. Sorry to be obvious but smartphones don't have the necessary bus width to support such kernels. You need to optimize by reducing the size of the texture plugged in the sampler so that caching will cover a larger kernel radius.

Also, you could forget about your disk kernel and use a two passes algorithm using a vertical kernel, and another one using a purely horizontal, this way you pass from "2D" to "1D" so to speak, and reduce drastically the number of samplings as well as improving cache performance thanks to linear access.

Vertical feteches should not affect cache performance thanks to the Z storage textures should be arranged in GPU memory. cf http://en.wikipedia.org/wiki/Z-order_curve