Return to Answer

added 3 characters in body

Source Link

edited Apr 13 at 20:22

30.4k
4
76
124

In addition to @KevinReid's answer which is spot on, you may look into compressing that transform data as much as possible.

If you can use 10 bits per each of your x, y, z positional components, for example, then 3x10 = 30 bits can be fit into a single 32-bit value passed to the GPU. That uses 4x less bandwidth than passing a 32-bit float per component (= 96 bits, then round up to 128 bits / 16B to align to 32B word word boundaries, depending on whether or not you additionally send w-coordinate).

Source Link

answered Apr 13 at 16:32

Engineer

30.4k
4
76
124

In addition to @KevinReid's answer which is spot on, you may look into compressing that transform data as much as possible.

If you can use 10 bits per each of your x, y, z positional components, for example, then 3x10 = 30 bits can be fit into a single 32-bit value passed to the GPU. That uses 4x less bandwidth than passing a 32-bit float per component (= 96 bits, then round up to 128 bits to align to 32B word boundaries, depending on whether or not you additionally send w-coordinate).