Revisions to Decoupling rendering pipeline (for UI responsiveness): Multithreading and multiple contexts?

Tweeted twitter.com/#!/StackGameDev/status/437123050723897345

occurred Feb 22, 2014 at 7:13

added 128 characters in body

Source Link

edited Feb 21, 2014 at 17:49

740
1
9
22

Warning! Wall of text (see <TL;DR> paragraphs below for short version)

I have been noticing something in quite many games (most recently in cutting-edge RTS games such as Uber Entertainment's Planetary Annihilation, which is amazing by the way) that I think has room for improvement.

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
<TL;DR #1> Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flips a bit which Thread #1 will read on its next cycle to determine if it needs to flip the texture.

<TL;DR #2> My question is... can this be done in a cleaner way? Is there an engine design that can achieve this perhaps without requiring two contexts? I know that for example on iOS you are required to set up an OpenGL ES context for each thread, and as far as I can tell the only sane way to handle this is with threads.

I have been noticing something in quite many games (most recently in cutting-edge RTS games such as Uber Entertainment's Planetary Annihilation, which is amazing by the way) that I think has room for improvement.

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flips a bit which Thread #1 will read on its next cycle to determine if it needs to flip the texture.

My question is... can this be done in a cleaner way? Is there an engine design that can achieve this perhaps without requiring two contexts? I know that for example on iOS you are required to set up an OpenGL ES context for each thread, and as far as I can tell the only sane way to handle this is with threads.

Warning! Wall of text (see <TL;DR> paragraphs below for short version)

I have been noticing something in quite many games (most recently in cutting-edge RTS games such as Uber Entertainment's Planetary Annihilation, which is amazing by the way) that I think has room for improvement.

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
<TL;DR #1> Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flips a bit which Thread #1 will read on its next cycle to determine if it needs to flip the texture.

<TL;DR #2> My question is... can this be done in a cleaner way? Is there an engine design that can achieve this perhaps without requiring two contexts? I know that for example on iOS you are required to set up an OpenGL ES context for each thread, and as far as I can tell the only sane way to handle this is with threads.

deleted 9 characters in body

Source Link

edited Feb 21, 2014 at 13:03

Steven Lu

740
1
9
22

However I am also seeing a few nontrivialities with option 2, because it's not clear how to make it perform properly when the 3D rendering is keeping up with the screen refresh rate -- thread #2 sort of has to wait for thread#1 to be done flipping the texture before it can start rendering to the new one. It seems like I have to dynamically switch back to a simpler "basic" pipeline if the rendering isn't taking too long.

The architecture that is emerging looks to me like an additional abstraction of a front and back buffer. The real front and back buffer are now decoupled from the heavy 3D render task so they are now continuously updated at refresh rate so long as the UI rendering can complete in time, and the flip rate of new buffer pair is the "real" framerate.

My question is... can this be done in a cleaner way? Is there an engine design that can achieve this perhaps without requiring two contexts? I know that for example on iOS you are required to set up an OpenGL ES context for each thread, and as far as I can tell the only sane way to handle this is with threads.

deleted 9 characters in body

Source Link

edited Feb 21, 2014 at 12:56

Steven Lu

740
1
9
22

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flagsflips a condition variablebit which Thread #1 will useread on its next cycle to determine if it needs to flip the texture.

Or on iOS, you can have regular UI widget overlay views that go on top of the GL ViewControllerEAGLView which I would expect to be responsive as they bypass the GL rendering since they're not part of it.

But I am talking about a unified graphics pipeline that has UI integrated into it, where this UI is drawn with the same context used for drawing the 3D scene. This is for portability reasons. Even though I have labeled the question with OpenGL the topic obviously is applicable to DirectX for academic purposes.

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flags a condition variable which Thread #1 will use on its next cycle to flip the texture.

Or on iOS, you can have regular UI widget overlay views that go on top of the GL ViewController which I would expect to be responsive as they bypass the GL rendering since they're not part of it.

But I am talking about a unified graphics pipeline that has UI integrated into it, where this UI is drawn with the same context used for drawing the 3D scene. This is for portability reasons.

No multithreading. Main GL loop flushes input queue and draws UI, and does not necessarily update 3D scene. Design 3D rendering pipeline to draw fragments of the scene at a time into an alternate render buffer or texture, one piece at a time. This sounds a little impractical because it's not clear how the rendering should be split up. Tiles? scanlines? How big will they be? With a super heavy workload that normally renders at 2fps, I'd want to split the scene into 500ms/16.667ms = 30 chunks, but there's definitely no guarantee that each chunk would take the allotted 16.67ms. It also sounds like adjusting the chunk number will result in shuffling around resources on the GPU, and basically lead to a bunch of extra overhead.
Two GL contexts, two threads. Thread #1 flushes input queue and draws UI, periodically updates texture on which 3D scene is drawn, draws 3D scene with full-screen-quad, and handles buffer swapping to ensure vsync smoothness. Thread #2 renders 3D scene to texture shared with Thread#1. Requires use of ping-pong scheme to facilitate resource sharing. Thread #2 flips a bit which Thread #1 will read on its next cycle to determine if it needs to flip the texture.

Or on iOS, you can have regular UI widget overlay views that go on top of the EAGLView which I would expect to be responsive as they bypass the GL rendering since they're not part of it.

But I am talking about a unified graphics pipeline that has UI integrated into it, where this UI is drawn with the same context used for drawing the 3D scene. This is for portability reasons. Even though I have labeled the question with OpenGL the topic obviously is applicable to DirectX for academic purposes.

added 762 characters in body

Source Link

edited Feb 21, 2014 at 12:50

Steven Lu

740
1
9
22

Loading

Source Link

asked Feb 21, 2014 at 12:42

Steven Lu

740
1
9
22

Loading

Stack Exchange Network

Return to Question