Revisions to Why do games run slow on CPU

added 29 characters in body

Source Link

edited Jun 13, 2021 at 13:35

30.4k
4
76
124

So if you could buy 1 car engine vs. 10 motorcycle engines that apply twice the power for half the cost and weight, which would you buy? It would depend on a lot of factormany factors, such as power vs torque rating, engine wear and maintainability, acceleration curve, weight etc. Think about that for a moment, because similarSimilar principles apply here.

Sure, some people and companies write inefficient code. That's what you're seeing there. Or alternativelyAlternatively, you're comparing a game from 20 years ago, with one made today. Some technical developers can produce very high quality optimised code, usually due to (1) a deep understanding of complex CPU architecture, and (2) a lot of time spent in low-level optimisation.

The industry has sped up since the early 2000s, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world, to optimise so much. Whereas before year 2000, you had no choice if you wanted to write a game that could run reasonably well.

So if you could buy 1 car engine vs. 10 motorcycle engines that apply twice the power for half the cost and weight, which would you buy? It would depend on a lot of factor such as power vs torque rating, engine wear, acceleration curve, weight etc. Think about that for a moment, because similar principles apply here.

Sure, some people and companies write inefficient code. That's what you're seeing there. Or alternatively, you're comparing a game from 20 years ago, with one made today. Some technical developers can produce very high quality optimised code, usually due to (1) a deep understanding of complex CPU architecture, and (2) a lot of time spent in low-level optimisation.

The industry has sped up since the early 2000s, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world. Whereas before year 2000, you had no choice if you wanted to write a game that could run reasonably well.

So if you could buy 1 car engine vs. 10 motorcycle engines that apply twice the power for half the cost and weight, which would you buy? It would depend on many factors, such as power vs torque rating, engine wear and maintainability, acceleration curve, weight etc. Similar principles apply here.

Sure, some people and companies write inefficient code. Alternatively, you're comparing a game from 20 years ago, with one made today. Some technical developers can produce very high quality optimised code, usually due to (1) a deep understanding of complex CPU architecture, and (2) a lot of time spent in low-level optimisation.

The industry has sped up since the early 2000s, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world, to optimise so much. Whereas before year 2000, you had no choice if you wanted to write a game that could run reasonably well.

added 29 characters in body

Source Link

edited Jun 13, 2021 at 13:28

Engineer

30.4k
4
76
124

GPU is designed for a different set of tasks to CPU.

Design differences

These two cars are designed for completely different purposes. Why shouldn't this be the same for any other machine? DISCLAIMER: Sure, this is not a perfect analogy, but it makes the point.

Here's one of the most important differences. When there is a conditional branch (if or switch) in code, a CPU is designed to handle that in the most efficient way possible. A GPU works differently. It expects to be able to pipeline information similarly across all of its cores. (Pipeline means to make sure you know that the GPU is never waiting for information, i.e. that the flow is uninterrupted.)

In STEM, hybrid solutions often come out better than "pure" solutions - for effectiveness, cost and/or efficiency.

Cost of manufacture

In the real world, money talks. If Option A manufacturing setup costs me 10x, 5x, or even 2x to get the same quality product as Option B, why would I choose option A? At the end of the day, that cost has to go the consumer, which means to do the same tasks we do today on CPU + GPU, in the world you have proposed, we would have been paying maybe 5x-20x more for a CPU that can literally do everything, for the last 20 years. Would that have been worth it?

GPU is designed for a different set of tasks to CPU.

Design differences

These two cars are designed for completely different purposes. Why shouldn't this be the same for any other machine?

Here's one of the most important differences. When there is a conditional branch (if or switch) in code, a CPU is designed to handle that in the most efficient way possible. A GPU works differently. It expects to be able to pipeline information similarly across all of its cores. (Pipeline means to make sure you know that the GPU is never waiting for information, i.e. that the flow is uninterrupted.)

Cost of manufacture

In the real world, money talks. If Option A manufacturing setup costs me 10x, 5x, or even 2x to get the same quality product as Option B, why would I choose option A? At the end of the day, that cost has to go the consumer, which means to do the same tasks we do today on CPU + GPU, in the world you have proposed, we have been paying maybe 5x-20x more for a CPU that can literally do everything. Would that have been worth it?

Design differences

These two cars are designed for completely different purposes. Why shouldn't this be the same for any other machine? DISCLAIMER: Sure, this is not a perfect analogy, but it makes the point.

Here's one of the most important differences. When there is a conditional branch (if or switch) in code, a CPU is designed to handle that in the most efficient way possible. A GPU works differently. It expects to be able to pipeline information similarly across all of its cores.

In STEM, hybrid solutions often come out better than "pure" solutions - for effectiveness, cost and/or efficiency.

Cost of manufacture

In the real world, money talks. If Option A manufacturing setup costs me 10x, 5x, or even 2x to get the same quality product as Option B, why would I choose option A? At the end of the day, that cost has to go the consumer, which means to do the same tasks we do today on CPU + GPU, in the world you have proposed, we would have been paying maybe 5x-20x more for a CPU that can literally do everything, for the last 20 years. Would that have been worth it?

added 11 characters in body

Source Link

edited Jun 13, 2021 at 13:22

Engineer

30.4k
4
76
124

Think about a GPU like a drag racing car - it can go very fast, but it can't change direction very fastquickly. And it can't go offroad, nor can it climb a steep hill.

Now let's consider giving a CPU the same task. Every CPU core is hugely more complex in it's design (both in itself and within the context of its parent CPU's design) than a typical GPU thread. Guess what the downside is? OVERHEAD. There is overhead for branch prediction, there is an overhead dependent on cache regime complexity, there is an overhead for SSE/AVX/MMX, etc. etc. etc. There are probably hundreds of features. Every time you assign a thread to a CPU core, there is a startup and shutdown cost that can be many CPU cycles (which are lost for processing your code). On the GPU, any overhead to startup or shutdown is mitigated (amortised) by the fact that when you start one thread, you are starting several hundred others, as well, so cost becomes negligible.

It is true that a CPU could do anything, if it had enough cores. (You could say a GPU is not so capable, mainly because of a lack of branch prediction.) Cost per CPU core has traditionally been on the order of 10s-100x more costly than GPU cores. The reason is that these cores are fare more full-featured. They typically have multiple levels of cache, branch prediction, vectorisation, and many other features. Per se, a CPU has traditionally been massively more complex than a GPU. A GPU consists of a lot more repitition than of the systemic complexity thatthan a modern CPU.

The industry has sped up since thenthe early 2000s, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world. Whereas before year 2000, you had no choice - if you wanted to write an impressivea game, that iscould run reasonably well.

Think about a GPU like a drag racing car - it can go very fast, but it can't change direction very fast. And it can't go offroad.

Now let's consider giving a CPU the same task. Every CPU core is hugely more complex in it's design (both in itself and within the context of its parent CPU's design) than a typical GPU thread. Guess what the downside is? OVERHEAD. Every time you assign a thread to a CPU core, there is a startup and shutdown cost that can be many CPU cycles (which are lost for processing your code). On the GPU, any overhead to startup or shutdown is mitigated (amortised) by the fact that when you start one thread, you are starting several hundred others, as well, so cost becomes negligible.

It is true that a CPU could do anything, if it had enough cores. (You could say a GPU is not so capable, mainly because of a lack of branch prediction.) Cost per CPU core has traditionally been on the order of 10s-100x more costly than GPU cores. The reason is that these cores are fare more full-featured. They typically have multiple levels of cache, branch prediction, vectorisation, and many other features. Per se, a CPU has traditionally been massively more complex than a GPU. A GPU consists of a lot more repitition than of the systemic complexity that a modern .

The industry has sped up since then, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world. Whereas before year 2000, you had no choice - if you wanted to write an impressive game, that is.

Think about a GPU like a drag racing car - it can go very fast, but it can't change direction quickly. And it can't go offroad, nor can it climb a steep hill.

Now let's consider giving a CPU the same task. Every CPU core is hugely more complex in it's design (both in itself and within the context of its parent CPU's design) than a typical GPU thread. Guess what the downside is? OVERHEAD. There is overhead for branch prediction, there is an overhead dependent on cache regime complexity, there is an overhead for SSE/AVX/MMX, etc. etc. etc. There are probably hundreds of features. Every time you assign a thread to a CPU core, there is a startup and shutdown cost that can be many CPU cycles (which are lost for processing your code). On the GPU, any overhead to startup or shutdown is mitigated (amortised) by the fact that when you start one thread, you are starting several hundred others, as well, so cost becomes negligible.

It is true that a CPU could do anything, if it had enough cores. (You could say a GPU is not so capable, mainly because of a lack of branch prediction.) Cost per CPU core has traditionally been on the order of 10s-100x more costly than GPU cores. The reason is that these cores are fare more full-featured. They typically have multiple levels of cache, branch prediction, vectorisation, and many other features. Per se, a CPU has traditionally been massively more complex than a GPU. A GPU consists of a lot more repitition than of the systemic complexity than a CPU.

The industry has sped up since the early 2000s, CPUs and GPUs have become a lot more powerful and thus able to carry a bigger load, and consequently many developers no longer give as much attention to deep optimisation as they may once have done. It often just isn't economically viable in today's fast-paced and highly competitive world. Whereas before year 2000, you had no choice if you wanted to write a game that could run reasonably well.

Source Link

answered Jun 13, 2021 at 13:17

Engineer

30.4k
4
76
124

Loading

Stack Exchange Network

Return to Answer