Skip to main content
Commonmark migration
Source Link
  1. PREPARATION/BENCHMARKING: To surely know that this routine is the problem, write a small benchmark routine. It shall execute the Update() method of the Gravity multiple times for e.g. 1000 times and measure it's time. If you want to achieve 30 FPS with 100 objects, you should simulate 100 objects and measure the time for 30 executions. It should be less than 1 second. Using such a benchmark is needed to do reasonable optimizations. Else you will probably achieve the opposite and make the code run slower because you just think it must be faster... So I really encourage you to do this!

  2. OPTIMIZATIONS: While you cannot do much about the O(N²) effort problem (meaning: calculation time increases quadratically with number of simulated objects N), you can improve the code itself.

    a) You use a lot of "associative array" (Dictionary) lookups within your code. These are slow! For example entityEngine.Entities[e.Key].Position. Can't you just use e.Value.Position? This saves one lookup. You do this everywhere in the whole inner loop to access properties of the objects referenced by e and e2... Change this! b) You create a new Vector inside the loop new Vector2( .... ). All "new" calls implicate some memory allocation (and later: deallocation). These are even much slower than lookups of Dictionaries. If you only need this Vector temporarily, so allocate it outside of the loops AND -reuse- it by reinitializing its values to the new values instead of creating a new object. c) You use a lot of trigonometric functions (e.g. atan2 and cos) within the loop. If your accuracy doesn't need to really really exact, you might try to use an lookup table instead. To do this you scale your value to a define range, round it to an integer value and look it up in a table of pre-calculated results. If you need help with that, just ask. d) You often use .Texture.Width / 2. You can pre-calculate this and store the result as .Texture.HalfWidth or -if this is always an even positive integer value - you can use bit the shift operation >> 1 to devide by two.

a) You use a lot of "associative array" (Dictionary) lookups within your code. These are slow! For example entityEngine.Entities[e.Key].Position. Can't you just use e.Value.Position? This saves one lookup. You do this everywhere in the whole inner loop to access properties of the objects referenced by e and e2... Change this! b) You create a new Vector inside the loop new Vector2( .... ). All "new" calls implicate some memory allocation (and later: deallocation). These are even much slower than lookups of Dictionaries. If you only need this Vector temporarily, so allocate it outside of the loops AND -reuse- it by reinitializing its values to the new values instead of creating a new object. c) You use a lot of trigonometric functions (e.g. atan2 and cos) within the loop. If your accuracy doesn't need to really really exact, you might try to use an lookup table instead. To do this you scale your value to a define range, round it to an integer value and look it up in a table of pre-calculated results. If you need help with that, just ask. d) You often use .Texture.Width / 2. You can pre-calculate this and store the result as .Texture.HalfWidth or -if this is always an even positive integer value - you can use bit the shift operation >> 1 to devide by two.

  1. PREPARATION/BENCHMARKING: To surely know that this routine is the problem, write a small benchmark routine. It shall execute the Update() method of the Gravity multiple times for e.g. 1000 times and measure it's time. If you want to achieve 30 FPS with 100 objects, you should simulate 100 objects and measure the time for 30 executions. It should be less than 1 second. Using such a benchmark is needed to do reasonable optimizations. Else you will probably achieve the opposite and make the code run slower because you just think it must be faster... So I really encourage you to do this!

  2. OPTIMIZATIONS: While you cannot do much about the O(N²) effort problem (meaning: calculation time increases quadratically with number of simulated objects N), you can improve the code itself.

a) You use a lot of "associative array" (Dictionary) lookups within your code. These are slow! For example entityEngine.Entities[e.Key].Position. Can't you just use e.Value.Position? This saves one lookup. You do this everywhere in the whole inner loop to access properties of the objects referenced by e and e2... Change this! b) You create a new Vector inside the loop new Vector2( .... ). All "new" calls implicate some memory allocation (and later: deallocation). These are even much slower than lookups of Dictionaries. If you only need this Vector temporarily, so allocate it outside of the loops AND -reuse- it by reinitializing its values to the new values instead of creating a new object. c) You use a lot of trigonometric functions (e.g. atan2 and cos) within the loop. If your accuracy doesn't need to really really exact, you might try to use an lookup table instead. To do this you scale your value to a define range, round it to an integer value and look it up in a table of pre-calculated results. If you need help with that, just ask. d) You often use .Texture.Width / 2. You can pre-calculate this and store the result as .Texture.HalfWidth or -if this is always an even positive integer value - you can use bit the shift operation >> 1 to devide by two.

  1. PREPARATION/BENCHMARKING: To surely know that this routine is the problem, write a small benchmark routine. It shall execute the Update() method of the Gravity multiple times for e.g. 1000 times and measure it's time. If you want to achieve 30 FPS with 100 objects, you should simulate 100 objects and measure the time for 30 executions. It should be less than 1 second. Using such a benchmark is needed to do reasonable optimizations. Else you will probably achieve the opposite and make the code run slower because you just think it must be faster... So I really encourage you to do this!

  2. OPTIMIZATIONS: While you cannot do much about the O(N²) effort problem (meaning: calculation time increases quadratically with number of simulated objects N), you can improve the code itself.

    a) You use a lot of "associative array" (Dictionary) lookups within your code. These are slow! For example entityEngine.Entities[e.Key].Position. Can't you just use e.Value.Position? This saves one lookup. You do this everywhere in the whole inner loop to access properties of the objects referenced by e and e2... Change this! b) You create a new Vector inside the loop new Vector2( .... ). All "new" calls implicate some memory allocation (and later: deallocation). These are even much slower than lookups of Dictionaries. If you only need this Vector temporarily, so allocate it outside of the loops AND -reuse- it by reinitializing its values to the new values instead of creating a new object. c) You use a lot of trigonometric functions (e.g. atan2 and cos) within the loop. If your accuracy doesn't need to really really exact, you might try to use an lookup table instead. To do this you scale your value to a define range, round it to an integer value and look it up in a table of pre-calculated results. If you need help with that, just ask. d) You often use .Texture.Width / 2. You can pre-calculate this and store the result as .Texture.HalfWidth or -if this is always an even positive integer value - you can use bit the shift operation >> 1 to devide by two.

Source Link
SDwarfs
  • 671
  • 4
  • 5

If you already have such huge problems with 10 simulated objects, you'll need to optimize the code! Your nested loop would cause only 10*10 iterations of which 10 iterations are skipped (same object), resulting in 90 iterations of the inner loop. If you only achieve 2 FPS, this would mean that your performance is so bad, that you only achieve 180 iterations of the inner loop per second.

I suggest you to do the following:

  1. PREPARATION/BENCHMARKING: To surely know that this routine is the problem, write a small benchmark routine. It shall execute the Update() method of the Gravity multiple times for e.g. 1000 times and measure it's time. If you want to achieve 30 FPS with 100 objects, you should simulate 100 objects and measure the time for 30 executions. It should be less than 1 second. Using such a benchmark is needed to do reasonable optimizations. Else you will probably achieve the opposite and make the code run slower because you just think it must be faster... So I really encourage you to do this!

  2. OPTIMIZATIONS: While you cannot do much about the O(N²) effort problem (meaning: calculation time increases quadratically with number of simulated objects N), you can improve the code itself.

a) You use a lot of "associative array" (Dictionary) lookups within your code. These are slow! For example entityEngine.Entities[e.Key].Position. Can't you just use e.Value.Position? This saves one lookup. You do this everywhere in the whole inner loop to access properties of the objects referenced by e and e2... Change this! b) You create a new Vector inside the loop new Vector2( .... ). All "new" calls implicate some memory allocation (and later: deallocation). These are even much slower than lookups of Dictionaries. If you only need this Vector temporarily, so allocate it outside of the loops AND -reuse- it by reinitializing its values to the new values instead of creating a new object. c) You use a lot of trigonometric functions (e.g. atan2 and cos) within the loop. If your accuracy doesn't need to really really exact, you might try to use an lookup table instead. To do this you scale your value to a define range, round it to an integer value and look it up in a table of pre-calculated results. If you need help with that, just ask. d) You often use .Texture.Width / 2. You can pre-calculate this and store the result as .Texture.HalfWidth or -if this is always an even positive integer value - you can use bit the shift operation >> 1 to devide by two.

Do only one of the changes at a time and measure the change by the benchmark to see how it effected your runtime! Maybe one thing is good while the other idea was bad (even I did propose them above!)...

I think these optimizations will be much better than trying to achieve better performance by using multiple threads! You'll have much trouble to coordinate the threads, so they will not overwrite the others values. Also they will conflict when accessing similar memory regions too. If you use 4 CPUs/Threads for this job, you could expect only a speed up of 2 to 3 for the frame rate.