1

I've been reading about data-oriented programming in the context of Entity Component Systems. Apparently, using a struct of arrays can more effectively leverage the cache and give substantial performance increases. Basically, if all of the data you're iterating over is contiguous, then cache locality is leveraged to give a large performance increase.

Because I'll be working in Javascript, I figured I'd first devise a small little benchmark to see how much of a performance increase is possible under ideal conditions. I made it very simple. In the first test I benchmark the speed of iterating over an array of structs, and in the second I benchmark the speed of iterating over a struct of arrays.

Here is the code:

function randomInt() { return Math.floor(Math.random() * 100) + 1; }
function randomStr() { return Math.random().toString(36).substring(7); }

let samples = 1000;
let count = 10000000;

function benchmarkArrayOfStructs() {
  let AOS = [];

  for (let i = 0; i < count; i++) {
    AOS.push({ health: randomInt(), name: randomStr(), damage: randomInt() });
  }

  let t1 = performance.now();
  
  let sum = 0;

  for (let x = 0; x < samples; x++) {
    for (let i = 0; i < AOS.length; i++) {
      let item = AOS[i];
    
      sum += item.health + item.damage;
    }
  }
    
  console.log(performance.now() - t1);
}

function benchmarkStructOfArrays() {
  let SOA = { health: [], name: [], damage: [] }

  for (let i = 0; i < count; i++) {
    SOA.health.push(randomInt());
    SOA.name.push(randomStr());
    SOA.damage.push(randomInt());
  }

  let t2 = performance.now();
  
  let sum = 0;

  let h = SOA.health;
  let d = SOA.damage;

  for (let x = 0; x < samples; x++) {
    for (let i = 0; i < count; i++) {  
      sum += h[i] + d[i];
    }
  }

  console.log(performance.now() - t2);
}

benchmarkArrayOfStructs();
benchmarkStructOfArrays();

Interestingly, the latter solution is only 20% or so faster than the first solution. In the various talks that I've watched, they've claimed 10x speed increases for this type of operation. Also, intuitively I feel as if the latter solution should be much faster, but it isn't. Now I'm beginning to wonder if this sort of optimization is even worth integrating into my project, as it severely decreases ergonomics. Have I done something wrong in my benchmark, or is this the actual expected speedup?

10
  • Synthetic benchmarks are rarely a good guide to what will happen with your real code in real situations, JavaScript engines apply different optimization strategies (or don't apply any at all) based on how the code is used. The benchmark above is quite simplistic (you'd want to do lots of data retrieval tests, not just one), but that's not the primary problem. The primary problem is that it's a synthetic benchmark in the first place. Commented Jan 24, 2021 at 14:26
  • I'm aware of the pitfalls of benchmarking in general, but this one seems simple enough to be explanatory. Do you have any advice on how to improve the benchmark? Commented Jan 24, 2021 at 14:27
  • As I said, at a minimum you'd repeat the test (within the test, not separate runs) -- a lot. (Like, 10,000+ times.) But are you having a significant performance problem with your project's current code? Commented Jan 24, 2021 at 14:28
  • I'll attempt to modify the code example to do such. There is also no current code to speak. I'm attempting to design a minimal Entity Component System in Javascript for a project, and I'm using this to determine whether or not it's worth it to implement it using a Struct of Arrays versus an Array of Structs. The latter is significantly more ergonomic, so I'd prefer to do such, but if the former is as performant as others claim, I may have to sacrifice ergonomics. I'm trying to determine if it's worth it. Commented Jan 24, 2021 at 14:30
  • I've updated the code to take into account samples, but it is still the same percentage difference. Regardless of whether there's 1 or 10,000 samples, the latter is 20% faster. Commented Jan 24, 2021 at 14:40

2 Answers 2

2

JavaScript isn't going to auto-vectorize with SIMD while JITting. That's one of the biggest advantages that an SoA layout allows, but you aren't taking advantage of it. (And AFAIK can't easily in JS.)

Also, if your code is the only thread running on an otherwise-idle desktop machine, you have much more memory bandwidth available to your thread than on a typical server, or on a busy machine with all cores competing for memory access. (Intel Xeons have lower max per-core DRAM memory bandwidth due to higher latency interconnects, but higher aggregate bandwidth with all cores busy. That's assuming you miss in private L2 cache.) So your benchmark probably tested a case where you have lots of excess memory bandwidth.

You might also see more benefit from SoA if your objects are bigger. Your AoS loop is reading 2 of the 3 objects from every array element, so only a small part of the data brought into cache is "wasted". If you try with more fields that your loop doesn't use, you may see a bigger advantage.

Sign up to request clarification or add additional context in comments.

Comments

0

In Javascript, there is no guarantee that an Array is a true array - for all you know the engine could be storing elements all over memory, nullifying cache locality. To force the engine to store an array in contiguous memory, you must use a TypedArray.

TypedArrays are also intrinsically faster than regular arrays since the JS engine doesn't have to do any guesswork whatsoever about usage patterns.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.