0

My case is that I need to store a unique key in order to keep track of my millions of models in Javascript, and store the key-value pair in a Map.

There is an existing uuid library which will generate a string key for me, but I am afraid I would need an integer key in order to achieve optimal performance. However it looks like I will have to convert the string uuid to unique integer myself which is not trivial and has overhead too.

Is there a significant difference using string as key for a Map or an integer?

4
  • 2
    Have you measured the performance? Why do you think that string keys and integer keys have significantly different performance? Commented Sep 11, 2021 at 4:32
  • The only time I know of integers being faster is when you use a sparse array instead of a Map. "There is an existing uuid library which will generate a string key for me..." If the uuid's are arbitrary anyway is there any reason you can not start at 0 and increment 1 for each? Commented Sep 11, 2021 at 4:34
  • @Chris Welton Thank you. After consideration it actually seems there is no reason I shouldn't start from 0 and increment 1 each. In this case would it be faster if I use a normal sparse array instead of a Map? Commented Sep 11, 2021 at 4:53
  • I will give you an example... Commented Sep 11, 2021 at 4:55

1 Answer 1

2

Here is a solution based on our conversation.

There are many similar ways to do this, but the key is to store the reverse index for the lookup in the model itself, so each model "knows" where it is in the model index.

Edit: There was a bug in my first example that would have showed up when the array became sparse due to array.length shrinking.

This is a more advanced example that gets rid of the bug and has a dataIndex class that is responsible for indexing, and where models can have reverse lookups for multiple indexes.

class dataIndex {
    constructor(indexId) {
        this.vec = [];
        this.indexId = indexId;
        this.indexNext = 0;
    }
    indexModel(model) {
        this.vec[model.lookup[this.indexId] = this.indexNext++] = model;
        return this;
    }
}

class dataModel {
    constructor(data) {
        this.data = data;
        this.lookup = new Map();
    }
    static compareData(a, b) {
        return (a.data === b.data) ? 0:
            (a.data > b.data) ? 1 : -1;
    }
}

const modelIndex = new dataIndex('primary');
const sortedIndex = new dataIndex('sorted');

for (let i = 0; i < 10; i++) {
    let newModel = new dataModel(Math.random());
    modelIndex.indexModel(newModel);
}

const ordered = modelIndex.vec.sort((a, b) => dataModel.compareData(a, b))
ordered.forEach(model => sortedIndex.indexModel(model));

console.log(ordered);

Output:

[
  dataModel {
    data: 0.08420389624353097,
    lookup: Map(0) { primary: 9, sorted: 0 }
  },
  dataModel {
    data: 0.1528733550120258,
    lookup: Map(0) { primary: 7, sorted: 1 }
  },
  dataModel {
    data: 0.28483626134194595,
    lookup: Map(0) { primary: 8, sorted: 2 }
  },
  dataModel {
    data: 0.3257986769682104,
    lookup: Map(0) { primary: 5, sorted: 3 }
  },
  dataModel {
    data: 0.3409113857134396,
    lookup: Map(0) { primary: 3, sorted: 4 }
  },
  dataModel {
    data: 0.3841968173496322,
    lookup: Map(0) { primary: 1, sorted: 5 }
  },
  dataModel {
    data: 0.40414714769598237,
    lookup: Map(0) { primary: 4, sorted: 6 }
  },
  dataModel {
    data: 0.5817767975980099,
    lookup: Map(0) { primary: 0, sorted: 7 }
  },
  dataModel {
    data: 0.8091360598739015,
    lookup: Map(0) { primary: 2, sorted: 8 }
  },
  dataModel {
    data: 0.8217632650897493,
    lookup: Map(0) { primary: 6, sorted: 9 }
  }
]
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for the answer. I am wondering is this on average faster or slower than if indexArray is a Map, considering models can be deleted as well. I.E. the case where revIndex starts at 100000 and ends at 999999 with holes in the middle can happen.
When "holes" start happening that is referred to as moving from a "dense" to a "sparse" array. Generally sparse arrays are pretty well optimized in most javascript implementations. (It actually stores them a lot like key, value objects, but internally they can highly optimize "hashing" mostly contiguous integers.) news.qooxdoo.org/…
@cr001 to give you an idea it takes 175ms to create and index 1000000 random numbers in my laptops node.js. How many millions are you looking to keep track of at once?
The problem is that I am not really sure at this point how large it can get to. It is used to track something like socket connections, so it depends on many factors that cannot be simulated now. In Java I would be pretty sure sparse arrays are faster than TreeMaps in the case of integer keys, however I am not sure if this is the case in Javascript.
Well, I don't know of anything that will be faster, and the second benefit to using an array is that it keeps native optimized Array.filter and Array.sort in easy reach, both of which can accept custom callbacks for filtering and sorting. So if you have 10 million records and need to delete the oldest 2 million you have Array.filter, Array.sort, Array.splice, and Array.slice to help you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.