How can I perform an inner join with two object arrays in JavaScript?

Question

I have two object arrays:

var a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]

I want to do an inner join for these two arrays a and b, and create a third array like this (if the position property is not present, then it becomes null):

var result = [{
  {id: 4, name: 'Greg', position: null},
  {id: 1, name: 'David', position: null},
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
}]

My approach:

function innerJoinAB(a,b) {
    a.forEach(function(obj, index) {
        // Search through objects in first loop
        b.forEach(function(obj2,i2){
        // Find objects in 2nd loop
        // if obj1 is present in obj2 then push to result.
        });
    });
}

But the time complexity is O(N^2). How can I do it in O(N)? My friend told me that we can use reducers and Object.assign.

I'm not able to figure this out. Please help.

You have two arrays of objects. Seems like you need to copy all the values of one array to a new one, then merge the second (and subsequent) arrays into it. Array.prototype.reduce might be a good start. What is the primary key, id? Since you're using an array to hold objects, likely you also want to create an index of ID to array index so you can easily find IDs without having to iterate over the array each time. — RobG
– RobG, Commented Feb 24, 2017 at 1:08
PS inner join might not be the right term as that from my understanding only gives a result set where there was a match in both sets (so your example would only give lines with IDs 2 & 3). This is more of a typical merge. — Marty
– Marty, Commented Feb 24, 2017 at 1:10
Possible duplicate of Merge two array of objects based on a key — Heretic Monkey
– Heretic Monkey, Commented May 20, 2019 at 20:25
What you want, based on your output example, is a full outer join, not an inner join. — svenema
– svenema, Commented Mar 17, 2020 at 12:17

Henke · Accepted Answer · 2021-04-26 23:51:26Z

14

I don't know how reduce would help here, but you could use a Map to accomplish the same task in O(n):

const a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'}];

const b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'}];

var m = new Map();
// Insert all entries keyed by ID into the Map, filling in placeholder
// 'position' since the Array 'a' lacks 'position' entirely:
a.forEach(function(x) { x.position = null; m.set(x.id, x); });

// For values in 'b', insert them if missing, otherwise, update existing values:
b.forEach(function(x) {
    var existing = m.get(x.id);
    if (existing === undefined)
        m.set(x.id, x);
    else
        Object.assign(existing, x);
});

// Extract resulting combined objects from the Map as an Array
var result = Array.from(m.values());

console.log(JSON.stringify(result));

.as-console-wrapper { max-height: 100% !important; top: 0; }

Because Map accesses and updates are O(1) (on average - because of hash collisions and rehashing, it can be longer), this makes O(n+m) (where n and m are the lengths of a and b respectively; the naive solution you gave would be O(n*m) using the same meaning for n and m).

edited Apr 26, 2021 at 23:51

Henke

5,9076 gold badges41 silver badges54 bronze badges

answered Feb 24, 2017 at 1:34

ShadowRanger

158k12 gold badges222 silver badges317 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Gerrit0 Over a year ago

One issue with this: If position is set in array a, it will be lost if it is not duplicated in b.

ShadowRanger Over a year ago

@Gerrit0: Yar, I noted that assumption (that a always lacks a position) in the comments. You could easily adjust it to make the set of x.position when working on a conditional, but the OP's provided inputs indicated that a never had position and b always did. Similarly, this assumes that id is unique on its own (no need for name as part of the key, with the assumption that name will match if id does).

phil Over a year ago

This looks like a left join to me, not an inner join.

ShadowRanger Over a year ago

@phil: Agreed. The OP asked for inner join, but their desired output was a left join. I provided an answer that would produce their desired output, as it seems clear they were using the wrong terminology.

Henke · Accepted Answer · 2021-04-26 10:50:01Z

11

One of the ways how to solve it.

const a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
];

const b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
];

const r = a.filter(({ id: idv }) => b.every(({ id: idc }) => idv !== idc));
const newArr = b.concat(r).map((v) => v.position ? v : { ...v, position: null });

console.log(JSON.stringify(newArr));

.as-console-wrapper { max-height: 100% !important; top: 0; }

edited Apr 26, 2021 at 10:50

Henke

5,9076 gold badges41 silver badges54 bronze badges

answered Feb 24, 2017 at 1:18

kind user

42k8 gold badges69 silver badges78 bronze badges

5 Comments

Hamms Over a year ago

Note that the time complexity of this is still O(N^2) (technically O(N*M) where N and M are the lengths of your two arrays)

Felix Dombek Over a year ago

Also, it uses name as primary key which may not be intended as names may be non-unique

kind user Over a year ago

@FelixDombek Please provide me more info.

Felix Dombek Over a year ago

In the line which filters, I think you should be comparing ids, not names

kind user Over a year ago

@FelixDombek Agreed, but in this particular case, it didn't make any difference (: Anyways, I've changed it.

Stephen · Accepted Answer · 2019-01-25 23:34:09Z

4

If you drop the null criteria (many in the community are saying using null is bad) then there's a very simple solution

let a = [1, 2, 3];
let b = [2, 3, 4];

a.filter(x => b.includes(x)) 

// [2, 3]

answered Jan 25, 2019 at 23:34

Stephen

8,1769 gold badges49 silver badges74 bronze badges

2 Comments

montelof Over a year ago

what about 1 and 4?

Stephen Over a year ago

Like a full outer join? it should be [].concat( a.filter(x => !b.includes(x)), b.filter(x => !a.includes(x)) )

Community · Accepted Answer · 2020-06-20 09:12:55Z

3

To reduce the time complexity, it is inevitable to use more memory.

var a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]     

var s = new Set();
var result = [];
b.forEach(function(e) {
    result.push(Object.assign({}, e));
    s.add(e.id);
});
a.forEach(function(e) {
    if (!s.has(e.id)) {
      var temp = Object.assign({}, e);
      temp.position = null;
      result.push(temp);
    }
});
console.log(result);

update

As @Blindman67 mentioned:"You do not reduce the problems complexity by moving a search into the native code." I've consulted the ECMAScript® 2016 Language Specification about the internal procedure of Set.prototype.has() and Map.prototype.get(), unfortunately, it seemed that they both iterate through all the elements they have.

Set.prototype.has ( value )#

The following steps are taken:

    Let S be the this value.
    If Type(S) is not Object, throw a TypeError exception.
    If S does not have a [[SetData]] internal slot, throw a TypeError exception.
    Let entries be the List that is the value of S's [[SetData]] internal slot.
    Repeat for each e that is an element of entries,
        If e is not empty and SameValueZero(e, value) is true, return true.
    Return false.

http://www.ecma-international.org/ecma-262/7.0/#sec-set.prototype.has

Map.prototype.get ( key )#

The following steps are taken:

    Let M be the this value.
    If Type(M) is not Object, throw a TypeError exception.
    If M does not have a [[MapData]] internal slot, throw a TypeError exception.
    Let entries be the List that is the value of M's [[MapData]] internal slot.
    Repeat for each Record {[[Key]], [[Value]]} p that is an element of entries,
        If p.[[Key]] is not empty and SameValueZero(p.[[Key]], key) is true, return p.[[Value]].
    Return undefined.

http://www.ecma-international.org/ecma-262/7.0/#sec-map.prototype.get

Perhaps, we can use the Object which can directly access its properties by their names, like the hash table or associative array, for example:

var a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]     

var s = {};
var result = [];
b.forEach(function(e) {
    result.push(Object.assign({}, e));
    s[e.id] = true;
});
a.forEach(function(e) {
    if (!s[e.id]) {
      var temp = Object.assign({}, e);
      temp.position = null;
      result.push(temp);
    }
});
console.log(result);

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Feb 24, 2017 at 2:31

Yichong

7274 silver badges10 bronze badges

9 Comments

ShadowRanger Over a year ago

You misunderstand the spec. Set is describing basic logic in the docs, not an actual implementation strategy. get and has are required to be sublinear, so at worst they're O(log n) and the suggested implementation is a hash table, O(n). Read earlier in the overall Set docs:

ShadowRanger Over a year ago

"Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model."

TechnoCorner Over a year ago

One question @Y.C Why did you use result.push(Object.assign({}, e)); why can't we just push result.push(e); it will give the same result?

Yichong Over a year ago

@TechnoCorner Object.assign() offers a shallow clone, it copies the values of all enumerable own properties from one or more source objects to a target object. So if you changed the objects in the result array, it won't affect the original one, because they're different objects with same content. But if you don't mind making changes to the the original objects, in that case, obviously result.push(e) is more efficient.

Yichong Over a year ago

@ShadowRanger So, you can't be sure which browser is using which implementation or mechanism, but you know that objects in JS can be used as hash table or associative array.

|

Blindman67 · Accepted Answer · 2017-02-24 04:07:40Z

You do not reduce the problems complexity by moving a search into the native code. The search must still be done.

Also the addition of the need to null a undefined property is one of the many reasons I dislike using null.

So without the null the solution would look like

var a = [
  {id: 4, name: 'Greg',position: '7'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]


function join (indexName, ...arrays) {
    const map = new Map();
    arrays.forEach((array) => {
        array.forEach((item) => {
            map.set(
                item[indexName],
                Object.assign(item, map.get(item[indexName]))
            );
        })
    })
    return [...map.values()];
}

And is called with

const joinedArray = join("id", a, b);

To join with a default is a little more complex but should prove handy as it can join any number of arrays and automatically set missing properties to a provided default.

Testing for the defaults is done after the join to save a little time.

function join (indexName, defaults, ...arrays) {
    const map = new Map();
    arrays.forEach((array) => {
        array.forEach((item) => {
            map.set(
                item[indexName], 
                Object.assign( 
                    item, 
                    map.get(item[indexName])
                )
            );
        })
    })
    return [...map.values()].map(item => Object.assign({}, defaults, item));

}

To use

const joinedArray = join("id", {position : null}, a, b);

You could add...

    arrays.shift().forEach((item) => {  // first array is a special case.
        map.set(item[indexName], item);
    });

...at the start of the function to save a little time, but I feel it's more elegant without the extra code.

Gerrit0 · Accepted Answer · 2017-02-24 03:33:34Z

Here is an attempt at a more generic version of a join which accepts N objects and merges them based on a primary id key.

If performance is critical, you are better off using a specific version like the one provided by ShadowRanger which doesn't need to dynamically build a list of all property keys.

This implementation assumes that any missing properties should be set to null and that every object in each input array has the same properties (though properties can differ between arrays)

var a = [
    {id: 4, name: 'Greg'},
    {id: 1, name: 'David'},
    {id: 2, name: 'John'},
    {id: 3, name: 'Matt'},
];
var b = [
    {id: 5, name: 'Mathew', position: '1'},
    {id: 600, name: 'Gracia', position: '2'},
    {id: 2, name: 'John', position: '2'},
    {id: 3, name: 'Matt', position: '2'},
];

console.log(genericJoin(a, b));

function genericJoin(...input) {
    //Get all possible keys
    let template = new Set();
    input.forEach(arr => {
        if (arr.length) {
            Object.keys(arr[0]).forEach(key => {
                template.add(key);
            });
        }
    });

    // Merge arrays
    input = input.reduce((a, b) => a.concat(b));

    // Merge items with duplicate ids
    let result = new Map();
    input.forEach(item => {
        result.set(item.id, Object.assign((result.get(item.id) || {}), item));
    });

    // Convert the map back to an array of objects
    // and set any missing properties to null
    return Array.from(result.values(), item => {
        template.forEach(key => {
            item[key] = item[key] || null;
        });
        return item;
    });
}

Snowbldr · Accepted Answer · 2019-10-05 06:10:33Z

Here's a generic O(n*m) solution, where n is the number of records and m is the number of keys. This will only work for valid object keys. You can convert any value to base64 and use that if you need to.

const join = ( keys, ...lists ) =>
    lists.reduce(
        ( res, list ) => {
            list.forEach( ( record ) => {
                let hasNode = keys.reduce(
                    ( idx, key ) => idx && idx[ record[ key ] ],
                    res[ 0 ].tree
                )
                if( hasNode ) {
                    const i = hasNode.i
                    Object.assign( res[ i ].value, record )
                    res[ i ].found++
                } else {
                    let node = keys.reduce( ( idx, key ) => {
                        if( idx[ record[ key ] ] )
                            return idx[ record[ key ] ]
                        else
                            idx[ record[ key ] ] = {}
                        return idx[ record[ key ] ]
                    }, res[ 0 ].tree )
                    node.i = res[ 0 ].i++
                    res[ node.i ] = {
                        found: 1,
                        value: record
                    }
                }
            } )
            return res
        },
        [ { i: 1, tree: {} } ]
         )
         .slice( 1 )
         .filter( node => node.found === lists.length )
         .map( n => n.value )

join( [ 'id', 'name' ], a, b )

This is essentially the same as Blindman67's answer, except that it adds an index object to identify records to join. The records are stored in an array and the index stores the position of the record for the given key set and the number of lists it's been found in.

Each time the same key set is encountered, the node is found in the tree, the element at it's index is updated, and the number of times it's been found is incremented.

finally, the idx object is removed from the array with the slice, any elements that weren't found in each set are removed. This makes it an inner join, you could remove this filter and have a full outer join.

finally each element is mapped to it's value, and you have the merged array.

Jamie Marshall · Accepted Answer · 2023-05-29 17:52:37Z

From extensive research I've done on this, no way to reduce the joining of two lists beyond O(n*m)

The classical solution that I understand most databases use, is to create an index from the smaller list and then do a scan of that index. This is essentially just pushing the O(n*m) "work" down the interpreter chain as far as possible. ie, your OS/processor API probably has a very optimized way of orchestrating list compares so you get a performance boost from them doing the job. This technically makes it O(n*m + n) but it should still be most efficient.

var a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]

const idx = a.reduce(prev, _ => {
    prev[_.id] = _
}, {})

const result = b.reduce(prev, _ => {
  if( idx[_] !== undefined ){
    prev.push([_, idx[_.id])
}, [])

Again, as far as I understand, this is the classic solution to this probably. Would love to be wrong.

Collectives™ on Stack Overflow

How can I perform an inner join with two object arrays in JavaScript?

8 Answers 8

4 Comments

5 Comments

2 Comments

update

9 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

4 Comments

5 Comments

2 Comments

update

9 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related