1

I am using MongoDB v3.2 and I'm using the native nodejs driver v2.1. When running the aggregation pipeline on large data sets(1mil+ documents), I am encountering the following error:

 'aggregation result exceeds maximum document size (16MB)'

Here is my aggregation pipeline code:

var eventCollection = myMongoConnection.db.collection('events');
var cursor = eventCollection.aggregate([
                {
                    $match: {
                        event_type_id: {$eq: 89012}
                    }
                },
                {
                    $group: {
                        _id: "$user_id",
                        score: {$sum: "$points"}
                    }
                },
                {
                    $sort: {
                        score: -1
                    }
                }
            ],
            {
                cursor: {
                    batchSize: 500
                },
                allowDiskUse: true,
                explain: false
            }, function () {

            });

Things I've tried:

//Using cursor event listeners. None of the on listeners seem to work. Always get error about 16mb.
cursor.on("data", function (data) {
   console.log("Some data: ", data);
});
cursor.on("end", function (data) {
   console.log("End of data: ", data);
});

//Using forEach. Which I thought would allow for >16mb because it's used in conjunction with the batchSize and cursor.
cursor.forEach(function (item) {

})

I've seen in other answers (How could I write aggregation without exceeds maximum document size?) that I need to have the results returned by a cursor, so how do I properly do that? I just can't seem to get it to work. Any suggestions on what the batchSize should be?

I am using the native mongodb package - https://github.com/mongodb/node-mongodb-native for a nodejs project not the mongo command line.

4
  • The .forEach() has no place here since an "aggregation cursor" is actually just a node "stream" interface, and therefore only the "data" event is actually doing anything. However if that is how your code is actually set up then this would suggest that you have a MongoDB 2.4 or lower server instance, which of course does not support a "cursor/stream" response. I would therefore suggest you upgrade right away, since that would be a "very old" release now. Commented Apr 13, 2016 at 5:20
  • stackoverflow.com/questions/29644587/… Commented Apr 13, 2016 at 6:41
  • I have verified my MongoDB instance is version 3.2.1 Commented Apr 13, 2016 at 14:34
  • I don't believe you. The server is clearly not a capable version or you are not in fact using the node native driver that you claim to be using, or your actual code has different syntax usage to what is here. This simply does not reproduce and the error indicates that any cursor options are being ignored. Commented Apr 14, 2016 at 1:44

1 Answer 1

6

Ok I figured it out. It was not working because I was passing in a callback function as the last parameter in the aggregate method. By passing null, it allowed the stream to work as expected. Changes shown below:

var cursor = eventCollection.aggregate([
            {
                $match: {
                    event_type_id: {$eq: 89012}
                }
            },
            {
                $group: {
                    _id: "$user_id",
                    score: {$sum: "$points"}
                }
            },
            {
                $sort: {
                    score: -1
                }
            }
        ],
        {
            cursor: {
                batchSize: 500
            },
            allowDiskUse: true,
            explain: false
        }, null);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.