Tips on working with embedded arrays in Mongoose/Mongo

Why Mongoose and MongoDB?

MongoDB uses Document Oriented Storage in JSON-style documents. Mongoose is an object relational modeling (ORM) system that bridges Node.js and MongoDB. Running Node.js as a server and using Mongoose and MongoDB for the database system is a popular choice that provides advantages such as scaleability, simplicity of a document object systyem that maps nicely to programming language data types and no complex database joins (which you would find with a Relational Database).

While working with queries and updates on a simple document object can be straight-forward, when data gets more complex and nested, queries and updates become more complicated and finding the appropriate query for a specific situation can be difficult. This post attempts to provide several tips for working with embedded arrays.

Example: an app that manages creation of meetings

Let's take a simple example of building out methods for interacting with a database that is storing information for an app that is used for creating meetings. Our datbase will have at a bare minimum a Users Collection and a Meetings Collection. Collections, in case you are just learning about MongoDB, are roughly the equivalent of a Relational Database Management System (RDBMS) table. Collections hold documents which are objects with key/value pairs. As with any plain-old JavaScript object, the values can be any primitive value but can also be objects or arrays. This is far different than an RDBMS which maps data differently. Since there is no relational mapping in MongoDB, you will often be storing a lot of data in a nested way.

For example, here is what a document (object) in our Meetings Collection might look like. I'm using RoboMongo for data visualization here which is a tool that allows you to interact with and visualize a MongoDB database.

This document has seven key/value pairs:

_id - the auto-generated id for the document
name - String: the name for the Meeting ('test2' in this case)
info - String: used for any meeting specific info
active - Boolean: if the meeting is active or not (might be over in which case we would make it inactive but keep the data)
messages - Array: an array of all messages about the meeting that have occurred in the meeting chats
users - Array: an array of people invited to the meeting
_v - the versionKey is a property set on each document which gets incremented whenever an internal array is updated or modified. It is used to ensure that two updates happening simultaneously don't conflict.

Pushing to embedded array

Let's focus on arrays since updating the other key/value pairs when they are primitive values is fairly trivial. Suppose we wanted to update the datbase with a new message that was captured when someone submitted a new message in the meeting chat room, we could do something like this:

Meetings.findById(MeetingId, function (err, meeting) {
// handle errors ..
meeting.messages.push({ userId: userId, username: username, message: message, date: new Date() });
 post.save(callback);
})

The MeetingId, userId, username, and message would be available to us from the communication we received from the client, whether by an HTTP POST request or using Socket.io.

You could also do this using the following syntax which is perhaps harder to read:

Meetings.update({ "_id": MeetingId },
	{$push: { "messages": message }},
    function(err, numAffected) {
      if(err) {//handle error}
      else { 
        //do something depending on the number of documents affected
      }});

Again, this is equivalent to pushing to the embedded array as shown in the code above but one could argue that it is harder to follow what is going on. Basically, we search for the document (object) with the desired meeting id (_id), then find the messages array, then push our new message to that array. The third parameter in the collection update method is a callback function that will return the number of documents affected, 0 for none or 1 if our document was updated. This can be used to communicate back to the client if the document was successfully updated.

The process goes like this. We find the meetingId, create a variable that points to the object in our messages array that we want to update, change the message for that message object, then save it to the MongoDB.

Let's break this down a little bit more to understand the syntax. The basic syntax is the following:

Model.update(conditions, update, options, callback);

The first argument to the update method is conditions, i.e. the search query. The second is what to update, the third are options for the update, and the fourth is a callback function. We could rewrite our Meetings.update method call in a more readable way by first defining our conditions, update, options and callback.

var conditions = { "_id" : meetingId };
{$push: { "messages": message }}
//options not included in above code
var options = { multi: false };
var callback = function (err, numAffected) {
  //handle errors
  // numAffected is the number of updated documents
});

So, my method call would simply look like this:

Model.update(conditions, update, options, callback);

Updating an embedded array

Now, let's deal with something trickier; updating something that already exists within the embedded array. Let's suppose the object at index 0 of our mesages array looked like this in our DB.

{ userId: '555',
  username: 'Joey',
  message: 'Hey, I can not make the meeting',
  date: 'Sun Apr 12 2015 22:34:40 GMT-0500 (PST)'
  }

Joey later realizes he can make the meeting and wants to update his message which we are allowing on the client side.

Meetings.findById(meetingId, function (err, meeting) {
// handle errors ..
 var message =  meeting.messages.userId(userId);
 meeting.message = 'Hey, I can go now';
 meeting.save(callback);
});

Use $AddToSet to insert into an array if value does not exist

The $addToSet operator adds a value to an array unless the value is already present, in which case $addToSet does nothing to that array.

It takes this form:

Meetings.update(
{ _id: meetingId },
{ $addToSet: { messages: 'hey there' } }
)

In this example, we've changed the messages array to be an array of Strings rather than objects. $addToSet will check if 'hey there' already exists in the array and if it doesn't, it will add it to the messages array.

How about an array of objects?

$addToSet can also work with an array of objects but they need to be documents first. What does this mean? We need to first create a Schema for the objects that will exist in the array so that Mongoose can use special methods on them. It will also create a unique ID for each of the documents in the array. It would look like this for our meetings collection example:

var Messages = new Schema({
  userId: String,
  username: String,
  message: String,
  date: Date 
});

And to include our Messages Schema in our Meetings Collection, we could simply write our Meetings Schema like this (abbreviated for simplicity).

var Meetings = new Schema({
  _id: ObjectId,
  name: String,
  messages: [Messages],
  ... 
});

Here we have included one Schema inside of another, the Messages Schema in the Meetings Schema. This will allow us to use certain methods when performing queries and updates on the messages array that wouldn't otherwised be available. For instance, doing an $addToSet operation where we check if the message document (object) already exists and if not perform an insert. This could be useful to find out if we've already inserted the message in the DB, perhaps if the user clicks submit twice. Or another use might be to make a similar schema for Users and when someone clicks to join the meeting, add them to the users array only if they are not already in it.

This concludes a brief foray into working with embedded arrays when using Mongoose and MongoDB.