Stellar use cases for MongoDB

28 July 2011

MongoDB has a nice wide sweet spot where it’s a very useful persistence platform, however, it’s not for everything. I thought I would quickly enumerate a couple great use cases that have come up in the last year and a half and why they are such a great fit for MongoDB.

Documents: Using MongoDB instead of a XML based system.

MongoDB is a document oriented data store. XML is a document language. By moving a traditional XML app to MongoDB one can experience a few key advantages. The typical pattern in XML is to fetch an entire document, work with it, and put it back to the server. This approach has many downsides including the amount of data transmitted over the wire, collision detection/resolution, data set size, and server side overhead. In the MongoDB model, documents can be updated atomically, fetched by index, and even partially fetched. Applications are simpler, faster, and more robust.

Metadata storage systems.

Any system that stores metadata can be a great use case in MongoDB. Such systems typically have a pattern of adding attributes about some type of entitiy, and then needing to query/sort/filter based on these items. The prototypical use case for such a system is the use of tags. The tag implementation is so superior in MongoDB that almost single handedly compels one to use MongoDB for any system needing tags. Simply put:

db.mymetadata.save({stuff:"some data here", thing:"/x/foo/bar.mpg", tags:['cats','beach','family']})
db.mymetadata.ensureIndex({"tags":-1})
db.mymetadata.find({tags:'cats'})
...
"indexBounds" : {
		"tags" : [
			[
				"cats",
				"cats"
			]
		]

In many metadata systems the schema may vary depending on the metadata itself. This allows for huge degrees of flexibility in the data modeling of applications that store metadata. Imagine a media storage service that can store video and image data in the same collection but with different attributes about each type of metadata. No joins needed on query, and the logical I/O path is minimized! MongoDB now supports sparse indexes, so indexes on attributes that are not in every document are kept at a minimum size.

Read intensive systems

Any system where the amount of change is low, and read is high is a nice sweet spot for MongoDB. MongoDB has a nice scaling property with both the replica sets functionality (setting SLAVE_OK), as well as using sharding. Combine this with the document model, and metadata storage capabilites one has an excellent system for say a gawker clone. Reads can come off any one of N sharded nodes by say, story_id, and reads can be geographically targeted to a slave for reads. Keep your data clustered by key for super fast I/O.