This has been a posting that I been putting off for some time now. noSQL is a great topic but the more you look in to it the more there is. noSQL covers a lot of technologies and ideas, some very new, and some that been around for a bit. I’m hoping this posting wets your appetite for knowing more about noSQL. One thing I will say to start off with, noSQL is a buzzword, plain and simple. With a buzzword it’s up to you to determine if this is the right technology that fits your needs or solves your problems.

As Tim Berglund has said at his NoSQL Smackdown talk at NFJS Boston 2011, the term noSQL is ‘via negative’ which is a way of describing something by saying whats its not. Which in this case all that it tells us about noSQL is that it doesn’t use a SQL interface to get to your data. Now if you think that leaves a lot of room for what is noSQL, you are right!

The term ’noSQL’ seems to have two meanings when I’ve looked online, ’no SQL’ and ’not only sql’. For me I like the ’not only’ as it’s a less negative view on SQL. It’s a new take on how to store data that differs from the classic relational way, along with the new way of scaling. Due to this idea people feel that noSQL is a great fit for the web to ensure that you are able to scale out to what you need. I believe this last part is where the term (or buzzword) web scale comes from. If you want a funny way to ensure your boss will never let you use noSQL/MongoDB pass this video around at your work, sadly I did this :( (The video kind of falls under NSFW, but whats not to love about cute animals talking).

There has been a good number of big sites making a switch to noSQL or using it for parts of their site. Some examples are Foursquare, SourceForge, Etsy and The New York Times are using MongoDB for parts of their site along with a good number of others. Both Twitter and Facebook are using Cassandra for parts of there site. Canonical is using CouchDB to back their UbuntuOne system (a service like DropBox) in Ubuntu along with the BBC using CouchDB to ensure they can keep the needed system uptime. It seems that some big players in the tech field also plan on switching over to noSQL to solve some of the issues they are having.

I’ve been looking in to noSQL for a bit now, trying some of the different software out on some small personal projects to get a better idea. So far I have used CouchDB, Redis and MongoDB and looked in to many more. I’m far from a expert but I feel that I have a good idea for the database systems I have used so far. Both CouchDB and MongoDB fall under a type of noSQL databases called document-oriented as they save data as documents. Think of the data that would end up being in many tables and linking up using something like IDs being kept in one place all together, like info you find in real life for example something like a paper invoice or business cards. This makes reading the data faster as you only need to pull one document from the database. Documents are very flexible, and don’t need to follow a schema to be placed in the database. This is a new way of working with data and for some applications it works out really well. Also on a side note with noSQL, it seems more power is given to the Developer than having a the need of a Database Admin to manage things. For a start up this is a nice setup where you are trying to push out your products and updates as fast as you can.

As far as Redis, it is an in-memory key values store, most of the time you’ll see Redis used along side other database systems to help speed things up. I’m not really going to talk about Redis, not that I don’t like it but it’s a different idea of dealing with data then CouchDB and MongoDB and I don’t think it fits in to this posting. Still something that you should look in to as it’s a great way to speed up your application or deal with data that you don’t need saved to disk.

Playing around with both CouchDB and MongoDB you get a good feeling for the project’s intentions. To me MongoDB feels a lot like mySQL in both the server side and how you go about using it in your applications. You need to have a driver to be able to talk to the database, but you most likely will be able to find one for your language. When dealing with data from MongoDB most of the time you are using BSON which is a Binary JSON document. MongoDB seems to have worked on their idea of scaling by using things like sharding to let you scale horizontally across multiple nodes. People are saying that MongoDB is going to become the mySQL of the noSQL world and I think they are on the right path to do so.

As far as CouchDB, they have taken a new approach. They using HTTP/Restful API calls to talk to the database which means any language able to use Restful APIs, are also able to use CouchDB. You deal with pure JSON data just like some APIs found on the web. Because of CouchDB is using HTTP you can get some interesting system design for scaling by using replication. There are some projects that allow CouchDB to scale horizontally, similar to MongoDB, but this is not built in.

My opinions of CouchDB vs MongoDB, I like CouchDB as it seems like a new approach to doing things. For the type of projects I plan on working on, CouchDB seems to fit in better then MongoDB. But the thing that really got me looking in to CouchDB was this Quote,

“Let me tell you something: Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated.” - Jacob Kaplan-Moss, Creator of Django. http://jacobian.org/writing/of-the-web/

I believe that all software is going to shift onto the web soon. You see a lot of this idea being used by Google and others. Knowing that CouchDB is made “of the web” I know that it’s not going to hold me back in my applications for the web. CouchDB has a lot of cool things within it besides just storing your data. Right now I’m trying to learn as much as I can about CouchDB, I’m even started going to a CouchDB meetup in Boston. This is a technology that I want to start using more that just seems fun to work with.

If you have any questions or comments please post, also any suggestions on improving this are welcome.