Archive for March, 2009

First impressions of CouchDB

Sunday, March 29th, 2009

I’m spending my weekend playing with CouchDB. It’s a “document” database (storing JSON objects) which is pitching itself as more suitable for some web applications than traditional relational databases like MySQL. There are other resources that explain it well so I won’t go into its feature set, this is more a note of what I thought of it after a play.

The first thing you notice is the built-in UI is pretty slick. Think of a really slimmed down “PHPMyAdmin” that’s built into the database. This let me start playing around with the new concepts of CouchDB (documents, map/reduce) without getting bogged down with hooking it up to PHP.

I started creating some documents and it hits you pretty fast that this isn’t anything like MySQL. All the documents are stored within a database (you can have many documents inside a database, and many databases) but there’s no type-differentiation between them: it really is just a big bag full of JSON objects. The convention is to create a “type” property on your documents yourself and once I did that things started to click better.

So having put some documents in, how the heck do I query them? There’s no SQL; instead you need to create views which use two functions (map/reduce) to provide the logic for the query. It took me a good few hours to read up on this and understand the basics. The idea is that for a given view, all documents will be sequentially passed into your “map” function and it’s up to your logic (programmed in Javascript) to decide if any data from that document is going to be put on the output stack. A simple example of this is a view to extract documents of a certain “type” (assuming I’ve used the convention of giving my documents a type property).

function(doc){
  if(doc.type == "person"){
    emit(doc._id, doc);
  }
}

This map function will “emit” (put on the output stack) all my “person” documents, indexed by the _id property (a UUID that each document must have, like a primary key). This gives you a fine level of control of the outputted data (I think more than SQL would) because you’ve got a full programming language available to define the logic; although this worries me a little: with great power comes great responsibility! It took me a while to understand that there will be no query logic in my application any more. That’s a double-edged sword I suppose (good: no dodgy SQL written by juniors; bad: you lose the flexibility of writing arbitrary queries).

I tried to setup a sample database for a project I’m working on and got stuck really fast. Getting your head around “documents” when you’ve been in “relational database” mode for 10 years is difficult. I found a useful trick was to think about how i’d implement my project in the offline world, with paper-based forms and documents. It definitely feels “wrong” to be breaking the database rules and I think it’s going to take some more time to understand the best ways to setup my data in documents.

I do really like the general concept of a document/object database though and I think it’s a better fit for the majority of the web application work that I do. Currently I feel I’m doing a disproportionate amount of fighting with my relational database (MySQL), particularly on very active projects where I’m often touching the database to implement new functionality; I’m hoping that something like CouchDB will “go with the grain” better.

It’s definitely something I’ll be spending some more time getting to know.

Roy Fielding: English, motherfucker, do you speak it?

Thursday, March 26th, 2009

When I think of a REST web-service, simplicity springs to mind. I understand the web and HTTP, and my development languages (PHP & Javascript) allow me to easily interact with them them. Compared to trying to get my head around “Enterprise SOAP”, and especially from within PHP, REST is easy. Or so I thought.

I stumbled onto a blog post by Roy Fielding in which he bitch-slaps people who call their API “RESTful” when, according to him, it’s not. One might think that as the person who defined the “REST” acronym he is the authorative voice on the matter. However, I challenge you to read that post and understand what he’s talking about.

A REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification. [Failure here implies that identification is not separated from interaction.]

Insert image of Samual L. Jackson here!

Fielding posted a response to the criticism of his language, in which he explains (how kind!) that he’s directing his blog post at “specialists”, who in his eyes, should be able to understand it.

However, when I send out a message to API designers, I expect the audience to be reasonably competent in the field. I have to talk to them as a specialist because I want them to understand, as specialists themselves, exactly what I am trying to convey and not some second-order derivatives.

What use is criticising somebody when you do so in language that makes it difficult for them to understand your points? Presumably if the great unwashed have failed to grok the disseration, then they’re probably not going to react well to more of the same “academic speak”, and the criticism will either fall on deaf ears or it will not be fully understood (or at all), which totally defeats its purpose.

I think he also contradicts himself in his response. He doesn’t want people to understand “second-order derivatives”, yet he’s pleased that there are enough clever-people out there who can explain it to the thickies (thus creating second-order derivatives).

Fortunately, there are more than enough people who are specialist enough to understand what I have written (even when they disagree with it) and care enough about the subject to explain it to others in more concrete terms, provide consulting if you really need it, or just hang out and metablog.

But isn’t this exactly the communication problem that causes people to missunderstand REST? All the designers of RESTful services who don’t understand him will now rely on other people’s translations (i.e. second-order derivatives), which may make the message fuzzy and wrong.

People who create APIs are often not academics. Why the fuck couldn’t he just explain himself in plain english, with some nice “for dummies” examples, and save us all some time. It angers me that instead of saying, “Wow, a lot of people didn’t understand my post (who want to understand me); let me rephrase and give you some examples…”, he responds by patronising us. What a dick.