One common task when working with CouchDB is to find out whether a document with a given ID exists. A simple solution is to send an HTTP GET request with the ID to CouchDB and check the response’s HTTP status code.
A GET request, executed, for example with curl
will return the document with an HTTP status code
200 if it is successfully found. If the document does not exist, an HTTP status code
404 will be returned, with the following response body:
While this works well, it is not the most efficient solution. If the request is successful, the whole document is read from the database, put into the HTTP reponse and sent over the network – and the receiver is not interested at all in the document itself. In many cases this might not be a problem, but imagine a document of several megabytes sent over a slow internet connection.
It can be worse when you’re using one of the many Java APIs for CouchDB. They usually offer high-level APIs for GET requests which do the JSON parsing of the response body automatically – so, even more time is lost parsing the JSON String to create a Java object which isn’t needed.
A better approach is to use HTTP HEAD requests to check for a document’s existence. HEAD requests return responses which only have header information, no body. Hence responses are always small. Additionally, the handling of the request on the CouchDB side can be done very efficiently. To fill the response’s header, only document meta data like revision number or size needs to be determined. There is no need to read the document itself.
A HEAD request with curl
curl --head http://localhost:5984/mydatabase/mydocumentid
returns (in the case of success) a response like
HTTP/1.1 200 OK Server: CouchDB/1.2.0 (Erlang OTP/R15B) ETag: "1335-788fe7a4e1323206fcd0df76dcabd163" Date: Thu, 28 Feb 2013 19:41:37 GMT Content-Type: text/plain; charset=utf-8 Content-Length: 2705 Cache-Control: must-revalidate
Note that some of the Java APIs (Ektorp, LightCouch) offer methods to check for a document’s existence (
CouchDBClient.contains, respectively) which use HEAD requests internally. Others (JCouchDB, JRelax, CouchDB4J) don’t. Fortunately they’re open source and it’s quite straightforward to implement these missing methods yourself, using the underlying Apache HTTPComponents or Restlet APIs.