From Single-Document Updates to Multiple-Document Mayhem

When creating or supporting HTTP APIs for database operations, the technique is to connect HTTP methods to database operations. The widely accepted mapping is as follows:

  • GET: Find
  • POST : Insert/Create
  • PUT: Update
  • DELETE: Delete/Remove

(Unless you’ve adopted GraphQL, in which case, what in Middle-earth?!)

In general, everyone agrees on how POST, GET, and DELETE operations are supposed to behave. There isn’t much room for ambiguity there.

But PUT? It’s like a shape-shifting Balrog from the depths of Moria. As long as you have to support the application, it will find new and inventive ways to torment you. Scratch that – even if you are beyond the world of the living, it will find a way to remind you of your failure.

Hold on; we are getting ahead of ourselves. Let us venture into the realm of data, databases, and web services that expose HTTP APIs for managing data. We shall make an effort to understand why things can go so wrong (or so right) with PUT.

When building a POST service, it is linked to a create or insert operation on the database side. You send a payload – JSON/XML/(insert whatever you like) – and it’s clear what needs to be done with it: insert it into the database. If the payload is an array of documents, it inserts multiple documents. It’s straightforward and simple.

Similarly, GET operations fetch documents based on a filter, and DELETE operations remove them. Still simple and easy to understand.

However, PUT operations are not so simple. It’s not just a matter of sending a document identifier and a document. Well, it is most of the time. But the first decision point is -  do you update the whole document or just parts of it? What exactly needs to be done? What if multiple documents need to be updated based on a filter? It might sound like complaining, and trust me, I am!

Let’s take a step back and design PUT APIs from the ground up.

A document needs to be updated. The question is: do you want to update the entire document or just parts of it?

As a rule of thumb, always update the document ONLY for the values that are present in the document. NEVER replace the entire document.
If replacing the entire document is necessary, explicitly ask the user for confirmation, like a prompt. We’ll get to that in a moment.

A simple PUT (or not!)

Consider a document with the fields _id, item, and qty, and we want to update the qty field from 20 to 30. The question is, which is the better option,

  • Send the whole document with the updated value for qty , OR
  • A document with only the qty field with the updated value?

API requests for both these approaches would probably look like this,

Option 1, which involves sending the whole document, is the most common implementation. However, it is not the optimal approach when we consider performance and cost. Why send the entire document over the network if we only need to update one field? This takes up bandwidth and requires additional processing at the backend, which would not be ideal under heavy load.

Option 2, on the other hand, is a better choice. By sending only the updated field, we reduce the amount of data transmitted and have a lighter API footprint. Assuming we are using MongoDB, we can use the updateOne operation with $set to update the document:

To support replacing the whole document, we can introduce a query parameter replace=true/false for the PUT API, with the default value being false. This is the explicit prompt for the API consumer to indicate that they want to replace the whole document. If they wish to, they must set the query parameter to replace=true to replace the whole document.

It would also be helpful to support upsert operations, which attempt to update a document if it exists or create a new document if it doesn’t. We can add another query parameter, upsert=true/false, with the default option being false.

In summary, we can implement a PUT API that solves most problems by using a few query parameters and safe default options. But of course, as with anything in life, there may be exceptions!

Multiple document updates

What if you need to update multiple documents? Making multiple API calls is inefficient; at times, it is like shooting yourself in the foot.

As embarrassing as it may sound, without going into the details, let me tell you that we ended up DDOS-ing ourselves because our front end just kept making multiple PUT API calls. Luckily we caught it during testing. Or did we?

The preferred approach, and a logical one, is to use a PUT request with an array of documents.

This still ends up calling multiple update operations on the DB. But still manages to reduce the API load and can support query parameters like upsert and replace.

In the above case, we are updating multiple documents with different values. But, what if we want to update multiple documents with the same value, for e.g., if item==box update qty to 30. In this case, we have an even simpler variation of multiple-document update by introducing a filter param.

This, ideally, would be one DB call. Taking the MongoDB example, the update operation would look like this.

Since filter={} matches all documents and is a valid filter, just like replace and upsert parameters, the safer approach is not to use filter if it is not explicitly set.

Multiple, multiple document updates

We have encountered an unusual requirement for PUT API, where multiple updates on multiple documents must be supported in a single API call. For example, setting the qty of all item==box to 30 and all item==bag to 50. To address this requirement, we can introduce a key parameter.

For each document in the payload, use the value of item as filter criteria and update the documents.

We can rewrite the multiple-update API that used the _id in this format as,

Just that, in this case, only one document gets updated per row.

Approaches for PUT

Query params

replace=true/false If set to true, the whole document will be replaced with the payload. Otherwise, only the specified fields will be updated.
upsert=true/false If set to true, the document will be updated if it exists; otherwise, the payload will be inserted as a new document.
filter={…} This filter matches multiple documents for a bulk update.

Variations

API DESCRIPTION
PUT /:id {…} Update one document with the specified ID
PUT / [{_id, …}, {_id, …}] Update multiple documents with the same structure
PUT /?filter={…} {…} Update multiple documents with a filter
PUT /?key=<key> [{<key>, …}, {<key>, …}] Update multiple documents based on a key-value pair

Currently reading J. R. R. Tolkien’s Lord of the Rings. Love the book. Hence the unsolicited LotR references.


./J