Last week, I reviewed a pull request from one of my colleagues and found a totally wrong implementation of a mutex lock based on DynamoDB. After pointing it out, I realized there are very few blog posts about the correct implementation of a mutex lock using Amazon DynamoDB, and I personally believe it is a really common use case to share.
In this blog post, I will present the “wrong” implementation, and the actual correct code we have written together during the review.
A bit of theory: optimistic lock vs mutex lock
Optimistic locks are useful when you would like to update a data, assuming that it could have been updated by somebody else at the same time. Basically, optimistic lock work as this:
Record a timestamp marking the transaction’s beginning
Read values, and tentatively write changes
Check whether other transactions have modified data
If there is no conflict, make all changes take effect. If there is a conflict, resolve it.
Mutex locks are intended for MUTual EXclusion. For example, when you have two processes updating the same resource, you can have a single bit telling a process that another is currently updating the resource. Mutex locks work as it:
Read the semaphore value. If value is 0, set to 1, else, fail (lock cannot be acquired)
When lock is acquired, update the data
Set semaphore value to 0 (will release lock)
Although the two locks are similar, we will below present the mutual exclusion algorithm.
An incorrect mutex lock algorithm
Assuming we have a DynamoDB table with ResourceId as hash key, below is the source code I received to review:
First of all, the « lock » operation is here not atomic. Between a get_item and the put_item operation, there could have been another process checking the lock using get_item on his side. Example in workflow below, the two processes will believe to have acquired the lock:
The second problem is intrinsic to DynamoDB databases. Even with one process this algorithm will not behave as expected. The reason is: the get_item operation is eventually consistent. Consequently, it may return a stale item when the lock was actually released, and the process will believe that it cannot acquire the lock.
The mutex lock done right using DynamoDB as backend
Below, is my proposed implementation. It uses the DynamoDB ConditionExpression feature.
# Put item with conditional expression to acquire the lock
dyn_table = boto3.resource('dynamodb').Table('my-lock-table')
# Lock acquired
except botocore.exceptions.ClientError as e:
# Another exception than ConditionalCheckFailedException was caught, raise as-is
if e.response['Error']['Code'] != 'ConditionalCheckFailedException':
# Else, lock cannot be acquired because already locked
How does it work ?
Using put_item with ConditionExpression will tell DynamoDB to perform the put operation ONLY IF the condition is true. This operation is atomic.
In the above source code, I tell DynamoDB to put my item only if there is not already an item with this identifier. If the item exists, an exception is raised and lock is not acquired : probably another process is using the resource. If it worked, the lock is acquired and the item is created, preventing another process to take it.
The unlock works symmetrically, verifying that the resource is currently locked.
The end ?
Developing algorithms with distributed databases is always a challenge. Moreover, because Amazon DynamoDB is so simple to use, we can easily forget about the eventually consistency or basic algorithmic concepts about atomic operations.
If you are reading this article because a colleague pointed something in your source code, do not be mad. I personally love to be wrong, because when somebody have explained me why I mistaken, I have learnt something new.
Renault Digital est une filiale de Renault qui a pour mission d'être un accélér...
A PROPOS DE L'AUTEUR
C'est assez jeune que Nicolas s'est mis à faire travailler les machines à sa place. S'il a commencé à coder en C/C++, il est rapidement tombé dans le Java. Linuxien convaincu, il a pendant longtemps écrit son code avec Vim. Certains disent que c'est un troll, d'autres qu'il pose juste beaucoup de questions...