by Horațiu Dan
One of the most famous deadlocks is the one encountered in the well-known problem of the ‘dining philosophers’. Briefly, it is said that ‘n’ philosophers sit at a round table aiming for Chinese food. On the table, there are ‘n’ chopsticks, one between every two philosophers. As the venue is a pleasant and productive one, they are not only eating, but also thinking, alternating between the two. In order to be able to eat, each needs to acquire two chopsticks first, eat, then put them back on the table and get back to thinking. Without getting into further details, one can easily observe that in the situation where each philosopher grabs the chopstick to his right and then waits for the one on the left without realizing the former, the deadlock appears.
Moving to threads, the simplest deadlock example is when one holds a lock forever while the others, aiming to acquire the same lock, block waiting. Simpler put, for two threads, if the former holds lock A and aims for lock B, while the latter holds lock B and aims for lock A, both will wait forever.
Similarly, at database level, deadlocks appear when two or more concurrent transactions are blocked, each waiting for the others to release a resource they need.
Abstract
A while ago, I confronted with a database deadlock which
unfortunately appeared in one of our productions. The problem was
identified and successfully fixed. Yet, what kept my attention
during this investigation was the attempt to reproduce the issue in
an isolated manner, by trying to guarantee the concurrent execution
of the actions deadlocking. In this article, first a “straight
forward” deadlock is fabricated, then, the focus is set on how to
execute two actions “as simultaneous as possible” by leveraging two
Java synchronization aids – CyclickBarrier
and
CountDownLatch
.
Set-Up
The proof of concept is build with:
- Java 21
- PostgreSQL 13
- PostgreSQL Driver version 42.7.5.
For simplicity, Spring Boot version 3.4.4, Flyway 10.20.1 and Apache Maven 3.9.9 are also used.
The Deadlock
Let Entity1
and Entity2
be two simple
entities. They are similar and described only by an unique
identifier (the primary key) and a text.
@Entity
@Table(name = "entity1")
public class Entity1 {
@Id
@Column(name = "id")
private Long id;
@Column(name = "text", nullable = false)
private String text;
...
}
@Entity
@Table(name = "entity2")
public class Entity2 {
@Id
@Column(name = "id")
private Long id;
@Column(name = "text", nullable = false)
private String text;
...
}
EntityProcessor
is a component that defines two
operations, each running in its own transaction, but involving the
same two entities, of type Entity1
and of
Entity2
respectively. Obviously, it is assumed they
exist.
1. process1():
- begins the transaction
- reads Entity1 from the database
- modifies its text
- reads Entity2 from the database
- modifies its text
- commits the transaction
2. process2():
- begins the transaction
- reads Entity2 from the database
- modifies its text
- reads Entity1 from the database
- modifies its text
- commits the transaction
@Service
public class EntityProcessor {
private final Entity1Repository entity1Repo;
private final Entity2Repository entity2Repo;
public EntityProcessor(Entity1Repository entity1Repo, Entity2Repository entity2Repo) {
this.entity1Repo = entity1Repo;
this.entity2Repo = entity2Repo;
}
@Transactional
public void process1(long entity1Id, long entity2Id) {
final int index = 1;
processEntity1(index, entity1Id);
processEntity2(index, entity2Id);
}
@Transactional
public void process2(long entity1Id, long entity2Id) {
final int index = 2;
processEntity2(index, entity2Id);
processEntity1(index, entity1Id);
}
private void processEntity1(int index, long entityId) {
Entity1 entity1 = entity1Repo.findById(entityId)
.orElseThrow(() -> new RuntimeException("Entity1 not found"));
entity1.setText("Set by process " + index);
}
private void processEntity2(int index, long entityId) {
Entity2 entity2 = entity2Repo.findById(entityId)
.orElseThrow(() -> new RuntimeException("Entity2 not found"));
entity2.setText("Set by process " + index);
}
}
As part of this POC, the “business” goal is to modify the text of the two entities twice, irrespective of their order, but in parallel. The above operations (transactions) lead to a deadlock if executed concurrently due to the opposite lock sequences. To resolve the problem, or at least to significantly reduce the risk of blocking, both should lock the two entities in the same order.
When process1()
and process2()
(as
defined above) are run in parallel, the below exception is
obtained, highlighting the deadlock.
[virtual-61] DEBUG SqlExceptionHelper#could not execute statement [update entity1 set text=? where id=?]
org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 36964 waits for ShareLock on transaction 29998; blocked by process 48636.
Process 48636 waits for ShareLock on transaction 29999; blocked by process 36964.
Hint: See server log for query details.
Where: while updating tuple (0,7) in relation "entity1"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2733)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2420)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:372)
...
at com.hcd.deadlock.service.EntityProcessor$$SpringCGLIB$$0.process2(<generated>)
The stack trace clearly outlines the issue.
The Parallel Execution
The deadlock was introduced in the previous section. The main goal of this article is to create the proper set-up for guaranteeing the actual concurrent execution of the two transactions.
For this experiment, two Java synchronization aids are used –
CyclicBarrier
and CountDownLatch
. A unit
test is created for each, with the following common set-up.
@SpringBootTest
@Rollback(false)
class Test {
@Autowired
private EntityProcessor entityProcessor;
@Autowired
private Entity1Repository entity1Repo;
@Autowired
private Entity2Repository entity2Repo;
private Entity1 entity1;
private Entity2 entity2;
@BeforeEach
void setUp() {
entity1 = entity1Repo.save(new Entity1(1L));
entity2 = entity2Repo.save(new Entity2(2L));
}
@AfterEach
void tearDown() {
entity1Repo.delete(entity1);
entity2Repo.delete(entity2);
}
}
First Entity1
and Entity2
are created,
then EntityProcessor#process1()
and
EntityProcessor#process2()
are executed concurrently.
Ultimately, the clean-up is done and the two entities are deleted.
Thus, the state of the database remains unaltered, as these tests
are not transactional in order to try to simulate as much as
possible real use-cases.
Aid 1 – The CountDownLatch
A CountDownLatch
is “a synchronization aid that
allows one or more threads to wait until a set of operations being
performed in other threads completes” [1].
The test prepares and schedules two threads for executing
EntityProcessor#process1()
and
EntityProcessor#process2()
respectively and constructs
a CountDownLatch
. The CountDownLatch
is
initialized with a count of 1 and thus acts as a toggle or a simple
gate. The behavior of the threads is similar and described by the
ProcessTask
Runnable
. They first invoke
latch.await()
and consequently block into a waiting
state, until “the gate” is opened by the main thread which invokes
the latch.countDown()
, brings its count to zero and
starts the actual parallel processing.
@Test
void run() {
CountDownLatch latch = new CountDownLatch(1);
try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
Future<?> future1 = exec.submit(new ProcessTask(latch,
() -> entityProcessor.process1(entity1.getId(), entity2.getId())));
Future<?> future2 = exec.submit(new ProcessTask(latch,
() -> entityProcessor.process2(entity1.getId(), entity2.getId())));
latch.countDown();
future1.get();
future2.get();
} catch (ExecutionException | InterruptedException e) {
throw new RuntimeException(e);
}
log.info("All processors completed.");
}
private record ProcessTask(CountDownLatch latch, Runnable runnable) implements Runnable {
@Override
public void run() {
try {
latch.await();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
runnable.run();
}
}
Aid 2 – The CyclicBarrier
A CyclicBarrier
“allows a set of threads to all
wait for each other to reach a common barrier point” [2], in this
case, the start of the actual processing – the execution of
EntityProcessor#process1()
and
EntityProcessor#process2()
.
Similarly as above, the test prepares and schedules the two
threads and constructs a CyclicBarrier
with 3 parties
– the two tasks and the main thread – then calls
barrier.await()
– call number one. The
behavior of the threads is similar with the former and described by
the ProcessTask
Runnable
. They first
invoke barrier.await()
– call number two and
number three. Once all three parties call
barrier.await()
, they will all proceed together and
start the actual parallel processing.
@Test
void run() {
CyclicBarrier barrier = new CyclicBarrier(3);
try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
Future<?> future1 = exec.submit(new ProcessTask(barrier,
() -> entityProcessor.process1(entity1.getId(), entity2.getId())));
Future<?> future2 = exec.submit(new ProcessTask(barrier,
() -> entityProcessor.process2(entity1.getId(), entity2.getId())));
barrier.await();
future1.get();
future2.get();
} catch (ExecutionException | InterruptedException | BrokenBarrierException e) {
throw new RuntimeException(e);
}
log.info("All processors completed.");
}
private record ProcessTask(CyclicBarrier barrier, Runnable runnable) implements Runnable {
@Override
public void run() {
try {
barrier.await();
} catch (InterruptedException | BrokenBarrierException e) {
log.error("Could not await().", e);
}
runnable.run();
}
}
Take Aways
In the case of the CountDownLatch
, the processing
threads run independently and in parallel, but start the actual
processing – runnable.run()
– only when released by
the main thread which decrements the latch to zero.
When using the CyclicBarrier
, the situation is very
similar, each of the parties wait at the barrier and the last one
that arrives, releases them.
Unlike CountDownLatch
, which is a one-time use
only, a CyclicBarrier
may be reused multiple times,
making it particularly useful when having multiple phases where a
set of threads must wait for each other and synchronize
repeatedly.
This article presents two ways of testing deadlock situations in isolation which might prove quite useful when such issues are detected.
When executing either of the above tests, the deadlock appears and the corresponding exception is thrown. Nevertheless, although the purpose here is not to describe how to resolve deadlocks in general, it is worth re-iterating that ordering concurrent lock operations resolves such problems or significantly reduces the risk of deadlocking.
Resources
[1], [2] – Java 21 Reference Documentation
[3] – POC source code