Camunda Team Blog

Job Prioritization for Asynchronous Processing at Scale

Written by Thorben Lindhauer on , under Execution category.

Camunda users process heavy workloads with the process engine. Often this includes asynchronous processing which is handled using the job executor component. The amount of jobs that need to be processed can quickly reach an order of magnitude of millions of jobs per day. To bring order into situations of high job executor load, Camunda BPM 7.4.0 will ship job prioritization. With our first 7.4.0-alpha1 release, you can already have a look at it and give it a try. This article deals with three questions:


The Case for Job Prioritization

Up to date, Camunda BPM is used by a growing number of customers in a variety of industries, each with different requirements for process automation. Among others, we are especially prominent in the financial, insurance, and telecommunications sectors. In these fields, processes tend to be mostly or even fully automated (termed Dunkelverarbeitung in German insurances). To scope units of work in these processes, that means sets of activities that are executed in one transaction, Camunda provides the concept of asynchronous continuations. Asynchronous continuations manifest themselves as jobs at runtime, representing the task to execute a unit of work in a running process instance. The process engine's job executor component continuously picks up jobs from the database and schedules them for execution in a dedicated thread pool.

With global and national players using the process engine to automate their processes, the amount of jobs present at a time can grow quite large. After the 7.3.0 release, we conducted a survey amongst community and enterprise edition users receiving feedback from some of our most demanding users. Key results were:
  • Users process up to 5 million jobs per day
  • Job creation and execution is subject to peaks, varying in rate and duration

During peaks, the job executor and its thread pool may be temporarily overloaded, needing time to process the existing jobs and reduce the queue size to a manageable portion again. With previous Camunda versions, the order of job execution is generally non-deterministic with limited measures to order execution of jobs (e.g. prefer execution of timers over asynchronous continuations). In a large set of pending jobs, some jobs may be more important than others. For example:
  • VIP customers are more important than casual customers to a company. In high load situations, VIP orders should be processed with only little delay.
  • Batch operations like housekeeping tasks create a large amount of jobs in a short time, yet their execution is less important than other business processes.
  • In an exceptional condition, an external service may respond slowly. Jobs accessing that service are temporarily less important in order to avoid blocking other jobs.

For use cases like that, job prioritization is the adequate tool in the Camunda toolbox.

How to use it

Let us apply job prioritization by implementing an example. The following diagram shows a simplified delivery scheduling process. In order to automatically retry scheduling in case of failure, the service task Schedule Delivery is declared asynchronous. In the following, we want to treat VIP customers' deliveries with higher priority such that they are sooner processed in case of high job load.

Engine Configuration

Before starting, we make sure to configure the job executor to acquire jobs by their priority. In bpm-platform.xml, this look as follows:
<?xml version="1.0" encoding="UTF-8"?>
<bpm-platform ...>
  <job-executor>
    <job-acquisition name="default" />
  </job-executor>

  <process-engine name="default">
    <job-acquisition>default</job-acquisition>
    ...
    <properties>
      <property name="jobExecutorAcquireByPriority">true</property>
      ...
    </properties>
    ...
  </process-engine>
</bpm-platform>

The job acquisition thread will now acquire jobs strictly by their priority, from highest to lowest. Have a look at the documentation of job acquisition order for a recommended database index.

Prioritizing VIP Customers' Jobs

Next, we configure an asynchronous continuation job to receive a priority based on the BPMN 2.0 XML. Priorities are natural numbers in the Integer range and can be either constant values or the result of a JUEL expression. Let us assume that we have a process variable delivery that contains business data related to the delivery such as the customer's identifier. Furthermore, we have a CDI bean called priorityHandler that is able to calculate a customer's priority. In the BPMN XML of our process, we can configure the service task as follows:
<bpmn:serviceTask id="ScheduleDelivery_1"
  name="Schedule Delivery"
  camunda:asyncBefore="true"
  camunda:jobPriority="${priorityHandler.calculatePriorityFor(delivery.customer)}" />

Every job for that activity now is dynamically assigned a priority by the priorityHandler bean.

Overriding Priorities at Runtime

Our solution works fine until one day the delivery service encounters an overload and starts to respond very slowly. In order to get the delivery scheduling jobs "out of the way" of other jobs, we can use the management service to define a priority for the job definition that temporarily overrides the setting in the BPMN XML.
// find the job definition
JobDefinition jobDefinition = managementService
  .createJobDefinitionQuery()
  .activityIdIn("ScheduleDelivery_1")
  .singleResult();

// set an overriding priority
managementService.setOverridingJobPriorityForJobDefinition(jobDefinition.getId(), 0);

Now, every new async job that is created for the Schedule Delivery activity will receive the priority 0. When the delivery service has returned to normal operation conditions, this priority can similarly be cleared again with
managementService.clearOverridingJobPriorityForJobDefinition(jobDefinition.getId());

What you can expect in 7.4.0

In the previous sections, we have explored the engine's new job prioritization feature. As you may have noticed, it deals with the BPMN and Java API part but there is not yet an integration with Cockpit. Similar to features like job definition suspension, we plan to enable Cockpit users to define overriding priorities dynamically at runtime. This way, operators can immediately respond to exceptional conditions that require re-prioritization. In addition, we will integrate the priority attribute into the graphical camunda Modeler or bpmn.io.

For now, you can have a look at the documentation on job prioritization for a more comprehensive description of the feature. We are eager to receive your feedback, whether prioritization helps you solve use cases, where you see potential for improvement, and if you encounter any bugs or performance issues. Drop us a line in the comments below or on the camunda user forum.