Camunda Team Blog

How to clean up/purge the camunda history tables

Written by Ingo Richtsmeier on , under Execution category.
We get this question very often from customers: How can we get rid of old history data that we don't need anymore? First it's important to ask: "What do you mean by 'old data'?". This is completely business focused and depends on the use of cockpit, who is working with the history data, how reports are generated and maybe even constraints due to legislation.

One of our customers - Hamburger Sparkasse (Haspa) - built a solution that is flexible enough to address the issues raised by the questions above:
  • They created a bunch of SQL scripts
  • Created a parameter for the number of days that completed process instances should remain in the database
  • which then moved the old data into a set of archive tables and marked them with the time-stamp and a sequence number of the cleanup run.
  • This ensures the process can be undone in case data is moved accidentally - or if you recognize later on that you still need it.
The source code is available in our consulting repository (thanks to Haspa for allowing this!):


How can you start it?

Load the scripts from our repo into your oracle database (we tested them with Oracle 12c). There are scripts that must run once to prepare everything (STEP-1 and STEP-2). STEP-3 loads a PL/SQL-Function which can be started as follows:

select ARCHIVE_CAMUNDA_HISTORY(9, 20) from DUAL; -- Archive PI's older then 9 days and max. 20 instances

If you run

select ARCHIVE_CAMUNDA_HISTORY(180, 0) from DUAL;

All process instances, that are completed more than 180 days before today will be moved into the archive tables.

STEP-4 is made for Restore/Rollback. Run the file once in your SQLDeveloper to compile the function and start it with:

select ROLLB_ARCHIVE_CAMUNDA_HISTORY(7, -1, 10) from DUAL;

where 7 is the number of the starting cleanup sequence number, the second parameter is a max cleanup sequence number (-1 means that only the first parameter counts) and the third parameter is the number of process instances that should be restored. 0 for the third parameter restores all process instances from the archive runs:

select ROLLB_ARCHIVE_CAMUNDA_HISTORY(8, 10, 0);

You will find more detailed documentation in the code. Its worth it to have a look.

To get rid of the new tables and functions, there is of course DROP-Script, too.

How does it work? 

The archiving function checks first for process instances in the history that match the archiving arguments. It copys the process instance ids into a temporary table. Then it generates SQL Scripts to insert the data from the history tables into the archive tables and to delete the data from the history. Afterwards it checks for bytearrays referenced from the archived process instances in the VARINST- and DETAIL-Table. If there are any, the data are moved, too.

The restore/rollback function does it the other way around and moves the data from the archive table back into the history table. Now they appear in the cockpit again.