Conditional workflows

Skipping/executing actions on condition

A conditional workflow is a workflow a part of which may be executed or skipped depending on a condition. For instance: IF column A is missing THEN calculate it using an expression. Such conditional execution can be arranged in EasyMorph using the "Skip actions on condition" action which settings are depicted below. The action makes EasyMorph skip all the actions that follow it in the current table, when the action’s condition is satisfied. The condition can be one of a few types which can be seen below.

Skip actions on condition

For instance, in the follow screenshot the condition of the "Skip…" action is not fulfilled (i.e. is false), and therefore the following actions have been executed. Their pictograms look usual.

Actions are not skipped

In the next screenshot below, the condition of the "Skip…" action is fulfilled (i.e. is true). The actions that follow the "Skip.." action are skipped (i.e. not executed). Their pictograms have an overlay icon showing that they have been skipped. Note that the pictogram of the "Skip.." action itself now also appears different in order to signal visually about the negative condition evaluation result.

Actions are skipped

Halting on condition

Sometimes, it's required to stop workflow execution on condition. Usually, this is required when a critical data quality issue has been detected and workflow must be stopped in order to prevent propagating bad data into other systems or databases. EasyMorph provides two actions to address such cases:

Halting on condition is a powerful technique that allows ensuring data consistency. It can be used in various cases, for instance:
  • Check input data for values out of the acceptable range, missing or extra values
  • Verify data types before exporting to a database table or web API
  • Verify data consistency after a merge

Advanced topics

Conditionally derived tables

The "Skip actions on condition" action described above handles simple cases when a group of actions is either executed or not, depending on a condition. However, sometimes there can be two (or more) groups of actions executed on mutually exclusive conditions. For instance: IF loaded data has no missing dates THEN export it into a database ELSE send an email to Pete. In such a workflow, only one thing can happen during workflow execution: either data is exported into a database, or an email is sent to Pete. Both condition branches are mutually exclusive and must never be executed in the same workflow run. If we depict this workflow graphically, it would look as below:

Conditional workflow animation

Such conditional branching is arranged in EasyMorph using conditionally derived tables. To remind you, a regular derived table is a table which replicates the resulting state (data) of its source table and performs certain actions with it. If you're not familiar with derived tables, check out this tutorial article: Derived tables.

A conditionally derived table only executes its actions when a certain condition is satisfied. If not, then no data is replicated from its source table, and no actions are performed — all actions in such table are skipped. The example mentioned above would look slightly different if we do branching using conditionally derived tables in EasyMorph:

Conditional workflow using derived tables

In this example we have 2 conditionally derived tables. The conditions in the tables are inverse (i.e. mutually exclusive) to each other: in one table the condition is "missing a date?" while in the other it's "NOT missing a date?". Therefore, depending on whether the loaded data is incomplete or not, either one table or the other is entirely skipped.

In an EasyMorph project, the example described above would look as follows:

Conditional workflow using derived tables

The two rightmost derived tables are derived conditionally. Notice that one of the tables is skipped — you can see it has different title bar color, and a derivation icon with red "X". In this table, actions were not executed and it has no data.

The project depicted above detects if some dates are missing or not by calculating the difference between adjacent dates. If no dates are skipped then the difference from previous date will always be equal to 1, for all dates. If one or more dates are skipped somewhere, then for the following date the distance will be 2 or more. We calculate the max distance, and merge it into the main table as a new column using the "Peek" action. Later, prior to exporting, the column is removed.

In the conditionally derived tables the conditions are as follows:

[Distance, days] = 1
for the upper derived table "Export to DB", and

[Distance, days] <> 1
for the lower derived table "Send an email".

As you can see, the conditions are inverse to each other. Therefore, either one or the other derived table is skipped, but never are both skipped or executed simultaneously.

Conditions

The "Derive table" action has a switch that tells it whether it should be unconditional or conditional. In the latter case, there are three possible conditions for derivation:

  • If the given expression evaluates to TRUE for each row in table
  • If the source table is empty
  • If the source table is NOT empty
In the screenshot below you can see the properties of the "Derive table" action in the conditional mode:

Conditional workflow

Example

Download the example from this Community topic: If a file doesn't exist.

More complex workflows

The examples above show using conditionally derived tables for the "IF...THEN...ELSE" type of workflows. However, in the same fashion, it is possible to arrange the "SWITCH...CASE" type of branching that arranges 3 or more branches.

Also, if a condition branch requires executing a complex workflow that can't be fit into one table, the workflow can be moved into a separate module and executed using the "Call" action (explained earlier) in the respective condition branch.


Read next: Data analysis