Tableau Prep: The Flow

5 minute read

I've been a bit quiet lately, but Tableau Prep out the door and it's time to make a little noise.

Clark recently wrote an excellent post on the basic UX architecture of Prep. Here I'd like to cover a key concept underlying Prep that may be a bit foreign to people coming from Tableau: the flow.

1flow

This isn't the most glamorous part of Prep, but it is one of the most fundamental concepts in the tool, so it seems worth spending some quality time on.

Strap on your life jacket and read on for more.

Data In; Data Out

To understand flows, we start with steps, which are the conceptual unit of work in Tableau Prep. Every time you take an action on your data in Prep, you're adding a step. For example, if we take the world consumer price index data included with the product and add a filter, we find that a new step is added to the flow:

2step

Each item in the flow pane represents a step, and each step works in the same way: data come in from the left, are modified by the step, and leave to the right:

3inandout-annotated

Some steps — cleaning steps — may have multiple sub-steps, or changes. These are just like steps in the flow, but are smaller increments of work. They flow top to bottom:

5cleaning-annotated

We group these together to help conceptually simplify the flow, but each change acts just like any other step: rows come in, they're modified, and they go out.

Some steps — such as joins — have multiple inputs, but they work the same way: two sets of data come in from the left, they're put together, and the result leaves to the right:

4join-annotated

And where do they go? On to the next step! Some steps may even have multiple outputs, with the data going to multiple targets:

6twooutannotated

Step-by-step we build up a flow: an ordered sequence of steps that does what we want.

1flow

Clarity and Control

That ordering is a key aspect of flows. If you're coming from Tableau, you may be aware that it performs operations in a particular order, but the system doesn't advertise this, and generally you don't need to think about it.

But order sometimes matters, and we designed Prep with those times in mind. The CPI data contain both a food index and a general index. Let's say that we've pivoted the data, and now want to compare each country's CPI to the global average for each year — except we only care about the food index.

To do this, we'll first filter to keep only the food index:

filter-annotated

And then we'll aggregate by year:

agg-annotated

Order matters: if we did the aggregate first, we would have folded in the general CPI as well.

This kind of ordering is explicit in Prep. You don't have to guess, and you don't need to coax the system into doing what you want: you just build your flow in the order fits your problem.

And with Prep, you can always go back and see your data at any point along the flow. Just click back and look. This way you can see and control what the flow is doing to your data every step along the way.

Prep is a Competent Cook

We can add another metaphor: think of a flow as a recipe, and let's take a moment to bake some cookies.

julia-spoon

We've already mixed the wet ingredients — the eggs, the vanilla, the butter — when we get to this part of the recipe:

  1. ...
  2. Measure 1.5 cups flour
  3. Add 1/4 teaspoon salt
  4. Add 1/2 teaspoon baking powder
  5. Mix thoroughly
  6. Add dry ingredients to wet ingredients

A competent cook would mix these dry ingredients before adding them to the wet, but they would take the liberty of combining them in any convenient order: they know it's irrelevant.

Tableau Prep is a competent cook. It can figure out many cases where the order won't matter, and can rearrange them to make your flow run more efficiently. But it will only do this when the reordering won't affect the results that you intended.

So while the flow give a conceptual order to the operations and their execution order, they may not be run that way at all. The result is that you can ignore order when it doesn’t matter, but rely on it when it does.

More than Just Flows

The notion of a flow is not unique to Tableau Prep, and it isn't Prep's most distinguishing feature. The way that Prep uses samples to give you immediate feedback, the way we use analytics to help you see what needs to be done, and the direct manipulation all more directly contribute to what makes Prep special.

But understanding flows is central to understanding how to make Prep do exactly what you want, and it can be a bit of a leap for folks coming from Tableau Desktop. I hope this helps make that leap a little easier.

Happy hacking!