In this blog post, we will discuss a more or less undocumented way of flattening arrays in a State Machine without using a dedicated Lambda function. That is, transforming an array of arrays (of arrays, of..) into a plain simple flat array. It can be especially useful when handling Parallels or Maps state results.
A simple array as Map result…
If you are going heavy on Step Functions, you probably use Parallels or Maps somewhere. And you know that the output of those states is a JSON array containing the individual outputs of each sub-state machine.
For example, let us consider this simple State Machine :
We use a Map state, but the problem we will tackle and the solution are quite similar in a Parallel state.
Here we use a simple Pass state (DO_WORK) to yield a single value inside our Map. Of course in the real world it would probably be multiple Task states leading to the output of this value, but for the purpose of this demonstration a simple Pass is enough. We can look at the output of the Map and see that it is an array the same size of the input array used by the Map for its iteration :
Screenshot 1: The Input of the Map state, with the “key_to_iterate”, an array of 3 elements.
Screenshot 2: The Output of the Map state, an array of the 3 results of each independent iteration (the value returned by DO_WORK).
An ugly array of array…
All is well, but let us say that instead of yielding simple values like strings or numbers, each of your sub-state machine yield an array of values. And you would like your Parallel or Map to output a unique, flatten array of all those values.
We recently came across such a use-case where we had a Map state that iterates over a list of Instance IDs and creates snapshots of the EBS volumes of each instance. What we want that Map to output is simply the list of the snapshot IDs. But, as each instance can have multiple volumes, the sub-state machine doing the job returns an array of snapshot IDs. Then the Map state returns an array of arrays of snapshot IDs. How can we flatten that into a simple array of snapshot IDs?
To better illustrate the use-case, we can modify our previous example by making the “DO_WORK” Pass state yield an array of values instead of a single value:
Screenshot 3: The Output of the Map state, an array of the 3 arrays produced by each of the 3 independent iterations.
As expected, we got ourselves an array of arrays. Argh, ugly! Give me a flatten array with all the values!
When we searched the Internet for a solution with my colleague, we did not find anything (we may be bad at searching) neither in the AWS documentation nor in other resources. All the solutions point toward writing a simple Lambda function called by a Task state to do the job. But you don’t need to do that!
It turns out that you can use the JMESPath syntax [*] for the win 🙂 And it is quite trivial, though not explicitly documented in any example of the AWS documentation.
In the definition of your Map or Parallel state, you only need to do that little trick:
We add a ResultSelector and an OutputPath and… voilà:
Screenshot 4: The Output of the Map state, a flatten array of the values produced by the independent iterations.
No need to have an additional State, no need to have another Lambda function! And it will also work to flatten arrays on 3 or 4 or more levels: just keep adding [*] in the ResultSelector! The only thing that is important is that the depth (number of levels) must be fixed and the same everywhere.
Keeping the input
Of course, if you write exactly what I propose, you will lose the Input of your Map/Parallel and it may not be what you want. But you can keep the idea and apply it another way with a ResultSelector and a ResultPath:
En tant que consultant, on passe parfois souvent du temps dans les transports. ...
A PROPOS DE L'AUTEUR
Passionné de sciences en général et particulièrement de physique et d'astronomie, Jérémie baigne dans l'informatique depuis l'âge de 12 ans. Son background d'administration système et de scripting en Powershell l'a naturellement amené vers le cloud AWS. Jérémie est également AWS Community Hero et formateur AWS.
Le blog reBirth
Nous luttons contre les raccourcis intellectuels, proposons des alternatives, challengeons les pratiques, partageons nos expériences et provoquons une réaction. En ce sens nous ENTREPRENONS et révélons les singularités.