As you build a transformation, you may notice a sequence of steps that you want to repeat. This repetitive part can be turned into a mapping.
A mapping is a transformation with placeholder input and output steps. The mapping transformation is executed through the Mapping step in a parent transformation. Because the parent transformation runs a separate transformation through a specific step, the mapping transformation is commonly referred as a sub-transformation. The sub-transformation must contain the following input and output steps:
- Mapping input: a placeholder where the mapping expects input from the parent transformation.
- Mapping output: a placeholder indicating, from where, the parent transformation can read data.
Use mapping when you want to re-use a certain sequence of steps in a transformation. The following image illustrates mapping and the relationship of the parent
Enter the following information in the transformation step fields.
|Step Name||Specifies the unique name of the step on the canvas. A step can be placed on the canvas several times; however, it represents the same step multiple times. The Step Name is set to Mapping (sub-transformation) by default.|
Specify your mapping sub-transformation to execute. Click Browse to display and enter the mapping details using the Using the virtual file system browser in PDI.
If you select a transformation that has the same root path as the
current transformation, the variable
If you are working with a repository, specify the name of the transformation. If you are not working with a repository, specify the XML file name of the transformation.
Transformations previously specified by reference are automatically converted to be specified by the transformation name within the Pentaho Repository.
Log lines in Kettle
To differentiate log lines from a mapping, edit the kettle.properties file and set the KETTLE_LOG_MARK_MAPPINGS variable to Y.
Set this variable to Y to precede log lines with the mapping step name and the mapping itself.
The Mapping step features several tabs with fields for setting parameters and defining data flow. Each tab is described below.
You can use the following fields in the Parameters tab to define or pass Kettle variables down to the mapping sub-transformation.
|Variable Name||Add a string you want to assign as a variable.|
|String Value (can include variable expressions)||Add the value you want to assign to this variable name. It is possible to include variable expressions in the string values for the variable names.|
|Inherit all variables from the parent transformation||
Select this option to make all variables that are available in the parent transformation available in the sub-transformation, even if they are not explicitly specified in the Parameters tab.
Although one input entry point is available, you can create additional input entry points. Each input corresponds to one Mapping input specification step in the mapping sub-transformation. You can have any number of these entry points in a single mapping step, including no input entry point. The following table describes the options in the Input tab.
|Available inputs (Add input)||
Use the Plus Sign button to add an input mapping for the specified sub-transformation.
You can remove an input by clicking the X icon.
|Main data path||Select to indicate that you only have one input mapping and you can leave the two following fields (Input source step name and Mapping target step name) empty.|
|Input source step name||Specify the name of the step in the parent transformation (not the mapping) to read from. It can be any step in the parent transformation with an outgoing hop connected to the Mapping step. Click Choose to select the target step from a list.|
|Mapping target step name||Specify the name of the Mapping Input specification step inside the sub-transformation that is to receive the rows from the input source step. Click Choose to select the target step from a list.|
|Description||Add a description to this input step mapping.|
|Update mapped fields downstream||Select this option to rename fields back to their original names when they reach the Mapping output step. This will make your sub-transformations more transparent and reusable. If not selected, fields get renamed before they are transferred to the mapping transformation.|
|Mapping||Click to open the Enter Mapping dialog box. Use this field mappings dialog to specify exactly how the fields from the input source step are connected to the fields of the Mapping target step. When you finish mapping fields, click OK in the Enter Mapping dialog box. Your field mapping will appear in the mapping table located beneath the Description field.|
Add inputs to table
To add inputs into the table, perform the following steps:
Click the Mapping button.
Click a Source Field you want to map.
Click a Target Field to associate with the Source Field.
Click Add. If you need to edit or change the Source or Target fields, click Delete.
By default, one output entry is available; however, you can add more output entries using the Plus Sign button by the Available outputs pane. Each of the output entries correspond to one Mapping output specification step in the mapping or sub-transformation. That means you can have any number of output entries (or none) in a single mapping step.
|Available outputs (Add output)||
Use the Plus Sign button to add a tab to specify an output mapping for the specified sub-transformation.
You can remove an output entry by clicking the X icon.
|Main data path||Check this if you only have one output mapping and you can leave the two following fields (Mapping source step name and Output target step name) empty.|
|Mapping source step name||The name of a Mapping output specification step in the sub-transformation where data will be read from. Use the Choose button to select this step from a list.|
|Output target step name||The name of the step in the current transformation (parent) that is to receive the rows from the Mapping source step. This can be any step whose incoming hop is connected to the Mapping step. Use the Choose button to select this step from a list.|
|Description||Add a description to this output step mapping here.|
|Mapping||Not enabled on the Output tab.|
Mapping Input Specification
This step acts as a place-holder in a Mapping sub-transformation. It describes the places (0 or more) in the mapping where input is expected to occur. You can think of it as a special input step that receives data from its parent transformation.
To help you create and design the sub-transformation, this dialog box includes the required input fields for the sub-transformation.
|Step name||Name of the step. This name has to be unique in a single transformation.|
|Name||The name of the field as it will be known inside the sub-transformation.|
|Type||The data type of this field.|
|Length||Maximum string length.|
|Precision||Maximum number of decimals.|
|Include unspecified fields, ordered by name||
In certain borderline cases, you not only want the required fields, but all of them. This option allows for this situation.
This is the case, for example, if you want to remove certain fields from the stream later on, retaining all other fields.
Mapping Output Specification
This step acts as an output place-holder in a Mapping sub-transformation. It describes the places (0 or more) in the mapping where output is expected to occur. You can think of it as a special output step that allows the parent transformation to receive data from a sub-transformation.
|Step name||Name of the step.|
The following sample transformations demonstrate the capabilities of the Mapping step. These samples are available in the distribution package and are included at the following location: design-tools/data-integration/samples/transformations/mapping
- Mapping: simple mapping.ktr: (an example of a sub-transformation)
- Mapping: use simple mapping.ktr: (an example of a parent transformation)
The two input fields (strings) that the script needs are: leftValue and rightValue. Define these two strings in the Mapping input step:
The calculated value res is a field we want to pass to the parent transformations, so we add a Mapping output step as well.
The resulting mapping looks like the following example:
Now that the mapping is specified, it's ready to be run.
In this example, there are two fields coming into the Mapping step, X=A+B: A and B. A mapping is made between:
- A and leftValue
- B and rightValue
- res and X the result field
This mapping is achieved with the Input and Output tabs of the Mapping dialog box, as shown in these sample screens.
In our sample, we only use one input and output mapping. It is possible however to use 0, 1 or more of either input or output mappings in a mapping transformation. That means that we need to be able to specify which input or output we're addressing in the various tabs. That is where the various step name choices come from in the screenshot.
In this sample, we checked the Main data path option. The corresponding Output tab shows the example's fieldname values in the mapping and target steps.