Skip to main content
Pentaho Documentation

Mapping

As you build a transformation, you may notice a sequence of steps that you want to repeat. This repetitive part can be turned into a mapping.

A mapping is a transformation with placeholder input and output steps. The mapping transformation is executed through the Mapping step in a parent transformation. Because the parent transformation runs a separate transformation through a specific step, the mapping transformation is commonly referred as a sub-transformation. The sub-transformation must contain the following input and output steps:

  • Mapping input: a placeholder where the mapping expects input from the parent transformation.
  • Mapping output: a placeholder indicating, from where, the parent transformation can read data.

Use mapping when you want to re-use a certain sequence of steps in a transformation. The following image illustrates mapping and the relationship of the parent transformation to the sub-transformation. 

General

Enter the following information in the transformation step fields.

Field  Description
Step Name Specifies the unique name of the step on the canvas. A step can be placed on the canvas several times; however, it represents the same step multiple times. The Step Name is set to 'Mapping (sub-transformation)' by default.
Transformation

Specify your mapping sub-transformation to execute by entering its path or clicking Browse.

If you select a transformation that has the same root path as the current transformation, the variable ${Internal.Entry.Current.Directory} will automatically be inserted in place of the common root path. For example, if the current transformation's path is /home/admin/transformation.ktr and you select a transformation in the folder /home/admin/path/sub.ktr than the path will automatically be converted to ${Internal.Entry.Current.Directory}/path/sub.ktr.

If you are working with a repository, specify the name of the transformation. If you are not working with a repository, specify the XML file name of the transformation. 

Transformations previously specified by reference are automatically converted to be specified by the transformation name within the Pentaho Repository.

Log Lines in Kettle

To differentiate log lines from a mapping, edit the kettle.properties file and set the KETTLE_LOG_MARK_MAPPINGS variable to Y.

Set this variable to Y to precede log lines with the mapping step name and the mapping itself.

Options

The Mapping step features several tabs with fields for setting parameters and defining data flow. Each tab is described below.

Parameters Tab

You can use the following fields in the Parameters tab to define or pass Kettle variables down to the mapping sub-transformation.

Option Description
Variable Name

Add a string you want to assign as a variable.

String Value (can include variable expressions) Add the value you want to assign to this variable name. It is possible to include variable expressions in the string values for the variable names.
Inherit all variables from the parent transformation

Select this option to make all variables that are available in the parent transformation available in the sub-transformation, even if they are not explicitly specified in the Parameters tab. 

When this option is not checked, only those variables/values that are specified are passed down to the sub-transformation.

Input Tab

Although one Input entry point is available, you can create additional input entry points. Each Input corresponds to one Mapping Input specification step in the mapping sub-transformation. You can have any number of these entry points in a single mapping step, including no input entry point. The following table describes the options in the Input tab.

Option Description
Available inputs (Add input) 

Use the + button to add an input mapping for the specified sub-transformation.

You can remove an Input by clicking the X icon.  

Main data path Select to indicate that you only have one input mapping and you can leave the two following fields (Input source step name and Mapping target step name) empty.
Input source step name

Specify the name of the step in the parent transformation (not the mapping) to read from. It can be any step in the parent transformation with an outgoing hop connected to the Mapping step. Click Choose to select the target step from a list.

Mapping target step name Specify the name of the Mapping Input specification step inside the sub-transformation that is to receive the rows from the Input source step. Click Choose to select the target step from a list.
Description Add a description to this input step mapping.
Update mapped fields downstream Select this option to rename fields back to their original names when they reach the Mapping output step. This will make your sub-transformations more transparent and reusable. If not selected, fields get renamed before they are transferred to the mapping transformation. 
Mapping  Click to open the Enter Mapping dialog box. Use this field mappings dialog to specify exactly how the fields from the Input source step are connected to the fields of the Mapping target step. When you finish mapping fields, click OK in the Enter Mapping dialog box. Your field mapping will appear in the mapping table located beneath the Description field.

To add inputs into the table, perform the following steps:

  1. Click the Mapping button.
  2. Click a Source Field you want to map.
  3. Click a Target Field to associate with the Source Field.
  4. Click Add. If you need to edit or change the Source or Target fields, click Delete.
  5. Click Ok

Output Tab

By default, one Output entry is available; however, you can add more Output entries using the "Add Output" + button.  Each of the Output entries correspond to one Mapping Output specification step in the mapping or sub-transformation.  That means you can have any number of Output entries (or none) in a single mapping step.  
 

Options Description
Available Outputs (Add output) 

Use this button to add a tab to specify an output mapping for the specified sub-transformation. 

You can remove an Output entry by clicking the X icon. 

Main data path Check this if you only have one output mapping and you can leave the two following fields (Mapping source step name and Output target step name) empty.
Mapping source step name The name of a Mapping output specification step in the sub-transformation where data will be read from. Use the Choose button to select this step from a list.
Output target step name The name of the step in the current transformation (parent) that is to receive the rows from the Mapping source step. This can be any step whose incoming hop is connected to the Mapping step. Use the Choose button to select this step from a list.
Description Add a description to this output step mapping here.
Mapping

Not enabled on the Output tab.

 

Mapping Input Specification

This step acts as a place-holder in a Mapping sub-transformation. It describes the places (0 or more) in the mapping where input is expected to occur. You can think of it as a special input step that receives data from its parent transformation.

To help you create and design the sub-transformation, this dialog includes the required input fields for the sub-transformation:  

Options

Option Description
Step Name

Name of the step. 

This name has to be unique in a single transformation.

Name The name of the field as it will be known inside the sub-transformation.
Type The data type of this field.
Length Maximum string length.
Precision Maximum number of decimals.
Include unspecified fields, ordered by name

In certain borderline cases, you not only want the required fields, but all of them. This option allows for this situation. 

This is the case, for example, if you want to remove certain fields from the stream later on, retaining all other fields.

During design time, none of these unspecified fields will be available in field picklists etc., since these fields will be available only during runtime, not design time.

Mapping Output Specification

This step acts as an output place-holder in a Mapping sub-transformation. It describes the places (0 or more) in the mapping where output is expected to occur. You can think of it as a special output step that allows the parent transformation to receive data from a sub-transformation.

Options

Option Description
Step Name

Name of the step.

This name has to be unique in a single transformation

Samples

The following sample transformations demonstrate the capabilities of the mapping step. These samples are available in the distribution package and are included at the following location: 

design-tools/data-integration/samples/transformations/mapping

  • Mapping - simple mapping.ktr: (an example of a sub-transformation)

  • Mapping - use simple mapping.ktr:  (an example of a parent transformation)

Suppose we have a JavaScript step that we want to use multiple times in several transformations. This sample uses a simple concatenation to demonstrate mapping.

The two input fields (strings) that the script needs are: leftValue and rightValue. Define these two strings in the Mapping Input step:

The calculated value res is a field we want to pass to the parent transformations, so we add a Mapping Output step as well.

Mapping Input and Output are placeholders, there is no actual logic in them.  

The resulting mapping looks like the following example: 

Now that the mapping is specified, it's ready to be run. 

In this example, there are two fields coming into the Mapping step "X=A+B": A and B. A "mapping" is made between:

  • "A" and "leftValue"
  • "B" and "rightValue"
  • "res" and "X" the result field

This mapping is achieved with the "Input" and "Output" tabs of the Mapping dialog, as shown in these sample screens. 

In our sample, we only use one input and output mapping.  It is possible however to use 0, 1 or more of either input or output mappings in a mapping transformation. That means that we need to be able to specify which input or output we're addressing in the various tabs.  That is where the various step name choices come from in the screenshot.

In this sample, we checked the Main data path option. The corresponding output tab shows the example's fieldname values in the mapping and target steps.