Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

MapReduce Input

Parent article

This step defines the key/value pairs for Hadoop input, and indicates the injection point of the transformation that receives input from the MapReduce framework. The rest of the transformation operates on the fields that came from this step.

AEL considerations

When using the MapReduce Input step with the Adaptive Execution Layer, the following factor affects performance and results:

  • Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.

Options

Enter the following information in the transformation step fields.

Option Description
Step nameSpecifies the unique name of the MapReduce Input step on the canvas. A MapReduce Input step can be placed on the canvas several times; however, it represents the same MapReduce Input step. You can customize the name or leave it as the default.
Key fieldThe Hadoop input field and data type that represents the key in MapReduce.
Value fieldThe Hadoop input field and data type that represents the value in MapReduce.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.