Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Kinesis Producer

Parent article

The Kinesis Producer step pushes data from a PDI transformation to an existing stream in a specific region within the Amazon Kinesis Data Streams (KDS) service. You can use this step to publish a stream of data to KDS for storage in S3, Redshift, or EMR HDFS. For more information about the Amazon Kinesis Data Streams protocol, see https://aws.amazon.com/kinesis/.

For the Kinesis Producer step, configure the connection to Amazon and specify the target Amazon region and KDS stream. Only one region can be selected.

AEL considerations

When using the Kinesis Producer step with the Adaptive Execution Layer, the following may affect performance and results.

  • When running the Kinesis Consumer step on AEL Spark, use HDP 3.x. Earlier versions of HDP are not supported.

General

Enter the following information:

  • Step name: Specifies the unique name of the Kinesis Producer step on the canvas. You can customize the name or leave Kinesis Producer as the default.

Options

The Kinesis Producer step includes a tab for setting up the connection to the Amazon Kinesis Data Streams service and a tab for adding configuration options. Each tab is described below.

Setup tab

In the Setup tab, define the connection to the Amazon region, the target stream in the Amazon Kinesis Data Streams (KDS) service, the partion key field, and the message field:

Kinesis Producer setup tab
OptionDescription
RegionSpecify the Amazon geographical region where the stream is located. You can select only one region.
Stream nameSpecify the Amazon KDS stream name by selecting one of the following methods:
  • Specify stream name

    Specify the Stream name to which the data will be published.

  • Get data from field

    Specify the Field name from another step that is generating data in the transformation stream. Using the drop-down list, select the name of the field to use.

Partition key fieldSelect the PDI field that contains the partition key. The partition key is used to group records into shards within the stream.
Message fieldSelect the PDI field that contains data to write to the stream.

Options tab

In the Options tab, define the Value for the following write and connection timeout settings:

Kinesis Producer options tab
NameValue
Write Timeout SecondsSpecify a timeout value, in seconds, for the Kinesis Producer step to wait for the next time to write to the KDS stream. The default value is 30.
Connection Timeout SecondsSpecify a timeout value, in seconds, for the Kinesis Producer step to wait while trying to connect to the server. The default value is 2.
Connection Acquisition Timeout SecondsSpecify a timeout value, in seconds, for the Kinesis Producer step to wait while trying to connect after the initial conection is made. The default value is 10.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.