Skip to main content
Pentaho Documentation

SSTable Output

Parent article

The SSTable Output step writes to a filesystem directory as an Apache Cassandra SSTable using CQL (Cassandra Query Language) version 3.x.

NoteThis step supports Cassandra 2.2 and later.

AEL considerations

When using the SSTable Output step with the Adaptive Execution Layer, the following factor affects performance and results:

  • Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.

Options

SSTable Output

The following options are available for the SSTable Output transformation step.

OptionDescription
Step nameSpecify the unique name of the SSTable Output step on the canvas. You can customize the name or leave it as the default.
Cassandra yaml fileSpecify the location of YAML file. A cassandra.yaml file is the main configuration file for Cassandra. It defines node and cluster configuration details.
DirectorySpecify where to write the output. This directory points to the target table to load to and must match the Keyspace and Table fields.
KeyspaceSpecify the keyspace (database) name of the target table to load. The name specified must match the Directory field.
TableSpecifies the table (column family) to upload. It assumes the metadata for this table was previously defined in Cassandra. The table specified must match the Directory field.
Incoming fields to use as the keySpecify which incoming row to use as the key. You can use Set Fields to specify the key from the names of incoming PDI transformation fields.
Set FieldsSelect from a list of incoming PDI transformation fields to specify as the Incoming fields to use as the key.
Buffer (MB)Specify buffer size to use. A new table file is written every time the buffer is full.