Skip to main content
Pentaho Documentation

SSTable Output

The SSTable Output step writes to a filesystem directory as an Apache Cassandra SSTable using CQL (Cassandra Query Language) version 3.x.

This step supports Cassandra 2.2 and later.

AEL Considerations

When using the SSTable Output step with the Adaptive Execution Layer (AEL), the following factors affect performance and results:

  • Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.
  • Metadata injection is not supported for steps running on AEL.

Options

SSTable Output Properties Dialog Box

The following options are available for the SSTable Output transformation step:

Option Description

Step name

Specify the unique name of the SSTable Output step on the canvas. You can customize the name or leave it as the default.

Cassandra yaml file

Specify the location of YAML file. A cassandra.yaml file is the main configuration file for Cassandra. It defines node and cluster configuration details.

Directory

Specify where to write the output. This directory points to the target table to load to and must match the Keyspace and Table fields.

Keyspace

Specify the keyspace (database) name of the target table to load. The name specified must match the Directory field.

Table

Specifies the table (column family) to upload. It assumes the metadata for this table was previously defined in Cassandra. The table specified must match the Directory field.

Incoming field to use as the row key

Specify which incoming row to use as the key. You can use Set Fields to specify the key from the names of incoming PDI transformation fields.

Set fields Select from a list of incoming PDI transformation fields to specify as the Incoming field to use as the row key.

Buffer (MB)

Specify buffer size to use. A new table file is written every time the buffer is full.