Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Strings cut

Parent article

The Strings cut step returns a snippet of an input string based on a range of character locations. For example, you may need to parse the time “11:00” out of a filename “11:00 am update”. You would use Strings cut to return the substring starting at an index of 0 and ending before the index 5.

NoteIf you specify locations outside of the character length of the string, Strings cut returns a blank string.

General

Enter the following information in the transformation step field:

  • Step name

    Specify the unique name of the Strings cut step on the canvas. You can customize the name or leave it as the default.

The fields to cut

The fields to cut table in the Strings cut step

Use The fields to cut table to specify what fields to cut and where to cut them.

The table contains the following columns:

ColumnDescription
In stream fieldName of the field containing the string to cut. Use Get fields to populate the table with fields from the incoming PDI data stream.
Out stream field(Optional) A new outgoing PDI field containing the resulting substring. If you do not specify a value for this field, the In stream field is replaced by the resulting substring.
Cut fromThe character location at the starting point of the substring. This value is zero-based. The first character of the entire input string has an index of 0.
Cut toThe character location after the ending point of the substring, such that the substring cuts up to but not including the value entered. The value is zero-based but is exclusive. For example, setting Cut to to a value of 1 returns the first character in In stream field.

The maximum length of the resulting string is Cut to minus Cut from.

Example

Given the string “example”, you can use Strings cut to parse out the first letter “e” by understanding the index of each character in the string, as shown in the following table:

Characterexample
Index0123456

The Cut from position of the first letter “e” is 0. The Cut to position is 1.

To parse out “am”, the Cut from position represents the start of the substring at 2. The Cut to position should include the end of the substring, which is before 4.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.