Skip to main content
Pentaho Documentation

Run Work Items on Pentaho Worker Nodes

After completing the Pentaho Worker Nodes installation and setup process, you can run transformations and jobs as work items to scale out across clusters when your workload increases or when using Kafka streaming data. You can run work items in real-time from the PDI client or the User Console, or use scheduling options to launch work items during off-peak hours or on a recurring basis. Regardless of the Pentaho application and method used to execute your work items, the processed transformations and job results reside within the Pentaho platform.

These tasks are intended for Pentaho administrators who know where the data is stored, how to connect to it, and details about the computing environment. To run work items, you must have Pentaho Worker Nodes installed and set up on your system. For more information, see Install Pentaho Worker Nodes on a Single Instance of HCI and Set Up Worker Nodes on the Pentaho Server.

Run Work Items using the PDI Client

When using the PDI client, you must set up the Run configuration of a transformation or job to run it as a work item. Afterward, you can run those saved work items immediately or schedule the items to run at regular intervals, on certain dates and times, or with different parameters.

Perform the following actions to run work items on worker nodes using the PDI client:

  1. Make sure that you are connected to the Pentaho Repository.
  2. Create or edit the run configuration for the transformation or job through the Run configurations folder in the View tab as shown:
  • To create a new run configuration, right-click on the Run Configurations folder and select New, as shown in the folder structure below:

PDI_RunConfig_New_Dialog.png

  • To edit an existing run configuration, open the file. Right-click on the Run configuration that you want to change and then select Edit, as shown in the folder structure below:

PDI_RunConfig_Edit_Dialog.png

  1. In the Run configuration dialog box, enter or select the options shown in the table below. 

PDI_RunConfig_Dialog.png

The Run configuration dialog box contains the following options when Pentaho is selected as the Engine for running a transformation or job:

Option Description
Name Specify the name of the run configuration.
Description Optionally, specify details of your configuration.
Engine Pentaho
Settings Pentaho server
  1. Click OK. The PDI client is ready to run the work item using worker nodes. To run the work items at specific times, on a recurring basis, with different parameters, or to manage scheduled items, see PDI Client Scheduling for details.

Run Work Items using the User Console

When using the User Console, you can launch saved work items to run on worker nodes immediately, at a scheduled time, or at a regular interval.

Perform these initial steps to run work items on worker nodes using the User Console:

  1. Connect to the Pentaho Repository and then save your transformation or job file to a folder in the Pentaho Repository.
  2. Log in to the User Console, and then click the Browse Files button.
  3. In the Folders pane, click the folder containing the file that you want to run. 
  4. In the File pane, click on the file that you want to run.
  5. Next, proceed according to how and when you want to run the file:

PUC_BrowseFiles_Screen.png

Running Work Items in the Background

Perform the following actions to run a file immediately:

  1. In File Actions pane, select Run in background. The Run In Background dialog box displays.

PUC_RunInBackground_Screen.png

  1. The Run In Background dialog box contains the following options when running a file. Enter your selections for these options.
Option Description
Schedule Name Specify a name for the schedule, which will also be the name of the generated content.
Generated Content Location Specify a location for the generated content.
  1. Click OK. The work item is now running using worker nodes, where content is delivered to your specified location.

Running Work Items using Scheduling

Perform the following actions to run a file on a specific date and time or at a recurring interval:

  1. In File Actions pane, select Schedule. The New Schedule dialog box displays.

PUC_NewSchedule_ScheduleName_Dialog.png

  1. The New Schedule dialog box contains the following options when scheduling a file.
Option Description
Schedule Name Specify a name for the schedule, which will also be the name of the generated content.
Generated Content Location Specify a location for the generated content.
  1. Enter your selections and then click Next
  2. Options for customizing your schedule display. Enter your selections for these options.

PUC_NewScheduleTime_Dialog.png

Option Description
Recurrence

Specifies a recurring period in which the file is run. Options include:

  • Run Once -  Runs the file one time. 
  • Seconds – Runs the file repeatedly: specify the Recurrence pattern (in seconds) and the Range of recurrence (Start and End date).
  • Minutes - Runs the file repeatedly: specify the Recurrence pattern (in minutes) and the Range of recurrence (Start and End date).
  • Hours - Runs the file repeatedly: specify the Recurrence pattern (in hours) and the Range of recurrence (Start and End date).
  • Daily - Runs the file repeatedly: specify the Recurrence pattern (in days) and the Range of recurrence (Start and End date).
  • Weekly - Runs the file repeatedly: specify the Recurrence pattern (on the day of every week) and the Range of recurrence (Start and End date).
  • Monthly - Runs the file repeatedly: specify the Recurrence pattern (on the day of every month) and the Range of recurrence (Start and End date).
  • Yearly - Runs the file repeatedly: specify the Recurrence pattern (on the month of the year) and the Range of recurrence (Start and End date).
  • Cron - Runs the file according to the Cron String: specify the Cron attributes and the Range of recurrence (Start and End date).
Start Time Specify a start time to run the file.
Start Date Specify a start date to run the file.
  1. Click Finish. The work item will run when scheduled on worker nodes, at the specified recurrence rate, with its content output to your selected location. To manage scheduled work items in the User Console, see Manage Schedules for details. 

Administer the Pentaho Worker Nodes Product 

After running work items, use the following article to learn how to monitor work items: