Skip to main content
Pentaho Documentation

Pentaho Worker Nodes System Recommendations

There are several hardware, networking, and operating system recommendations for running the Pentaho Worker Nodes Product on one or more instances.

Resource Recommendations

This section provides basic resource requirements for running Pentaho Worker Nodes Product on an HCI instance. You can scale your own worker nodes environments based on your work item load. When aligning your available resources with the work item load you want to run, keep the following guidelines in mind:

  • A single instance of HCI requires 8 GB of RAM and 2 cores per machine, plus 50 GB of available disk space.
Resource Required amounts to run a single HCI instance
RAM 8 GB
CPU 2 cores
Available disk space 50 GB
  • A single worker node is configured for 8 GB of RAM and 2 cores from the cluster, plus 8 GB of available disk space. 
Resource Required amounts to run a single worker node
RAM 8 GB
CPU 2 cores
Available disk space 8 GB approximately

Given the above guidelines, a machine with 32 GB of RAM and 8 cores must reserve 8 GB and 2 cores for running the required HCI instance. You can, therefore, allocate the remaining 24 GB and 6 cores to running three worker nodes simultaneously.

Resource Required amounts to run 3 worker nodes simultaneously on a single HCI instance
RAM 32 GB = 8 GB (HCI instance) + 24 GB (8 GB per worker node)
CPU  8 cores = 2 cores (HCI instance) + 6 cores (2 cores per worker node)
Available disk space  75 GB   = 50 GB (HCI instance) + 25 GB (8 GB disk per worker node)

Use these guidelines to scale your resources. The more worker nodes you run simultaneously, the more hardware and resources you may need.

Single-Instance Systems Versus Multi-Instance Systems

An HCI system can have a single instance or it can have multiple instances of four or more. Each instance must meet the minimum RAM, CPU, and disk space requirements. 

Three instances are sufficient to perform leader election for distributing work. However, a multi-instance system requires a minimum of four instances because, with the minimum hardware requirements, three instances are not sufficient for running all HCI services at their recommended distributions. 

Single Instance System

A single-instance system is useful for testing and demonstration purposes. It requires only a single server and can perform all HCI functionality. 
However, a single-instance system has the following drawbacks: 

  • It has a single point of failure. If the instance hardware fails, you lose access to HCI. 
  • With no additional instances, you cannot choose where to run HCI services. All services run on that one instance. 

Multiple Instances System

A multi-instance system is recommended for use in a production environment because it offers the following advantages: 

  • You can control how services are distributed across the multiple instances, providing improved service redundancy, scale-out, and availability. 
  • A multi-instance system can survive instance outages. For example, with a four-instance system running the default distribution of services, the system can lose one instance and remain available. 
  • Performance is improved since work is performed in parallel across instances. 
  • You can add additional instances to the system at any time. 

You cannot convert a single-instance system to a production-ready multi-instance system by adding new instances since HCI does not support adding additional master instances. Master instances are special instances that run a particular set of HCI services. Single instance systems have one master instance. Multi-instance systems have a minimum of three master servers. 

By adding additional instances to a single-instance system, your system still has only one master instance, meaning there is a single point of failure for the essential services that only a master instance can run. 

A multi-instance system should have a minimum of three master servers.  A non-master or worker node can be added to a multi-instance if the minimum of three is the starting point.

The three master instance IP values should be determined before you run the Setup script. Once HCI is installed, any IP changes would require the complete removal and re-installation of HCI to enact the changes, such as  changing single-instance IP values to multi-instance IP values.

For information on adding instances to an existing HCI system, see the HCI Administrator Help, which is available from the Administration App.

Docker and Operating System Requirements 

To be an HCI instance, each server you provide must meet the following requirements: 

  • Must have Docker version 1.10.3 or later installed 
  • Must run a 64-bit Linux distribution 

You must install the current Docker version suggested by your operating system, unless that version is earlier than 1.10.3. HCI cannot run with Docker versions prior to 1.10.3. 

For more information about the Docker versions suggested by various operating systems, refer to the HCI Install Guide included with your installation.

Docker Considerations

Ensure that the Docker storage driver is configured correctly on each instance before installing HCI. After HCI is installed, changing the Docker storage driver requires a reinstallation of HCI. 

To view the current Docker storage driver on an instance, run the command: 

docker info 

Do not run the Docker Device Mapper storage driver in loop-lvm mode on a production system, because it can slow system performance. On certain Linux distributions, your system may not have enough space to run it.

The Docker installation directory on each instance must have at least 20 GB available for storing the HCI Docker images.

Networking 

The following describes the network usage and requirements for both system instances and services.

Notes
  • You must configure the network settings for each service when you install the system. You cannot change these settings after the system is up and running.
  • If your networking environment changes after you deploy HCI, such that HCI can no longer function with its current networking configuration, you need to reinstall the HCI system. For more information about networking, refer to the HCI Install Guide included with your installation.

For more information about adding network security, see Enabling Secure Communication for Pentaho Worker Nodes.

Instance IP Address Requirements 

All instance IP addresses must be static, including both internal and external network IP addresses, if applicable to your system. 

If the IP address of any instance changes, refer to the HCI Install Guide included with your installation.

Network Types 

Each HCI service can bind to one type of network, either internal or external, for receiving incoming traffic. If your network infrastructure supports having two networks, you may want to isolate the traffic for most system services to a secured internal network that has limited access. You can then leave only the Search-App and Admin-App services on your external network for user access. 

You can use either a single network type for all services or a mix of both types. If you want to use both types, every instance in your system must be addressable by two IP addresses: one on your internal network and one on your external network. If you use only one network type, each instance needs only one IP address. 

Allowing Access to External Resources

Regardless of whether you are using a single network type or a mix of types, you need to configure your network environment to ensure that all instances have outgoing access to the external resources you want to use, including:

  • The data sources where your data is stored. 
  • Identity providers for user authentication. 
  • Email servers that you want to use for sending email notifications. 

Ports

Each service binds to a number of ports for receiving incoming traffic. 

Before installing HCI, you can configure the services to use different ports, or use the default values shown below. For more information, see Optional: Set Up Networking for System Services

System-External Ports 

The following table contains information about the service ports that users use to interact with the system. On every instance in the system, each of these ports must be accessible from: 

  • Any network that requires administrative or search access to the system. 
  • Every other instance in the system.
Default Port Value Service Purpose
8000 Admin-App

Access to administrative interfaces: 

  • Administration App 
  • Administrative REST API 
  • Administrative CLI

 

If you are enabling security, you will need to indicate a port value for secure communication. See Enabling Secure Communication for Pentaho Worker Nodes for more information.

System-Internal Ports

Determine which ports each HCI service should use. You can use the default ports for each service or specify different ones. In either case, these restrictions apply:

  • Every port must be accessible from all instances in the system.
  • Some ports must be accessible from outside the system.
  • All port values must be unique; no two services, whether system services or HCI services, can share the same port.
  • For information on port usage and requirements for each HCI service, see Ports.

You can find more information on how these ports are used in the documentation for the third-party software underlying each service. Refer to “Appendix B: Services” in the HCI Install Guide included with your installation.

Set Up HCI and Pentaho for Worker Nodes

Complete the instructions in the following articles to set up HCI and Pentaho to use worker nodes:

  1. Packaging the Pentaho Worker Nodes Product
  2. Install Pentaho Worker Nodes on a Single Instance of HCI
  3. Set Up Pentaho Worker Nodes on Pentaho Server

Run and Administer the Pentaho Worker Nodes Product

Use the following articles to assist you in running and administering Pentaho Worker Nodes: