Skip to main content
Pentaho Documentation

Configure Static and Dynamic Carte Clusters

If you want to speed the processing of your transformations, consider setting up a Carte cluster.  A Carte cluster consists of two or more Carte slave servers and a Carte master server.  When you run a transformation, the different parts of it are distributed across Carte slave server nodes for processing, while the Carte master server node tracks the progress.

Configure a Static Carte Cluster

Follow the directions below to set up static Carte slave servers.

  1. Copy over any required JDBC drivers and PDI plugins from your development instances of PDI to the Carte instances.
  2. Run the Carte script with an IP address, hostname, or domain name of this server, and the port number you want it to be available on.
    ./carte.sh 127.0.0.1 8081
  3. If you will be executing content stored in a DI Repository, copy the repositories.xml file from the .kettle directory on your workstation to the same location on your Carte slave. Without this file, the Carte slave will be unable to connect to the DI Repository to retrieve content.
  4. Ensure that the Carte service is running as intended, accessible from your primary PDI development machines, and that it can run your jobs and transformations.
  5. To start this slave server every time the operating system boots, create a startup or init script to run Carte at boot time with the same options you tested with.

Configure a Dynamic Carte Cluster

This procedure is only necessary for dynamic cluster scenarios in which one Carte server will control multiple slave Carte instances.

The following instructions explain how to create carte-master-config.xml and carte-slave-config.xml files.  You can rename these files if you want, but you must specify the content in the files as per the instructions. 

Configure Carte Master Server

Follow the process below to configure the Carte Master Server.

  1. Copy over any required JDBC drivers from your development instances of PDI to the Carte instances. 
  2. Create a carte-master-config.xml configuration file using the following example as a template:
<slave_config>
<!-- on a master server, the slaveserver node contains information about this Carte instance -->
    <slaveserver>
        <name>Master</name>
        <hostname>yourhostname</hostname>
        <port>9001</port>
        <username>cluster</username>
        <password>cluster</password>
        <master>Y</master>
    </slaveserver>
</slave_config>

The <name> of the Master server must be unique among all Carte instances in the cluster.

  1. Run the Carte script with the carte-slave-config.xml parameter.  Note that if you placed the carte-master-config.xml file in a different directory than the Carte script, you will need to add the path to the file to the command.
    ./carte.sh carte-master-config.xml
    
  1. Ensure that the Carte service is running as intended.
  2. To start this master server every time the operating system boots, create a startup or init script to run Carte at boot time.
You now have a Carte master server to use in a dynamic cluster. Next, configure the Carte slave servers.

Tuning Options

The table below shows the three configurable settings for schedule and remote execution logging in the slave-server-config.xml file .

To make modifications to slave-server-config.xml, you must stop the DI Server.

Property Values Description
max_log_lines Any value of 0 (zero) or greater. 0 indicates that there is no limit. Truncates the execution log when it goes beyond this many lines.
max_log_timeout_minutes Any value of 0 (zero) or greater. 0 indicates that there is no timeout. Removes lines from each log entry if it is older than this many minutes.
object_timeout_minutes Any value of 0 (zero) or greater. 0 indicates that there is no timeout. Removes entries from the list if they are older than this many minutes.

The following code block is an example of the slave-server-config.xml file:

<slave_config>
  <max_log_lines>0</max_log_lines>
  <max_log_timeout_minutes>0</max_log_timeout_minutes>
  <object_timeout_minutes>0</object_timeout_minutes>
</slave_config>  

Configure Carte Slave Servers

Follow the directions below to set up static Carte slave servers.

  1. Follow the process to configure the Carte Master Server.
  2. Make sure the Master server is running.
  3. Copy over any required JDBC drivers from your development instances of PDI to the Carte instances.
  4. In the /pentaho/design-tools/ directory,create a carte-slave-config.xml configuration file using the following example as a template:
<slave_config>
<!-- the masters node defines one or more load balancing Carte instances that will manage this slave -->
    <masters>
		<slaveserver>
			<name>Master</name>
			<hostname>yourhostname</hostname>
			<port>9000</port>
<!-- uncomment the next line if you want the DI Server to act as the load balancer -->
<!--	    <webAppName>pentaho-di</webAppName> -->
			<username>cluster</username>
			<password>cluster</password>
			<master>Y</master>
		</slaveserver>
	</masters>
	<report_to_masters>Y</report_to_masters>
<!-- the slaveserver node contains information about this Carte slave instance -->
    <slaveserver>
        <name>SlaveOne</name>
        <hostname>yourhostname</hostname>
        <port>9001</port>
        <username>cluster</username>
        <password>cluster</password>
        <master>N</master>
    </slaveserver>
</slave_config>

The slaveserver <name> must be unique among all Carte instances in the cluster.

  1. If you want a slave server to use the same kettle properties as the master server, add the <get_properties_from_master> and <override_existing_properties> tags between the <slaveserver> and </slaveserver> tags for the slave server.  Put the name of the master server between the  <get_properties_from_master> and </get_properties_from_master> tags.  Here is an example.
<!-- the slaveserver node contains information about this Carte slave instance -->
    <slaveserver>
        <name>SlaveOne</name>
        <hostname>yourhostname</hostname>
        <port>9001</port>
        <username>cluster</username>
        <password>cluster</password>
        <master>N</master>
        <get_properties_from_master>Master</get_properties_from_master>
        <override_existing_properties>Y</override_existing_properties>
    </slaveserver>
  1. Save and close the file.
  2. Run the Carte script with the carte-slave-config.xml parameter.  Note that if you placed the carte-slave-config.xml file in a different directory than the Carte script, you will need to add the path to the file to the command.
./carte.sh carte-slave-config.xml
  1. If you will be executing content stored in a DI Repository, copy the repositories.xml file from the .kettle directory on your workstation to the same location on your Carte slave. Without this file, the Carte slave will be unable to connect to the DI Repository to retrieve PDI content.
  2. Stop, then start the master and slave servers.
  3. Stop, then start the DI Server.
  4. Ensure that the Carte service is running as intended.  If you want to start this slave server every time the operating system boots, create a startup or init script to run Carte at boot time.

Changing Jetty Server Parameters

Carte runs on a Jetty server. You do not need to do anything to configure the Jetty server for Carte to work. But if you want to make changes to the default connection parameters, complete the steps in one of the subsections that follow.
 
Jetty Server Parameters Definition
acceptors The number of thread dedicated to accepting incoming connections.   The number of acceptors should be below or equal to the number of CPUs.
acceptQueueSize Number of connection requests that can be queued up before the operating system starts to send rejections.
lowResourcesMaxIdleTime This allows the server to rapidly close idle connections in order to gracefully handle high load situations.

If you want to learn more about these options, check out the Jetty documentation here: http://wiki.eclipse.org/Jetty/Howto/Configure_Connectors#Configuration_Options.  For more information about a high load setup read this article: https://wiki.eclipse.org/Jetty/Howto/High_Load.

Setting the Jetty Server Parameters in the carte-slave-config.xml file

To change the Jetty Server parameters in the carte-slave-config.xml file, complete these steps.
  1. In the /pentaho/design-tools/ directory, open the carte-slave-config.xml and add these lines between the <slave_config> </slave_config> tags.
<slave_config>
...
    <!-- Carte uses an embedded jetty server. Include this next section only if you want to change the default jetty configuration options.-->
    <jetty_options>
        <acceptors>2</acceptors>
        <acceptQueueSize>2</acceptQueueSize>
        <lowResourcesMaxIdleTime>2</lowResourcesMaxIdleTime>
    </jetty_options>
</slave_config>
  1. Adjust the values for the parameters as necessary, then save and close the file.

Setting the Jetty Server Parameters in the kettle.properties file

To change the Jetty Server parameters in the kettle.properties file, configure the following parameters to the numeric value you want. See Set Kettle Variables  if you need more information on how to do this. 
Kettle Variable in  kettle.properties  Jetty Server Parameter
KETTLE_CARTE_JETTY_ACCEPTORS acceptors
KETTLE_CARTE_JETTY_ACCEPT_QUEUE_SIZE acceptQueueSize
KETTLE_CARTE_JETTY_RES_MAX_IDLE_TIME lowResourcesMaxIdleTime