Skip to main content
Pentaho Documentation

Connect to a Pentaho Data Service

Overview

Explains how to use the Pentaho Data Service in a Pentaho or non-Pentaho Tool

A Pentaho Data Service is a virtual table that contains the output of a PDI transformation step.  You can connect to and query the Pentaho Data Service from any Pentaho tool, such as Report Designer, the PDI client (Spoon), and Analyzer.  You can also connect to and query it from a non-Pentaho tool, like RStudio or SQuirreL.  To learn more about the Pentaho Data Service, refer to the Turn Transformation Step Results Into a Pentaho Data Service and Create a Pentaho Data Service articles.

To connect and query the Pentaho Data Service, you need to have permission to run the transformation and to access the Pentaho Server where it is published.

Access a Pentaho Data Service

Once you've created a Pentaho Data Service, you can share it with others. Here is how they can connect to the service.

Connect to the Pentaho Data Service from a Pentaho Tool

Connecting to the data service from another Pentaho tool is similar to connecting to a database. For information on connecting to a database, refer to Specify Data Connections for the DI Server.  The following table provides values for the typical parameters that you'll need to connect.

Required Parameters Description
Connection Name Name that you specify.
Connection Type Pentaho Data Services
Access Native (JDBC)
Hostname Hostname of the DI Server or IP Address.  The default is localhost if running the Pentaho Server locally.
Port Number Port number of the Pentaho Server the data service will run on.  The default is 9080.
Username Name of a user who has permission to run the data service.
Password Password for a user who has permission to run the data service.
Webappname

The name of the web application.  The webappname is typically pentaho-di.  Specify this in the Options section of the Kettle database connection window if you want to connect with PDI to a Pentaho Server

You can also set the following optional parameters.

Optional Parameters Description
proxyhostname Proxy server for HTTP connection(s).
proxyport Proxy server port.
nonproxyhosts Hosts that do not use the proxy server.  If there is more than one host name, separate them with commas.
debugtrans Optional name of the file where the generated transformation is stored.  This transformation is generated to debug it.  Example: /tmp/debug.ktr.  Specify the name of the transformation or a path plus the name of the transformation.
debuglog Set this parameter to "true” if you want the log data from the transformation to be written to the general logging channel that appears in Spoon.
PARAMETER_[optionname]=value Sets the value for a parameter in the transformation. [optionname] is the name of the parameter, and [value] is the value assigned to it.  PARAMETER_ is placed before the option name. For example, if the name of the parameter is “model”, set the parameter:   PARAMETER_model=E6530.
secure Set this parameter to TRUE to use the HTTPS secure protocol connect to the data service.  If you omit this parameter or set it to FALSE, the standard HTTP unsecure protocol is used.

Install Pentaho Data Service JDBC Driver Files on a Non-Pentaho Tool

To connect to and run a Pentaho Data Service from a non-Pentaho tool, like Squirrel or Beaker, you need to install the service driver files, then create a connection to the data service.  Pentaho Data Service JDBC driver files are available with installations of Pentaho.

  1. Go to the pentaho/data-integration/Data Service JDBC Driver directory on a computer that has Pentaho Data Integration installed, and copy the files in it.
  2. Paste the files to the directory in your application where driver files are kept.
  3. If necessary, stop and restart the application.

Connect to the Pentaho Data Service from a Non-Pentaho Tool

Once the driver is installed, you will need to create the connection to the Pentaho Data Service.  For many tools, you'll do this by specifying a connection object.  Review the connection details and optional options in Connect to the Pentaho Data Service from a Pentaho Tool.

You'll probably also need the JDBC Driver class from the following table.

Parameter Value
JDBC Driver Class org.pentaho.di.trans.dataservice.jdbc.ThinDriver

Example of JDBC Connection String

The JDBC connection string uses this format:

jdbc:pdi://hostname:port/kettle?option=value&option=value

Here is an example of a connection string.  The webappname is required.

jdbc:pdi://localhost:9080/kettle?webappname=pentaho-di

Query a Pentaho Data Service

You can query the data service using SQL depending on your data service.  Note that you cannot query a Pentaho Data Service with MongoDB aggregation pipelines.  MongoDB can be used as an input source, but you can only query the Data Service with SQL. You can query the data service as you would normally for the tool you are using.

  • To find the name of the table to query, connect to the data service, then use explorer to find the name of the table.  The name of table is usually the same as the name of the data service.
  • Note that there are SQL limitations for queries.