Data Integration Operations Mart
The PDI Operations Mart is a centralized data mart that stores job or transformation log data for auditing, reporting, and analysis. The PDI Operations Mart enables you to collect and query Data Integration log data and then use the Pentaho Server tools to examine the log data in reports, charts, and dashboards. The data mart is a collection of tables organized as a data warehouse using a star schema. Together, the dimension tables and a fact table represent the logging data. These tables must be created in the PDI Operations Mart database. Pentaho provides SQL scripts to create these tables for the PostgreSQL database. A Data Integration job populates the time and date dimensions.
Getting started
Installation of the Data Integration Operations Mart depends on the following conditions and prerequisites:
Database requirement
- PostgreSQL
- MySQL
- Oracle
- MS SQL Server
If you need to review the Pentaho Server installation method, see Pentaho installation.
Existing 8.3 installation
Installation Steps
To install the Data Integration Operations Mart, you will perform the following steps:
- Step 1: Get the Data Integration Operations Mart files
- Step 2: Run the setup script
- Step 3: Set the global Kettle logging variables
- Step 4: Add logging and Operations Mart connections
- Step 5: Add a JNDI connection for the Pentaho Server
- Step 6: Add the Data Integration Operations Mart ETL solution and sample reports to the Pentaho Repository
- Step 7: Initialize the Data Integration Operations Mart
- Step 8: Verify the Data Integration Operations Mart is working
Step 1: Get the Data Integration Operations Mart files
Procedure
On the Customer Portal home page, sign in using the Pentaho support user name and password provided in your Pentaho Welcome Packet.
Click Downloads, then click Pentaho 8.3 GA Release in the 8.x list.
On the bottom of the Pentaho 8.3 GA Release page, browse the folders in the Box widget to find the files you need, located in the Operations Mart folder:
- pentaho-operations-mart-8.3.0-dist.zip
Unzip the Pentaho Operations Mart file.
Inside are the packaged Operations Mart installations file.Unpack the installation file by running the installer file for your environment.
In the IZPack window, read the license agreement, select I accept the terms of this license agreement, and then click Next.
In the Select the installation path text box, browse to or enter the directory location where you want to unpack the files, then click Next.
If you chose an existing directory, a warning message that the directory already exists appears. Click Yes.
Any existing files in the directory will be retained.When the installation progress is complete, click Quit.
Your directory will contain the setup scripts and files used to create the default content in the following steps.
Step 2: Run the Setup Script
The pentaho-operations-mart-ddl-8.3.x.yy.zip file contains folders for each database type listed with the scripts that are needed.
Database Type | Script Name | Located in the Directory |
PostgreSQL |
| /pentaho-server/data/postgresql |
MySQL |
| /pentaho-server/data/mysql5 |
Oracle |
| /pentaho-server/data/oracle10g |
Microsoft SQL Server |
| /pentaho-server/data/sqlserver |
Step 3: Set the global Kettle logging variables
When you run PDI for the first time, the kettle.properties file is created and stored in the $USER_HOME/.kettle.properties directory.
Procedure
In the PDI client, select
Add or edit the variables and values to reflect the values shown in the following table:
NoteFor Oracle and Microsoft SQL Server, leave Value blank with Variables that contain SCHEMA in the name.Variable Value KETTLE_CHANNEL_LOG_DB live_logging_info KETTLE_CHANNEL_LOG_TABLE channel_logs KETTLE_CHANNEL_LOG_SCHEMA pentaho_dilogs KETTLE_JOBENTRY_LOG_DB live_logging_info KETTLE_JOBENTRY_LOG_TABLE jobentry_logs KETTLE_JOBENTRY_LOG_SCHEMA pentaho_dilogs KETTLE_JOB_LOG_DB live_logging_info KETTLE_JOB_LOG_TABLE job_logs KETTLE_JOB_LOG_SCHEMA pentaho_dilogs KETTLE_METRICS_LOG_DB live_logging_info KETTLE_METRICS_LOG_TABLE metrics_logs KETTLE_METRICS_LOG_SCHEMA pentaho_dilogs KETTLE_STEP_LOG_DB live_logging_info KETTLE_STEP_LOG_TABLE step_logs KETTLE_STEP_LOG_SCHEMA pentaho_dilogs KETTLE_TRANS_LOG_DB live_logging_info KETTLE_TRANS_LOG_TABLE trans_logs KETTLE_TRANS_LOG_SCHEMA pentaho_dilogs KETTLE_TRANS_PERFORMANCE_LOG_DB live_logging_info KETTLE_TRANS_PERFORMANCE_LOG_TABLE transperf_logs KETTLE_TRANS_PERFORMANCE_LOG_SCHEMA pentaho_dilogs
Step 4: Add logging and operations mart connections
Procedure
Navigate to the pentaho/design-tools/data-integration/simple-jndi directory.
Open the jdbc.properties file with a text editor.
Depending on your repository database type, update the values accordingly (URL, users, password) as shown in the samples:
PostgreSQL:
PDI_Operations_Mart/type=javax.sql.DataSource PDI_Operations_Mart/driver=org.postgresql.Driver PDI_Operations_Mart/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_operations_mart PDI_Operations_Mart/user=hibuser PDI_Operations_Mart/password=password live_logging_info/type=javax.sql.DataSource live_logging_info/driver=org.postgresql.Driver live_logging_info/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs live_logging_info/user=hibuser live_logging_info/password=password
MySQL:
PDI_Operations_Mart/type=javax.sql.DataSource PDI_Operations_Mart/driver=com.mysql.jdbc.Driver PDI_Operations_Mart/url=jdbc:mysql://localhost:3306/pentaho_operations_mart PDI_Operations_Mart/user=hibuser PDI_Operations_Mart/password=password live_logging_info/type=javax.sql.DataSource live_logging_info/driver=com.mysql.jdbc.Driver live_logging_info/url=jdbc:mysql://localhost:3306/pentaho_dilogs live_logging_info/user=hibuser live_logging_info/password=password
Oracle:
PDI_Operations_Mart/type=javax.sql.DataSource PDI_Operations_Mart/driver=oracle.jdbc.OracleDriver PDI_Operations_Mart/url=jdbc:oracle:thin:@localhost:1521/XE PDI_Operations_Mart/user=pentaho_operations_mart PDI_Operations_Mart/password=password live_logging_info/type=javax.sql.DataSource live_logging_info/driver=oracle.jdbc.OracleDriver live_logging_info/url=jdbc:oracle:thin:@localhost:1521/XE live_logging_info/user=pentaho_dilogs live_logging_info/password=password
Microsoft SQL Server:
PDI_Operations_Mart/type=javax.sql.DataSource PDI_Operations_Mart/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver PDI_Operations_Mart/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart PDI_Operations_Mart/user=pentaho_operations_mart PDI_Operations_Mart/password=password live_logging_info/type=javax.sql.DataSource live_logging_info/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver live_logging_info/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_dilogs live_logging_info/user=dilogs_user live_logging_info/password=password
Step 5: Add a JNDI connection for the Pentaho Server
Procedure
Navigate to the pentaho/server/pentaho-server/tomcat/webapps/Pentaho/META-INF/ folder.
Open the context.xml file with a text editor.
Depending on your database type, edit the file to reflect the values in the applicable example:
PostgreSQL:
<Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0" maxWait="10000" username="hibuser" password="password" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/hibernate" validationQuery="select 1"/> <Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0" maxWait="10000" username="hibuser" password="password" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/hibernate" validationQuery="select 1"/> <Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0" maxWait="10000" username="hibuser" password="password" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs" validationQuery="select 1"/>
MySQL:
<Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5" maxWait="10000" username="hibuser" password="password" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/> <Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5" maxWait="10000" username="hibuser" password="password" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/> <Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5" maxWait="10000" username="hibuser" password="password" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_dilogs" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>
Oracle:
<Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/orcl" driverClassName="oracle.jdbc.OracleDriver" password="password" username="pentaho_operations_mart" initialSize="0" maxActive="20" maxIdle="10" maxWait="10000" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" type="javax.sql.DataSource" auth="Container" connectionProperties="oracle.jdbc.J2EE13Compliant=true" name="jdbc/pentaho_operations_mart"/> <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/orcl" driverClassName="oracle.jdbc.OracleDriver" password="password" username="pentaho_operations_mart" initialSize="0" maxActive="20" maxIdle="10" maxWait="10000" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" type="javax.sql.DataSource" auth="Container" connectionProperties="oracle.jdbc.J2EE13Compliant=true" name="jdbc/PDI_Operations_Mart"/> <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="password" username="pentaho_dilogs" maxWaitMillis="10000" maxIdle="5" maxTotal="20" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/live_logging_info"/>
Microsoft SQL Server:
<Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5" maxWaitMillis="10000" username="pentaho_operations_mart" password="password" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart" validationQuery="select 1"/> <Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5" maxWaitMillis="10000" username="pentaho_operations_mart" password="password" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart" validationQuery="select 1"/> <Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5" maxWaitMillis="10000" username="dilogs_user" password="password" jdbcInterceptors="ConnectionState" defaultAutoCommit="true" driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_dilogs" validationQuery="select 1"/>
Step 6: Add the Data Integration Operations Mart ETL solution and sample reports to the Pentaho Repository
Stop the Pentaho Server.
Depending on your repository database type, copy the following ETL solution and sample reports (downloaded in Step 1: Get the Data Integration Operations Mart files) to $PENTAHO_HOME/pentaho-server/pentaho-solution/default-content.
NoteThe file pentaho-operations-mart-etl-8.3.0-dist.zip may already be in this directory. If you are using a repository database type other than PostgreSQL, remove it.- PostgreSQL: pentaho-operations-mart-etl-8.3.0-dist.zip
- MySQL: pentaho-operations-mart-etl-mysql5-8.3.0-dist.zip
- Oracle: pentaho-operations-mart-etl-oracle10g-8.3.0-dist.zip
- Microsoft SQL Server: pentaho-operations-mart-etl-mssql-8.3.0-dist.zip
Place these two files in the directory as well:
- DI Operations Mart sample reports: pentaho-operations-mart-operations-di-8.3.0-dist.zip
- BA Operations Mart sample reports: pentaho-operations-mart-operations-bi-8.3.0-dist.zip
Start the Pentaho Server.
Step 7: Initialize the Data Integration Operations Mart
Launch the PDI client (Spoon).
Connect to the Pentaho Repository via the Pentaho Server.
At the main menu, select
.Select
job file and run it.At the main menu, select
.Select
job file and run it.
Step 8: Verify the Data Integration Operations Mart is working
From the Pentaho User Console, select
and open it.You should see the Jobs and Transformations that were run in Step 7.
Give users access to the PDI Operations Mart
To give users access to view the PDI Operations Mart, assign the Pentaho Operations role to those users as follows:
Procedure
From within the Pentaho User Console, select the Administration tab.
From the left panel, select
.Select the Roles tab.
Add the new role called Pentaho Operations by following the instructions in Adding Roles.
Assign the appropriate users to the new role, as described in Adding Users to Roles.
Advise these users to log in to the Pentaho User Console, create a Pentaho Analyzer or Pentaho Interactive Reports, and ensure that they can view the Pentaho Operations Mart in the Select a Data Source dialog box.
Charts, reports, and dashboards using PDI Operations Mart data
To help understand the contents of the log, see Data Integration Operations Mart Reference.
Clean up operations mart tables
Perform the following steps to clean up the PDI Operations Mart:
Procedure
Using the PDI client (Spoon), open either Clean_up_PDI_Operations_Mart.kjb for jobs or the Clean_up_PDI_Operations_Mart_fact_table.ktr for transformations.
Set the following parameters:
max.age.days (required)
the maximum age in days of the data.
schema.prefix (optional)
for PostgreSQL databases, enter the schema name followed by a period (.), this will be applied to the SQL statements. For other databases, leave the value blank
Run the job or transformation.
This will delete Job and transformation data older than the maximum age from the data mart.
Next steps