Skip to main content
Pentaho Documentation

Data Integration Operations Mart

Parent article

The Data Integration Operations Mart is a centralized data mart that stores job or transformation log data for auditing, reporting, and analysis. You can use the Data Integration Operations Mart to collect and query Data Integration log data. Then use the Pentaho Server tools to examine the log data in reports, charts, and dashboards.

The data mart is a collection of tables organized as a data warehouse using a star schema. Together, the dimension tables and a fact table represent the logging data. These tables must be created in the Data Integration Operations Mart database. Pentaho provides SQL scripts to create these tables for the PostgreSQL database. A Data Integration job populates the time and date dimensions.

NoteFor optimal performance, be sure to clean the operations mart periodically.

Instructions for installing the DI Ops Mart depend on the method you used to install Pentaho:

Installing DI Ops Mart for a Pentaho Archive Installation.

Installing DI Ops Mart for a Pentaho Manual Installation.

Charts, reports, and dashboards using the DI Operations Mart data

Once you have created and populated your Data Integration Operations Mart with log data, you can use the features of the Pentaho User Console to examine this data and create reports, charts, and dashboards. We provide many pre-built reports, charts, and dashboards that you can modify.

To help understand the contents of the log, see Data Integration Operations Mart Reference.

Clean up DI Operations Mart tables

Cleaning the DI Operation Mart consists of running either a job or transformation that deletes data older than a specified maximum age. The transformation and job for cleaning up the DI Operations Mart can be found in the etl folder.

Perform the following steps to clean up the DI Operations Mart:

Procedure

  1. Using the PDI client (Spoon), open either Clean_up_PDI_Operations_Mart.kjb for jobs or the Clean_up_PDI_Operations_Mart_fact_table.ktr for transformations.

  2. Set the following parameters:

    • max.age.days (required)

      The maximum age in days of the data.

    • schema.prefix (optional)

      For PostgreSQL databases, enter the schema name followed by a period (.). This prefix is applied to the SQL statements. For other databases, leave the value blank.

  3. Run the job or transformation.

    Running the job or transformation deletes the data that is older than the maximum age from the data mart.

Next steps

To schedule regular clean-up of the DI Operations Mart, see Schedule perspective in the PDI client.