Explains how to maintain and update the DI Operations Mart.
Override Kettle Logging Variable Settings
If you want to override the Kettle logging variable settings for where logging data is stored in a specific transformation or job, complete these steps. Even if you override those settings, logging information will be used to create Operations Mart reports.
- From within Spoon, open a job or transformation.
- Select the View tab, then right-click the job or transformation and select Transformation Settings or Job Settings.
- Select the Logging tab and choose the appropriate selection from the left pane.
- For jobs, select Job.
- For transformations, select Transformation.
- In the Log Connection field, enter or select the appropriate database connection. If you are using the default settings, choose live_logging_info. Otherwise, enter or choose the database connection that reflects your environment's configuration.
- In the Log table name field
- For jobs, enter LOG_JOB.
- For transformations, enter LOG_TRANS.
- In order to collect row input or output information for jobs or transformations, for instance for throughput calculations, specify an Input and output step for each transformation that collects external input or output data.
- For LINES_INPUT, specify the step collecting external input data.
- For LINES_OUTPUT, specify the step collecting output data.
- Ensure all entries under Fields To Log are selected. If the LOG_JOB or LOG_TRANS table has not been created in the database, click the SQL button and then click the Execute button in the subsequent dialog box.
- Click OK.
- In the Monitoring tab in the Transformation Properties or Job Properties window, check the box labeled Enable step performance monitoring?
- Click OK to exit the dialog box, then save the job or transformation.
The DI Operations Mart is configured to collect ETL logging data.
Update Operations Mart
You can monitor the latest performance of your ETL operations by updating the logging data within the DI Operations Mart. As a prerequisite, the Operations Mart must have previously been created and configured with the logging data you want to collect.
Your data logs need to be updated if you modified these types of data.
- Logging table
- Database connection
- Transformation step
- Job entry
You must update and then populate the executor and log dimensions table if you want to log the most current data.
If you modified the logging table, database connection, operations mart transformation steps or job entries, you need to manually update the DI operations mart executor, log dimension tables, and then refresh the data.
- From within Spoon, select Tools > Repository > Explore.
- Select pdi_operations_mart.
- Choose the appropriate job or transformation from the table.
|If you want to||Choose|
|Update the executor and log dimension tables.||Update Executor and Log Table Dimensions.ktr|
|Populate the Pentaho Operations Mart with the logging information without updating executor and log dimension tables.||Update_Logging_Datamart.kjb|
|Update the executor and log dimension tables with the latest logging data. Then, update the Pentaho Operations Mart with that new data.||Update_Dimensions_then_Logging_Datamart.kjb|
- Schedule the Update_Logging_Datamart.kjb job to run periodically. For more information on how to schedule the job, see Scheduling Transformations and Jobs From Spoon.
The job or transformation runs. The Operations Mart updates and/or populates with the latest logging data.
Clean Up Operations Mart Tables
Cleaning up the PDI Operation Mart consists of running a job or transformation that deletes data older than the specified maximum age. The transformation and job for cleaning up the PDI Operations Mart can be found in the "etl" folder.
- In Spoon, open Clean_up_PDI_Operations_Mart.kjb, then set these parameters.
- max.age (required)—the maximum age in days of the data. Job and transformation data older than the maximum age will be deleted from the datamart.
- schema.prefix (optional)—for PostgreSQL databases, enter the schema name followed by a period (.), this will be applied to the SQL statements. For other databases, leave the value blank.
- Data that was not within the specified date range is now deleted.