Skip to main content
Pentaho Documentation

Reference Tables for DI Operations Mart

Overview

Quisque mattis rutrum ligula et pellentesque. Pellentesque massa enim, pulvinar quis quam quis, aliquet dapibus augue. Aliquam feugiat suscipit turpis, nec ultrices dolor adipiscing ut.

DI Operations Mart consists of several tables, which are referenced in this section.

Logging Tables Status for the Data Integration Operations Mart

Transformation Log Tables

The transformation tables have a status column, these are descriptions of the values that can be found in that column.

Status Display Description
start Indicates the transformation was started and remains in this status until the transformation ends when no logging interval is set.
end Transformation ended successfully.
stop Indicates the user stopped the transformation.
error Indicates an error occurred when attempting to run the transformation.
running A transformation is only in this status directly after starting and does not appear without a logging interval.
paused Indicates the transformation was paused by the user and does not appear without a logging interval.

Jobs Log Tables

The job log tables have a status column, these are descriptions of the values that can be found in that column.

Status Display Description
start Indicates the job was started and keeps in this status until the job ends, and when no logging interval is set.
end Job ended successfully.
stop Indicates the user stopped the job.
error Indicates an error occurred when attempting to run the job.
running A job is only in this status directly after starting and does not appear without a logging interval.
paused Indicates the job was paused by the user, and does not appear without a logging interval.

Logging Dimensions and Metrics for the Data Integration Operation Mart

These tables are references that identify the various dimensions and metrics that can be used to create new ETL log charts and reports.

Fact Table

(fact_execution)

Field Name Description
execution_date_tk A technical key (TK) linking the fact to the date when the transformation/job was executed.
execution_time_tk A technical key (TK) linking the fact to the time-of-day when the transformation/job was executed.
batch_tk A technical key (TK) linking the fact to batch information for the transformation/job.
execution_tk A technical key (TK) linking the fact to execution information about the transformation/job.
executor_tk A technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tk A technical key (TK) linking the fact to information about the parent transformation/job).
root_executor_tk A technical key (TK) linking the fact to information about the root transformation/job.
execution_timestamp The date and time when the transformation/job was executed.
duration The length of time (in seconds) between when the transformation was logged (LOGDATE) and the maximum dependency date (DEPDATE)
rows_input The number of lines read from disk or the network by the specified step. Can be input from files, databases, etc.
rows_output The number of rows output during the execution of the transformation/job.
rows_read The number of rows read in from the input stream of the the specified step.
rows_written The number of rows written during the execution of the transformation/job.
rows_rejected The number of rows rejected during the execution of the transformation/job.
errors The number of errors that occurred during the execution of the transformation/job.
failed Indicates if the job or transformation has failed. 0 means the job or transformation completed successfully. 1 indicates the job or transformation failed.

Batch Dimension

(dim_batch)

Field Name Description
batch_tk A technical key (TK) for linking facts to batch information.
batch_id The ID number for the batch.
logchannel_id A string representing the identifier for the logging channel used by the batch.
parent_logchannel_id A string representing the identifier for the parent logging channel used by the batch.

Date Dimension

(dim_date)

Field Name Description
date_tk A technical key (TK) for linking facts to date information.
date_field A Date object representing a particular day (year, month, day).
ymd A string representing the date value in year-month-day format.
ym A string representing the date value in year-month format.
year An integer representing the year value.
quarter An integer representing the number of the quarter (1-4) to which this date belongs.
quarter_code A 2-character string representing the quarter (Q1-Q4) to which this date belongs.
month An integer representing the number of the month (1-12) to which this date belongs.
month_desc A string representing the month (“January”..”December”) to which this date belongs.
month_code A string representing the shortened month code (“JAN”..”DEC”) to which this date belongs.
day An integer representing the day of the month (1-31) to which this date belongs.
day_of_year An integer representing the day of the year (1-366) to which this date belongs.
day_of_week An integer representing the day of the week (1-7) to which this date belongs.
day_of_week_desc A string representing the day of the week (“Sunday”..”Saturday”) to which this date belongs.
day_of_week_code A string representing the shortened day-of-week code (“SUN”..”SAT”) to which this date belongs.
week An integer representing the week of the year (1-53) to which this date belongs.

Execution Dimension

(dim_execution)

Field Name Description
execution_tk A technical key (TK) for linking facts to execution information.
execution_id A unique string identifier for the execution.
server_name The name of the server associated with the execution.
server_host The name of the server associated with the execution.
executing_user The name of the user who initiated the execution.
execution_status The status of the execution (start, stop, end, error).
client The name of the client that triggered the execution.

Executor Dimension This table contains information about an executor that is a job or transformation (dim_executor).

Field Name Description
executor_tk A technical key (TK) for linking facts to executor information
version An integer corresponding to the version of the executor
date_from A date representing the minimum date for which the executor is valid
date_to A date representing the maximum date for which the executor is valid
executor_id A string identifier for the executor
executor_source The source location (either file- or repository-relative) for the executor
* executor_environment File server, repository name, related to the executor_source. *Reserved for future use.
executor_type The executor type (“job” or “transformation”)
executor_name The name of the executor (transformation name, e.g.)
executor_desc A string description of the executor (job description, e.g.)
executor_revision A string representing the revision of the executor (“1.3”, e.g.)
executor_version_label A string representing a description of the revision (i.e. change comments)
exec_enabled_table_logging Whether table logging is enabled for this executor. Values are “Y” if enabled, “N” otherwise.
exec_enabled_detailed_logging Whether detailed (step or job entry) logging is enabled for this executor. Values are “Y” if enabled, “N” otherwise.
exec_enabled_perf_logging Whether performance logging is enabled for this executor. Values are “Y” if enabled, “N” otherwise.
exec_enabled_history_logging Whether historical logging is enabled for this executor. Values are “Y” if enabled, “N” otherwise.
last_updated_date The date the executor was last updated
last_updated_user The name of the user who last updated the executor

Log Table Dimension

This is a “junk dimension” containing log table information (dim_log_table).

Field Name Description
log_table_tk A technical key (TK) for linking.
object_type The type of PDI object being logged (“job”, “transformation”, “step”, e.g.)
table_connection_name The name of the database connection corresponding to the location of the transformation/job logging table
table_name The name of the table containing the transformation/job logging information
schema_name The name of the database schema corresponding to the location of the transformation/job logging table
step_entry_table_conn_name The name of the database connection corresponding to the location of the step/entry logging table
step_entry_table_name The name of the table containing the step/entry logging information
step_entry_schema_name The name of the database schema corresponding to the location of the step/entry logging table
perf_table_conn_name The name of the database connection corresponding to the location of the performance logging table
perf_table_name The name of the table containing the performance logging information
perf_schema_name The name of the database schema corresponding to the location of the performance logging table

Time-Of-Day-Dimension

This dimension contains entries for every second of a day from midnight to midnight (dim_time).

Field Name Description
time_tk A technical key (TK) for linking facts to time-of-day information
hms A string representing the time of day as hours-minutes-seconds ("00:01:35", e.g.)
hm A string representing the time of day as hours-minutes ("23:59", e.g.)
ampm A string representing whether the time-of-day is AM or PM. Values are “am” or “pm”.
hour The integer number corresponding to the hour of the day (0-23)
hour12 The integer number corresponding to the hour of the day with respect to AM/PM (0-11)
minute The integer number corresponding to the minute of the hour (0-59)
second The integer number corresponding to the second of the minute (0-59)

Step Fact Table

This fact table contains facts about individual step executions (fact_step_execution).

Field Name Description
execution_date_tk A technical key (TK) linking the fact to the date when the step was executed.
execution_time_tk A technical key (TK) linking the fact to the time-of-day when the step was executed.
batch_tk A technical key (TK) linking the fact to batch information for the step.
executor_tk A technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tk A technical key (TK) linking the fact to information about the parent transformation.
root_executor_tk A technical key (TK) linking the fact to information about the root transformation/job.
execution_timestamp The date and time when the step was executed.
step_tk A technical key (TK) linking the fact to information about the step.
step_copy The step copy number. This is zero if there is only one copy of the step, or (0 to N-1) if N copies of the step are executed.
rows_input The number of lines read from disk or the network by the step. Can be input from files, databases, etc.
rows_output The number of lines written to disk or the network by the step. Can be output to files, databases, etc.
rows_read The number of rows read in from the input stream of the step.
rows_written The number of rows written to the output stream of the step.
rows_rejected The number of rows rejected during the execution of the step.
errors The number of errors that occurred during the execution of the step.

Step Dimension

This dimension contains information about individual steps and job entries (dim_step) .

Field Name Description
step_tk A technical key (TK) for linking facts to step/entry information
step_id The string name of the step/entry
* original_step_name The name of the step/entry template used to create this step/entry (“Table Input”, e.g.). *Reserved for future use.

Job Entry Fact Table

This fact table contains facts about individual job entry executions (fact_jobentry_execution).

Field Name Description
execution_date_tk A technical key (TK) linking the fact to the date when the job entry was executed.
execution_time_tk A technical key (TK) linking the fact to the time-of-day when the job entry was executed.
batch_tk A technical key (TK) linking the fact to batch information for the job entry.
executor_tk A technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tk A technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tk A technical key (TK) linking the fact to information about the root transformation/job.
step_tk A technical key (TK) linking the fact to information about the job entry.
execution_timestamp The date and time when the job entry was executed.
rows_input The number of lines read from disk or the network by the job entry. Can be input from files, databases, etc.
rows_output The number of lines written to disk or the network by the job entry. Can be output to files, databases, etc.
rows_read The number of rows read in from the input stream of the job entry.
rows_written The number of rows written to the output stream of the job entry.
rows_rejected The number of rows rejected during the execution of the job entry.
errors The number of errors that occurred during the execution of the job entry.
result Whether the job entry finished successfully or not. Values are “Y” (successful) or “N” (otherwise).
nr_result_rows The number of result rows after execution.
nr_result_files The number of result files after execution.

Execution Performance Fact Table

This fact table contains facts about the performance of steps in transformation executions (fact_perf_execution).

Field Name Description
execution_date_tk A technical key (TK) linking the fact to the date when the transformation was executed.
execution_time_tk A technical key (TK) linking the fact to the time-of-day when the transformation was executed.
batch_tk A technical key (TK) linking the fact to batch information for the transformation.
executor_tk A technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tk A technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tk A technical key (TK) linking the fact to information about the root transformation/job.
step_tk A technical key (TK) linking the fact to information about the transformation/job.
seq_nr The sequence number. This is an identifier differentiating performance snapshots for a single execution.
step_copy The step copy number. This is zero if there is only one copy of the step, or (0 to N-1) if N copies of the step are executed.
execution_timestamp The date and time when the transformation was executed.
rows_input The number of rows read from input (file, database, network, ...) during the interval
rows_output The number of rows written to output (file, database, network, ...) during the interval
rows_read The number of rows read from previous steps during the interval.
rows_written The number of rows written to following steps during the interval.
rows_rejected The number of rows rejected by the steps error handling during the interval.
errors The number of errors that occurred during the execution of the transformation/job.
input_buffer_rows The size of the step’s input buffer in rows at the time of the snapshot.
output_buffer_rows The size of the output buffer in rows at the time of the snapshot.