Skip to main content
Pentaho Documentation

Maintain logging

Parent article

Pentaho enables you to maintain your system logs with rotation and monitoring the execution status of transformations and jobs.

Log rotation

This procedure assumes that you do not have or do not want to use an operating system-level log rotation service. If you are using such a service on your Pentaho Server, connect to the Pentaho Server and use that instead of implementing this solution.

The Pentaho Server uses the Apache log4j Java logging framework to store server feedback. The default settings in the log4j.xml configuration file may be too verbose and grow too large for some production environments. Follow these instructions to modify the settings so that Pentaho Server log files are rotated and compressed:

Procedure

  1. Stop all relevant servers.

  2. Download a ZIP archive of the Apache Extras Companion for log4j package: Apache Logging Services.

  3. Unpack the apache-log4j-extras.jar file from the ZIP archive, and copy it into: server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/lib/

  4. Edit the log4j.xml settings file for the Pentaho Server. This XML file is located in: server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/classes/

  5. Remove all PENTAHOCONSOLE appenders from the configuration.

  6. Modify the PENTAHOFILE appenders to match the log rotation conditions that you prefer.

    You may need to consult the log4j documentation to learn more about configuration options. Many Pentaho customers find the following examples useful:Daily (date-based) log rotation with compression:
    <appender name="PENTAHOFILE" class="org.apache.log4j.rolling.RollingFileAppender">
        <!-- The active file to log to; this example is for Pentaho Server.-->
        <param name="File" value="../logs/pentaho.log" />
        <param name="Append" value="false" />
        <rollingPolicy class="org.apache.log4j.rolling.TimeBasedRollingPolicy">
            <!-- See javadoc for TimeBasedRollingPolicy -->
            <param name="FileNamePattern" value="../logs/pentaho.%d.log.gz" />
        </rollingPolicy>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d %-5p [%c] %m%n"/>
        </layout>
    </appender>
    Size-based log rotation with compression:
    <appender name="PENTAHOFILE" class="org.apache.log4j.rolling.RollingFileAppender">
        <!-- The active file to log to; this example is for Pentaho Server.-->
        <param name="File" value="../logs/pentaho.log" />
        <param name="Append" value="false" />
        <rollingPolicy class="org.apache.log4j.rolling.FixedWindowRollingPolicy">
            <param name="FileNamePattern" value="../logs/pentaho.%i.log.gz" />
            <param name="maxIndex" value="10" />
            <param name="minIndex" value="1" />
        </rollingPolicy>
        <triggeringPolicy class="org.apache.log4j.rolling.SizeBasedTriggeringPolicy">
            <!-- size in bytes -->
            <param name="MaxFileSize" value="10000000" />
        </triggeringPolicy>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d %-5p [%c] %m%n" />
        </layout>
    </appender>
  7. Save and close the file, then start all affected servers to test the configuration.

Results

You have an independent log rotation system in place for all modified servers.

Execution status

You can view remotely executed and scheduled job and transformation details, including the date and time that they were run, and their status and results, through the PDI Status page. To view it, navigate to the /pentaho/kettle/status page on your Pentaho Server (change the host name and port to match your configuration):

http://localhost:8080/pentaho/kettle/status
You must be logged in to ensure you are redirected to the login page.

You can get to a similar page in the PDI client by using the Monitor function of a slave server. This page clears when the server is restarted, or at the interval specified by the object_timeout_minutes setting.

On Carte

Any action done through the Carte server embedded in the Pentaho Server is controlled through the /pentaho/server/pentaho-server/pentaho-solutions/system/kettle/slave-server-config.xml file. Notice the Configuration details table at the bottom of the screen. This shows the three configurable settings for schedule and remote execution logging.
NoteTo make modifications to slave-server-config.xml, you must stop the Pentaho Server.
The three configurable options in the slave-server-config.xml file are explained in the following table:
PropertyValuesDescription
max_log_linesAny value of 0 (zero) or greater. 0 indicates that there is no limit.Truncates the execution log when it goes beyond this many lines.
max_log_timeout_minutesAny value of 0 (zero) or greater. 0 indicates that there is no timeout.Removes lines from each log entry if it is older than this many minutes.
object_timeout_minutesAny value of 0 (zero) or greater. 0 indicates that there is no timeout.Removes entries from the list if they are older than this many minutes.
The following code block is an example of the slave-server-config.xml file:
<slave_config>
  <max_log_lines>0</max_log_lines>
  <max_log_timeout_minutes>0</max_log_timeout_minutes>
  <object_timeout_minutes>0</object_timeout_minutes>
</slave_config>

Best practices for logging

Kettle logging provides extensive flexibility that allows you to determine log locations, granularity, as well as what information is captured. Here are a best practices for setting up logging in your environment. For more information on logging, see Logging and Monitoring Operations.
  • Store logs in a centralized database. By default, log files are stored locally. The PDI client, Carte, and Pentaho Server logs are stored separately. To make log information easier to find, place logs in a central database. As an added bonus, centralized logging makes it easier to use PDI’s performance monitoring more effectively.
  • Obtain full insert accesses for tables. Logging can fail if you do not have the appropriate accesses. Having the appropriate accesses minimizes this risk. To learn more about table access, consult the documentation for your database.
  • Install JDBC Drivers Locally and on Each Server. This helps you avoid Driver not found errors. For a list of supported drivers, see JDBC drivers reference.
  • Use implied schemas when possible. Implied schemas result in fewer places to troubleshoot should logging fail. Of course, you can still specify a schema if needed.
  • Make templates for transformation and job files. Include logging configurations in the template so that they can be reused with ease.
  • Use Kettle global logging variables when possible. To avoid the work of adding logging variables to each transformation or job, consider using global logging variables instead. You can override logging variables by adding information to individual transformations or jobs as needed.
If you choose to use the kettle.properties file, observe the following best practices.
  • Backup your kettle.properties files. If you are making many changes to the kettle.properties file, consider backing up the file first. This will make it easier to restore should issues occur.
  • Maintain a master copy of the kettle.properties file. It is usually easiest to use the kettle.properties file for the PDI client, then to overwrite the Carte and Pentaho Server copies if changes are made. Make sure that values that might change, such as directory paths, are maintained however.
  • Test thoroughly. Test your settings by saving your kettle.properties file locally, then restarting the PDI client. Make sure the kettle.properties file loads properly. Execute a transformation or job that uses the settings. Try executing the transformation locally and remotely on the Pentaho Server to ensure that log variables reference the same places.