Skip to main content
Pentaho Documentation

Result Set Caching

When rendered, a parameterized report must account for every dataset required for every parameter. Every time a parameter field changes, every dataset is recalculated, which can negatively impact performance.

You can avoid gratuitous dataset recalculations by caching parameter datasets. This is accomplished through the EHcache framework built into the BA Server. You can configure specific settings for published reports by editing the ehcache.xml file in the /WEB-INF/classes/ directory inside of the pentaho.war. The relevant element is:

Anything containing complex objects is not cached (CLOB and BLOB data types); neither are results coming from a scripting dataset, a Java method call, a table data source, an external data source (computed in an action sequence), or a CDA data source. In all of these cases there is either no point in caching because it would be more expensive than recalculating, or because there are not enough hints available in the involved parameters.

<!--
    Defines a cache used by the reporting engine to hold small datasets.
    This cache can be configured to have a separate instance for each
logged in user via the
    global report configuration. This per-user cache is required if role
or other security and
    filter information is used in ways invisible for the reporting
engine.
  -->
  <cache name="report-dataset-cache"
        maxElementsInMemory="50"
        eternal="false"
        overflowToDisk="false"
        timeToIdleSeconds="300"
        timeToLiveSeconds="600"
        diskPersistent="false"
        diskExpiryThreadIntervalSeconds="120"
    />

The other side of the coin is that if a cache exists for too long, when the data source is updated it may not reflect in the report output because it's still using old data. So there is a balance between performance and accuracy that you must tune to your needs.