Skip to main content
Pentaho Documentation

Paginated Exports

A pageable report generates a stream of pages. Each page has the same height, even if the page is not fully filled with content. When a page is filled, the layouted page will be passed over to the output target to render it in either a Graphics2D or a streaming output (PDF, Plaintext, HTML, etc.) context.

Page break methods

When the content contains a manual pagebreak, the page will be considered full. If the pagebreak is a before-print break, then the break will be converted to an after-break, the internal report states will be rolled back, and the report processing restarts to regenerate the layout with the new constraints. A similar rollback happens if the current band does not fit on the page. Because of this, you would generally prefer break-before over break-after.

So for large reports, you might consider removing manual page breaks and limiting the width of bands.

Page states

When processing a pageable report, the reporting engine assumes that the report will be run in interactive mode, which allows for parameterization control. To make browsing through the pages faster, a number of page states will be stored to allow report end-users to restart output processing at the point in the report where they adjust the parameters.

Reports that are run to fully export all pages usually do not need to store those page states. A series of Report engine settings controls the number and frequency of the page states stored:

  • org.pentaho.reporting.engine.classic.core.performance.pagestates.PrimaryPoolSize=20
  • org.pentaho.reporting.engine.classic.core.performance.pagestates.SecondaryPoolFrequency=4
  • org.pentaho.reporting.engine.classic.core.performance.pagestates.SecondaryPoolSize=100
  • org.pentaho.reporting.engine.classic.core.performance.pagestates.TertiaryPoolFrequency=10

The Reporting engine uses three lists to store page states. The default configuration looks as follows:

  1. The first 20 states (Pages 1 to 20) are stored in the primary pool. All states are stored with strong references and will not be garbage collected.
  2. The next 400 states (pages 21 to 421) are stored into the secondary pool. Of those, every fourth state is stored with a strong reference and cannot be garbage collected as long as the report processor is open.
  3. All subsequent states (pages > 421) are stored in the tertiary pool and every tenth state is stored as strong reference.

So for a 2000-page report, a total of about 270 states will be stored with strong references.

In server mode, the settings could be cut down to:

org.pentaho.reporting.engine.classic.core.performance.pagestates.PrimaryPoolSize=1
org.pentaho.reporting.engine.classic.core.performance.pagestates.
  SecondaryPoolFrequency=1
org.pentaho.reporting.engine.classic.core.performance.pagestates.SecondaryPoolSize=1
org.pentaho.reporting.engine.classic.core.performance.pagestates.
  TertiaryPoolFrequency=100

This reduces the number of states stored for a 2000 page report to 22, thus cutting the memory consumption for the page states to a 1/10th.

Note: In the current versioin full exports do not generate page states and thus these settings have no effect on such exports. They still affect the interactive mode.