Skip to main content
Pentaho Documentation

About Data Modeling in Pentaho


This section helps BA administrators prepare their data for use with the Pentaho Analyzer and Pentaho Report Designer. The collection of analysis components in Pentaho Business Analytics helps you visualize data trends and reveal useful information about your business. You can gain these insights by creating static reports from an analysis data source, traversing an analysis cube through an Analyzer report, showing how data points compare by using charts, and monitoring the status of specific trends and thresholds with dashboards.

Before you can begin using any client tools, you must prepare your data:

  1. Consolidate data from disparate sources into one canonical source and optimize it for the metrics you want to analyze.
  2. Create an analysis schema to describe the data.
  3. Iteratively improve that schema so that it meets your users' needs.
  4. Create aggregation tables for frequently computed views.

When you have your data prepped, you can begin to use Pentaho Business Analytics.

Pentaho Business Analytics and OLAP

Pentaho Business Analytics is built on the Mondrian online analytical processing (OLAP) engine. OLAP relies on a multidimensional data model that, when queried, returns a dataset that resembles a grid. The rows and columns that describe and bring meaning to the data in that grid are dimensions, and the hard numerical values in each cell are the measures or facts. In Pentaho Analyzer, dimensions are shown in yellow and measures are in blue.

OLAP requires a properly prepared data source in the form of a star or snowflake schema that defines a logical multi-dimensional database and maps it to a physical database model. Once you have your initial data structure in place, you must design a descriptive layer for it in the form of a Mondrian schema, which consists of one or more cubes, hierarchies, and members. Only when you have a tested and optimized Mondrian schema is your data prepared on a basic level for end-user tools like Pentaho Analyzer. See Workflow Overview for a more comprehensive overview of the Pentaho Analysis data preparation workflow, including which Pentaho tools you will need to execute this process.

For concise definitions of OLAP terms, refer to Mondrian Schema Element Quick Reference and the individual element pages it references.

Pentaho Analysis Enterprise Edition Features

Pentaho offers expanded functionality for Pentaho Analysis Enterprise Edition customers, including:

  • The Pentaho Analyzer visualization tool.
  • A pluggable Enterprise Cache with support for highly scalable, distributable cache implementations including Infinispan and Memcached.

Use of these features requires a Pentaho Analysis Enterprise Edition license installed on the Pentaho Server and workstations that have Schema Workbench and Metadata Editor. A special Pentaho Server package must also be installed; this process is covered in the Installation documentation.

All relevant configuration options for these features are covered in this section.