This section contains instructions for configuring and controlling the cache infrastructure that the Pentaho Analysis engine uses for OLAP data. This information is useful for properly updating your OLAP cubes when your data warehouse is refreshed, and for performance-tuning.
The Analysis engine does not ship with a segment cache, but it does have the ability to use third-party cache systems. If you've installed Pentaho Analysis Enterprise Edition, then you have a default configuration for the JBoss Infinispan distributed cache, though the actual Infinispan software is not included and must be downloaded separately. Infinispan supports a wide variety of sub-configurations and can be adapted to cache in memory, to the disk, to a relational database, or (the default setting) to a distributed cache cluster.
The Infinispan distributed cache is a highly scalable solution that distributes cached data across a self-managed cluster of Mondrian instances. Every Mondrian instance running the Analysis Enterprise Edition plugin on a local network will automatically discover each other using UDP multicast. An arbitrary number of segment data copies are stored across all available nodes. The total size of the cache will be the sum of all of the nodes' capacities, divided by the number of copies to maintain. This is all fully configurable; options are explained later in this section.
Other supported segment cache configurations include, but are not limited to:
- Memcached, which uses an established (extant) Memcached infrastructure to cache and share the segment data among Mondrian peers.
- Pentaho Platform Delegating Cache, which relies on the Pentaho BA Server to delegate segment data storage to the BA Server's native caching capabilities, thus leveraging the existing caching configuration. Some people may prefer this configuration because it keeps the BA Server and Analysis engine manageable as a single entity.