Learn about new features in Pentaho 5.3.
Pentaho 5.3 delivers many exciting and powerful features that help you quickly and securely access, blend, transform, and explore data. Highlights include new Analyzer APIs and documentation, Redshift and Impala improvements, Hadoop clusters and Hadoop distribution support, better support for high load Carte environments, and some minor functionality improvements around Pentaho Interactive Reports and Hadoop steps and entries.
Pentaho Business Analytics 5.3
Our new features and improvements will help you work with Analyzer APIs, explore the Streamlined Data Refinery, and set up multi-tenancy with Pentaho Business Analytics.
New Analyzer APIs & Documentation Updates
We have exposed a new set of APIs to provide more control over Analyzer when working in an embedded fashion. These APIs allow for more fine-grained interaction with the Analyzer reports and data. The Analyzer extensibility APIs will live in a single place, and include introductory material, as well as samples.
Documentation for Multi-Tenancy
We have created documentation to guide you through the Multi-Tenancy process for Pentaho Business Analytics. We cover data multi-tenancy, content multi-tenancy, and UI multi-tenancy within the documentation.
Updates to Streamlined Data Refinery
We have updated the Streamlined Data Refinery to improve the modeling process, added security and data source improvements, and added support for Amazon Redshift and Cloudera Impala.
Pentaho Data Integration 5.3
These new and powerful features will help you quickly and securely access, blend, transform, and explore data with Pentaho Data Integration.
Manage Cluster Configuration with the Hadoop Clusters Feature
Simplify deployment, maintenance, and configuration of Big Data clusters with the Hadoop Clusters feature. The Hadoop Clusters feature lets you store cluster configuration information, such as host names and port numbers. You can then reuse them in your HBase Input, HBase Output, Pentaho Map Reduce, Oozie Job Executor, Hadoop Job Executor, and Pig Script Executor transformation steps and job entries. With Hadoop Clusters, maintenance is a breeze! You only need to change cluster configuration in one place instead of many.
New Developer Documentation
Peruse our new PDI Server API documentation to learn more about which web services are available to developers. File management, user management, and carte web services are documented for use. The PDI SDK is also available for download from the Pentaho Support site.
New Cloudera and MapR Big Data Hadoop Distribution Support
We now support CDH 5.2 and MapR 4.0.1. To learn about other Hadoop Distributions we support, check out the 5.3 support matrix.
Better Support for High Load Environments and Large Deployments with Carte
We've made some improvements to Carte to help you better use PDI in high load environments and large deployments. Carte slave servers can now use the same kettle variable values as the master server. This simplifies the deployment of additional server nodes because you no longer need to set up kettle.variable file information for each node. You can also easily adjust Carte's Jetty server settings to support high load environments.
Minor Functionality Changes
Minor functionality changes cover changes to the software that might impact your upgrade or migration experience. If you are migrating from an earlier version than 5.2, check the What's New articles and Minor Functionality changes for each intermediate version of the software.
Manually Migrating Big Data Cluster Configurations Stored in Hadoop Steps and Entries
If you are migrating or upgrading to PDI 5.3 or greater, and you have transformations or jobs that use the following Big Data steps or entries, you might need to convert the existing cluster configuration information to use the Hadoop Clusters feature.
- HBase Input
- HBase Output
- Pentaho Map Reduce
- Oozie Job Exec
- Hadoop Job Exec
- Pig Script Exec
You only need to perform the conversion process if you edit one of the above steps or entries in Spoon. Otherwise, you do not need to complete the conversion process. Note that you can continue to run scheduled transformations and jobs without the conversion, as long as you do not manually edit one of the above steps or entries.
To convert, specify a new Hadoop cluster configuration using the instructions in the Managing Reusable Hadoop Cluster Configurations article.
Interactive Reports Performance Improvements
We have implemented a number of performance improvements for Pentaho Interactive Reports (PIR), including the ability for system administrators to set system-wide maximum row limit, a way to extend PIR to show toolbar buttons, and incorporated the query-metadata collection capabilities of Pentaho Report Designer (PRD) into PIR.
System-wide Row Limit
We have incorporated the capability to set a system-wide row limit for Pentaho Interactive Reports (PIR). Your users will not be able to override this row-limit once you have set it, although they will have the option of setting their own, smaller, row-limit through the query settings. This will improve performance when the returned data set is fairly large, and also adds the ability to run the full report in the background.
We have extended PIR so that you can show the buttons that interact with the repository in the PIR toolbar. This will allow you to embed PIR without having to use a third-party tool to hook the callbacks on. This function is triggered by passing a parameter on the URL of the PIR plugin.
We have changed PIR to take advantage of the query metadata collection improvements for Report Designer in Pentaho 5.2. This query metadata feature will improve the design time of Interactive reports.