Pentaho 6.0 delivers many exciting and powerful features to help you quickly and securely access, blend, transform, and explore your data. Highlights include new features and improvements for Analyzer, Data Integration, the Platform itself, and the Help system:
Pentaho Business Analytics 6.0
Your experience with Analyzer has been enhanced with inline model editing, calculated measure updates, additional data source connection options, and an improved prompting experience. With inline model editing, your updates are reflected in your data source when you save your changes. Moreover, your calculated measures can now be saved to the data model directly from Analyzer. You can connect to your new Pentaho Data Services. And. working with prompts has become a clean, seamless experience.
Inline Model Editing
Pentaho now features inline model editing, a self-service ability which allows you to modify your data source from Analyzer. Previously, model changes would have to be submitted to your IT group, and you would have to wait for your request to be applied. Now, you can rename a measure or create a calculated measure directly from Analyzer. When you save these changes, they become a part of the data source for others to immediately access. From Analyzer, you can update the properties on a base measure, including renaming the measure, changing the aggregation type, or adjusting the format. These changes will then be available to other users who are creating their reports based on this data source.
Calculated Measure Updates
As part of the new inline model editing feature, a self-service ability, you can save your calculated measure to the data model while creating or editing your report in Analyzer. Previously, if you wanted to add a particularly useful calculated measure to the model for others to use in their reports, you had to submit a request to IT and wait until it is applied. Now, you can save the calculated measure to the model from within Analyzer so other users can benefit immediately from your work.
More Data Source Connection Options
Two new data source connection options now appear in Pentaho User Console's Database Connection window: Cloudera Impala and Pentaho Data Services. The Cloudera Impala option enables you to use the Cloudera JDBC Simba driver Cloudera JDBC Driver for Impala 2.5.24 when you access a Cloudera Distribution for Hadoop (CDH) 5.4 cluster. For more details, see the Version-Specific Notes section of the Set up Pentaho to Connect to a Cloudera Cluster article. The Pentaho Data Service option enables you to easily query and connect to a data service from a Pentaho tool, like PRD or PIR. You no longer have to use the Generic Database option. For more information, see the Pentaho Data Services section in this article.
Improved Prompting Experience
Adding and editing prompts in Interactive Reporting and Dashboards is now a clean, seamless experience. In addition, prompts such as the Date Picker are now consistent and uniform across the product.
Pentaho Data Integration 6.0
In 6.0, you can now change your transformations into data services, work with SNMP monitoring trap events, and analyze data lineage along with various other enhancements. These new and powerful features will help you quickly and securely access, blend, transform, and explore data with Pentaho Data Integration.
Pentaho Data Services
Sometimes, building and maintaining a data warehouse is impractical or costly, especially when you need to quickly blend and visualize fast-moving or quickly evolving data sets. Instead of building a data warehouse, you can turn a transformation into a Pentaho Data Service, which empowers you to quickly analyze and visualize results from a virtual table. A Pentaho Data Service is a transformation within the DI Server so it can be queried as a virtual database table.
This powerful feature turns an ordinary transformation into a JDBC data source, which can be queried with simple SQL statements. You can query a Pentaho Data Service from JDBC-compliant tool, such as Pentaho Analyzer, Report Designer, or Interactive Reporting, as well as compatible non-Pentaho tools like R Studio or SQuirreL.
You can also monitor these data services from your browser. For more information, see Monitor the Data Service in the Create a Pentaho Data Service article.
You can now set up the Data Integration (DI) and Carte servers along with client tools (such as Pan, Kitchen, and Spoon) to expose common DI events like the start and the end of a transformation or job through Simple Network Management Protocol (SNMP) traps. This set up enables integration with 3rd-party enterprise monitoring tools like Nagios, Icinga, PRTG Network Monitor, and others to leverage their functionality for operational purposes.
The DI server only needs to know the IP address of your 3rd-party monitoring server to start sending trap events. The DI server either sends all event types occurring when a Kettle job or transformation is executed, or you can specify the type of events you want to monitor through SNMP traps.
Inline Model Editing SDR Updates
In addition to supporting annotations made in PDI, the Streamlined Data Refinery (SDR) process now supports annotations made via inline model editing within Analyzer. When you make changes to measures in Analyzer, these changes can be included when you republish the model during the SDR process.
Also, you can now add calculated measures as annotations in the Annotate Stream step. Select the Add Calculated Measure button to add your calculated measure as an annotation on the model.
Data Lineage Analysis
Pentaho now offers you the ability to analyze the end-to-end flow of your data across PDI transformations and jobs, providing you with valuable insights to help you maintain meaningful data. This ability to track your data from source systems to target applications enables you to take advantage of third-party tools, such as Meta Integration Technology's (MITI's) Model Bridge and yEd, to track and view specific data.
Once lineage tracking is enabled, PDI will generate a .graphml file every time you run a transformation or job. You can then open this file using a third-party tool, such as yEd, to view a tree diagram of the data. By parsing through and teasing out the different parts of the graph, you can gain an end-to-end view into a specific element of data from origin to target.
Big Data Improvements
We have expanded our support for Big Data and improved the configuration experience.
New Hadoop Distribution Support
Pentaho now offers support for Cloudera Distribution for Hadoop, version 5.4. To learn more about the other Hadoop Distributions supported, see our Component Reference.
Improved Hadoop Configuration Experience
When you set up a connection to the Hadoop Cluster in Spoon, you can now test the configuration to make troubleshooting common configuration problems easier. You can even set the active shim in Spoon as well. A shim is an adapter you configure to connect to a Hadoop Cluster.
Many improvements and fixes were made to improve PDI user experience.
Scheduling and Menu Updates
We have streamlined transformation and job menu items so it is easier to find the options you want. You can also now schedule a transformation or job to run yearly.
PDI Step and Entry Changes
The following steps, entries, and components have been moved to the marketplace:
- Entries: SSH2 Get Entry, SSH2 Put Entry
- Steps: Aggregate Rows Step, Get Previous Row Fields Step, LucidDB Bulk Loader Step, Streaming XML Input Step, XML Input Step, Google Analytics Input Step
If you have a job or transformation from a previous version of PDI, and it contains these steps or entries, you need to either replace them or download the step or entry plugin from the marketplace. Step and entry replacement recommendations are in the Transformation Step and Job Entry References.
A few other notable changes have occurred:
- Salesforce steps now supports TLS 1.2 encryption.
- The Start a YARN Kettle Cluster and Stop a YARN Kettle Cluster entries have been renamed to Start PDI Cluster on YARN and Stop PDI Cluster on YARN.
New Perspectives Menu
You can now get to different perspectives using a simple menu to the top right of the Spoon window. We have removed Instaview and MonetDB to streamline processes.
Multiple License Installation
You can now install more than one license at a time.
SAP HANA Bulk Loader Step
If you are an SAP HANA user, you can now bulk load data into your SAP HANA database table using the SAP HANA Bulk Loader transformation step. Bulk loading enables you to significantly speed up the process of setting up systems or restoring previously collected data. The SAP HANA Bulk Loader step is shipped as a plug-in, which you can manually install.
Minor Functionality Changes
Minor functionality changes cover changes to the software impacting your migration experience. If you are migrating from a version earlier than 6.0, check the What's New articles and Minor Functionality changes for each intermediate version of the software.
Deprecated Steps, Entries, and Components
The following items are deprecated in version 6.0 of the software. Although deprecated items will be phased out in a future version of PDI, they remain in the software. No new features or development work will be performed on deprecated items. If you are using a deprecated item, we strongly suggest you use a replacement step or entry. Replacement steps and entries appear in the Transformation Step and Job Entry References.
The steps and entries have been deprecated.
- Entries: MS Access Bulk Load Entry
- Steps: LucidDB Streaming Loader Step, Greenplum Bulk Loader Step
New PDI Internal Variable
The Internal.Entry.Current.Directory is a new variable you can use to indicate the current filename or repository directories in a PDI transformation step or job entry. This variable can be used in place of these four variables:
These variables will still work in 6.0. This new variable improves ease of use within one parameter.
Unused Connection No Longer Saved by Default
In PDI 6.0, the Only Use Saved Connections to XML option is set to True by default. Thus, when a transformation is saved or exported to an XML file, connection information used in the transformation is included in the XML file, but unused connection information is not. Before PDI 6.0, this option was set to False. All connection information was included in the saved or exported file, even if it was not used in the transformation.
If you want to include used and unused connection information in a saved or exported XML file, start Spoon, go to Tools > Options, then set the Only Use Saved Connections to XML option to False.
We have made a number of improvements and upgrades to components within the Pentaho Platform, as well as to the platform itself. Highlights include an easier upgrade and repository experience, JBoss and Tomcat upgrades, an update for Jackrabbit, and better security capabilities.
Simplified Upgrade Process
We have streamlined the entire upgrade process for Pentaho 6.0. You will now be able to upgrade to the next major release with a simplified process, starting with Pentaho 5.x to Pentaho 6.0. Our latest release contains multiple new features, platform upgrades, and fixes to improve the efficiency of your Pentaho system.
Complete Repository Backup and Restore
You now have a couple of ways to do a complete backup and restore of your Pentaho repository. You can do it either through the command line interface or through a pair of new REST APIs. When you do a backup, the process exports all content from the Pentaho repository, including data connections, users and roles, Mondrian schemas, metadata entries, and all reports and schedules.
JBoss, Tomcat, and Java 8 Upgrades
We have integrated newer versions of JBoss and Tomcat along with an upgrade to Java 8 into our platform to take advantage of the numerous improvements available in them.
Apache Jackrabbit Update
We have incorporated an updated version of Apache Jackrabbit into Pentaho 6.0.
Virtual File System Update
We have incorporated version 2.0 of VFS into Pentaho 6.0.
Existing plug-ins using VFS 1.x may need to be modified for VFS 2.0.
Enhanced Security Capabilities
We have improved our enterprise security capabilities by upgrading both Spring Security and the Java framework to provide improved authentication, authorization, and other security features.
APIs for Managing Pentaho Security Users and Roles
This new group of REST APIs allows you to work with Pentaho Security users and roles in the Business Analytics platform. You will be able to create or delete users or roles, assign roles to users, list members or roles, assign permissions to roles, or change user passwords.
Help Site Improvements
Our Help site now has an auto-faceted search tool and can be assessed directly from Pentaho products.
Auto-Faceted Help Search Tool
The search tool is now auto-faceted. If you begin your search on a specific article page, the results returned will only be within the category of that article. Use the search carousel or click on All Results to expand your search to the rest of the help system. You can expand the carousel to include other guides, categories, and versions.
In-Product Access to Pentaho Help
You can now access Pentaho Help from inside of the Pentaho products, instead of a PDF. The Pentaho help will give you access to the latest Pentaho help documentation.