Skip to main content
Pentaho Documentation

Working with the Streamlined Data Refinery

Overview

Create a single data refinery by streamlining all of your data sources through a central processing hub.

The Streamlined Data Refinery (SDR)  is a simplified ad hoc ETL refinery composed of a series of Pentaho Data Integration (PDI) jobs that take raw data, augment and blend it through the request form, and then publish it for report designers to use in Analyzer.

The Movie Ratings-SDR Sample form described here is a sample developed by Pentaho, based on CTools, and is provided for example purposes to help you get familiar with the structure.

SDR Sample Structure and Components

The components that make up the data refinery are PDI for parameter entry and an app for refining the data. This app calls to the Data Integration (DI) Server for the main job: refining data through Spoon using the new Build Model job entry, then publishing the data source back to the Business Analytics (BA) Server through the Publish Model job entry. Once it is published, the refined data is available for use in creating Analyzer reports.

SDR Workflow

App Builder, Community Dashboard Editor, and CTools

App Builder is an application builder for people who may not have Java knowledge, but who may have plenty of interesting ideas for new plugins. All that is required to use App Builder is knowledge of CTools and PDI.

Community Dashboard Editor (CDE), when integrated with the Pentaho User Console (PUC), simplifies the process of creating, refining, and previewing Pentaho dashboards. You can use CDE to design dashboards, either from scratch or using a template. The layout panel allows you to style your dashboard and add elements such as text or images.

Installing and Configuring the SDR Sample

This section walks you through each part of the process of installing the SDR sample to use. If you are evaluating Pentaho 5.2 software, we recommend using the graphical installation method  for a quick set-up. If you already have Pentaho 5.2 installed, you can skip right to the section for downloading and installing the sample.

The sample that we are using here was developed by Pentaho, based on CTools, and is provided for example and testing purposes only.

Install Pentaho Software

  1. Install the latest version of Pentaho software.
    1. Evaluation method using Postgres: choose default for installation, and then enter password for the Postgres password information.
  2. Log in to PUC, and verify that it is working properly.
  3. Log out of PUC, then stop the BA Server.
    1. Windows:  Go to Start > Programs > Pentaho Enterprise Edition > Server Management. Double-click on the Stop BA Server icon; this runs the script automatically. You might want to make a shortcut to Server Management on your desktop.
    2. Mac OS: In the command line, type cd INSTALL_DIR/  then run this script to stop the BA Server.
      - shutdown baserver - ./ctlscript.sh stop baserver

If you are using Vertica, you will need to install the Vertica JDBC driver at this point.

Download and Install the Sample

These steps will get the sample installed and running.

  1. Download the sample SDR.zip and save it to your desktop.
  2. Extract the contents of the SDR.zip file to this directory: pentaho/server/biserver-ee/pentaho-solutions/system. It should create an SDR folder in the directory.
  3. If you are using the default evaluation method, start the BA Server and log into PUC.
  4.  Go to Tools at the top of PUC and verify that Movie Ratings-SDR Sample appears in the dropdown menu.

Tool_SDR_sample.png

If you are evaluating Pentaho and used the default installation instructions, you are ready to get started experimenting with the sample form. If you are doing anything else, follow the instructions in the Advanced Settings section before continuing with the sample form.

How to Use the Movie Ratings-SDR Sample Form

The Movie Ratings-SDR Sample form is easy to use. All you will need to do is choose which filters that you want to apply, along with member profile information, a date range, and a name for your new data source. Then you can get started using your refined data in Analyzer, or work to create more data sets using the form.

We recommend that you make sure to do at least some selecting of data for optimum results.

  1. Log in to PUC using a login with administrator permissions.
  2. Open the Movie Ratings-SDR Sample form.

    BeginSDRForm.png

    The All Requests Processed field contains a list of up to ten instances of dataset activity. You can click an item in the list to work with that dataset, or create a new one.

    AllRequestsProcessed.png

  3. Click to select the filters that you want to use from the Movie Review Filter panel.
  4. Click to select profile items in the Member Profile panel to narrow down your data to specific points.

    FilterProfileSelected.png

  5. Enter the Start Date for the data set, or use the date picker to choose the start date.
  6. Enter the End Date for the data set, or use the date picker to choose the end date.
  7. Enter a data source name in the field.
  8. Click the Let's do this button to run the dataset.

    LetsDoThis.png

  9. Depending on whether you keep the check box selected, Analyzer launches with your dataset parameters in place. Or you can click Go to Analyzer when you are ready to begin creating reports.

Learn More about Our Tools

Here are a few ways to find out more about the Streamlined Data Refinery and other Pentaho products.