Skip to main content
Pentaho Documentation

Data Source Wizard

Before you can create Pentaho reports and analysis, the Pentaho Server needs to know where to find your data and how to connect to it, what data you want to use, how the data is structured, and what the data is like. The Data Source Wizard helps you define a data source that contains this information and guides you through the creation of your first relational or multidimensional data models for use in creating reports and analysis.

After you define a data source, you can make it available to people who create reports and analysis so they can simply pick the data source they want to use from a list and begin creating their reports or analysis. One data source can be used by any number of reports or analysis.

Get Started with the Data Source Wizard

When you configured your Pentaho Server, you defined the connection information, including where the data is stored and what protocol or driver to use to connect to it. Configure Pentaho Systems shows you how to change or add connection information.

To create a data source, just pick one of the connections that you have already defined, then use the wizard's interactive data modeling tool to select tables and columns to drag onto the model canvas. You can then use the Data Source Model Editor to refine the model further.

The Data Source Wizard guides you through setting up CSV files, as well as relational and multidimensional data models, as data sources for building Interactive and Analysis Reports. The initial data models you create with this tool enable you to immediately see how your data looks in Interactive Reporting and Analyzer. Your results appear in Interactive Reporting and Analyzer as you change model structures, and add tables and columns.

If you are not logged in with permissions to create, edit, and delete data sources, you are limited to view-only permissions and do not see any icons associated with adding, editing, or deleting data sources.

ssManageDataSourcesNewButtonx.png

There are several ways to access the Data Source Wizard from within the User Console, so you do not have to backtrack to create a data source.

If You Are In Then Follow These Steps
Home perspective Click Create New > Data Source > New Data Source button > Data Source Wizard
Home perspective Click Manage Data Sources > New Data Source button > Data Source Wizard
Home perspective Click Create New > Analysis Report > Add icon > Data Source Wizard
Home perspective Click Create New > Interactive Report> Add icon > Data Source Wizard
Opened perspective Click New > New Data Source > Data Source Wizard
Any perspective, from the menu bar Click File > New > Data Source > Data Source Wizard
Any perspective, from the menu bar Click File > Manage Data Sources > New Data Source button > Data Source Wizard
Dashboard pane, creating a chart or data table Click the Add icon in the Select Data Source dialog box


ssGettingStartedDataSourceWizardEntry.png

Eventually, you might want to add more security, localization, or make other advanced modifications to your model. To do that, export the initial model created with the Data Source wizard, and import it into either Metadata Editor for relational data models and Schema Workbench for multidimensional data models.

Note: Relational and multidimensional models that have been edited in Metadata Editor and Schema Workbench can no longer be edited in the Data Source Model Editor.

After you have accessed the Data Source Wizard, you are ready to begin creating your first data sources.

Choose a Data Source Type

The Data Source Wizard supports several data source types for maximum flexibility. Each data source type is defined on where or how you store data. This table helps you decide which data source type is best for you.

Table 1. Data Source Type Decision Table
Explore Considerations Choose Options
CSV Files SQL Queries Database Table Data Sources
Summary

A file with data that is separated by commas or some other specified delimiter character.

You can save an Excel spreadsheet as a CSV file and use this file to stage a database table in the Pentaho Repository.

If the data in the CSV file changes, you have to upload the updated file through the Data Source Wizard again.

CSV files can be uploaded directly to the Pentaho Repository without a data connection.

This option creates a multidimensional model for Analyzer only and a relational model for Interactive Reports, Dashboard reports, and Analyzer.

An SQL query written against a relational database defines the data you want to use in Interactive Reports or Analyzer.

You can customize how columns are presented to users who are building queries against the new data source; for example, you define column names and select options that indicate how data is aggregated (sum, min., max., etc.), and more.

This option creates a multidimensional model for Analyzer only and a relational model for Interactive Reports, Dashboard reports, and Analyzer.

SQL Queries are typically used for prototyping or small production environments, not for enterprise production environments.

Database Table Data Sources allow you to create relational database tables containing the data you want to use in Interactive Reports or Analyzer.

This data source type supports a broad range of use cases including relational reporting directly against an operational data store (ODS) or creating a data source for multidimensional analysis against a star schema.

Database Table data sources are used in production environments.

Expertise Knowledge of basic database structures. Knowledge of basic database structures and SQL. Knowledge of relational and/or multidimensional database structures.
Time Approximately 10 minutes Approximately 20 minutes Approximately 30 minutes to 1 hour
Recommendation This data source makes it easier to evaluate or create development prototypes. This data source type works well when you are evaluating or creating development prototypes.

SQL Queries are used for prototyping, not for production.

To use this data source in production, create the query using Metadata Editor.

Use this data source for production.

To use it for evaluating or creating development prototypes, limit the data in the tables to sample data that is representative of your production data.