Skip to main content
Pentaho Documentation

Data Source Wizard

Parent article

Before you can create Pentaho reports and analysis, the Pentaho Server needs to know where to find your data and how to connect to it, what data you want to use, how the data is structured, and what the data is like. The Data Source Wizard helps you define a data source that contains this information and guides you through the creation of your first relational or multidimensional data models for use in creating reports and analysis.

After you define a data source, you can make it available to people who create reports and analysis so they can simply pick the data source they want to use from a list and begin creating their reports or analysis. One data source can be used by any number of reports or analysis.

Get started with the Data Source Wizard

When you configured your Pentaho Server, you defined the connection information, including where the data is stored and what protocol or driver to use to connect to it. Pentaho configuration shows you how to change or add connection information.

To create a data source, just pick one of the connections that you have already defined, then use the wizard's interactive data modeling tool to select tables and columns to drag onto the model canvas. You can then use the Data Source Model Editor to refine the model further.

The Data Source Wizard guides you through setting up CSV files, as well as relational and multidimensional data models, as data sources for building interactive and analysis reports. The initial data models you create with this tool enable you to immediately see how your data looks in Interactive Reports and Analyzer. Your results appear in Interactive Reports and Analyzer as you change model structures, and add tables and columns.

Create New menu in the Pentaho User Console

If you are not logged in with permissions to create, edit, and delete data sources, you are limited to view-only permissions and do not see any icons associated with adding, editing, or deleting data sources.

Manage Data Sources dialog box

There are several ways to access the Data Source Wizard from within the User Console, so you do not have to backtrack to create a data source.

If You Are InThen Follow These Steps
Home perspectiveClick Create New Data Source, then the New Data Source button and Data Source Wizard
Home perspectiveClick New Data Source button in the Manage Data Sources dialog box, then Data Source Wizard
Home perspectiveClick Create New Analysis Report, then the Add icon and Data Source Wizard
Home perspectiveClick Create New Interactive Report, then the Add icon and Data Source Wizard
Opened perspectiveClick New New Data Source Data Source Wizard
Any perspective, from the menu barClick File New Data Source Data Source Wizard
Any perspective, from the menu barClick File Manage Data Sources, then the New Data Source button and Data Source Wizard
Dashboard pane, creating a chart or data tableClick the Add icon in the Select Data Source dialog box
Data Source Wizard dialog box

Eventually, you might want to add more security, localization, or make other advanced modifications to your model. To do that, export the initial model created with the Data Source wizard, and import it into either Pentaho Metadata Editor for relational data models and Pentaho Schema Workbench for multidimensional data models.

NoteRelational and multidimensional models that have been edited in Metadata Editor and Schema Workbench can no longer be edited in the Data Source Model Editor.

After you have accessed the Data Source Wizard, you are ready to begin creating your first data sources.

Choose a data source type

The Data Source Wizard supports several data source types for maximum flexibility. Each data source type is defined on where or how you store data. This table helps you decide which data source type is best for you.

Table 1: Data source type decision table
Explore ConsiderationsChoose Options
CSV FilesSQL QueriesDatabase Table Data Sources
Summary

A file with data that is separated by commas or some other specified delimiter character.

You can save an Excel spreadsheet as a CSV file and use this file to stage a database table in the Pentaho Repository.

If the data in the CSV file changes, you have to upload the updated file through the Data Source Wizardagain.

CSV files can be uploaded directly to the Pentaho Repository without a data connection.

This option creates a multidimensional model for Analyzer only and a relational model for Interactive Reports, Dashboard Designer, and Analyzer.

An SQL query written against a relational database defines the data you want to use in Interactive Reports or Analyzer.

You can customize how columns are presented to users who are building queries against the new data source; for example, you define column names and select options that indicate how data is aggregated (sum, min., max., etc.), and more.

This option creates a multidimensional model for Analyzer only and a relational model for Interactive Reports, Dashboard reports, and Analyzer.

SQL Queries are typically used for prototyping or small production environments, not for enterprise production environments.

Database Table data sources allow you to create relational database tables containing the data you want to use in Interactive Reports or Analyzer.

This data source type supports a broad range of use cases including relational reporting directly against an operational data store (ODS) or creating a data source for multidimensional analysis against a star schema.

Database Table data sources are used in production environments.

ExpertiseKnowledge of basic database structures.Knowledge of basic database structures and SQL.Knowledge of relational and/or multidimensional database structures.
TimeApproximately 10 minutesApproximately 20 minutesApproximately 30 minutes to 1 hour
RecommendationThis data source makes it easier to evaluate or create development prototypes.This data source type works well when you are evaluating or creating development prototypes.

SQL Queries are used for prototyping, not for production.

To use this data source in production, create the query using Pentaho Metadata Editor.

Use this data source for production.

To use it for evaluating or creating development prototypes, limit the data in the tables to sample data that is representative of your production data.