Skip to main content
Pentaho Documentation

Use Oracle as Your Repository Database

Overview

Explains how to initialize and configure the Oracle DI Repository.

Before you prepare your DI Repository, complete the tasks in Prepare Environment.

The DI Repository resides on the database that you installed during the Prepare Environment step, and consists of four repositories: Jackrabbit, Quartz, Hibernate, and Pentaho Operations Mart.

  • Jackrabbit contains the solution repository, examples, security data, and content data from reports that you use Pentaho software to create.
  • Quartz holds data that is related to scheduling reports and jobs.
  • Hibernate holds data that is related to audit logging.
  • Pentaho Operations Mart reports on system usage and performance.

Initialize Oracle DI Repository Database

To initialize Oracle so that it serves as the DI Repository, you will need to run a few SQL scripts to create the Hibernate, Quartz, Pentaho Operations mart, and Jackrabbit (also known as the JCR) databases.

Use the ASCII character set when you run these scripts. Do not use UTF-8 because there are text string length limitations that might cause the scripts to fail.

These sections take you through the steps to initialize the Oracle DI repository database.

Step 1: Change Default Passwords

Pentaho recommends that you change the default passwords in the SQL script files. If you are evaluating Pentaho, you might want to skip this step.

If you do decide to change the passwords, go to the pentaho/server/data-integration-server/data/oracle10g directory and use any text editor to change the passwords in these files:

  • create_jcr_ora.sql
  • create_quartz_ora.sql
  • create_repository_ora.sql
  • pentaho_mart_oracle.sql

Step 2: Run SQL Scripts

Once you change the passwords, you will need to run these SQL scripts. You will need administrator permissions on the server in order to run these scripts. The process for running SQL scripts is the same for Windows or Linux machines. The list of  SQL scripts is shown in the table below.

If you have a different port or different password, make sure that you change the password and port number in these examples to match the ones in your configuration.

Run these scripts from the Command Prompt window or from a Terminal window that runs SQL*Plus.

Action SQL Script
Create Quartz > start <your filepath>/create_quartz_ora.sql
Create Hibernate repository > start <your filepath>/create_repository_ora.sql
Create Jackrabbit > start <your filepath>/create_jcr_ora.sql
Create Pentaho Operations mart > start <your filepath>/pentaho_mart_oracle.sql

Step 3: Verify Oracle Initialization

After you run the scripts, this list will help you verify that databases and user roles have been created.

  1. Open the Terminal or Command Prompt window that is running SQL*Plus.
  2. Make sure that users have been created by running SELECT USERNAME FROM DBA_USERS.
  3. If your databases do not appear, go to the beginning of these instructions and try running the scripts again.
  4. Exit from the SQL*Plus.

Configure Oracle DI Repository Database

Now that you have initialized your repository database, you will need to configure Quartz, Hibernate, Jackrabbit, and Pentaho Operations Mart for an Oracle database.

By default, the examples in this section are for an Oracle database that runs on port 1521. The default password is also in these examples.

If you have a different port or different password, make sure that you change the password and port number in these examples to match the ones in your configuration.

Step 1: Set Up Quartz on Oracle DI Repository Database

Event information, such as scheduled reports, is stored in the Quartz JobStore. During the installation process, you must indicate where the JobStore is located, by modifying the quartz.properties file.

  1. Open the pentaho/server/data-integration-server/pentaho-solutions/system/quartz/quartz.properties file in any text editor.
  2. Locate the #_replace_jobstore_properties section and set the org.quartz.jobStore.driverDelegateClass as shown here.
    org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
    
  3. Locate the # Configure Datasources section and set the org.quartz.dataSource.myDS.jndiURL equal to Quartz, like this.
    org.quartz.dataSource.myDS.jndiURL = Quartz
    
  4. Save the file and close the text editor.

Step 2: Set Hibernate Settings for Oracle

Modify the hibernate settings file to specify where Pentaho should find the DI Repository’s hibernate config file. The hibernate config file specifies driver and connection information, as well as dialects and how to handle connection closes and timeouts.

The files in this section are located in the pentaho/server/data-integration-server/pentaho-solutions/system/hibernate directory.

  1. Open the hibernate-settings.xml file in a text editor. Find the <config-file> tags and change postgresql.hibernate.cfg.xml to oracle10g.hibernate.cfg.xml as shown.
    From:
    <config-file>system/hibernate/postgresql.hibernate.cfg.xml</config-file>
    
    To:
    <config-file>system/hibernate/oracle10g.hibernate.cfg.xml</config-file>
    
  2. Save and close the file.
  3. Open the oracle10.hibernate.cfg.xml file in a text editor.
  4. Make sure that the password and port number match the ones you specified in your configuration. Make changes if necessary, then save and close the file.

Step 3: Replace Default Version of Audit Log File with Oracle Version

Since you are using Oracle to host the DI Repository, you need to replace the audit_sql.xml file with one that is configured for Oracle.

  1. Locate the pentaho-solutions/system/dialects/oracle10g/audit_sql.xml file.
  2. Copy it into the pentaho-solutions/system directory.

Step 4: Modify Jackrabbit Repository Information for Oracle

There are parts of code that you will need to alter in order to change the default jackrabbit repository to Oracle.

  1. Navigate to the pentaho/server/data-integration-server/pentaho-solutions/system/jackrabbit and open the repository.xml file with any text editor.
  2. Following the table below, locate and change the code so that the Oracle lines are not commented out, but the PostgreSQL, MS SQL Server, and MySQL lines are commented out.

If you changed your password when you initialized the database during the Prepare Environment step, or if your database is on a different port, edit the url and password parameters in each section accordingly.

Item: Code Section:
Repository
<FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
    <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
    <param name="user" value="di_jcr_user"/>
    <param name="password" value="password"/>
   ...
    <param name="tablespace" value="di_pentaho_tablespace"/>
</FileSystem>
DataStore
<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
    <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
    <param name="driver" value="oracle.jdbc.OracleDriver"/>
    <param name="user" value="di_jcr_user"/>
    <param name="password" value="password"/>
    <param name="databaseType" value="oracle"/>
   ...
  </DataStore>
Workspaces
 <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
      <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
      <param name="user" value="di_jcr_user"/>
      <param name="password" value="password"/>
     ...
      <param name="tablespace" value="di_pentaho_tablespace"/>
    </FileSystem>

PersistenceManager

(1st part)

<PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager">
      <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
      <param name="driver" value="oracle.jdbc.OracleDriver"/>
      <param name="user" value="di_jcr_user"/>
      <param name="password" value="password"/>
      <param name="schema" value="oracle"/>
      <param name="schemaObjectPrefix" value="${wsp.name}_pm_ws_"/>
      <param name="tablespace" value="di_pentaho_tablespace"/>
    </PersistenceManager>
Versioning
<FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
      <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
      <param name="user" value="di_jcr_user"/>
      <param name="password" value="password"/>
    ...
      <param name="tablespace" value="di_pentaho_tablespace"/>
    </FileSystem>

PersistenceManager

(2nd part)

<PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager">
      <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
      <param name="driver" value="oracle.jdbc.OracleDriver"/>
      <param name="user" value="di_jcr_user"/>
      <param name="password" value="password"/>
      ...
      <param name="tablespace" value="di_pentaho_tablespace"/>
    </PersistenceManager>
DatabaseJournal
<Journal class="org.apache.jackrabbit.core.journal.OracleDatabaseJournal">
    <param name="revision" value="${rep.home}/revision.log"/>
    <param name="driver" value="oracle.jdbc.OracleDriver"/>
    <param name="url" value="jdbc:oracle:thin:@localhost:1521/XE"/>
    <param name="user" value="di_jcr_user"/>
    <param name="password" value="pentaho"/>
    <param name="schema" value="oracle"/>
    <param name="schemaObjectPrefix" value="J_C_"/>
</Journal>

Perform Tomcat-Specific Connection Tasks

After your repository has been configured, you must configure the web application servers to connect to the DI Repository. In this step, JDBC and JNDI connections are made to the Hibernate, Jackrabbit, and Quartz databases.

By default, the DI Server software is configured to be deployed and run on the Tomcat server. As such, connections have already been specified and only the Tomcat context.xml file must be modified.

The next couple of sections guide you through the process of working with the JDBC drivers and connection information for Tomcat.

Step 1: Download Drivers and Install with the JDBC Distribution Tool

To connect to a database, including the BA Repository or DI Repository database, you will need to download and install a JDBC driver to the appropriate places for Pentaho components as well as on the the web application server that contains the Pentaho Server(s). Fortunately, the JDBC Distribution Tool makes this process easy. 

Due to licensing restrictions, Pentaho cannot redistribute some third-party database drivers. This is why you have to download the file yourself and install it yourself.

  1. Download a JDBC driver JAR from your database vendor or a third-party driver developer.
  2. Copy the JDBC driver JAR you just downloaded to the pentaho/jdbc-distribution directory.
  3. Open a cmd prompt or shell tool, navigate to the pentaho/jdbc-distribution directory and enter one of the following:

​Windows:

  distribute-files.bat <name of JDBC driver JAR>

Linux:

  ./ distribute-drivers.sh <name of JDBC driver JAR>
  1. If you have run this utility as part of the installation process, you are done. Go to the next step of the installation instructions.
  2. If you have run this utility so that you can connect to a new repository, restart the BA or DI Server and Design tools, then try to connect to the new repository. If you cannot connect, verify that the drivers are installed as shown in this table. Restart your Pentaho Server(s) and Client tools.
List of Products and Corresponding Locations for JDBC Drivers
Server or Design Tool Directory
Data Integration (DI) Server pentaho/server/data-integration-server/tomcat/lib
Pentaho Data Integration (Spoon) pentaho/design-tools/data-integration/lib

Step 2: Modify JDBC Connection Information in the Tomcat context.xml File

Database connection and network information, such as the username, password, driver class information, IP address or domain name, and port numbers for your DI Repository database are stored in the context.xml file. Modify this file to reflect the database connection and network information to reflect your operating environment. You also modify the values for the validationQuery parameters in this file if you have chosen to use an DI Repository database other than PostgreSQL.

If you have a different port, password, user, driver class information, or IP address, make sure that you change the password and port number in these examples to match the ones in your configuration environment.

  1. Consult your database documentation to determine the JDBC class name and connection string for your DI Repository database.
  2. Go to the server/data-integration-server/tomcat/webapps/pentaho-di/META-INF directory and open the context.xml file with any file editor.
  3. Comment out the resource references that refer to databases other than Oracle, such as PostgreSQL, MS SQL Server, and MySQL. Then, add the following code to the file if it does not already exist.

    Be sure to adjust the port numbers and passwords to reflect your environment, if necessary.

    <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="password" username="di_hibuser" maxWait="10000" maxIdle="5" maxActive="20" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/Hibernate"/>
    
    <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="password" username="di_hibuser" maxWait="10000" maxIdle="5" maxActive="20" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/Audit"/>
    
    <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="password" username="di_quartz" maxWait="10000" maxIdle="5" maxActive="20" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/Quartz"/>
    
    <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="pentaho_operations_mart" username="pentaho_operations_mart" maxWait="10000" maxIdle="5" maxActive="20" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/pentaho_operations_mart"/>
    
    <Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" driverClassName="oracle.jdbc.OracleDriver" password="pentaho_operations_mart" username="pentaho_operations_mart" maxWait="10000" maxIdle="5" maxActive="20" factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" auth="Container" name="jdbc/PDI_Operations_Mart"/>
    
  4. Make sure that the validationQuery variable for your database is set to this: validationQuery="select 1 from dual".
  5. Save the context.xml file, then close it.

Step 3: Configure http and https Ports on Tomcat

Since the DI Server is configured to run on 9080 (http) and 9443 (https), you need to configure Tomcat for these ports as well.

  1. Use a text editor to open the server.xml file, which is located in pentaho/server/data-integration-server/<your tomcat installation directory/conf directory.
  2. Modify the connector port settings for http and https to reflect the DI Server ports (9080 and 9443).
  3. Save the changes and close the file.

Next Steps

Now it is time to start your server.