Managing the Pentaho Repository

Pentaho shares files and folders across teams and products through the Pentaho Repository. The Pentaho Repository is an environment for collaborative analysis and ETL (Extract, Transform, and Load) development. From the Pentaho User Console (PUC), you can upload and download the repository. From Pentaho Data Integration (PDI), you can import and export it.

The Pentaho Repository also enables you to track changes and revert to previous file states using version history. The PDI client maintains a version history while you are developing transformations and jobs within the Pentaho Repository. You can turn version tracking on by editing the Pentaho Repository properties file.

As the Pentaho Repository grows, it may become too large for effective system performance. If a hardware upgrade is expensive, consider purging some of the data within the repository using our purge command-line utility.

Another concern is data loss through machine failure, theft, disaster, or accidental change. This loss can be minimized by routine backups. You can backup or restore the Pentaho Repository through a command line interface.