Skip to main content
Pentaho Documentation

Purge Transformations, Jobs, and Shared Objects from the Pentaho Repository

Overview

Explains how to use the purge utility to permanently delete content, revisions, and shared objects from the Pentaho Repository.

The Purge Utility allows you to permanently delete shared objects (servers, clusters, and databases) stored in the Pentaho Repository as well as content (transformations and jobs). You can also delete revision information for content and shared objects.

Purging is permanent. Purged items cannot be restored. 

To use the Purge Utility, complete these steps.

  1. Make sure the Pentaho Repository is running.
  2. Open a shell tool, command prompt window, or terminal window, and navigate to the pentaho/design-tools/data-integration directory.

  3. At the prompt enter the purge utility command. The format for the command, a table that describes each parameter, and parameter examples follow.  

The command must contain the url, user, and password parameters, as well as one of these parameters: versionCountpurgeBeforeDatepurgeFilespurgeRevisions.

Windows:

purge-utility.bat [-url] [-user] [-password] [-purgeSharedObjects][-versionCount] 
[-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]

​Linux:

purge-utility.sh [-url] [-user] [-password] [-purgeSharedObjects] [-versionCount] 
[-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]
Option Required? Description
-url Y URL address for the Pentaho Repository. This is a required parameter. By default, the Pentaho Server is installed at this URL: http://localhost:8080/pentaho.  
-user Y Username for an account that can access the Pentaho Server as an administrator. This is a required parameter.  
-password Y Password for the account used to access the Pentaho Server. This is a required parameter.  
-purgeSharedObjects

N

When set to TRUE, the parameter purges shared objects from the repository. This parameter must be used with the purgefile parameter. If you try to purge shared objects without including the purgefile parameter in the command line, an error occurs. If you set the purgeSharedObjects parameter to FALSE, it does not purge shared objects. If you include the purgeSharedObjects parameter in the command, but you don't set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
-versionCount

You must include only one of these: versionCount, purgeBeforeDate, purgeFiles, or purgeRevisions

Deletes entire version history except the for last versionCount versions. Set this value to an integer.   
-purgeBeforeDate Deletes all versions before purgeBeforeDate. The format for the date must be mm/dd/yyyy
-purgeFiles When set to TRUE, transformations and jobs are permanently and physically removed. Shared objects (such as database connections) are NOT removed. If you want to also remove shared objects, include the purgeSharedObject parameter as well. If you set the purgeFiles parameter to FALSE, it does not purge files. If you include the purgeFiles parameter in the command, but you don't set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
-purgeRevisions When set to TRUE, all revisions are purged, but the current file remains unchanged. If you set the purgeRevisions parameter to FALSE, it does not purge revisions. If you include the purgeRevisions parameter in the command, but you do not set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
-logFileName N Allows you to specify the file name for the log file. If this parameter is not present, the log is written to a file that has this name format: purge-utility-log-YYYYMMdd-HHmmss.txt.  YYYYMMdd-HHmmss indicates the date and time that the log file was created (e.g. purge-utility-log-20140313-154741.txt).
-logLevel N Indicates the types and levels of detail the logs should contain. Values are: ALL, DEBUG, ERROR, FATAL, TRACE, INFO, OFF, and WARN. By default the log is set to INFO. Check the Log4J documentation for more details on the logging framework definitions:  https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.
  • In this example, only the last five revisions of transformations and jobs are NOT deleted. All previous revisions are deleted.


purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -versionCount=5
  • In the example that follows all revisions before 01/11/2009 are deleted. Logging is set to the WARN level.


purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeBeforeDate=01/11/2009 -logLevel=WARN
  • In this example, all transformations, jobs, and shared objects are deleted. You do not need to set the purgeFiles and purgeSharedObjects parameters to TRUE for this command to work. Logging is turned OFF.


purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeFiles -purgeSharedObjects -logLevel=OFF
  1. When finished, examine the logs to see if there were any issues or problems with the purge.  
  2. To see the results of the purge process, disconnect, then reconnect to the Pentaho Repository. In the Repository Explorer, in the Browse tab, verify that the items you specified in your purge utility command were purged.