Some transformation steps and job entries have virtual filesystem (VFS) dialog boxes in place of the traditional local filesystem windows. VFS file dialog boxes enable you to specify a VFS URL in lieu of a typical local path. The files are accessed using HTTP, with the URLs containing scheme data identifying a protocol to use. See http://commons.apache.org/vfs/apidocs/index.html for VFS scheme documentation. Your files can be local or remote, and can reside in compressed formats such as .tar, .zip, or other compressed formats.
To access your files with the VFS browser complete the following steps:
- Select File > Open URL in the PDI client to open the VFS browser.
- Choose your file system type in the Location drop-down. The supported file systems are:
- Local – Opens files on your local machine. Use the folders in the Name panel of the Open File dialog box to select a resource.
- Hadoop Cluster – Opens files on any Hadoop cluster except S3. Click the Hadoop Cluster drop-down box to select your desired cluster, then the resource you want to access.
- S3 – (Simple Storage Service) accesses the resources on Amazon Web Services. Select S3, enter your Access Key and Secret Key in the Open File dialog box, then click Connect.
- HDFS – Opens files on any Hadoop distributed file system except MapR. Click the Hadoop Cluster drop-down box to select your desired cluster, then the resource you want to access.
- MapRFS – Opens files on the MapR file system. Use the folders in the Name panel of the Open File dialog box to select a MapR resource.
The following addressed are example VFS URLs:
S3: s3://<AWS Access Key>:<AWS Secret Access Key>@s3/MyBucket/path
Add and Delete Folders or Files
You can also use the VFS browser to delete files or folders on your file system. A default filter is applied so that initially Kettle transformation and job files display. To view other files, click the Filter drop-down and select the type of file you want to select. Once you have selected the file or folder you want to delete, click the X in the upper-right corner of the VFS browser to delete your selection. If you want to create a new folder, click the + in the upper-right corner of the VFS browser and enter your new folder name, and click OK.
Supported Steps and Entries
Supported transformation steps and job entries open the VFS browser instead of the traditional file open dialog box. With the VFS browser, you specify a VFS URL instead of a file path to access those resources.
The following steps and entries support the VFS browser:
- File Exists
- Mapping (sub-transformation)
- ETL Metadata Injection
- Hadoop Copy Files
- Hadoop File Input
- Hadoop File Output
VFS dialog boxes are configured through certain transformation parameters. Refer to Configure SFTP VFS for more information on configuring options for SFTP.
Configure VFS Options
The VFS browser can be configured to set variables as parameters to use at runtime. A sample transformation, VFS Configuration Sample.ktr, containing some of the parameters you can set is in the /design-tools/data-integration/samples/transformations folder. For more information on setting the variables, see VFS Properties. For an example of configuring an SFTP VFS connection, see Configure SFTP VFS.