Skip to main content
Pentaho Documentation

Fix Big Data Configuration Problems

Overview

This section explains how to fix CDH, EMR, HDP, and MapR configuration problems and issues.

This section explains how to fix common Hadoop cluster connection problems and issues.

General Configuration Problems

The issues in this section explain how to resolve common configuration problems. 

Shim and Configuration Issues

Symptoms Common Causes Common Resolutions

No shim

  • Active shim was not selected.
  • Shim was installed in the  wrong place.
  • Shim name was not entered correctly in the plugin.properties file.
  • Verify that the plugin name that is in the plugin.properties file matches the directory name in the pentaho-big-data-plugin/hadoop-configurations directory
  • Make sure the shim is installed in the correct place.
  • Check the instructions for your Hadoop distribution in the Set Up Pentaho to Connect to an Apache Hadoop Cluster article for more details on how to verify the plugin name and shim installation directory.
Shim doesn't load
  • Required licenses are not installed.
  • You tried to load a shim that is not supported by your version of Pentaho.
  • If you are using MapR, the client might not have been installed correctly. 
  • Configuration file changes were made incorrectly.
  • Verify the required licenses are installed and have not expired.
  • Verify that the shim is supported by your version of Pentaho. Find your version of Pentaho, then look for the corresponding support matrix for more details. For example, if you are running Pentaho 6.0, then see this Components Reference topic which is the support matrix for Pentaho 6.0.
  • Verify that configuration file changes were made correctly.  Contact your Hadoop Administrator or see the Set Up Pentaho to Connect to an Apache Hadoop Cluster article.
  • If you are connecting to MapR, verify that the client was properly installed.  See MapR documentation for details.
  • Restart Spoon, then test again.
  • If this error continues to occur, files might be corrupted.  Download a new copy of the shim from the Pentaho Customer Support Portal.
The file system's URL does not match the URL in the configuration file. Configuration files (*-site.xml files) were not configured properly.  Verify that the configuration files were configured correctly.  Verify that the core-site.xml file is configured correctly.  See the instructions for your Hadoop distribution in the Set Up Pentaho to Connect to an Apache Hadoop Cluster article for details.

Connection Problems

Symptoms Common Causes Common Resolutions
Hostname incorrect or not resolving properly.
  • No hostname has been specified.
  • Hostname/IP Address is incorrect.
  • Hostname is not resolving properly in the DNS.
  • Verify that the Hostname/IP address is correct.
  • Check the DNS to make sure the Hostname is resolving properly. 
Port name is incorrect.
  • No port number has been specified.
  • Port  number is incorrect.
  • Port number is not numeric.
  • Verify that the port number is correct.
  • If you don't have a port number, determine whether your cluster has been enabled for high availability. If it has, then you do not need a port number.
Can't connect.
  • Firewall is a barrier to connecting.
  • Other networking issues are occurring.
  • Verify that a firewall is not impeding the connection and that there aren't other network issues. 

Directory Access or Permissions Issues

Symptoms Common Causes Common Resolutions

Can't access directory.

  • Authorization and/or authentication issues.
  • Directory is not on the cluster.
  • Make sure the user has been granted read, write, and execute access to the directory. 
  • Ensure security settings for the cluster and shim allow access.
  • Verify the hostname and port number are correct for the Hadoop File System's namenode. 

Can't create, read, update, or delete files or directories

Authorization and/or authentication issues.

  • Make sure the user has been authorized execute access to the directory. 
  • Ensure security settings for the cluster and shim allow access.
  • Verify that the hostname and port number are correct for the Hadoop File System's namenode. 
Test file cannot be overwritten.  Pentaho test file is already in the directory.
  • A file with the same name as the Pentaho test file is already in the directory.  The test file is used to make sure that the user can create, write, and delete in the user's home directory.
  • The test was run, but the file was not deleted.  You will need to manually delete the test file.  Check the log fo the test file name.

Oozie Issues

Symptoms Common Causes Common Resolutions

Can't connect to Oozie.

  • Firewall issue.
  • Other networking issues.
  • Oozie URL is incorrect.
  • Verify that the Oozie URL was correctly entered.
  • Verify that a firewall is not impeding the connection. 

Zookeeper Problems

Symptoms Common Causes Common Resolutions

Can't connect to Zookeeper.

  • Firewall is hindering connection with the Zookeeper service.
  • Other networking issues.
  • Verify that a firewall is not impeding the connection. 

Zookeeper hostname or port not found or doesn't resolve properly.  

  • Hostname/IP Address and Port name is missing or is incorrect.
  • Try to connect to the Zookeeper nodes using ping or another method.
  • Verify that the Hostname/IP Address and Port numbers are correct.