Once you have all of your business tables created, you must create relationships between the tables so that the query generators and SQL generators that work with Pentaho Metadata can create the data queries correctly. This is similar to drawing a relational diagram to show primary and foreign key relationships; however, relational links are not the only relationships that can be modeled. You can create a relationship between any two tables, link any two columns between them and dictate what the relationship is (one to many, many to many, and so on). The important pieces of information to know before you try to create a relationship are:
- What two business tables would you like to associate with this relationship?
- What columns in the business tables identify the relationship?
- What type of relationship is it — one to one, one to many, many to one, and so on?
Create Relationships in the Navigation Pane
To create a new relationship between business tables in the navigation pane, first make sure that the model you want to add this relationship to is selected, and that the Relationships node is visible.
- Right-click Relationships in the navigation pane.
- Select New Relationship The Relationship Properties dialog box appears.
- Select a business table from the From Table/Field list. This is the first relationship
- Select a business table from the To Table/Field list. This steps sets up a relationship between two tables.
- Now, specify the business columns (from the adjacent lists) from each business table that identify this relationship. If the business column names are similar, click Guess Matching Fields and let Pentaho Metadata Editor attempt to determine the columns for you.
- Define the relationship from the Relationship drop down list. The relationships that can be chosen are defined with examples:
Relationship Description 1:N A one-to-many mandatory relationship is the most common relationship in databases. The primary key table contains only one record that relates to none, one, or many records in the related table. This relationship is similar to the one between you and one of your parents. You have one mother, but your mother may have several children. N:1 A many-to-one is opposite of one to many (1:N) relationship. 1:1 In a one-to-one relationship, both tables are limited to one record only on either side of the relationship. Each primary key value relates to a single record, or no record, in the associated table. They are like spouses — you may be married, or not; however, if you are married, both you and your spouse can have only one partner. Most one-to-one relationships are forced by business rules. If you do not have a business rule, you can, in most cases, combine both tables into one table without breaking normalization rules. 0:N A zero to many optional relationship indicates that a person may have no phone, one phone, or many phones, and that the phone may not be "owned," but can only be owned by a maximum of one person. N:0 Opposite of a zero to many relationship 0:1 A zero to one relationship might indicate that a person may be a programmer, but a programmer must be a person. It is assumed that the mandatory side of the relationship is the dominant. 1:0 Opposite of a zero to one relationship N:N In a many to many relationship each record in both tables can relate to an unlimited number of records (or no records) in the other table. For example, if you have many siblings, your siblings also have many siblings. Many-to-many relationships must have a third table, referred to as an associate or linking table, because relational systems cannot accommodate the relationship directly. 0:0 A zero to zero optional relationship indicates that a person may occupy one parking space, but that a person is not necessary to have a space and a space does not need to have a person.
- Click OK when you are done. You should see a new relationship line drawn between the two tables on the Editor Graph, and the relationship represented in the navigation pane.
Important: Note that complex joins appear in the WHERE clause of the SQL statement, so currently any joining that takes place in the FROM clause of the SQL statement is not supported. An example of a complex join might be TABLE_A.COL_A=TABLE_B.COL_A AND TABLE_A.COL_B=TABLE_B.COL_B. This represents a join of two tables based on two key columns versus a single join column. Also note, the complex join expression provided must use the names of the physical tables and physical columns, not business tables and business column names.
Create Relationships Using the Editor Graph
In the Editor Graph, creating a new relationship is somewhat simplified because you select the two business tables on the canvas, and the Relationship Properties dialog box is pre-populated with your selections. Before you start, make sure that the model you want to add a relationship to is selected, and that the business tables are displayed in the Editor Graph.
- Select the two business tables you want to include in the new relationship, either by clicking and dragging a marquee around the tables or by holding the <SHIFT+CTRL> keys, then clicking on the tables.
- Once your business tables are selected, right-click on the selection. Click Add Relationship... in the popup menu.
- When the Relationship Properties dialog box appears, continue following steps 3 through 7 in Creating Relationships in the Navigation Pane to finish creating the relationship.
Hadoop Hive-Specific SQL Limitations
There are a few key limitations in Hive that prevent some regular Metadata Editor features from working as intended, and limit the structure of your SQL queries in Report Designer:
- Outer joins are not supported.
- Each column can only be used once in a SELECT clause. Duplicate columns in SELECT statements cause errors.
- Conditional joins can only use the = conditional unless you use a WHERE clause. Any non-equal conditional in a FROM statement forces the Metadata Editor to use a cartesian join and a WHERE clause conditional to limit it. This is not much of a limitation, but it may seem unusual to experienced Metadata Editor users who are accustomed to working with SQL databases.