After connecting to most data sources, a query needs to be created and assigned a name to retrieve your data. This named query can be assigned to retrieve data for the master report, sub report, or parameter selections. A JDBC data source is used to access data from relational databases using the SQL query language. Pentaho also provides two data modeling layers, Pentaho metadata and Pentaho Analysis. The Pentaho metadata layer focuses on reporting. The other layer focuses on analysis (OLAP). Pentaho metadata relies on Pentaho’s MQL query language where Pentaho Analysis uses Microsoft Multidimensional Expression (MDX) query language.
The default setting for the query is to pull from session-based cache. If you do not want your query to use session-based cache, go to the Master Report tab then the Attributes tab to change the data-cache field to False so that every time the query is run or the report opens, the query will refresh.
Hadoop Hive-Specific SQL Limitations
A few of the following key limitations in Hive prevent some regular Metadata Editor features from working as intended, and limit the structure of your SQL queries in Report Designer:
- Outer joins are not supported.
- Each column can only be used once in a SELECT clause. Duplicate columns in SELECT statements cause errors.
- Conditional joins can only use the = conditional unless you use a WHERE clause. Any non-equal conditional in a FROM statement forces the Metadata Editor to use a Cartesian join and a WHERE clause conditional to limit it.