Analytic Query Support

Overview

Elide includes a semantic modeling layer and analytic query API for OLAP style queries against your database.

A semantic model is the view of the data you want your users to understand. It is typically non-relational (for simplicity) and consists of concepts like tables, measures, and dimensions. End users refer to these concepts by name only (they are not expected to derive formulas or know about the physical storage or serialization of data).

A virtual semantic layer maps a semantic model to columns and tables in a physical database. Elide’s virtual semantic layer accomplishes this mapping through a Hjson configuration language. Hjson is a human friendly adaptation of JSON that allows comments and a relaxed syntax among other features. Elide’s virtual semantic layer includes the following information:

  • The defintions of tables, measures, and dimensions you want to expose to the end user.
  • Metadata like descriptions, categories, and tags that better describe and label the semantic model.
  • For every table, measure, and dimension, a SQL fragment that maps it to the physical data. These fragements are used by elide to generate native SQL queries against the target database.

Elide leverages the AggregationDataStore store to expose the read-only models defined in the semantic model. Model attributes represent either metrics (for aggregating, filtering, and sorting) and dimensions (for grouping, filtering, and sorting). Models exposed through the aggregation store are flat and do not contain relationships to other models.

The Aggregation store includes a companion store, the MetaDataStore, which exposes metadata about the Aggregation store models including their metrics and dimensions. The metadata store models are predefined, read-only, and served from server memory.

There are two mechanisms to create models in the Aggregation store’s semantic layer:

  1. Through Hjson configuration files that can be maintained without writing code or rebuilding the application.
  2. Through JVM language classes annotated with Elide annotations.

The former is preferred for most use cases because of better ergonomics for non-developers. The latter is useful to add custom Elide security rules or life cycle hooks.

With the introduction of the Aggregation store, Elide now integrates with Yavin - a companion UI framework that provides data visualization and analytics.

Querying

Models managed by the AggregationDataStore can be queried via JSON-API or GraphQL similar to other Elide models. There are a few important distinctions:

  1. If one or more metrics are included in the query, every dimension will be used to aggregate the selected metrics.
  2. If only dimensions (no metrics) are included in the query, Elide will return a distinct list of the requested dimension value combinations.
  3. Every elide model includes an ID field. The ID field returned from aggregation store models is not a true identifier. It represents the row number from a returned result. Attempts to load the model by its identifier will result in an error.

Analytic Queries

Similar to other Elide models, analytic models can be sorted, filtered, and paginated. A typical analytic query might look like:

/playerStats?fields[playerStats]=highScore,overallRating,countryIsoCode&sort=highScore
{
  playerStats(sort: "highScore") {
    edges {
      node {
        highScore 
        overallRating 
        countryIsoCode
      }
    }
  }
}

Conceptually, these queries might generate SQL similar to:

SELECT MAX(highScore), overallRating, countryIsoCode FROM playerStats GROUP BY overallRating, countryIsoCode ORDER BY MAX(highScore) ASC;

Here are the respective responses:

{
    "data": [
        {
            "type": "playerStats",
            "id": "0",
            "attributes": {
                "countryIsoCode": "HKG",
                "highScore": 1000,
                "overallRating": "Good"
            }
        },
        {
            "type": "playerStats",
            "id": "1",
            "attributes": {
                "countryIsoCode": "USA",
                "highScore": 1234,
                "overallRating": "Good"
            }
        },
        {
            "type": "playerStats",
            "id": "2",
            "attributes": {
                "countryIsoCode": "USA",
                "highScore": 2412,
                "overallRating": "Great"
            }
        }
    ]
}
  {
    "data": {
        "playerStats": {
            "edges": [
                {
                    "node": {
                        "highScore": 1000,
                        "overallRating": "Good",
                        "countryIsoCode": "HKG"
                    }
                },
                {
                    "node": {
                        "highScore": 1234,
                        "overallRating": "Good",
                        "countryIsoCode": "USA"
                    }
                },
                {
                    "node": {
                        "highScore": 2412,
                        "overallRating": "Great",
                        "countryIsoCode": "USA"
                    }
                }
            ]
        }
    }
}

Metadata Queries

A full list of available table and column metadata is covered in the configuration section. Metadata can be queried through the table model and its associated relationships. To enable the metadata query APIs, we have to turn on the MetaDataStore. The enableMetaDataStore flag is described here.

/table/playerStats?fields[table]=name,category,description,requiredFilter,tags,metrics,dimensions,timeDimensions
{
  table(ids: ["playerStats"]) {
    edges {
      node {
        name 
        category 
        description 
        requiredFilter 
        tags
        metrics {edges {node {id}}} 
        dimensions {edges {node {id}}} 
        timeDimensions {edges {node {id}}}
      }
    }
  }
}

Here are the respective responses:

{
    "data": {
	"type": "table",
	"id": "playerStats",
	"attributes": {
	    "category": "Sports Category",
	    "description": "Player Statistics",
	    "name": "playerStats",
	    "requiredFilter": "",
	    "tags": [
		"Game",
		"Statistics"
	    ]
	},
	"relationships": {
	    "dimensions": {
		"data": [
		    {
			"type": "dimension",
			"id": "playerStats.id"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.playerName"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.player2Name"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.playerLevel"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.overallRating"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.countryIsInUsa"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.countryIsoCode"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.countryUnSeats"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.countryNickName"
		    },
		    {
			"type": "dimension",
			"id": "playerStats.subCountryIsoCode"
		    }
		]
	    },
	    "metrics": {
		"data": [
		    {
			"type": "metric",
			"id": "playerStats.lowScore"
		    },
		    {
			"type": "metric",
			"id": "playerStats.highScore"
		    },
		    {
			"type": "metric",
			"id": "playerStats.highScoreNoAgg"
		    }
		]
	    },
	    "timeDimensions": {
		"data": [
		    {
			"type": "timeDimension",
			"id": "playerStats.updatedDate"
		    },
		    {
			"type": "timeDimension",
			"id": "playerStats.recordedDate"
		    }
		]
	    }
	}
   }
}
{
    "data": {
	"table": {
	    "edges": [
		{
		    "node": {
			"name": "playerStats",
			"category": "Sports Category",
			"description": "Player Statistics",
			"requiredFilter": "",
			"tags": [
			    "Game",
			    "Statistics"
			],
			"metrics": {
			    "edges": [
				{
				    "node": {
					"id": "playerStats.highScoreNoAgg"
				    }
				},
				{
				    "node": {
					"id": "playerStats.lowScore"
				    }
				},
				{
				    "node": {
					"id": "playerStats.highScore"
				    }
				}
			    ]
			},
			"dimensions": {
			    "edges": [
				{
				    "node": {
					"id": "playerStats.countryUnSeats"
				    }
				},
				{
				    "node": {
					"id": "playerStats.overallRating"
				    }
				},
				{
				    "node": {
					"id": "playerStats.countryNickName"
				    }
				},
				{
				    "node": {
					"id": "playerStats.player2Name"
				    }
				},
				{
				    "node": {
					"id": "playerStats.countryIsoCode"
				    }
				},
				{
				    "node": {
					"id": "playerStats.playerName"
				    }
				},
				{
				    "node": {
					"id": "playerStats.playerLevel"
				    }
				},
				{
				    "node": {
					"id": "playerStats.id"
				    }
				},
				{
				    "node": {
					"id": "playerStats.countryIsInUsa"
				    }
				},
				{
				    "node": {
					"id": "playerStats.subCountryIsoCode"
				    }
				}
			    ]
			},
			"timeDimensions": {
			    "edges": [
				{
				    "node": {
					"id": "playerStats.recordedDate"
				    }
				},
				{
				    "node": {
					"id": "playerStats.updatedDate"
				    }
				}
			    ]
			}
		    }
		}
	    ]
	}
   }
}

Configuration

Feature Flags

There are feature flags that enable Hjson configuration, analytic queries, and Metadata queries respectively:

Configuration Description Default
dynamic-config.enabled Enable model creation through the Hjson configuration files. false
aggregation-store.enabled Enable support for data analytic queries. false
aggregation-store.enableMetaDataStore Enable the metadata query APIs exposing the metadata about the Aggregation store models including their metrics and dimensions. false
# application.yaml
elide:
  dynamic-config:
    enabled: true
  aggregation-store:
    enabled: true
    enableMetaDataStore: true
//Override ElideStandaloneSettings

@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
    return new ElideStandaloneAnalyticSettings() {
        @Override
        public boolean enableDynamicModelConfig() {
            return true;
        }
        @Override
        public boolean enableAggregationDataStore() {
            return true;
        }
        @Override
        public boolean enableMetaDataStore() {
            return true;
        }
    };
}

File Layout

Analtyic model configuration can either be specified through JVM classes decorated with Elide annotations or Hjson configuration files. Hjson configuration files can be sourced either from the local filesystem or the classpath. If Hjson configuration is found in the classpath, the filesystem is ignored. All Hjson configuration must conform to the following directory structure:

CONFIG_ROOT/
  ├── models/
  |  ├── tables/
  |  |  ├── model1.hjson
  |  |  ├── model2.hjson
  |  ├── security.hjson
  |  └── variables.hjson
  ├── db/
  |  ├── sql/
  |  |  ├── db1.hjson
  |  ├── variables.hjson
  1. Analytic model files are stored in /models/tables. Multiple models can be grouped together into a single file.
  2. Security rules are stored in /models/security.hjson.
  3. Model and security Hjson files support variable substitution with variables defined in /models/variables.hjson.
  4. Data source configurations are stored in /db/sql. Multiple configurations can be grouped together into a single file.
  5. Data source Hjson files support variable substitution with variables defined in /db/variables.hjson.

CONFIG_ROOT can be any directory in the filesystem or classpath. The root configuration location can be set as follows:

# application.yaml

elide:
  dynamic-config:
    path: src/resources/configs
//Override ElideStandaloneSettings

@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
    return new ElideStandaloneAnalyticSettings() {
        @Override
        public String getDynamicConfigPath() {
            return File.separator + "configs" + File.separator;
        }
    };
}

Data Source Configuration

The Aggregation Data Store does not leverage JPA, but rather uses JDBC directly. By default, Elide will leverage the default JPA configuration for establishing connections through the Aggregation Data Store. However, more complex configurations are possible including:

  1. Using a different JDBC data source other than what is configured for JPA.
  2. Leveraging multiple JDBC data sources for different Elide models.

For these complex configurations, you must configure Elide using the Aggregation Store’s Hjson configuration language. The following configuration file illustrates two data sources. Each data source configuration includes:

  1. A name that will be referenced in your Analytic models (effectively binding them to a data source).
  2. A JDBC URL
  3. A JDBC driver
  4. A user name
  5. An Elide SQL Dialect. This can either be the name of an Elide supported dialect or it can be the fully qualified class name of an implementation of an Elide dialect.
  6. A map of driver specific properties.
{
  dbconfigs:
  [
    {
      name: Presto Data Source
      url: jdbc:presto://localhost:4443/testdb
      driver: com.facebook.presto.jdbc.PrestoDriver
      user: guestdb2
      dialect: PrestoDB
    }
    {
      name: Hive Data Source
      url: jdbc:hive2://localhost:4444/dbName
      driver: org.apache.hive.jdbc.HiveDriver
      user: guestmysql
      dialect: com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.HiveDialect
      propertyMap:
      {
        sslEnabled : true
      }
    }
  ]
}

By default, Elide uses HikariCP’s DataSource for JDBC connection pool. A custom DataSourceConfiguration can be configured by the following override:

//Override the following bean:

@Bean
public DataSourceConfiguration getDataSourceConfiguration() {
    return new DataSourceConfiguration() {
        @Override
        public DataSource getDataSource(DBConfig dbConfig, DBPasswordExtractor dbPasswordExtractor) {
            HikariConfig config = new HikariConfig();

            config.setJdbcUrl(dbConfig.getUrl());
            config.setUsername(dbConfig.getUser());
            config.setPassword(dbPasswordExtractor.getDBPassword(dbConfig));
            config.setDriverClassName(dbConfig.getDriver());
            dbConfig.getPropertyMap().forEach((k, v) -> config.addDataSourceProperty(k, v));

            return new HikariDataSource(config);
        }
    };
}
//Override ElideStandaloneSettings:

@Override
public DataSourceConfiguration getDataSourceConfiguration() {
    return new DataSourceConfiguration() {
        @Override
        public DataSource getDataSource(DBConfig dbConfig, DBPasswordExtractor dbPasswordExtractor) {
            HikariConfig config = new HikariConfig();

            config.setJdbcUrl(dbConfig.getUrl());
            config.setUsername(dbConfig.getUser());
            config.setPassword(dbPasswordExtractor.getDBPassword(dbConfig));
            config.setDriverClassName(dbConfig.getDriver());
            dbConfig.getPropertyMap().forEach((k, v) -> config.addDataSourceProperty(k, v));

            return new HikariDataSource(config);
        }
    };
}

Data Source Passwords

Data source passwords are provided out of band by implementing a DBPasswordExtractor:

public interface DBPasswordExtractor {
    String getDBPassword(DBConfig config);
}

A custom DBPasswordExtractor can be configured by the following override:

//Override the following bean:

@Bean
public DBPasswordExtractor getDBPasswordExtractor() {
    return new DBPasswordExtractor() {
        @Override
        public String getDBPassword(DBConfig config) {
            return StringUtils.EMPTY;
        }
    };
}
//Override ElideStandaloneSettings:

@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
    return new ElideStandaloneAnalyticSettings() {
        @Override
        public DBPasswordExtractor getDBPasswordExtractor() {
            return new DBPasswordExtractor() {
                @Override
                public String getDBPassword(DBConfig config) {
                    return StringUtils.EMPTY;
                }
            };
        }
    };
}

Dialects

A dialect must be configured for Elide to correctly generate analytic SQL queries. Elide supports the following dialects out of the box:

Friendly Name Class
H2 com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.H2Dialect
Hive com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.HiveDialect
PrestoDB com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.PrestoDBDialect
Postgres com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.PostgresDialect
MySQL com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.MySQLDialect
Druid com.yahoo.elide.datastores.aggregation.queryengines.sql.dialects.impl.DruidDialect

If not leveraging Hjson configuration, a default dialect can be configured for analytic queries:

# application.yaml

elide:
  aggregation-store:
    default-dialect: H2
//Override ElideStandaloneSettings

@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
    return new ElideStandaloneAnalyticSettings() {
        @Override
        public String getDefaultDialect() {
            return "Hive";
        }
    };
}

Model Configuration

Concepts

Elide exposes a virtual semantic model of tables and columns that represents a data warehouse. The virtual semantic model can be mapped to one or more physical databases, tables, and columns through configuration by a data analyst. The analyst maps virtual tables and columns to fragments of native SQL queries that are later assembled into complete SQL statements at query time.

Analytic models are called Tables in Elide. They are made up of:

  1. Metrics - Numeric columns that can be aggregated, filtered on, and sorted on.
  2. Dimensions - Columns that can be grouped on, filtered on, and sorted on.
  3. TimeDimension - A type of Dimension that represents time. Time dimensions are tied to grain (a period) and a timezone.
  4. Columns - The supertype of Metrics, Dimensions, and TimeDimensions. All columns share a set of common metadata.
  5. Joins - Even though Elide analytic models are flat (there are no relationships to other models), individual model columns can be sourced from multiple physical tables. Joins provide Elide the information it needs to join other database tables at query time to compute a given column.

Some metrics have FunctionArguments. They represent parameters that are supplied by the client to change how the metric is computed.

Example Configuration

{
  tables: [{
      name: PlayerStats
      table: playerStats
      dbConnectionName: Presto Data Source
      friendlyName: Player Stats
      description:
      '''
      A long description
      '''
      category: Sports
      cardinality : large
      readAccess : '(user AND member) OR (admin.user AND NOT guest user)'
      filterTemplate : createdOn>={{start}};createdOn<{{end}}
      isFact : true
      tags: ['Game', 'Player']
      joins: [
          {
              name: playerCountry
              to: PlayerCountry
              kind: toOne
              type: left
              definition: '{{playerCountry.id}} = {{country_id}}'
          }
      ]
      measures : [
          {
          name : highScore
          type : INTEGER
          definition: 'MAX({{highScore}})'
          friendlyName: High Score
          }
      ]
      dimensions : [
          {
              name : name
              type : TEXT
              definition : '{{name}}'
              cardinality : large
          },
          {
              name : countryCode
              type : TEXT
              definition : '{{playerCountry.isoCode}}'
              friendlyName: Country Code
          },
          {
              name : gameType
              type : TEXT
              definition : '{{game_type}}'
              friendlyName: Game Type
          },
          {
              name : gameOn
              type : TIME
              definition : '{{game_on}}'
              grains:
              [
                  {
                      type: MONTH
                      sql: PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM'), 'yyyy-MM')
                  },
                  {
                      type: DAY
                      sql: PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')
                  },
                  {
                      type: SECOND
                      sql: PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd HH:mm:ss')
                  }
              ]
          },
          {
              name : createdOn
              type : TIME
              definition : '{{created_on}}'
              grains:
              [{
                  type :  DAY
                  sql :  '''
                  PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')
                  '''
              }]
          },
          {
              name : updatedOn
              type : TIME
              definition : '{{updated_on}}'
              grains:
              [{
                  type :  MONTH
                  sql :  '''
                  PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM'), 'yyyy-MM')
                  '''
              }]
          }
      ]
  }]
}
@Include
@VersionQuery(sql = "SELECT COUNT(*) from playerStats")
@EqualsAndHashCode
@ToString
@FromTable(name = "playerStats", dbConnectionName = "Presto Data Source")
@TableMeta(description = "A long description", category = "Sports", tags = {"Game", "Player"}, filterTemplate = "createdOn>={{start}};createdOn<{{end}}", size = CardinalitySize.LARGE, friendlyName = "Player Stats")
@ReadPermission(expression = "(user AND member) OR (admin.user AND NOT guest user)")
public class PlayerStats {

    public static final String DATE_FORMAT = "PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')";
    public static final String YEAR_MONTH_FORMAT = "PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM'), 'yyyy-MM')";

    @Id
    private String id;

    @MetricFormula("MAX({{highScore}})")
    @ColumnMeta(friendlyName = "High Score")
    private long highScore;

    @ColumnMeta(size = CardinalitySize.LARGE)
    private String name;

    @Join("{{country_id}} = {{playerCountry.id}}", type = JoinType.LEFT)
    private Country playerCountry;

    @DimensionFormula("{{playerCountry.isoCode}}")
    @ColumnMeta(friendlyName = "Country Code")
    private String countryCode;

    @DimensionFormula("{{game_type}}")
    @ColumnMeta(friendlyName = "Game Type")
    private String gameType;

    @Temporal(grains = {
          @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT),
          @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT)
    }, timeZone = "UTC")
    @DimensionFormula("{{game_on}}")
    private Time gameOn;

    @Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT) }, timeZone = "UTC")
    @DimensionFormula("{{created_on}}")
    private Time createdOn;

    @Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT) }, timeZone = "UTC")
    @DimensionFormula("{{updated_on}}")
    private Time updatedOn;
}

Tables

Tables must source their columns from somewhere. There are three, mutually exclusive options:

  1. Tables can source their columns from a physical table.
  2. Tables can source their columns from a SQL subquery.
  3. Tables can extend (override or add columns to) an existing Table. More details can be found here.

These options are configured via the ‘table’, ‘sql’, and ‘extend’ properties.

Table Properties

Tables include the following properties:

Hjson Property Explanation Example Hjson Value Annotation/Java Equivalent
name The name of the elide model. It will be exposed through the API with this name. tableName @Include(type="tableName")
version If leveraging Elide API versions, the API version associated with this model. 1.0 @ApiVersion(version="1.0")
friendlyName The friendly name for this table. Unicode characters are supported. ‘Player Stats’ @TableMeta(friendlyName="Player Stats")
description A description of the table. ‘A description for tableName’ @TableMeta(description="A description for tableName")
category A free-form text category for the table. ‘Some Category’ @TableMeta(category="Some Category")
tags A list of free-form text labels for the table. [‘label1’, ‘label2’] @TableMeta(tags={"label1","label2"})
cardinality tiny, small, medium, large, huge - A hint about the number of records in the table. small @TableMeta(size=CardinalitySize.SMALL)
dbConnectionName The name of the physical data source where this table can be queried. This name must match a data source configuration name. MysqlDB @FromTable(dbConnectionName="MysqlDB")
schema The database schema where the physical data resides schemaName @FromTable(name=schemaName.tableName)
table Exactly one of table, sql, and extend must be provided. Provides the name of the physical base table where data will be sourced from. tableName @FromTable(name=tableName)
sql Exactly one of table, sql, and extend must be provided. Provides a SQL subquery where the data will be sourced from. ‘SELECT foo, bar FROM blah;’ @FromSubquery(sql="SELECT foo, bar FROM blah;")
extend Exactly one of table, sql, and extend must be provided. This model extends or inherits from another analytic model. tableName class Foo extends Bar
readAccess An elide permission rule that governs read access to the table. ‘member and admin.user’ @ReadPermission(expression="member and admin.user")
filterTemplate An RSQL filter expression template that must directly match or be included in the client provided filter. countryIsoCode=={{code}} @TableMeta(filterTemplate=”countryIsoCode=={{code}}”)
hidden The table is not exposed through the API. true @Exclude
isFact Is the table a fact table. Models annotated using FromTable or FromSubquery or TableMeta or configured through Hjson default to true unless marked otherwise. Yavin will use this flag to determine which tables can be used to build reports. true @TableMeta(isFact=false)

Columns

Columns are either measures, dimensions, or time dimensions. They all share a number of common properties. The most important properties are:

  1. The name of the column.
  2. The data type of the column.
  3. The definition of the column.

Column definitions are templated, native SQL fragments. Columns definitions can include references to other column definitions or physical column names that are expanded at query time. Any part of the column definition enclosed in double curly braces ({{foo}}) is interpreted either as:

  • Another column in the current table (assuming the parameter matches another column name in the table).
  • A column in the underlying physical table (assuming either the parameter does not match any columns in the current table or it matches the current column name).
  • Another column in a different table. The parameter is a dot (‘.’) separated path where each segment of the path represents a join to another table (denoted by the join name) ending with the destination column name ({{player.team.name}}).

Column expressions can be defined in Hjson or Java:

{
  measures : [
    {
      name : highScore
      type : INTEGER
      definition: 'MAX({{highScore}})'
    }
  ]
  dimensions : [
    {
      name : name
      type : TEXT
      definition : '{{name}}'
    },
    {
      name : countryCode
      type : TEXT
      definition : '{{playerCountry.isoCode}}'
    }
  ]
}
//A Dimension
@DimensionFormula("CASE WHEN {{name}}  = 'United States' THEN true ELSE false END")
private boolean inUsa;

//A metric
@MetricFormula("{{wins}} / {{totalGames}}  * 100")
private float winRatio;

Column Properties

Columns include the following properties:

Hjson Property Explanation Example Hjson Value Annotation/Java Equivalent
name The name of the column. It will be exposed through the API with this name. columnName String columnName;
friendlyName The friendly name for this column to be displayed in the UI. ‘Country Code’ @ColumnMeta(friendlyName = "Country Code")
description A description of the column. ‘A description for columnA’ @ColumnMeta(description="A description for columnA")
category A free-form text category for the column. ‘Some Category’ @ColumnMeta(category="Some Category")
tags A list of free-form text labels for the column. [‘label1’, ‘label2’] @ColumnMeta(tags={"label1","label2"})
cardinality tiny, small, medium, large, huge - A hint about the dimension’s cardinality. small @ColumnMeta(size=CardinalitySize.SMALL)
readAccess An elide permission rule that governs read access to the column. ‘admin.user’ @ReadPermission(expression="admin.user")
definition A SQL fragment that describes how to generate the column. MAX({{sessions}}) @DimensionFormula(“CASE WHEN {{name}} = ‘United States’ THEN true ELSE false END”)
type The data type of the column. One of ‘INTEGER’, ‘DECIMAL’, ‘MONEY’, ‘TEXT’, ‘COORDINATE’, ‘BOOLEAN’ ‘BOOLEAN’ String columnName;
hidden The column is not exposed through the API. true @Exclude

Non-time dimensions include the following properties that describe where a discrete list of values can be sourced from (for type-ahead search or other uses) :

Hjson Property Explanation Example Hjson Value Annotation/Java Equivalent
values An optional enumerated list of dimension values for small cardinality dimensions [‘Africa’, ‘Asia’, ‘North America’] @ColumnMeta(values = {"Africa", "Asia", "North America")
tableSource The semantic table and column names where to find the values (tableName.columnName). continent.name @ColumnMeta(tableSource = "continent.name")

Time Dimensions & Time Grains

Time dimensions represent time and include one or more time grains. The time grain determines how time is represented as text in query filters and query results. Supported time grains include:

Grain Text Format
SECOND “yyyy-MM-dd HH:mm:ss”
MINUTE “yyyy-MM-dd HH:mm”
HOUR “yyyy-MM-dd HH”
DAY “yyyy-MM-dd”
WEEK “yyyy-MM-dd”
ISOWEEK “yyyy-MM-dd”
MONTH “yyyy-MM”
QUARTER “yyyy-MM”
YEAR “yyyy”

When defining a time dimension, a native SQL expression may be provided with the grain to convert the underlying column (represented as {{}}) to its expanded SQL definition:

{
    name : createdOn
    type : TIME
    definition : "FORMATDATETIME({{createdOn}}, 'yyyy-MM')"
    grains:
    [{
        type :  MONTH
        sql :  '''
        PARSEDATETIME({{}}, 'yyyy-MM')
        '''
    }]
}
public static final String DATE_FORMAT = "PARSEDATETIME({{}}, 'yyyy-MM')";


@Temporal(grains = {
    @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = DATE_FORMAT)
}, timeZone = "UTC")
@DimensionFormula("FORMATDATETIME({{createdOn}}, 'yyyy-MM')")
private Date createdOn;

Elide would expand the above example to this SQL fragment: PARSEDATETIME(FORMATDATETIME(createdOn, 'yyyy-MM'), 'yyyy-MM').

Time grain definitions are optional and default to type ‘DAY’ and an empty SQL expression.

Joins

Table joins allow column expressions to reference fields from other tables. At query time, if a column requires a join, the join will be added to the generated SQL query. Each table configuration can include zero or more join definitions:

joins: [
    {
       name: playerCountry
       to: country
       kind: toOne
       type: left
       definition: '{{country_id}} = {{playerCountry.id}}'
    },
    {
       name: playerTeam
       to: team
       kind: toMany
       type: full
       definition: '{{team_id}} = {{playerTeam.id}}'
    }
]
private Country country;
private Team team;

@Join("{{country_id}} = {{playerCountry.id}}", type = JoinType.LEFT)
public Country getCountry() {
    return country;
}

@Join("{{team_id}} = {{playerTeam.id}}", type = JoinType.FULL)
public Team getTeam() {
    return team;
}

Join Properties

Each join definition includes the following properties:

Hjson Property Explanation
name A unique name for the join. The name can be referenced in column definitions.
to The name of the Elide model being joined against. This can be a semantic model or a CRUD model.
kind ‘toMany’ or ‘toOne’ (Default: toOne)
type ‘left’, ‘inner’, ‘full’ or ‘cross’ (Default: left)
definition A templated SQL join expression. See below.

Join Definition

Join definitions are templated SQL expressions that represent the ON clause of a SQL statement:

definition: "{{ orderId}} = {{delivery.orderId}} AND {{ delivery.delivered_on }} > '1970-01-01'"

Column references must be wrapped in curly braces and are replaced at query time with the correctly qualified SQL names. A column reference can either refer to:

  1. A logical column in the current model that should be expanded by its corresponding SQL definition.
  2. A physical column in the current table.
  3. A reference to logical or physical column in the join table. The reference consists of the join name, a period, and finally the column name in the join table.

Inheritance

Tables can extend another existing Table. The following actions can be performed:

  • New columns can be added.
  • Existing columns can be modified.
  • Table properties can be modified.

The Table properties listed below can be inherited without re-declaration. Any Table property not listed below, has to be re-declared.

  • dbConnectionName
  • schema
  • table
  • sql

Unlike Table properties, Column properties are not inherited. When overriding a Column in an extended Table, the column properties have to be redefined.

Hjson inheritance vs Java inheritance

Hjson inheritance and Java inheritance differ in one key way. Hjson inheritance allows the type of a measure or dimension to be changed in the subclassed model. Changing the type of an inherited measure or dimension in Java might generate a compilation error.

Example Extend Configuration

The sample below uses the Example Configuration as its parent model. Let’s assume we are a club that exposes the Player Stats from the intra-squad practice games and the tournament games to coaches using the PlayerStats model. We want to expose the data from the same persistent store to the general public with below differences:

  • Exclude the intra-squad games from highScore calculation.
  • Modify the Grain of game_on column from DAY to YEAR.
  • Accessible by Admins and Guest users.

To avoid the compilation error highlighted above, we will have to write the new JVM class with all the columns and properties instead of inheriting unchanged ones from the Parent model. With the Hjson extend, it will be a few lines of simple changes to inherit from the Parent model without duplication as highlighted in the example below.

{
  tables: [{
      name: TournamentPlayerStats
      extend: PlayerStats
      readAccess : 'admin.user OR guest user'
      measures : [
          {
          name : highScore
          type : INTEGER
          definition: MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)
          }
      ],
      dimensions : [
          {
              name : gameOn
              type : TIME
              definition : '{{game_on}}'
              # Change Type from MONTH, DAY & SECOND to YEAR & MONTH
              grains:
              [
                  {
                      type: YEAR
                      sql: PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy'), 'yyyy')
                  },
                  {
                      type: MONTH
                      sql: PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM'), 'yyyy-MM')
                  }
              ]
          }
      ]
  }]
}
@Include
@VersionQuery(sql = "SELECT COUNT(*) from playerStats")
@EqualsAndHashCode
@ToString
@ReadPermission(expression = "admin.user OR guest user")
@FromTable(name = "playerStats", dbConnectionName = "Presto Data Source")
public class TournamentPlayerStats {

    public static final String DATE_FORMAT = "PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')";
    public static final String YEAR_MONTH_FORMAT = "PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy-MM'), 'yyyy-MM')";
    public static final String YEAR_FORMAT = "PARSEDATETIME(FORMATDATETIME({{}}, 'yyyy'), 'yyyy')";

    @Id
    private String id;

    // Change formula to filter on Tournament Games
    @MetricFormula("MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)")
    @ColumnMeta(friendlyName = "High Score")
    private long highScore;

    @ColumnMeta(size = CardinalitySize.LARGE)
    private String name;

    @Join("{{country_id}} = {{playerCountry.id}}", type = JoinType.LEFT)
    private Country playerCountry;

    @DimensionFormula("{{playerCountry.isoCode}}")
    @ColumnMeta(friendlyName = "Country Code")
    private String countryCode;

    @DimensionFormula("{{game_type}}")
    @ColumnMeta(friendlyName = "Game Type")
    private String gameType;

    @Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT) }, timeZone = "UTC")
    @DimensionFormula("{{updated_on}}")
    private Time updatedOn;

    @Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT) }, timeZone = "UTC")
    @DimensionFormula("{{created_on}}")
    private Time createdOn;

    // Change types of gameOn from Day & Month to Day, Month & Year
    @Temporal(grains = {
          @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT),
          @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT)
          @TimeGrainDefinition(grain = TimeGrain.YEAR, expression = YEAR_FORMAT)
    }, timeZone = "UTC")
    @DimensionFormula("{{game_on}}")
    private Time gameOn;
}

We can use Java’s inheritance, if the goal does not involve changing the type of columns. Hjson extend will still require a few lines of simple changes.

{
  tables: [{
      name: TournamentPlayerStats
      extend: PlayerStats
      readAccess : 'admin.user OR guest user'
      measures : [
          {
          name : highScore
          type : INTEGER
          definition: MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)
          }
      ]
  }]
}
@Include
@VersionQuery(sql = "SELECT COUNT(*) from playerStats")
@EqualsAndHashCode
@ToString
@ReadPermission(expression = "admin.user OR guest user")
public class TournamentPlayerStats extends PlayerStats {

    // Change formula to filter on Tournament Games
    @MetricFormula("MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)")
    @ColumnMeta(friendlyName = "High Score")
    private long highScore;
}

Security Configuration

The list of available security roles can be defined in the security.hjson file:

{
    roles : [
        admin.user
        guest user
        member
        user
    ]
}

Each role defined generates an Elide user check that extends RoleMemberCheck.

These roles can then be referenced in security rules applied to entire tables or individual columns in their respective Hjson configuration:

readAccess = 'member OR guest user'

The readAccess table and column attribute can also reference Elide checks that are compiled with your application to implement row level security or other more complex security rules.

Variable Substitution

To avoid repeated configuration blocks, all Hjson files (table, security, and data source) support variable substitution. Variables are defined in the variables.hjson file:

{
   foo: [1, 2, 3]
   bar: blah
   hour: hour_replace
   measure_type: MAX
   name: PlayerStats
   table: player_stats
}

The file format is a simple mapping from the variable name to a JSON structure. At server startup, Elide will replace any variable name surrounded by ‘<%’ and ‘%>’ tags with the corresponding JSON structure.

Caching

The Aggregation data store supports a configurable caching strategy to cache query results. More details can be found in the performance section.

Bypassing Cache

Elide JAX-RS endpoints (elide-standalone) and Spring conrollers (Spring) support a Bypass Cache header (‘bypasscache’) that can be set to true for caching to be disabled on a per query basis. If no bypasscache header is specified by the client or a value other than true is used, caching is enabled by default.

Configuration Validation

All Hjson configuration files are validated by a JSON schema. The schemas for each file type can be found here:

  1. Table Config
  2. Data Source Config
  3. Security Config
  4. Variable File

Hjson configuration files can be validated against schemas using a command-line utility following these steps:

  1. Build your Elide project to generate a Fat JAR. Make sure to include a Fat JAR build configuration in your POM file.

    mvn clean install

  2. Using the generated JAR for validation:

    java -cp elide-*-example.jar com.yahoo.elide.modelconfig.validator.DynamicConfigValidator --help java -cp elide-*-example.jar com.yahoo.elide.modelconfig.validator.DynamicConfigValidator --configDir <Path for Config Directory>

  3. The config directory needs to adhere to this file layout.