VDM Public

Public information; No registration required

vdmgen - Code Generator

vdmgen is a code generator based on the same principle as vdmiter, however it is intended to generate code libraries from manifests and templates, not to execute generated code on the fly. Various versions of this script exist and others can be customized, focused on more specialized and typically more complex tasks. Here are some examples of its functionality:

  • Generates DDL and DML for system or user-maintained MQTs (MQTGEN.) Coupled with the VDM Aggregation Strategy, a minimal set of optimal MQTs can be defined, generated and maintained.
  • Allows expansion of cascading templates that can be "macro-included" in other templates.
  • Handles structured comments as well as variable substitution
VDM Access: 

vdmiter - A Swiss Army Knife

The vdmiter command is a simple mechanism for iterating over the rows of a manifest and macro-substituting values in a template script or an in-line SQL query. Here is an overview of the command parameters: vdmiter -m manifestfile -t templatefile -vx SQLtext

The Manifest File

VDM Access: 

Lexical Dimensions

Lexical are dimensions that combine and give a name to combinations of two or more attributes. A common use is for hierarchical codes, such as the ZIP code system, or breakdown codes. A lexical dimesions essentially encodes all combinations of its attributes into a single key that is used in the fact. It is also possible that the fact contains the individual keys that the lexical dimension table is made of. In those cases the table is not acting as a dimension but can be used to drive drop-downs and pick-lists in a progressive selection of the user interface so that the user sees only valid combinations of these attributes.

VDM Access: 

VDMGEN Archtiecture Overview

Simply put, the VDMGEN code generator reads manifest files and templates, and produces SQL/DDL and DB2 command language data-specific script libraries.

Manifests are files that list a series of names in the first (header) row and corresponding values in each subsequent row. VDMGEN reads-in the header names and processes each subsequent row. Processing entails the following:

  1. Identify and read appropriate templates
  2. Construct substitution tokens from manifest header names and values
  3. Scan each template and apply substitutions
  4. Store output to the appropriate library directories
  5. Repeat through all nested/associated manifests and templates

Templates look like regular SQL scripts, including two special kinds of literals:

VDM Access: 

Release Notes

Release 2.2 adds templates for incremental maintenance and includes the following enhancements:

  1. Improvements in the use of ksh93 in AIX and Linux and generalization of the vdmgen code generator
  2. Conversion of the directory structure to be compliant with Unix best practices
  3. Extensions and refinements in the manifest structure and the ability to support many families of objects
  4. The ability to maintain two tiers of dependent structures, e.g. Data Mart and their MQTs
  5. New templates for MQT definition compliant with Cubing Services, incremental maintenance of data marts and MQTs, support for DB2-maintained MQTs in deferred mode and various new utilities

The following templates from (VDMGEN) release 2.1 will be supported:

VDM Access: 

Overview and Example

The problem addressed is illustrated by the following example:

  • The data warehouse (DW) contains a large multi-billion transaction fact and five major dimensions
  • Cubing Services accesses seven relational data marts (DMs) built from a subset of the DW data. Private dimensions are introduced, specific to each data mart, to support the cube models. These data marts are identical in structure and functionality, but defined over separate subsets of the fact after being transformed through views, one for each data mart.
  • Three MQTs are defined over each of the data marts to enhance performance of queries issued from Cubing Services
  • Two additional MQTs are defined directly over the DW fact to support reporting off the DW fact
VDM Access: 

A Word on Recasting

Aggregates and MQTs that are based on facts and dimensions are impacted by three types of changes:

  1. New facts and dimensions
  2. Update or deletion of existing facts
  3. Existing dimension member updates requiring recasting of affected aggregations
VDM Access: 

Derived Data Maintenance Architecture and Methodology Supported

The framework assumes the existence of one or more prime facts and dimensions that are maintained from external sources. Transformations and aggregations are encapsulated in defining views which in turn are materialized into physical tables. Two types of such derived structures are supported:

  • First tier data marts, aggregates or MQTs over the prime facts and dimensions.
  • Second tier MQTs or aggregations built on top of first tier objects.

Methods for maintaining the first tier of data marts and derived tables can be:

Full Refresh

Periodically refresh the target from the defining view over the source. This approach has the following characteristics

  • Simplicity
  • Robustness and reliability
VDM Access: 

The VDMGEN Framework

The VDMGEN 2.2 framework supports the definition and maintenance of data marts and MQTs in DB2 release 9.7 using incremental or full-refresh methodologies. It contains three major components:

  1. The VDMGEN code generator which builds DB2 command language and SQL script libraries suitable for DB2 9.7
  2. The VDMGEN Customizable Script Templates used by the code generator
  3. Top level scripts and utilities suitable for AIX 5.2 and ksh93
VDM Access: 

An Example

Let's take a simple example to illustrate the approach. Our database contains five facts, and each of them has millions or billions of rows. The facts, although they share dimensions were developed at different times by different teams, so the naming conventions are not 100% consistent. Our objective is to list the count of rows and distinct count of load-batches and distinct count of accounts for each of the facts. Ideally we could write a simple query like the following, and only change the name of the table we run it against:


 
VDM Access: 

Pages