VDMETL Framework

The VDMETL© framework is a customizable open set of scripts and processes utilizing the VDMGEN© capabilities to create, maintain and execute ETL data provisioning activities with simple available Unix tools. The run-time framework can vary from a basic parallel scalable run-time framework relying on Unix communicating processes that is provided, to a Hadoop/HDFS streaming framework. A totally redesigned Spark-based solution is under development. Parsers are generated from human/machine readable mapping specification documents. The framework supports error handling capabilities, generates and loads relational or XML tables, handles code lookups, surrogate keys and simple logical and transformation expressions. It is a low cost alternative or supplement to other ETL tools that fits many high volume and low to medium complexity cases:

  • Absorb spikes of activity, e.g. history or corrective loads,
  • Rapidly test and refine mapping specifications streamlining more expensive ETL development
  • Deliver early data to base front-end development activities while production ETL is being developed
  • Dual ETL second site loads avoiding second set of active licenses
  • Regular low cost, limited resource production ETL for medium complexity high volume data or special cases such as XML tables
  • Enforcement of standards and generation of working archetypes or components in commercial ETL script language to jump start development
The general philosophy follows the "convention over configuration" approach as exemplified in more complex frameworks such as rails and maven
VDM Access: