Automated Software Modernization

Model Driven Modernization

The automated software modernization market is highly fragmented with tools and services from many vendors which vary widely in quality, accuracy and price. Even for a single language pair automated software transformation is challenging and skill-intensive. It is hardly surprising that with over 500 legacy and modern languages and computing environments, automated modernization solutions are not available for many languages. The primary reason "canned" tools don't work is because the source and target languages and environments and customer requirements are highly variable and it is diffcult to create a tool that handles all possible variations in inputs, called the input space. Historically, automated language transformation tools got a bad rap because of inaccurate, inprecise and low quality translations, especially from the "free" tools which did not cover much of the input space. High-quality affordable automated modernization solutions do exist for an increasing number of languages, but they are usually offered as services.

A few solution service providers have developed highly powerful technology frameworks which they can rapidly adapted to new input and output spaces with near perfect accuracy and automation levels exceeding 99.99%. This cost to modify a proven tool to handle an additional language is covered by a "Set Up" charge, which can be a separate fee or absorbed in other charges on larger projects. Set up is recovered by the Return on Investment (ROI) when the tool is applied and saves hundreds of throusands of man hours and produces high quality code without the errors usually introduced by humans.

The figure to the right depicts the model-driven modernization (MDM) approach for automated software modernization. In contrast to early translators that employed regular expressions and string manipulation to carry out language translation, the MDM approach depicted in the figure is purely model-based, which means that the manipulation of the code is carried out by rules applied to a data structure which represent the code as a model, called the Abstract Syntax Tree (AST).

Transforming between 3rd generation languages and 4th generation languages (3GL and 4GL) is accomplished in a multi-step process. First a grammar system in the form of Extended BNF rules are defined for the source 3rd GL and the target 4th GL then a printer and parser are generated for each language. The language specific AST models are called Specific Abstract Syntax Tree Models (SASTM). For efficiency the 3GL SAST is transformed into a Generic Abstract Syntax Tree Model (GASTM) before it is transformed into the 4GL SAST. Through the use of a common intermediate language, many transformation rules are reusable.

 

3rd Generation Languages

Transformed Into

4th Generation Languages

Transformation between language generation, such as the 3GL to 4GL transformation, depicted above, require mapping the feature set allowed in one language into the feature set permitted in another.For efficiency the mapping between feature sets is accomplished by language-neutral transformation between 3GL GASTM and the 4GL GASTM constructs. A subsequent transformation mapping from the 4GL GASTM to a 4GL SASTM completes the model to model mappings between the source language and the target language.

In addition to language transformation, the framework also supports mapping the 3GL and 4GL ASTM to UML models which depict the source and target system's design and architecture. Target UML models and target 4GL code is exported into Interactive Development Environment (IDEs) and MDA development tools such as Microsoft's Visual Sutdio and IBM's Rational Rose.