Extracting and Generating Resource Twins

Abstract - In previous blog posts, we had given the ideas and goals of the TwinSpace project where we presented the technical building blocks: the LPDL as a modelling language and the extractors and generators to obtain LPDL models and resource twins. The LPDL allows to specify the non-functional properties of software components and software systems (e.g., the execution time, the number and kinds of used operations, memory usage, communication behaviour, …) without disclosing concrete algorithms. These models can either be automatically extracted from existing software components or manually written by the engineers. The twin generators use the information from the LPDL models to produce suitable resource twins which can then be used to assess different hardware platforms or compilers. Today, we want to show the conversion of a software component to a resource twin with the help of a small example. Later blog posts will describe our work on the system level and on metrics for the assessment of the twins as well as the embedding into an automotive development process.

A Small Example

Our running example is a simple synthetic example for a typical software component in an embedded system. The software component consisting of one routine is shown in Figure 1. Several floating-point computations are performed depending on the system’s state. In TwinSpace, we look at software components available as C source code as well as those that are only available in object code format. The running example is compiled for Infineon AURIX, because we want to apply a binary code extractor.

Figure 1: A simple software component

Extracting LPDL Models using Static Program Analysis

The extraction of the LPDL models for the resource twin starts with the reconstruction of the control-flow graph (CFG). In most cases, this happens fully automatically. However, in case targets of functions pointers cannot be resolved statically, the user can guide the extractor with the help of annotations that specify the targets. The CFG of our example software component is shown on the left of Figure 4. One can see the two branches that originate from the use of “if” in the source code. The loop that is present in the source code has been unrolled by the compiler.

The extractor classifies the instructions in each basic block after control-flow reconstruction. For the LPDL models, we are especially interested in the mathematical and logical instructions. Moreover, the extractor collects all variables that are either read or written. Both the coarse structure of the CFG as well as the number and kinds of operations are stored in the XML-based LPDL models. An excerpt of the model that has been extracted from our example component is shown in Figure 2. This LPDL model is then used in the next step to generate a resource twin.

Figure 2: Part of the LPDL model extracted from the example software component

Generating Resource Twins

The LPDL model contains all the necessary information to describe the non-functional properties of our example software component. Besides the coarse structure, this includes the number and kinds of mathematical and logical operations used. In our example, these are mostly floating-point operations. The twin generators use this information to generate C code that simulates the non-functional behaviour of the software component. Figure 3 shows the code that has been produced for the operations inside the “if” of the original software component. The twin has the same number of floating-point operations (for example, 2 divisions and 3 multiplications) but abstracts from the concrete variables and computation steps. For example, the constants used in the original source code are not preserved.

Figure 3: Part of the resource twin’s generated C code

From Original to Twin

In a last step, we compiled the twin for Infineon AURIX, as we did with the original software component. This allows us to compare the CFGs of the original and the twin (Figure 4). Both have a similar structure. However, the twin uses less instructions because the constants used in the original did not need to be computed.

A dedicated working group as part of the TwinSpace project is concerned with suitable metrics to compare resource twins with their original software components. These metrics are also used to improve the extractors and generators in order to reach the most precise twins possible. Details on these efforts will be part of a later blog post.

Figure 4: Control-flow graphs of the original software component (on the left) and its twin (on the right)