Extractors: From Program to LPDL

Abstract - This article introduces the various TwinSpace extractors used to generate load profiles for programs. The extractors work on different abstraction levels, such as source code, object files, and trace data, each offering specific advantages in analyzing non-functional properties like energy consumption and runtime. Analysis at the source code level allows consideration of high-level language features, while object files provide a more accurate estimate of the compiled program. Hardware-related analysis using trace data ultimately offers the highest accuracy. By combining these approaches, a comprehensive picture of a program's non-functional properties can be obtained, which is crucial for planning and efficiency improvement in various applications.
TwinSpace Extractors at Component Level

The load profiles, as presented in the blog post on LPDL, are not written by hand but generated by so-called extractors. An extractor is a program that automatically extracts characteristic properties from the source code or compiled code and represents them in LPDL (Load Profile Description Language). LPDL is a language for describing load profiles that enables better understanding and optimization of energy consumption and runtime, which is essential for planning and efficiency improvement in various applications.

 

At TwinSpace, extractors are introduced on different abstraction levels: from source code level to object files to trace data. Each level offers different advantages and disadvantages regarding the effort required for creation and profile accuracy. This applies to both non-functional properties like energy consumption and runtime and the ability to map features of the original programming language.

 

In the following, each extractor is presented in more detail.

 

C → LPDL: High-Level Programming Language

At the source code level (e.g. C), many hints about implementation can be found that are lost on lower levels. These hints are essential because they help understand the original intention of the code and identify optimization opportunities. For example, embedded processors often do not have floating-point units. If floating-point numbers are used nonetheless, calculations are performed using integer arithmetic and control structures. In the compiled program, there is no direct hint about the use of floating-point numbers anymore. At this level, details of the final machine program are not yet fixed. While it’s clear that some programs run longer or are more energy-intensive due to their algorithmic complexity, a concrete estimate is difficult. Optimizations during translation can play a significant role here and greatly influence the program’s runtime.

 

Object Files → LPDL: After Translation

Many of the previously open parameters are concretized after translation into object files, increasing the accuracy of load profiles regarding the compiled program. At this level, important decisions are made, such as using special instructions, code size, or loop structuring. Even small changes made by the compiler, like memory alignment or selection of alternative optimizations, can significantly influence the final result.

 

Trace Data → LPDL: Hardware-Oriented Analysis

At an even more hardware-oriented level, the program can be analyzed during its execution. Much relevant information can only be determined practically through simulation, such as calculated loop bounds or cache behavior. Understanding this information enables better characterization of a program’s performance. To gain this information, we use emulation to record the concrete program run and determine the program’s properties more accurately than with the previous methods.

 

Conclusion

In summary, the various extractors on different abstraction levels each offer specific advantages. Analysis at the programming language level allows consideration of high-level language features and implementation details, while object files provide a more accurate estimate of the compiled program. Hardware-related analysis using trace data ultimately offers the highest accuracy since it considers the actual program run. Each level has its place in the load profile creation process, and by combining these approaches, we can obtain a comprehensive picture of a program’s non-functional properties.