The presentation created by the presentation generators contains only the outside view, the language presentation and behavior, of a series of functions. To create the actual implementations of these presentations they need to be fed into a back end code generator. There are several distinct back ends which are based on a common library that does most of the work. The library and specific back ends are located under the c/pbe directory.
The back end library is the base implementation for code generation which is then specialized to an encoding scheme and runtime. The library provides a great deal of infrastructure for dealing with various things, like command line arguments, and loading the pres_c input file. However, the primary job of the library is to handle the translation and optimization of the input pres_c descriptions into C or C++ code. The process of interpreting the pres_c tree falls to the mu_state family of classes which provide a number of functions for walking the tree and interpreting its nodes. The mu_state can then be overridden by specific back ends to implement their own functionality. In addition to generating code from the pres_c description of stubs, the back end can also employ scml code to generate formatted C and C++ source code. This scml code is usually defined externally and then called explicitly by the back end. The process of calling scml macros can be done manually as it is for creating the header and footer for output files, or through the presentation implementation and collection classes described later.
The declarations for the library classes and functions can be found in the mom/c/be directory and the source in c/pbe/lib.
The first set of classes we'll look at are used for maintaining state in the back end and managing the top level flow of control. The classes and functions work together based on a simple event based methodology to allow for easy expansion. The core of this model is a handler object which can process an event, it does not necessarily have to process every event, just the ones it is interested in. These handlers are then prioritized and collected into an object which is able to distribute a single event amongst its set of handlers. The resulting system allows for easy addition of code into the root control path without having to tamper directly with the library.
Types:
Functions:
The current set of handler functions can be found in c/pbe/lib/arg_handlers.cc, c/pbe/lib/state_handlers.cc, and c/pbe/lib/file_handlers.cc. The handlers are all relatively simple, they just react to whatever events they are interested in and can then choose what to do from there. Currently, only the IIOPXX back end installs its own handler: it squelches the output of any definitions from the orb.idl file after the pres has been read in by the earlier handlers.
Once the flow of control comes down from the high level handler functions, we finally get to generate the contents of a file. The handlers in file_handlers.cc take care of this by calling stub generation functions for each stub described in the pres_c. The functions provided by the back end library are empty for the most part since they are expected to be implemented by the specific back end: information is required about the runtime in order to generate correct code.
Type marshaling and unmarshaling stubs are generated a bit different from regular client and server stubs. Since they are rarely used, we generate them on demand to prevent excessive output. Whenever one of these stubs is needed (a call to the stub is generated in the code), we create an mu_stub_info_node and store a reference to the stub and how it is going to be used (parameter direction and byteswap flag). This node is then put into a list kept by the be_state. After all other stubs have been generated, the set of required marshal/unmarshal stubs are generated by traversing the list.
The mu_state C++ class is the primary mechanism for creating marshaling and unmarshaling code. An object of this class is created and initialized with the necessary data structures from the pres_c and then set in motion to create the code. Basically, this code creation is done by interpreting the intentions of the pres_c nodes in the presence of specific state options, set at initialization or by context, to produce the actual code. The class itself is simply made up of a set of functions, each handling a single pres_c node. Each function does some processing on the node, which may include using the mu_state to process any child nodes. Thus, flexibility comes from overriding these functions to specialize them to the needs of a back end. Any code produced is stored in cast blocks rather than directly writing to the file, since we we may want to perform some post-processing on the generated code. Finally, the user of the mu_state takes whatever cast is left after processing and wraps it with whatever boilerplate code is necessary for a complete construct.
Processing a pres_c node usually requires more information than is contained in the node itself. This is due to the structure of the pres_c tree; since it does not always encode references to cast and mint structures, these need to be walked in parallel. This is why some functions require arguments for cast and mint structures, however, a pres_c node does not necessarily have to traverse the structures. For example, a PRES_C_MAPPING_POINTER will pass down the C type that the pointer is referring to, but the mint structure is not descended since it has no representation of pointers. The functions that do not require a cast object instead work off of a collection of cast objects accessed through an inline_state. The inline_state is used to map indices from a PRES_C_INLINE_ATOM to a C structure, union, or function encoded in cast. Once, the PRES_C_INLINE_ATOM is executed the mu_state goes into "mapping" mode and the selected cast object is passed down to the mapping nodes. Once we've figured out what objects we're trying to process we need to know what we're supposed to do with them. The op slot in the mu_state is used for this; it can be set with several flags that will influence the code to do what is needed by the user. The current set of flags is:
Stubs generated for non-trivial rpcs generally need some kind of marshaling buffer or stream, into which messages are marshaled or from which messages are unmarshaled. Sometimes a message can use multiple marshaling buffers. The format of the marshaled data in the buffer(s) depends on the encoding scheme being used by the back end (e.g., xdr is an encoding scheme), and by the mint interface definition for the rpc in question. The format of the marshaled data does not usually depend on any aspects of presentation.
Some transport mechanisms can transfer an "unlimited" amount of data in one message (e.g., Mach 3); others impose some arbitrary limit (e.g., Mach 4). Flick stubs are generally expected to be able to handle an unlimited amount of data, regardless of any limitations of the transport mechanism. Fixed-length arrays are generally handled as merely a degenerate case of variable-length arrays: the two types of arrays are identical except that fixed-length arrays use a "degenerate integer" data type with only one possible value as their length data type. Thus, fixed and variable-length arrays can usually be marshaled in exactly the same way; the simple-integer-marshaling code will automatically handle the degenerate length "variable" in fixed-length arrays (see mu_mapping_direct.cc).
The back end support library assumes that code generation can be done by traversing the type trees in a certain "natural order":
This ordering is not mandated; it is the responsibility of the back end to choose an appropriate on-the-wire layout for the data, as the needs of a particular transport mechanism dictate. For example, just because the type graphs are traversed in the order specified above does not mean the data must always appear in memory or on the wire in exactly this order; however, it is generally easier to write the back end if the layouts match.
The actual stub code that gets generated is made up of a number of macro calls and some control flow code. The macros are all defined in the runtime header files and follow similar naming and calling conventions across implementations.
When the mu_state generates these macros it uses the op flags and names for the presentation, encoding scheme, and protocol (link) to determine the names of the macros. For example, the macro to encode an 8-bit character in cdr would be flick_cdr_encode_char8(), and the macro to decode would be flick_cdr_decode_char8(). Unfortunately, this approach can cause some problems because it restricts what parameters we can pass to the macros, especially presentation specific macros which can vary greatly.
Also note that much of Flick's flexibility is actually due to the implementation of the macro calls. While the names change between back ends, often the series of macro calls is very similar, and thus a significant amount of the implementation comes from the definitions of the macros.
The client_mu_state and target_mu_state C++ classes inherit from the base mu_state class and describe how to marshal the client and target object references, respectively. Most presentations do not have a client reference, and most protocols do not have an explicit representation for them. The client reference is necessary in the decomposed stub presentation, and thus must be handled by the protocols and runtimes that support decomposed stubs.
The reason these are special states is that the client and target references are often encoded differently than other object references; for instance, they may be placed at a known location within the message rather than encoded in the midst of other parameter data. Since a single mu_state can only specify one method for handling object references (e.g., the handling of an arbitrary object reference parameter), the special client and server states provide the mechanism in which the client and target references can be handled specially and separately.
The mem_mu_state class is an extension of the basic mu_state class, intended for use by typical back ends where parameters are marshaled (at least partly) into variable-length memory buffers of some kind. It deals with things like buffer management, alignment, endian conversion, etc.
For code optimization purposes, marshaled messages are logically divided into globs, then subdivided into chunks.
A chunk is a sequence of bytes in the marshaled message whose length and internal format is known. For example, a fixed-length array of bytes could be one big chunk; an array with variably sized elements would not be a chunk, but each individual element in that array could be. Chunks are used primarily for optimization of data packing/unpacking code: once a chunk is started, no alignment checks or pointer adjustments need to be done between successive primitives in the chunk.
A glob is part of a message whose maximum possible length is a "smallish" compile-time constant, even though the actual length may not be constant. For example, a variable-length array of bytes with a maximum length of 32 bytes would be a good candidate for being lumped into a single glob, whereas an array with variably sized elements and an unlimited length would not be: instead each individual element of the array could be a separate glob in that case. (The definition of "smallish" is defined by the back end: each back end defines some maximum glob size, usually on the order of a few kilobytes.) Globs are used to optimize marshaling buffer management: once "enough" buffer memory is allocated at the beginning of a glob, the marshaling code within the glob can simply bump through with a pointer without having to worry about buffer space again until the next glob.
The mu_state_arglist class is used to help process some pres_c nodes by coalescing information from their children. For example, an allocation context node has children representing the length, buffer pointer, and possibly other attributes which need to be used together in order to do the correct allocation. However, since this information is only known by the children, the arglist provides a way of passing it back up to the parent (and/or subsequently down to other siblings) so it can be used. The actual filling of the arglist is taken care of by a PRES_C_MAPPING_ARGUMENT node on the path to the child which will capture the current cast expression and type and store it in a particular arglist and argument.
The mu_abort_block class is used for tracking any error handling code in a stub. Thus, we're able to do a proper recovery from any possible errors that occur during stub execution, such as a lack of memory or runtime error. To accomplish this task the class provides functionality to add code for handling an error, and then getting an "abort label" which the stub can jump to during execution if the error occurs. The abort handling code is then dumped at the end of blocks and then linked together with gotos. Finally, the code is reduced to only contain those blocks that are reachable.
The mu_msg_span class is used for putting buffer space checks into the stub code so it does not try to decode something that is not there. The term "span" is used to describe a message segment of an exact size that can be larger than a chunk, but not an approximation like a glob. The class is used to describe each span in the message and join them together into a tree which can then be reduced in a second pass. This reduction pass is used to merge checks together and to lift them out of array loops, if possible. For example, a union where each arm is the same size can be reduced to a single check before unmarshaling the union, rather than a check for each arm.
Using the spans depends on the format of the message that is being unmarshaled at the time. Normally, simple message segments are handled automatically by piggy-backing on the chunk functions in the mem_mu_state. This allows us to figure out how big a segment is, but we do not necessarily know how to join it into the rest of the message. This merging is done by creating another mu_msg_span which acts as a parent to a set of segments, and then setting the kind of the span to one of the formats described below.
The presentation generator is able to create "presentation functions" which have simple implementations and do not require optimization. However, the presentation generator cannot provide the implementations since they may be dependent on the runtime or other things specific to a back end. The current solution is to store a simple description of the presentation function, a semantically loaded string, in the presentation attributes tag lists. These tag lists can then be processed by the back end to produce the needed code. However, writing printfs for each function can become complicated and tedious, so the back end library provides a set of classes for semi-automatically passing this work off to scml.
This approach is not always necessary since the scml interpreter can also be called explicitly to execute macro's. See file prologue and epilogue handlers in file_handlers.cc for an example.
The back end library is not able to produce working code by itself, so a separate back end program must be created which provides its own functionality in addition to the library, specializing the generic code generated by the library into the specific code necessary to implement the encoding and transport layers required by a specific protocol and runtime system. These back ends are located in the c/pbe directory and are all structured very similarly.
The common tasks done in a back end are to create subclasses for the be_state and the various mu_states that implement the correct behavior for the back end's encoding and runtime. Once these classes have been filled out the stub writers are overridden to do some actual work. Coding these stub generators is simply a matter of using the mu_states to generate cast blocks which are then bracketed by the appropriate runtime code. Unfortunately, since all of this is relatively similar across all back ends, it has become common practice to copy an existing back end and then change all the names and printfs to your needs. This results in a fast start up time; however, this quickly turns into a maintenance nightmare. Hopefully in the future the redundant code in the back ends will be consolidated so that maintenance will be easier.
Decomposed stub generation is an option of the presentation generator that causes regular client stubs to be split into separate stubs (see Section 11.1.2 for more information). Handling this kind of presentation is done simply through implementation of the appropriate stub writers (for a detailed list, refer back to Section 12.1.2, assuming the runtime can support it. They work basically the same as the regular client and server stubs, with the functionality of the stubs split into separate functions. However, only the iiop and Khazana back ends are currently able to handle this presentation.
The IIOPXX back end is responsible for generating code specific to TAO, so anything that is implementation dependent must be handled before outputting any code. For example, a string inside of a structure needs to be of the "TAOManagedString" type and not a "Stringvar" like the PG will create since it does not know the implementation. The solution is to use the presentation implementation facilities to do any preprocessing. This type of code has all been dumped in the file c/pbe/iiopxx/tao_impl.cc. Currently, this code ranges from generating CORBA::TypeCode to modifying the structs and classes created by the presentation generator. Any associated scml code is located in runtime/headers/flick/pres/tao_*.scml. Eventually, it would be nice to organize the mess present in tao_impl.cc, but exactly how best to do it is unknown.
The back end code generators are one of the most complex pieces of Flick, mainly because of the number of choices that must (finally) be made. While the current suite of back ends is impressive, particularly in their variety and capabilities, there are a few design changes that might make them even more flexible and useful. Following are a few ideas for improvement: