Software integration approaches

Software architecture in a hetherogeous contest

It's possible to think of any software as composed by quite independent and inter-communicating modules. Each software task is performed by a particular module that is well adapted and expert in a given semantic domain.

Generally speaking all programming languages, no matter if they are imperative or functional languages, follows this approach or with very little effort can be adapted to support this approach.

For example C++ class and object concepts permit internal data organization and algorithms separation whereas Lisp package primitive permits logical separation of different routines.

The software architecture is quite simple if the software itself has been developed using only one programming language; every module is in fact specialized in one particular application area and every module communicates with each other using only native language tools, which are not exportable language instructions. There is only one communication bus that is shared by all modules and the expressiveness of the language mainly depends on the architecture of this bus.

The reason for integration

But today software architecture is more structured and complex. It's widely known that a programmer should not only develop it's own code but also should consider first of all the integration of available software libraries.

The silly approach to "write code we need when we need" is probably one of the principal causes of software weakness. The key concept is to absolutely avoid writing every single procedure that someone had yet written using better tools probably we don't know. For example no one would consider writing a complex numerical software library that is supposed to use intensively computer hardware resources in a functional language. Whereas no smart programmers would consider the realization of a software prototype using a low-level programming language, which is probably best for instructions optimization but not for time requirements.

Two different approaches: network based approach and local based approach

There are two different approaches by which a programmer can realize the needed software cooperation:

[Local-based cooperation]. Different modules from different languages cooperate running in the same process. Every module is "physically" linked to each other using proper operating system tools (i.e. static or dynamic linking modules to create a unique memory-resident running code) and all modules share the same memory address space, which led to the condivision of data structures.
In other words each single module has read and write access (depending to specific programming language policies and limitation) to the data values of other modules that are in general binary incompatible. In a local based approach there is the need to write some ad-hoc procedures that map and unmap data values from one programming language to the other.
[Network based cooperation]. Different modules are independent and distinct processes not sharing the same memory address space and the processes aren't limited to be located on the same computer.
The cooperation of these modules is realized through a network based infrastructure, that is mainly: a communication channel used to send messages and a standardized communication protocol which formally establish the communication between software components. Every module acts in the network as a server to which client modules ask for particular services. This is the approach adopted by the two leading standards for programming integration: CORBA and DCOM.
 
 


The embedding of a functional interpreter in C++

Having explained from a very general point of view which are the different approaches to get two or more programming languages cooperating, it's now useful to focus the attention on a specific (and no more general) contest of integration strategy.

The object of discussion of this document is the integration of some not trivial and deeply structured geometric C++ libraries in one functional interpreter adopting an innovative and original local-based approach.

Even if the solutions are based (and were studied from the beginning) on the Scheme language, all implemented algorithms are enough general to be easily adapted to work in different contest: this approach is said to be functional language independent in the sense that they could be quickly ported to work inside the interpreter of any another functional language.

The real innovation is that cooperation is obtained using only local procedures that rely on memory condivision and memory data structures conversion. The cooperation acts in a local contest without the use of a large, and probably not strictly necessary, network support (mainly: a network communication bus; an Interface Repository and one standardized network protocol).

This stand-alone solution grants reduced start-up application costs and also implies less hardware resources than the network-based one since all processes are linked inside the same executable file.

Description of the environment and original solutions

Since C++ is a low level programming language, the developer deals with concepts and operations strictly connected with computer hardware: the memory is accessible as a collection of typed variables with an associated left value (memory location) and right value (the data stored). A C++ object passed by value causes the creation of a new object inizialized by a copy constructor; on the contrary a C++ object passed by reference directly uses the left values of a variable.

In every pure functional language there are no variables but only functions returning data values that can be recursively passed as arguments for other functions.

In order to allow a complete integration between these two such different programming environments, it is necessary:

[1] To grant to the C++ language the access to the interpreter data structure. It is in general a simple and immediate task in every situation in which the functional language has itself been developed in C++ and we have source code (header and library files).

[2] To grant to the functional language a set of methods to access filtered C++ data structures. It's important to cover all "operative situations" in which a C++ programmer usual works; this fact implies the implementation of totally new (and quite strange for every functional semantic) methods which are intrinsically side-effect methods and which access directly to hardware memory locations. In next pages this set of methods will be referred as a Foreign Function Interface (FFI), so focusing on the central role played by these methods to integrate routines belonging to different programming languages.

Some example of cooperation

To best clarify the concept just introduced It's probably useful to show how a simple C method could be used under a Scheme interpreter (note: syntax and semantic of the given examples are a little different from those really used in my CScheme software). The following C method:

int add(int,int);
can be invoked by Scheme programmers (under the assumption that the implemented FFI is correctly working to map and unmap data structures from one language to the other):

(define x (set-value (set-type (new-cvar) "int") 5))
(define y (set-value (set-type (new-cvar) "double") 3.2))
(define args (append (append (new-args) x) y))
(define ret (dynamic-c-caller "add" args))
(get-value ret)
This code is really simple. At first we build single input arguments (specifically C/C++ variables); then we build a sequence of arguments which will be the actual parameters of the method; finally we invoke the "dynamic caller" to obtain the return value.

A FFI should also give support for C/C++ parameter passed not only by value but also by reference. For example the value returned by the get method:

double& get(int i, int j);
is a value that is supposed to be shared by two or more functions. For example by modifying the returned value:

(define args (append 1 (append (new-args) 2)))
(define ret (dynamic-c-caller "get" args))
(set-value ret (+ (get-value ret) 1.0))
ret value has to be modified both in Scheme environment and also in C/C++ library. So there is memory condivision (and not only memory conversion) by the two different programming languages.

These concepts could be simply extended to C++ language; there are only some technical details to consider because a class could contain: one or more constructor, a destructor, some instance methods and finally some static methods.

For example a simple instance of the Base class:


class Base{
  Base();
  void action();
}
could be created using using the following script:

(define obj (new-cvar "Base"))
(dynamic-c-caller obj "action" (new-args))