Foreign Function Interface

The definition Foreign Function Interface (FFI) appeared for the first time in functional language Lisp to indicate a mechanism by which it was possible to call extern functions implemented in C++. These functions were foreign to the Lisp itself and this is the main motivation of their name.

Passing the time and growing the problems connected with the collaboration between different languages, a FFI means a more generic tool to integrate functions belonging to different programming languages (for example by Scheme functional language and C++ imperative language).

The most important aspect of distinction between network-based collaboration solutions (such as CORBA and Microsoft DCOM) is that a FFI make the collaboration working in a local contest: the two processes communicates through the memory address condivision instead of working on a network channel.

This is a very good link about FFI features.

A simple example of FFI

The best way to fully comprehend how a FFI works is to give an example of usage. Consider the following C function:


int fact(int n) {
    if (n <= 1)
        return 1;
    else
        return n*fact(n-1);
}
To integrate the fact function in a script language it is necessary to write some middle code such as the following (the given example makes the fact function working inside a TCL shell):

int wrap_fact(ClientData clientData,
    Tcl_Interp *interp,
    int argc, char *argv[]) {
    int _result,_arg0;
    if (argc != 2) {
        interp->result = "wrong # args";
        return TCL_ERROR;
    }
    _arg0 = atoi(argv[1]);
    _result = fact(_arg0);
    sprintf(interp->result,"% ", _result);
    return TCL_OK;
}
FFI problems

The main problems a FFI must solve are:

[Memory handlers] Very often the two programming languages we want to integrate could allocate and deallocate the memory using different techniques. For example Python language uses reference counting whereas Scheme or Lisp uses garbage collection algorithms.

[Mapping and Unmapping] Data structures belonging to different languages are very different. Therefore there is the need of conversion (through mapping and unmapping operations) to use them in a different contest where they were created.

[Side effect] Some programming languages use very large data structures by passing them by reference. In all pure functional languages this modality is generally avoided.

[Error handlers] Each programming language has its own mechanism to trap error conditions.

[First class objects] In high level functional languages methods are first class objects so they are variables as any other memory objects. Otherwise in C++ language it is possible to handle function objects only through pointers.

FFI Actions: library loading

Functional interpreter could access foreign functions of a C++ library in a static or in a dynamic way. In the first case it means simply to link binary code to insert C++ functions in the same address space of the interpreter.

The second case is more complex because there is the need  to load C++ library at run-time. My FFI supports both modalities thanks to the load-extension primitives of MzScheme implementation.

FFI Actions: address bindings

There are a lot of actions automatically executed by C/C++ compiler/linker that probably no developer will ever concern of.

There exist two different types of library a C++ linker could produce: static libraries (*.lib in Windows and *.a in Unix) or dynamic libraries (*.dll in Windows and *.so in Unix).

In the first case every function call is converted by the linker in a goto <address> instruction and all the address resolutions are performed during the generation of the executable file.

On the contrary in a dynamic linking contest the binding of cross references is performed at run time and this process is very complicated (I don't know any algorithms which is Operating System independent).


Linker role in address binding problem

It is important to remark that symbolic names are "translated" by the compiler is some "binary" name. Also it is important to take in consideration the overloading problem: two functions can share the same symbolic name and their binary names are different depending on formal parameters.

In my FFI implementation I approach the address binding problem in a static way; therefore I generate a container of function calls for each methods: it is still the linker which produces goto instructions . I wish to find a solution that supports dynamic cross-reference solution in the future.

FFI Actions: Input parameters and output parameters

The FFI must load in the stack all parameters that the foreign function should receive as input parameters. This implies that parameters could be converted to be manipulated by the foreign language.

To have a precise idea of the concept of conversion see a typical example of FFI actions in assembler language


ffi_call_explicit(...){
    switch(ParamsTypes [i ]){
    case T_INT:
    asm{
        mov eax,Params [i ]
        push eax
    }
    break;
    [...]
Scheme and C++ objects are much more complex than the given INT value; therefore the switch T_INT case must be preceded by some mapping instruction.


Mapping and unmapping process

My FFI supports all modality of  arguments passage (both by reference or by value). For example a C integer:

int value;
could be the output parameter of a get_ref function:
int& getref()
This int& object is mapped into a proper scheme structure so a modification of its "internal" value affects both C variable and scheme value.