A Guide to the S-Lang Language: Functions

9. Functions

There are essentially two classes of functions that may be called from the interpreter: intrinsic functions and slang functions.

An intrinsic function is one that is implemented in C or some other compiled language and is callable from the interpreter. Nearly all of the built-in functions are of this variety. At the moment the basic interpreter provides nearly 300 intrinsic functions. Examples include the trigometric functions sin and cos, string functions such as strcat, etc. Dynamically loaded modules such as the png and pcre modules add additional intrinsic functions.

The other type of function is written in S-Lang and is known simply as a ``S-Lang function''. Such a function may be thought of as a group of statements that work together to perform a computation. The specification of such functions is the main subject of this chapter.

9.1 Calling Functions

The most important rule to remember in calling a function is that if the function returns a value, do something with it. While this might sound like a trivial statement it is the number one issue that trips-up novice users of the language.

To elaborate on this point further, consider the fputs function, which writes a a string to a file descriptor. This function can fail when, e.g., a disk is full, or the file is located on a network share and the network goes down, etc.

S-Lang supports two mechanisms that a function may use to report a failure: raising an exception, returning a status code. The latter mechanism is used by the S-Lang fputs function. i.e., it returns a value to indicate whether or not is was successful. Many users familiar with this function either seem to forget this fact, or assume that the function will succeed and not bother handling the return value. While some languages silently remove such values from the stack, S-Lang regards the stack as a dynamic data structure that programs can utilize. As a result, the value will be left on the S-Lang stack and can cause problems later on.

There are a number of correct ways of ``doing something'' with the return value from a function. Of course the recommended procedure is to use the return value as it was meant to be used. In the case of fputs, the proper thing to do is to check the return value, e.g.,


     if (-1 == fputs ("good luck", fp))
       {
          % Handle the error
       }

Other acceptable ways to ``do something'' with the return value include assigning it to a dummy variable,


     dummy = fputs ("good luck", fp);

or simply ``popping'' it from the stack:


     fputs ("good luck", fp);  pop();

The latter mechanism can also be written as


     () = fputs ("good luck", fp);

The last form is a special case of the multiple assignment statement, which is discussed in more detail below. Since this form is simpler than assigning the value to a dummy variable or explicitly calling the pop function, it is recommended over the other two mechanisms. Finally, this form has the redeeming feature that it presents a visual reminder that the function is returning a value that is not being used.

9.2 Declaring Functions

Like variables, functions must be declared before they can be used. The define keyword is used for this purpose. For example,


      define factorial ();

is sufficient to declare a function named factorial. Unlike the variable keyword used for declaring variables, the define keyword does not accept a list of names.

Usually, the above form is used only for recursive functions. In most cases, the function name is almost always followed by a parameter list and the body of the function:

define function-name (parameter-list) { statement-list }

The function-name is an identifier and must conform to the naming scheme for identifiers discussed in the chapter on Identifiers. The parameter-list is a comma-separated list of variable names that represent parameters passed to the function, and may be empty if no parameters are to be passed. The variables in the parameter-list are implicitly declared, thus, there is no need to declare them via a variable declaration statement. In fact any attempt to do so will result in a syntax error.

The body of the function is enclosed in braces and consists of zero or more statements (statement-list). While there are no imposed limits upon the number statements that may occur within a S-Lang function, it is considered poor programming practice if a function contains many statements. This notion stems from the belief that a function should have a simple, well defined purpose.

9.3 Parameter Passing Mechanism

Parameters to a function are always passed by value and never by reference. To see what this means, consider


     define add_10 (a) 
     {
        a = a + 10;
     }
     variable b = 0;
     add_10 (b);

Here a function add_10 has been defined, which when executed, adds 10 to its parameter. A variable b has also been declared and initialized to zero before being passed to add_10. What will be the value of b after the call to add_10? If S-Lang were a language that passed parameters by reference, the value of b would be changed to 10. However, S-Lang always passes by value, which means that b will retain its value during and after after the function call.

S-Lang does provide a mechanism for simulating pass by reference via the reference operator. This is described in greater detail in the next section.

If a function is called with a parameter in the parameter list omitted, the corresponding variable in the function will be set to NULL. To make this clear, consider the function


     define add_two_numbers (a, b)
     {
        if (a == NULL) a = 0;
        if (b == NULL) b = 0;
        return a + b;
     }

This function must be called with two parameters. However, either of them may omitted by calling the function in one of the following ways:


     variable s = add_two_numbers (2,3);
     variable s = add_two_numbers (2,);
     variable s = add_two_numbers (,3);
     variable s = add_two_numbers (,);

The first example calls the function using both parameters, but at least one of the parameters was omitted in the other examples. If the parser recognizes that a parameter has been omitted by finding a comma or right-parenthesis where a value is expected, it will substitute NULL for missing value. This means that the parser will convert the latter three statements in the above example to:


     variable s = add_two_numbers (2, NULL);
     variable s = add_two_numbers (NULL, 3);
     variable s = add_two_numbers (NULL, NULL);

It is important to note that this mechanism is available only for function calls that specify more than one parameter. That is,


     variable s = add_10 ();

is not equivalent to add_10(NULL). The reason for this is simple: the parser can only tell whether or not NULL should be substituted by looking at the position of the comma character in the parameter list, and only function calls that indicate more than one parameter will use a comma. A mechanism for handling single parameter function calls is described later in this chapter.

9.4 Referencing Variables

One can achieve the effect of passing by reference by using the reference (&) and dereference (@) operators. Consider again the add_10 function presented in the previous section. This time it is written as:


     define add_10 (a)
     {  
        @a = @a + 10;
     }
     variable b = 0;
     add_10 (&b);

The expression &b creates a reference to the variable b and it is the reference that gets passed to add_10. When the function add_10 is called, the value of the local variable a will be a reference to the variable b. It is only by dereferencing this value that b can be accessed and changed. So, the statement @a=@a+10 should be read as ``add 10 to the value of the object that a references and assign the result to the object that a references''.

The reader familiar with C will note the similarity between references in S-Lang and pointers in C.

References are not limited to variables. A reference to a function may also be created and passed to other functions. As a simple example from elementary calculus, consider the following function which returns an approximation to the derivative of another function at a specified point:


     define derivative (f, x)
     {
        variable h = 1e-6;
        return ((@f)(x+h) - (@f)(x)) / h;
     }
     define x_squared (x)
     {
        return x^2;
     }
     dydx = derivative (&x_squared, 3);

When the derivative function is called, the local variable f will be a reference to the x_squared function. The x_squared function is called is called with the specified parameters by dereferencing f with the dereference operator.

9.5 Functions with a Variable Number of Arguments

When a S-Lang function is called with parameters, those parameters are placed on the run-time stack. The function accesses those parameters by removing them from the stack and assigning them to the variables in its parameter list. This details of this operation are for the most part hidden from the programmer. But what happens when the number of parameters in the parameter list is not equal to the number of parameters passed to the function? If the number passed to the function is less than what the function expects, a StackUnderflow error could result as the function tries to remove items from the stack. If the number passed is greater than the number in the parameter list, then the extras will remain on the stack. The latter feature makes it possible to write functions that take a variable number of arguments.

Consider the add_10 example presented earlier. This time it is written


     define add_10 ()
     {
        variable x;
        x = ();
        return x + 10;
     }
     variable s = add_10 (12);  % ==> s = 22;

For the uninitiated, this example looks as if it is destined for disaster. The add_10 function appears to accept zero arguments, yet it was called with a single argument. On top of that, the assignment to x looks strange. The truth is, the code presented in this example makes perfect sense, once you realize what is happening.

First, consider what happens when add_10 is called with the parameter 12. Internally, 12 is pushed onto the stack and then the function called. Now, consider the function add_10 itself. In it, x is a local variable. The strange looking assignment `x=()' causes whatever is on the top of the stack to be assigned to x. In other words, after this statement, the value of x will be 12, since 12 is at the top of the stack.

A generic function of the form


    define function_name (x, y, ..., z)
    {
       .
       .
    }

is transformed internally by the parser to


    define function_name ()
    {
       variable x, y, ..., z;
       z = ();
       .
       .
       y = ();
       x = ();
       .
       .
    }

before further parsing. (The add_10 function, as defined above, is already in this form.) With this knowledge in hand, one can write a function that accepts a variable number of arguments. Consider the function:


    define average_n (n)
    {
       variable x, y;
       variable s;
       
       if (n == 1) 
         {
            x = ();
            s = x;
         }
       else if (n == 2)
         {
            y = ();
            x = ();
            s = x + y;
         }
       else throw NotImplementedError;

       return s / n;
   }
   variable ave1 = average_n (3.0, 1);        % ==> 3.0
   variable ave2 = average_n (3.0, 5.0, 2);   % ==> 4.0

Here, the last argument passed to average_n is an integer reflecting the number of quantities to be averaged. Although this example works fine, its principal limitation is obvious: it only supports one or two values. Extending it to three or more values by adding more else if constructs is rather straightforward but hardly worth the effort. There must be a better way, and there is:


   define average_n (n)
   {
      variable s, x;
      s = 0;
      loop (n) 
        {
           x = ();    % get next value from stack
           s += x;
        }
      return s / n;
   }

The principal limitation of this approach is that one must still pass an integer that specifies how many values are to be averaged. Fortunately, a special variable exists that is local to every function and contains the number of values that were passed to the function. That variable has the name _NARGS and may be used as follows:


   define average_n ()
   {
      variable x, s = 0;
      
      if (_NARGS == 0) 
        usage ("ave = average_n (x, ...);");

      loop (_NARGS)
        {
           x = ();
           s += x;
        }
      return s / _NARGS;
   }

Here, if no arguments are passed to the function, the usage function will generate a UsageError exception along with a simple message indicating how to use the function.

9.6 Returning Values

As stated earlier, the usual way to return values from a function is via the return statement. This statement has the simple syntax

return expression-list ;

where expression-list is a comma separated list of expressions. If the function does not return any values, the expression list will be empty. A simple example of a function that can return multiple values (two in this case) is:


        define sum_and_diff (x, y)
        {
            variable sum, diff;

            sum = x + y;  diff = x - y;
            return sum, diff;
        }

9.7 Multiple Assignment Statement

In the previous section an example of a function returning two values was given. That function can also be written somewhat simpler as:


       define sum_and_diff (x, y)
       {
          return x + y, x - y;
       }

This function may be called using


      (s, d) = sum_and_diff (12, 5);

After the above line is executed, s will have a value of 17 and the value of d will be 7.

The most general form of the multiple assignment statement is


     ( var_1, var_2, ..., var_n ) = expression;

Here expression is an arbitrary expression that leaves n items on the stack, and var_k represents an l-value object (permits assignment). The assignment statement removes those values and assigns them to the specified variables. Usually, expression is a call to a function that returns multiple values, but it need not be. For example,


     (s,d) = (x+y, x-y);

produces results that are equivalent to the call to the sum_and_diff function. Another common use of the multiple assignment statement is to swap values:


     (x,y) = (y,x);
     (a[i], a[j], a[k]) = (a[j], a[k], a[i]);

If an l-value is omitted from the list, then the corresponding value will be removed fro the stack. For example,


     (s, ) = sum_and_diff (9, 4);

assigns the sum of 9 and 4 to s and the difference (9-4) is removed from the stack. Similarly,


     () = fputs ("good luck", fp);

causes the return value of the fputs function to be discarded.

It is possible to create functions that return a variable number of values instead of a fixed number. Although such functions are discouraged, it is easy to cope with them. Usually, the value at the top of the stack will indicate the actual number of return values. For such functions, the multiple assignment statement cannot directly be used. To see how such functions can be dealt with, consider the following function:


     define read_line (fp)
     {
        variable line;
        if (-1 == fgets (&line, fp))
          return -1;
        return (line, 0);
     }

This function returns either one or two values, depending upon the return value of fgets. Such a function may be handled using:


      status = read_line (fp);
      if (status != -1)
        {
           s = ();
           .
           .
        }

In this example, the last value returned by read_line is assigned to status and then tested. If it is non-zero, the second return value is assigned to s. In particular note the empty set of parenthesis in the assignment to s. This simply indicates that whatever is on the top of the stack when the statement is executed will be assigned to s.

9.8 Exit-Blocks

An exit-block is a set of statements that get executed when a functions returns. They are very useful for cleaning up when a function returns via an explicit call to return from deep within a function.

An exit-block is created by using the EXIT_BLOCK keyword according to the syntax

EXIT_BLOCK { statement-list }

where statement-list represents the list of statements that comprise the exit-block. The following example illustrates the use of an exit-block:


      define simple_demo ()
      {
         variable n = 0;

         EXIT_BLOCK { message ("Exit block called."); }

         forever
          {
            if (n == 10) return;
            n++;
          }
      }

Here, the function contains an exit-block and a forever loop. The loop will terminate via the return statement when n is 10. Before it returns, the exit-block will get executed.

A function can contain multiple exit-blocks, but only the last one encountered during execution will actually get used. For example,


      define simple_demo (n)
      {
         EXIT_BLOCK { return 1; }
         
         if (n != 1)
           {
              EXIT_BLOCK { return 2; }
           }
         return;
      }

If 1 is passed to this function, the first exit-block will get executed because the second one would not have been encountered during the execution. However, if some other value is passed, the second exit-block would get executed. This example also illustrates that it is possible to explicitly return from an exit-block, but nested exit-blocks are illegal.

Next Previous Contents