/* A subroutine declaration */
int play(char* filename, double frequency, int times, audio_stream* stream) {
...
}
/* A subroutine call */
play("happy_birthday.mid", 35, 2, &s);
A subroutine's signature specifies the number, order, names, modes and types of its parameters and return values. Example:
// Declaration specifies the subroutine name and its signature sub f(p1: m1 T1, p2: m2 T2, ... pn: mn Tn) returns T0; // Here is an example call T0 x = f(e1, e2, ... ek);
How do we know if a call matches a signature?
One very common rule is:
k=n and the type of each ei must be compatible with the corresponding Ti
but many languages have many different rules.
How do we know which argument goes with which parameter? Many answers to this question!
Arguments are usually associated with parameters positionally, but in Ada you can name the parameters in the call:
procedure P(X: Integer; Y: Float; C: Duration; B: Boolean); ... P(X => 4, B => False, C => 4.0, Y => 35.66); P(B => True, C => 12.0, Y => -11.4 * Z, X => 1); P(C => 12.0, Y => -11.4, B => Found or not Odd(Q), X => 1); P(8, 4.2, B => True, C => Stop_Time - End_Time); ... procedure Push(On_The_Stack: Stack, The_Value: Character); ... Push(On_The_Stack => S, The_Value => '$'); Push(The_Value => '^', On_The_Stack => T);
In some languages you can "fake" named association by passing in hash literals:
>> def f(h); puts h[:x], h[:y]; end => nil >> f(:y=>3, :x=>2) 2 3 => nil
With overloaded subroutines the compiler has to figure out which subroutine is being called; this is called overload resolution.
How to deal with ambiguity? Is the return type considered? Are the modes? How about the number of arguments, when variadic subroutines are in the mix? Should we search in outer scopes, even when subroutines of the same name are "closer" to the call? Should ambiguities be detected at the point of call, or at the point of declaration?
A kind of overloading, actually.
-- Ada function F(X: Integer; Y: Integer := 1; Z: Integer := 0) return Integer is begin return X * Y + Z; end F; -- F(4, 3, 9) returns 21 -- F(4, 3) returns 12 -- F(4) returns 4
// C++
int f(int x, int y = 1, int z = 0) {
return x * y + z;
}
// f(4, 3, 9) returns 21
// f(4, 3) returns 12
// f(4) returns 4
; Common Lisp (defun f (x &optional (y 1) (z 0)) (+ (* x y) z)) ; (f 4 3 9) returns 21 ; (f 4 3) returns 12 ; (f 4) returns 4
You fake them in Perl because all the arguments are passed in one big list:
sub f {
die "At least one argument required" if not @_;
my $x = shift;
my $y = @_ ? shift : 1;
my $z = @_ ? shift : 0;
return $x * $y + $z;
}
Some languages have de facto defaults -- if you don't call the subroutine with enough arguments, the "missing" parameters automatically get some kind of undefined or null value.
A variadic subroutine can accept an arbitrary number of parameters.
In Lisp, you make a function variadic by marking the last parameter with &rest.
(defun f (x &rest y) (* x (apply #'+ y))) ; (f 2) means x=2 and y=() ; (f 3 5) means x=3 and y=(5) ; (f 2 8 7 5 3 4) means x=2 and y=(8 7 5 3 4)
In Perl, all subroutines are variadic, since all arguments are collected into the array @_.
sub f {
my $x = shift;
my $result = 0;
$result += $_ for (@_);
return $x * $result;
}
f 2 == 0 or die;
f 3, 5 == 15 or die;
f 2, 8, 7, 5, 3, 4 == 54 or die;
In Java, you can mark the last parameter of a method with "..." to mean that it is really an array and the caller can pass in a variable number of arguments without the clunkiness of putting them in an array.
public static int f(int x, int... y) {
int result = 0;
for (int a : y) result += a;
return x * result;
}
; f(2) means x=2 and y=new int[]{} and returns 0
; f(3,5) means x=3 and y=new int[]{5} and returns 15
; f(2,8,7,5,3,4) means x=2 and y=new int[]{8,7,5,3,4} and returns 54
In C, you need the stdarg.h library module and a lot of code.
#include <assert.h>
#include <stdarg.h>
/*
* f(k, x, y1, y2, y3, ..., yk) returns the value of
* x * sum(y1 ... yk). Note that because this is C,
* you have to pass in the argument count!
*/
int f(int k, int x, ...) {
int result = 0;
va_list ap;
va_start(ap, x); /* x is the param just before the extras */
while (k--) { /* simple count down */
int y = va_arg(ap, int);
result += y; /* sum up the extras */
}
va_end(ap);
return x * result;
}
int main() {
assert( f(2, 2, 5, 9) == 28 );
assert( f(3, 2, 1, 6, 2) == 18 );
assert( f(0, 2) == 0 );
assert( f(1, 2, 50) == 100 );
assert( f(9, 4, 1, 2, 3, 2, 1, 1, 1, 1, 5) == 68 );
}
Some design considerations:
There are many ways parameter passing can be defined. In:
subroutine f (x) begin ... end f; ... f (y);
[Semantic] Can x be written to in the body of f, and if so, will the change be made in y, and if y does change, does the change happen right away or only after f returns?
[Pragmatic] What exactly happens at the call? Is a copy of y made and passed to x? Are x and y to be aliases? Is the "address" of y passed? Is a block of code to evaluate y passed?
| Implementation | Allowable operations on parameter |
Can argument change? | Aliasing? | |
|---|---|---|---|---|
| in | any | R | No | Maybe |
| out | any | W | Yes | Maybe |
| in out | any | RW | Yes | Maybe |
| value | value | RW | No | No |
| value-result | value | RW | Yes | No |
| sharing | any | RW | Yes | Yes |
| reference | address | RW | Yes | Yes |
| name | closure | RW | Yes | Yes |
When calling a subroutine, you pass an object. What if you wanted to pass a type?
Not an issue in ML (polymorphism), Lisp (dynamic typing).
In C++ and Ada the compiler generates code for every instance of a "generic" subroutine. In C++ each actual subroutine is implicitly generated when needed:
// ---------------------------------------------------------------------------
// mintemplate.cpp
//
// This program illustrates the declaration of a template function and how
// it can be called with many different types, including user-defined ones.
// ---------------------------------------------------------------------------
#include <iostream>
using namespace std;
template <class T>
T minimum(T x, T y) {
return x < y ? x : y;
}
class A {
int x;
int y;
public:
A(int x, int y): x(x), y(y) {}
bool operator<(A other) {return x + y < other.x + other.y;}
friend ostream& operator<<(ostream& s, A a) {
return s << "(" << a.x << "," << a.y << ")";
}
};
// Show it off
int main() {
cout << minimum(4, 5) << "\n";
cout << minimum(3.3, 4.1) << "\n";
cout << minimum("dog", "bat") << "\n";
cout << minimum(A(1,3), A(6,0)) << "\n";
cout << minimum(A(5,5), A(6,0)) << "\n";
cout << minimum(A(9,1), A(2,1)) << "\n";
cout << minimum(A(2,3), A(-6,0)) << "\n";
return 0;
}
But in Ada you have to implicitly instantiate them:
------------------------------------------------------------------------------
-- generic_min_demo.adb
--
-- This program illustrates the declaration of a generic function and how
--/ it can be called with many different types, including user-defined ones.
------------------------------------------------------------------------------
with Ada.Text_IO, Ada.Integer_Text_IO, Ada.Float_Text_IO;
use Ada.Text_IO, Ada.Integer_Text_IO, Ada.Float_Text_IO;
procedure Generic_Min_Demo is
-- Define a generic Minimum function callable on any type.
-- The second parameter is a default parameter. Cool, eh?
generic
type T is limited private;
with function "<"(X: T; Y: T) return Boolean is <>;
function Minimum(X: T; Y: T) return T;
function Minimum(X: T; Y: T) return T is
begin
if X < Y then
return X;
else
return Y;
end if;
end Minimum;
-- Make a user defined type called A, with "<" and Put operations.
type A is record
X: Integer;
Y: Integer;
end record;
function "<"(X: A; Y: A) return boolean is
begin
return X.X + X.Y < Y.X + Y.Y;
end "<";
procedure Put (X: A) is
begin
Put ("(");
Put (X.X, 0);
Put (",");
Put (X.Y, 0);
Put (")");
end Put;
-- Instantiate a few
function Min_Integer is new Minimum(Integer);
function Min_Float is new Minimum(Float);
function Min_A is new Minimum(A);
-- Show off the calls
begin
Put(Min_Integer(4, 5), 0); New_Line;
Put(Min_Float(3.3, 4.1), 0); New_Line;
Put (Min_A(A'(1,3), A'(6,0))); New_Line;
Put (Min_A(A'(5,5), A'(6,0))); New_Line;
Put (Min_A(A'(9,1), A'(2,1))); New_Line;
Put (Min_A(A'(2,3), A'(-6,0))); New_Line;
end Generic_Min_Demo;
Here's an example of a Java generic method:
public static <T, S extends T> boolean in(S x, T[] a) {
for (T y : a) {
if (y.equals(x)) return true;
}
return false;
}
In some languages, a subroutine s may be "just another object," like an integer, string or list. If so, we can ask these questions:
In ML, Lisp, Scheme, Haskell and other "functional" languages, then answer to all 5 questions is definitely yes.
Other languages answer many of these questions no, but others might say the answer is technically no but allow references to subroutines everywhere. However the answer to question 5 is often no. Except in Perl:
# Some higher order functions in Perl
use strict;
use warnings;
# ----------------------------------------------------------------------------
# compose($f, $g) returns the function which is the composition
# of &$f and &$g.
# ----------------------------------------------------------------------------
sub compose {
my ($f, $g) = @_;
return sub { $f->($g->($_[0])); }
}
# ----------------------------------------------------------------------------
# twice($f) returns the function which applies &$f to an argument twice.
# ----------------------------------------------------------------------------
sub twice {
return compose($_[0], $_[0]);
}
# ----------------------------------------------------------------------------
# TESTING
# ----------------------------------------------------------------------------
sub square { $_[0] * $_[0]; }
sub addSix { $_[0] + 6; }
my $fun = compose(\&square, \&addSix);
$fun->(9) == 225 or die;
twice(\&square)->(3) == 81 or die;
A closure is a subroutine that refers to variables defined in an enclosing scope.
function counterFrom(start) {
var number = start;
return function(increment) {
number += increment;
return number;
}
}
var c1 = counterFrom(20);
var c2 = counterFrom(100);
alert(c1(2));
alert(c1(5));
alert(c2(3));
alert(c1(4));
alert(c2(1));
They are interesting when the enclosing scope itself is a function, because when the enclosing function goes out of scope, its local variables must be retained because instances of the inner function might still need them! This is both powerful, but you have to be careful not to fill up memory since a lot of stuff becomes ineligible for garbage collection.
Study:
Each invocation of a subroutine has a record associated with it at runtime, holding some or all of the following:
Note that some of the stack frame may be in registers and some of it may be in memory.
Activation records are also called stack frames. When the runtime supports closures or concurrency, something like a spaghetti stack is required.
Calling conventions specify:
The C-calling convention on the IA-32 is:
int example(int x, int y) {
int a, b, c;
...
f(y, a, b, b, x);
...
}
The prolog/epilog for this function would look like:
push ebp ; must save old ebp mov ebp, esp ; point ebp to this frame sub esp, ___ ; make space for locals ... mov esp, ebp ; clean up locals pop ebp ; restore old ebp ret
with stack frame
+---------+
ebp-12 | a |
+---------+
ebp-8 | b |
+---------+
ebp-4 | c |
+---------+
ebp | old ebp |
+---------+
ebp+4 | retaddr |
+---------+
ebp+8 | x |
+---------+
ebp+12 | y |
+---------+
The SPARC uses register windows
! ----------------------------------------------------------------------------
! writeint
!
! A small program that contains a function to write an unsigned integer value
! to standard output.
! ----------------------------------------------------------------------------
.global _start
.text
! writeint(unsigned int n) writes n to standard output as text. It is
! inefficient since it is recursive and invokes a system call to write
! each digit.
!
! It is equivalent to the C function
!
! void writeint(unsigned n) {
! if (n >= 10) writeint(n / 10);
! putchar(n + '0');
! }
writeint:
save %sp, -96, %sp
mov 10, %l0 ! Keep 10 in %l0
cmp %i0, %l0 ! Is it less than 10?
bl writedigit ! If so just write the digit
nop
mov %g0, %y
udiv %i0, %l0, %l1 ! n / 10 --> %l1
umul %l1, %l0, %l2 ! n * (n/10) --> %l2
sub %i0, %l2, %l2 ! n mod 10 --> %l2
call writeint
mov %l1, %o0 ! call writeint(n / 10)
mov %l2, %i0 ! Put digit into %i0 before writing
writedigit: ! Writes the digit in %i0
add 48, %i0, %l1 ! Character value of the digit
set digit, %o1
stb %l1, [%o1] ! Must be in memory to write
mov 4, %g1 ! System call write
mov 1, %o0 ! stdout
mov 1, %o2 ! number of characters to write
ta 0
ret
restore
! A starting point for the program that just makes a couple of calls
! to the writeint function.
_start:
call writeint ! writeint(5)
mov 5, %o0
call writeint ! writeint(1312)
mov 1312, %o0
sethi %hi(7308841), %o0 ! writeint(7308841)
call writeint
or %o0, %lo(7308841), %o0
mov 1, %g1
ta 0
.data
.align 4
digit: .skip 4
A leaf subroutine makes no calls. A good compiler can generate code to make a leaf subroutine's activation use the stack frame of its caller. The SPARC has direct support for this.
! write_stars(n) writes n stars followed by a newline write_stars: mov %o0, %l0 ! save the argument in %l0 mov 4, %g1 ! syscall write mov 1, %o0 ! stdout set star, %o1 ! message to write mov 1, %o2 ! one character in message write_star: tst %l0 ! no more characters to write be write_newline ! write terminating newline if so nop ta 0 ba write_star ! go see if more to write dec %l0 ! count off one more char as written write_newline: set newline, %o1 retl ta 0 ! write the newline (before returning) star: .ascii "*" newline: .ascii "\n"
A compiler inlines a subroutine call when it replaces the call with the body of the called subroutine (with all argument-parameter bindings resolved, of course). If all calls are inlined, there is no need to generate separate code for the subroutine. Inlining should be faster in general, since there is no call/return overhead (which is substantial for short subroutines). But if there are thousands of such calls you can end up with more compiled code.
And compilers have to be careful when inlining recursive subroutines....
Given
function f(a, b) {
var x = 1;
function g(a, y) {
var z = 3;
function h(z) {
print x;
}
function k(c) {
var x = c;
if (x < 5) k(c+1); else h(1);
}
if (a == 0) g(1, 0); else k(2);
}
g(0, 0);
}
The call sequence goes f → g → g → k → k → k → k → h. What gets printed? It depends on whether we have static or dynamic scoping.
If dynamically scoped, use a-lists or central reference tables.
If statically scoped, use static links (or displays). In this case we rely on the compiler to generate code that says:
"To find x from the body of k, take two steps down the static chain and look in offset -4".
