| Construct | Example |
| variables | c: char |
| constants | 7: int |
| parameters | fun f(s: string) = ... |
| fields | {name: string, birthday: date, id: int} |
| subroutines | length: string->int |
| expressions | (length(s)-5, "hello"): int*string |
In programming language theory, a language's type system defines:
Languages will differ as to which types are built-in, and which have to be defined using definition mechanisms. Some kinds of types are:
Type
Data
Simple
Discrete
Ordinal
Integer
Signed
Unsigned (Modular)
Character
Boolean
Other Enumeration
Pointer
Real
Floating-Point
Fixed-Point
Composite (Aggregate)
String
Array
List
Set
Bag
Hierarchy
Graph
Dictionary
Stack
Queue
Priority Queue
Record
Union
Class
File
Computational
Synchronous
Procedure
Function
Anonymous Block of Code
Asynchronous
Process
thread (Task)
Java has eight built-in types: boolean, byte, char, short, int, long, float, double. All other types are defined with the array or class definition mechanism.
Some languages allow type "operators" to make new types. Examples in ML include "*" and list. Users can define their own:
7 int (4, "dog") int * string [5, 3, 2, 7] int list [(3, 4.0),(1, 5.5)] (int * real) list [ [], [3,3,3], [], [1] ] int list list [5, [2.2, 1.0E~4]] int * real list datatype primary_color = Red | Blue | Green Red primary_color (Red, 4) primary_color * int fn x => x + 4 int -> int fn x => fn y => x ^ Int.toString(y) string -> int -> string
Question:
When are two types the same?
Several ways to answer this. Consider:
typedef struct {
int a;
int b;
} Point;
typedef struct {
int a;
int b;
} Pair;
Point x;
Pair y;
Do x and y have the same type? We could say yes (because they have the same structure), or no (because their types have different names, and furthermore appear in different declarations).
| Structural Equivalence | Name Equivalence |
| Check equivalence by expanding structures all the way down to basic types | Strict: Every type declaration defines a new type |
| Loose: Factor out aliases |
type alink = pointer to cell; subtype blink = alink; p, q : pointer to cell; r : alink; s : blink; t : pointer to cell; u : alink; -- If structural: [p q r s t u] -- If strict name: [p q] [r u] [s] [t] -- If loose name: [p q] [r s u] [t]
Loose name equivalence is pretty common: we like to distinguish things like Point and Pair above but want the flexibility of aliasing. In Ada we can do both explicitly:
type Height is new Integer; type Weight is new Integer; -- Can't add heights and weights subtype Height is Integer; subtype Weight is Integer; -- Now you can freely mix them
ML seems to use structural equivalence for anonymous types and also for types named in a type declaration:
type pair = int * int;
type point = int * int;
val x: point = (1, 4);
val y: pair = x;
type person = {name: string, age: int};
type school = {name: string, age: int};
val p: person = {name="Alice", age=22};
val s: school = p;
that's because "type" does not define a new type! Only a datatype or abstype declaration creates a new type.
abstype person = P of string * int with fun new_person (s, i) = P(s, i) fun name (P(s, i)) = s fun age (P(s, i)) = i end;
Question:
When can a value of type A be used in a context that expects type B?
Possible answers:
int a;
real b;
real c;
c = a + b;
Is this
A language that allows this is said to coerce ints to reals, and we say "int is compatible with real".
Question:
How do we determine the type of an expression?
Answer: Look at all the types of the primitive expressions, then at the types of arguments that the subroutines and operators expect, and work your way out to the whole expression.
1 fun fib (n) = 2 let fun fib_helper (f1, f2, i) = 3 if i = n then f2 4 else fib_helper (f2, f1+f2, i+1) 5 in 6 fib_helper (0, 1, 0) 7 end;
Here are some functions and their types
fun I x = x; 'a -> 'a fun p x y = y 'a -> 'b -> 'b fun first (x, y) = x; 'a * 'b -> 'a fun q x = (hd o hd) x; 'a list list -> 'a fun c x y = if x = y then "y" else "n"; ''a -> ''a -> string
(In ML, a type variable beginning with two primes can only be instantiated with a type that admits equality.)
Question:
What if you use an expression in a manner inconsistent with its type, like trying to compute the age of a string (instead of a person). Should the evaluation result in an error, or return a "best guess"?
| Strong Typing | Weak Typing |
| Type clashes among unrelated types result in errors. they may be compile time, or even run time (e.g. NoSuchMethodError, or ClassCastException) but they are errors. | Type clashes don't really exist... the language implementation will try to cast the argument into some reasonable (!) type and carry out the operation. |
| Alternate definition: you can't really circumvent the type system. Example: In Java, even if you explicitly add a cast to cast a string to an int, you get an error. | Alternate definition: you can easily circumvent the type system. Example: In C++, you can cast a pointer to an int to a void*, then to a string*, and you get away with it. |
| Static Typing | Dynamic Typing |
| Type checking done at compile time
Once a variable's type is known, it can only be assigned expressions of that type (including related types that can be coerced to it, of course). |
Type checking done at run time
Because checking is deferred until run-time a variable can be assigned an expression of one type and then later an expression of another type. var x; x = 2; print x + 5; x = "dog"; print concat(x, "house"); |
| Manifest Typing | Implicit Typing |
Types of variables must be given in their
declarations.
void f(int x, list<int> z) {
int y = x * x;
z.push_back(y);
}
|
the types of variables will be inferred from context
fun f x z = let y = x * x in z @ [y] endx must be an int because of the "*" operator, so y must be an int as well, and then z must be an int list, and finally f must be int -> int list -> int list. |
Suppose f has type 'a * int -> 'b * 'b -> string. then the following expressions are legal
f(4, 5)("sd","de")
f(1,1)(2,2)
f([],5)([],[4,4])
but these are not
f(3,2)(4,"p")
f((3,2),5)([5.5,2.2],[1,6])
the literature on strong/weak, static/dynamic, and manifest/implicit typing is enormous, as are the debates and flame wars. Some good reading:
Usually by array we mean a collection of elements indexed by integers or some other enumerated type, that has constant time access of any element given its index. Contrast this with a list, which needn't have constant time access, and with a map (a.k.a dictionary or associative array), which can be indexed by anything, not just integers or related enumerated types.
| Static Bounds | Bounds Set at Elaboration | Bounds Can be Changed (Dynamic Shape) |
|
|---|---|---|---|
| Global Lifetime |
C globals | ||
| Local Lifetime |
C locals | Ada locals | |
| Arbitrary Lifetime |
Java | Perl, APL |
Several languages (Perl, APL, Fortran 90) have nice syntax for slices. Fortran 90 allows (these examples are from Scott):
matrix(3:6, 4:7) columns 3-6, rows 4-7
matrix(6:, 5) columns 6-end, row 5
matrix(:4, 2:8:2) columns 1-4, every other row from 2-8
matrix(:, /2, 5, 9/) all columns, rows 2, 5, and 9
A pointer is, more or less, an object through which you reference some other object.
int x = 5;
int* p = &x;
int* q = NULL;
int* r = new int;
int* s = new int[100];
cout << *p << *r << s[20];
cout << *q; // crash
Lisp:
(R (X () ()) (Y (Z ()()) (W ()())))
ML:
datatype tree = empty | node of 'a * 'a tree * 'a tree;
val x_zw = node ('R',
node ('X', empty, empty),
node ('Y',
node ('Z', empty, empty),
node ('W', empty, empty)));
Can be always explicit, always implicit, or sometimes-implicit (like in Ada).

If the pointer is called p (or $p in Perl), then
| In | the referent is called | and the field is called |
|---|---|---|
| C | *p | (*p).x or p->x |
| Ada | p.all | p.all.x or p.x |
| Modula-2 | p^ | p^.x |
| Java | p | p.x |
| Perl | %$p | $$p{x} or $p->{x} |
If the referent is called p (or %p in Perl), then
| In | the pointer is called | and the field is called |
|---|---|---|
| C | &p | p.x |
| Ada | p'access | p.x |
| Perl | \%p | $p{x} |
Only a problem in languages td at require/allow the programmer to explicitly deallocate.
Can there be run time checks to ensure pointers cannot outlive referenced objects?
Covered separately. For now, you can read Wikipedia's article on garbage collection, and an interesting developerworks article on Java performance and garbage collection.
this can get confusing. After all,
e1[e2] is the same as *(e1 + e2)
For definitions these are
completely different things:
int *x; /* is totally different from: */
int x[100];
int *a[n]; /* is totally different from: */
int a[n][100];
But for declarations, at least in parameter declarations, you can blur the distinction:
void f(int* a) { ... }
void g(int b[]) { ... }
An array of ints, or pointer to an int, can be passed to either.
A stream
Subroutines are often objects with their own types. We'll see these later when we discuss subroutines.
Will be covered later when we discuss concurrency
Can an object of any type...
Are types themselves objects?
Issues in Orthogonality