ML is really the name for a family of languages; the two important ones are Standard ML and Objective CAML (a.k.a O'Caml or ocaml). These notes only cover Standard ML (the latest version, SML 97).
You will love ML because it:
More reasons: Read this overview of features.
The traditional "Hello, world" program:
print "Hello, World\n";
You can "run" the code by dropping it into an interactive shell, or by using a compiler to produce executables.
Just type your declarations and expressions into the interactive shell and the system will respond. Here is an example session. (Assume the file hello.sml contains the code above.)
C:\lmu\web\code\ML>mosml Moscow ML version 2.00 (June 2000) Enter `quit();' to quit. - 4; > val it = 4 : int - "hello " ^ " there"; > val it = "hello there" : string - fun fact n = if n = 0 then 1 else n * fact(n - 1); > val fact = fn : int -> int - fact 5; > val it = 120 : int - val x = 12 and y = ~5; > val x = 12 : int val y = ~5 : int - x - y * 2; > val it = 22 : int - use "hello.sml"; [opening file "hello.sml"] Hello, World > val it = () : unit [closing file "hello.sml"] > val it = () : unit - length [4, 3, 8, 7, 3]; > val it = 5 : int - quit();
Note how nice the system is.... If you don't specify types, it figures them out for you (as long as what you wrote makes sense). If you type in an expression directly, it turns it into a declaration binding the result to the special name it.
The Moscow ML compiler is called mosmlc. It won't compile our "Hello, World" program because it doesn't like standalone expressions: it requires declarations:
val _ = print "Hello, World\n";
Then on Windows you can enter
mosmlc helloworld.sml -o helloworld.exe
and on Unix you can enter
mosmlc -standalone helloworld.sml -o helloworld
and you have your executable.
Here's a slightly more interesting "program"
(*
* reverseargs.sml
*
* A little program that writes the command line arguments
* in reverse, to standard output. It is longer than necessary
* because it is a teaching example.
*)
(* Returns the reversal of a list. Warning: wheel reinvention. *)
fun reverse [] = []
| reverse (h :: t) = reverse t @ [h];
fun concat_space s = s ^ " ";
(* Prints each command line arg, suffixed with a space. *)
val _ =
let
val args = CommandLine.arguments()
in
map (print o concat_space) (reverse args);
print "\n"
end;
A "program" is a sequence of declarations. Declarations begin with:
• val declares a new value binding • fun declares a new function • type introduces a type abbreviation • datatype defines a new type with visible constructors • abstype defines a new type with hidden constructors • exception defines a new exception constructor • signature defines a new signature • structure defines a new structure • functor defines a new function • infix gives infix, left-associative status to an operator • infixr gives infix, right-associative status to an operator • nonfix declares an operator to be prefix • local makes a declaration that uses private local declarations • open allows all items in a structure to be used without qualification
Most program entities (types, exceptions, values, etc.) are defined in structures. For example, the structure Timer contains the type cpu_timer and the functions startCPUTimer and checkCPUTimer, among other things, so you need to write Timer.cpu_timer and Timer.checkCPUTimer, etc. But entities in the package General (which contains types like int, string, and real and functions like <=, not, and mod do not have to be qualified with the package name.
(*
* timerExample.sml
*
* Trivial example of timing.
*)
fun count(n) =
if (n > 0) then count(n - 1) else ();
val _ =
let
val t = Timer.startCPUTimer()
in
count(10000000);
print (Time.toString(#usr(Timer.checkCPUTimer(t))) ^ "\n")
end;
abstype and andalso as case do datatype else
end eqtype exception fn fun functor handle if
in include infix infixr let local nonfix of
op open orelse raise rec sharing sig signature
struct structure then type val where with withtype
while ( ) [ ] { } ,
: :> ; ... _ | = =>
-> #
Are either
But reserved words are excluded. And notice no mixing of alphanumeric and symbolic characters! You can't have the identifier |-o-|, though you can have |-*-|.
Between (* and *) and they can nest!
Here are some of the types in Standard ML
| Name | Defined in | Examples |
|---|---|---|
| unit | General | () |
| bool | Bool, General | true false |
| int | General, Int | 0 ~4 999999 0xfa33bc ~0x1fcf5 |
| word | General, Word, Word8 | 0w0 0w4 0wx4ccab |
| real | General, Real, Math | 0.7 ~0.7 3.32E5 3E~7 ~3E~7 |
| char | Char, General | #"s" |
| string | General, String | "hello" |
| α -> β | the type of functions taking an argument of type α and returning a value of type β | fn x => x + 1 fn x => fn y => 1 - x / y |
| record types | a built-in type former | {name="Masha", age=25} {re=6.2, im=~3.0} {1=4, 2="d", 3=false} |
| α * β | abbreviation for records whose fields are numbers starting with 1 | (4, "d", false) |
| α list | General, List | nil 4::3::12::5::nil [4, 3, 12, 5] [ ] |
| α option | General, Option | SOME 2 NONE |
| exn | General (This is a special datatype because new constructors can be added at runtime) |
Div Fail "dunno" |
| α ref | General | ref 2 |
| α frag | General | QUOTE "hello" ANTIQUOTE 9 |
The type declaration does not make a new type, just a type abbreviation.
type intpair = int * int;
type person = {name: string, birthday: date, weight: real};
type simplefun = real -> real;
type text = string;
type intset = bool list;
type coin_flipping_results = bool list;
Here coin_flipping_results and intset are exactly the same type.
You need to use datatype or abstype to create a new type, distinct from any other. Datatypes are defined with constructors.
datatype weekday = Monday | Tuesday | Wednesday | Thursday | Friday;
datatype file_descriptor = Stdin | Stdout | Stderr | Other of int;
datatype 'a tree = Empty
| Node of 'a * 'a tree * 'a tree;
datatype Shape = Circle of {radius: real}
| Rectangle of {height: real, width: real}
| Polygon of (real * real) list;
abstype 'a stack = Data of 'a list
with
exception Underflow;
val newStack = Data []
fun push (Data s) x = Data (x::s)
fun pop (Data []) = raise Underflow
| pop (Data (_::t)) = Data t
fun top (Data []) = raise Underflow
| top (Data (x::t)) = x
fun listToStack s = Data s
fun stackToList (Data s) = s
end;
The abstract type is just a datatype that hides its constructors. But its probably better to use structures anyway.
Functions are just regular values who happen to have a type like 'a -> 'b. The syntax of a function is:
fn pattern => exp ('|' pattern => exp)*
Function calls are straighforward:
(fn x => x + 6) 20; (fn [] => false | (h::t) => true) [4, 5, ~3]; (fn _ => 0) (4, "dog", [4.5, 1.1], (1,1)); (fn 5 => 0 | x => x) 5; (fn 5 => 0 | x => x) 12;
Note that each match is tried in order. Order matters, so be careful.
Because functions are values, you declare them just like any other value. You can take advantage of the fact that each declaration can use declarations that precede it.
val two = 2; val addSix = fn x => x + two + 4; val square = fn x:real => x * x; val magnitude = fn (x,y) => Math.sqrt(square x + square y); val rec fact = fn n => if n = 0 then 1 else n * fact (n-1);
You can use fun for convenience (it's basically val rec):
fun addSix x = x + 6; fun square (x:real) = x * x; fun magnitude (x,y) = Math.sqrt(square x + square y); fun fact n = if n = 0 then 1 else n * fact (n-1);
You can use multiple patterns with fun too; it sure makes writing recursive functions easy. And pattern-based definitions are much preferred over if-expressions.
fun fact 0 = n | fact n = n * fact(n - 1); fun pow (0, _) = 1.0 | pow (n, x) = x * pow(n - 1, x); fun size Empty = 0 | size (Node (x, left, right)) = 1 + size left + size right; fun preorder Empty = [] | preorder (Node (x, left, right)) = [x] @ preorder left @ preorder right;
The functions fn (x,y) => x + y and fn x => fn y => x + y are similar. The former is the uncurried version; the latter is curried. Curried functions allow "partial application":
fun add x y = x + y; fun addSix = add 6; add 6 5; (* returns 11 *) addSix 5; (* also returns 11 *)
A higher order function is one that takes a function as a parameter and/or returns a function as its result. The curried plus function above is a higher order function, because it takes in an integer and returns a function.
Operators are really just functions that are given precedence and fixity in a infix, infixr or nonfix directive. These are some of the infix operators defined in the standard library. (Note that many of the built-in operators are overloaded, so the specification has "fake types" — wordint means int, word or word8; num means wordint or real; numtxt means num, char or string.)
| Precedence | Id | Type | Description | Associativity |
|---|---|---|---|---|
| 7 | * | num * num -> num | product | L |
| / | real * real -> real | floating-point quotient | L | |
| div | wordint * wordint -> wordint | quotient (rounded toward -infinity) | L | |
| mod | wordint * wordint -> wordint | remainder (of div) | L | |
| 6 | + | num * num -> num | sum | L |
| - | num * num -> num | difference | L | |
| ^ | string * string -> string | concatenation | L | |
| 5 | :: | 'a * 'a list -> 'a list | push to head of list | R |
| @ | 'a list * 'a list -> 'a list | append lists | R | |
| 4 | = | 'a * 'a -> bool | equal to | L |
| <> | 'a * 'a -> bool | not equal to | L | |
| < | numtxt * numtxt -> bool | less than | L | |
| <= | numtxt * numtxt -> bool | less than or equal to | L | |
| > | numtxt * numtxt -> bool | greater than | L | |
| >= | numtxt * numtxt -> bool | greater than or equal to | L | |
| 3 | := | 'a ref * 'a -> unit | assignment | L |
| o | ('b*'c) * ('a*'b) -> ('a*'c) | composition | L | |
| 0 | before | 'a * 'b -> 'a | return first argument | L |
And you can define your own:
fun |-*-| (x, y) = 2 * x + 3 * y; infix 5 |-*-|; 10 |-*-| 4; (* returns 32 *) 1 |-*-| 2 |-*-| 3; (* returns 25 *) 1 |-*-| (2 |-*-| 3); (* returns 41 *) 5 + 3 |-*-| 6 before 8; (* returns 34 *)
You can abuse the "built-in" operators:
- infix 2 *; > infix 2 * - 4 + 8 * 1 + 4; > val it = 60 : int - val op+ = op-; > val + = fn : int * int -> int - 5 + 2; > val it = 3 : int
Remember, useful programming languages provide modules for
structure Set =
struct
datatype 'a set = Items of 'a list;
val empty = Items [];
fun add (Items xs) x =
if List.exists (fn y => y = x) xs then Items xs else Items(x::xs);
fun remove (Items xs) x =
Items (List.filter (fn y => y <> x) xs);
fun cardinality (Items xs) = length xs;
fun contains (Items xs, x) = List.exists (fn y => y = x) xs;
end;
Example of using the structure
val s = Set.add Set.empty 7; val s = Set.add s 8; val s = Set.remove s 2; Set.cardinality s; Set.contains (s, 7); Set.Items [8,8,4]; (**** uggh ****)
The previous structure exposes too much information. We can use a signature to restrict the view. We should write:
signature SET =
sig
type 'a set;
val empty: 'a set;
val add: ''a set -> ''a -> ''a set;
val remove: ''a set -> ''a -> ''a set;
val cardinality: 'a set -> int;
val contains: ''a set * ''a -> bool;
end;
Then we say
structure Set :> SET =
struct
datatype 'a set = Items of 'a list;
val empty = Items [];
fun add (Items xs) x =
if List.exists (fn y => y = x) xs then Items xs else Items(x::xs);
fun remove (Items xs) x =
Items (List.filter (fn y => y <> x) xs);
fun cardinality (Items xs) = length xs;
fun contains (Items xs, x) = List.exists (fn y => y = x) xs;
end;
Now Set.Items doesn't exist! We can now even say
signature READ_ONLY_SET =
sig
type 'a set;
val empty: 'a set;
val cardinality: 'a set -> int;
val contains: ''a set * ''a -> bool;
end;
structure ReadOnlySet :> READ_ONLY_SET = Set;
now ReadOnlySet.empty exists, but ReadOnlySet.remove does not.
If we change the set representation from a list to a binary search tree, we would have to specify how the set elements were to be ordered. This "less than" relation is an input to the structure. The way to parameterize structures in SML is with functors.
functor MakeSet (eqtype t; val lt: t * t -> bool):
sig
type ''a set
val empty: t set
val add: t set -> t -> t set
val remove: t set -> t -> t set
val cardinality: t set -> int
val contains: t set * t -> bool
end =
struct
datatype ''a set = Empty | Node of ''a * ''a set * ''a set
val empty = Empty;
fun add Empty x = Node (x, Empty, Empty)
| add (s as Node(y, left, right)) x =
if (x = y) then s
else if lt(x, y) then Node(y, add left x, right)
else Node(y, left, add right x);
fun remove s x = raise (Fail "remove not implemented");
fun cardinality Empty = 0
| cardinality (Node(_, left, right)) =
1 + cardinality left + cardinality right;
fun contains (Empty, x) = false
| contains (Node(y, left, right), x) =
if lt(x, y) then contains(left, x)
else if lt(y,x) then contains(right, x)
else true;
end;
Then
structure IntSet = MakeSet(type t = int; val lt = op<); val s = IntSet.add IntSet.empty 8; IntSet.contains(s, 7); ...
The following structures comprise the Standard Basis Library
| Array | mutable constant-time-access arrays |
| Array2 | two-dimensional arrays |
| Binaryset | binary tree implementation of finite sets |
| Bool | Booleans |
| Byte | character-byte conversion |
| Char | characters |
| CharArray | arrays of characters |
| CharVector | vectors of characters (= strings) |
| CommandLine | program name and arguments |
| Date | manipulation of calendar dates |
| FileSys | interaction with the file system |
| General | various top-level primitives |
| Int | operations on integers |
| List | classic list manipulation functions |
| ListPair | operations on pairs of lists |
| Math | trigonometric functions etc. |
| OS | operating system information |
| Option | partial functions |
| Path | file-system independent path manipulation |
| Process | manipulating processes |
| Real | arithmetics on floating-point numbers |
| Signal | Unix signals |
| SML90 | top-level compatibility with SML'90 |
| String | string manipulation |
| StringCvt | conversion to and from strings |
| Substring | manipulation of constant-time substrings |
| TextIO | text input-output streams (imperative) |
| Time | time points and durations |
| Timer | measuring real time and cpu time |
| Unix | starting concurrent subprocesses under Unix |
| Vector | immutable constant-time-access vectors |
| Word | words (31-bit unsigned integers) |
| Word8 | bytes (8-bit unsigned integers) |
| Word8Array | arrays of bytes |
| Word8Vector | vectors of bytes |
Different installations will add many more structures. For example Moscow ML adds Arraysort, Binarymap, BinIO, Dynarray, Dynlib, Intset, Meta, Mosmlcgi, Mysql, Regex, Signal, Splaymap, Gdimage, among others.
The Moscow ML Library Reference is online.