Squid stands for Simple Quadruple-based Intermediate Description. It is a fairly useful and extensible intermediate language for imperative, block-structured languages.
A Squid description consists of a collection of subroutines, each of which is either an internal subroutine or an external subroutine. A subroutine has a list of tuples. Tuples consist of an operator and zero to three arguments. Arguments are literals, variables, temporaries, subroutines, or labels.
The set of Squid tuples is not fixed. It consists of a basic set described below, but anyone can add more if desired.
| Tuple | Rendered As... | Description |
|---|---|---|
| (COPY, x, y) | y := x | Simple copy |
| (COPY_FROM_DEREF, x, y) | y := *x | Copy the contents of memory at address x into y |
| (COPY_TO_DEREF, x, y) | *y := x | Copy x into the memory at address y |
| (COPY_FROM_OFS, x, y, z) | z := *(x+y) | Copy the contents of memory at address x+y into z |
| (COPY_TO_OFS, x, y, z) | *(y+z) := x | Copy x into the memory at address y+z |
| (ADD, x, y, z) | z := x + y | Sum |
| (SUB, x, y, z) | z := x - y | Difference |
| (MUL, x, y, z) | z := x * y | Product |
| (DIV, x, y, z) | z := x / y | Quotient |
| (MOD, x, y, z) | z := x mod y | Modulo |
| (REM, x, y, z) | z := x rem y | Remainder |
| (SHL, x, y, z) | z := x << y | Left shift |
| (SHR, x, y, z) | z := x >> y | Logical right shift |
| (SAR, x, y, z) | z := x >> y | Arithmetic right shift |
| (AND, x, y, z) | z := x & y | Bitwise conjunction |
| (OR, x, y, z) | z := x | y | Bitwise disjunction |
| (XOR, x, y, z) | z := x xor y | Bitwise exclusive or |
| (NOT, x, y) | y := !x | Logical complement |
| (NEG, x, y) | y := -x | Negation |
| (COMP, x, y) | y := ~x | Bitwise complement |
| (ABS, x, y) | y := abs x | Absolute value |
| (SIN, x, y) | y := sin x | Sine |
| (COS, x, y) | y := cos x | Cosine |
| (ATAN, x, y, z) | z := atan x, y | Arctangent |
| (LN, x, y) | y := ln x | Natural logarithm |
| (SQRT, x, y) | y := sqrt x | Square root |
| (INC x) | inc x | Increment |
| (DEC, x) | dec x | Decrement |
| (INC_DEREF x) | inc *x | Increment a memory location given its address |
| (DEC_DEREF, x) | dec *x | Decrement a memory location given its address |
| (LT, x, y, z) | z := x < y | 1 if x is less than y, 0 otherwise |
| (LE, x, y, z) | z := x <= y | 1 if x is less than or equal to y, 0 otherwise |
| (EQ, x, y, z) | z := x == y | 1 if x is equal to y, 0 otherwise |
| (NE, x, y, z) | z := x != y | 1 if x is not equal to y, 0 otherwise |
| (GE, x, y, z) | z := x >= y | 1 if x is greater than or equal to y, 0 otherwise |
| (GT, x, y, z) | z := x > y | 1 if x is greater than y, 0 otherwise |
| (LABEL, L) | L: | Label |
| (JUMP, L) | goto L | Unconditional jump to a label |
| (JZERO, x, L) | if x == 0 goto L | Jump if zero / Jump if false |
| (JNZERO, x, L) | if x != 0 goto L | Jump if zero / Jump if true |
| (JLT, x, y, L) | if x < y goto L | Jump if less / Jump if not greater or equal |
| (JLE, x, y, L) | if x <= y goto L | Jump if less or equal / Jump if not greater |
| (JEQ, x, y, L) | if x == y goto L | Jump if equal |
| (JNE, x, y, L) | if x != y goto L | Jump if not equal |
| (JGE, x, y, L) | if x >= y goto L | Jump if greater or equal / Jump if not less |
| (JGT, x, y, L) | if x > y goto L | Jump if greater / Jump if not less |
| (STARTCALL, f) | startcall f | Begin the call to function f: a STARTCALL tuple should be followed by zero or more PARAM tuples and then a CALLP or CALLF tuple. This operation is provided for languages that allow nested functions and normally pass static links before parameters. |
| (PARAM, x) | param x | Pass x as an argument |
| (CALLP, f, n) | call f, n | Call procedure (non-value returning function) f, using the last n bytes passed as the arguments |
| (CALLF, f, n, x) | x := call f, n | Call function f, using the last n bytes passed as the arguments, catching return value in x |
| (RETP) | ret | Return from procedure |
| (RETF, x) | ret x | Return x from this function |
| (INT_TO_STRING, x, y) | y := to_string x | String representation of an integer (one can add a new tuple to take a base) |
| (FLOAT_TO_STRING, x, y) | y := to_string x | String representation of a float (one can add a new tuple taking a format string) |
| (BOOL_TO_STRING, x, y) | y := to_string x | String representation of a boolean (localized?) |
| (CHAR_TO_STRING, x, y) | y := to_string x | String representation of a character |
| (ALLOC, x, y) | y := alloc x | Allocate x bytes of memory, returning the address in y (or 0 if the memory could not be allocated). |
| (TO_FLOAT, x, y) | y := to_float x | Integer to real |
| (NULL_CHECK, x) | assert x != 0 | Do something if x is null |
| (ASSERT_POSITIVE, x) | assert x > 0 | Do something if x is not positive |
| (BOUND, x, y, z) | assert y <= x < z | Do something if x is not between y (inclusive) and x (inclusive) |
| (NO_OP) | nop | Do nothing |
| (EXIT) | exit | terminate the program |
Squid is designed to be pretty flexible, so some aspects of this definition are left unspecified. For example, what happens if an assertion fails? That depends on the source language; therefore, a given translator will fill in these details.
Tuple arguments in Squid are limited to
By non-structured we mean items that fit into a single fixed-size machine unit. Strings, arrays, and structs are always referenced through a pointer.
Squid does not specify the kinds or sizes of the value data types. It assumes, of course, integers and floating-point values, but allows for a several different types of various sizes of these. A specific translation would define these types, and a back end would arrange for type conversions. For example:
(ADD, x, i, y)
where x and y are floating point and i is an integer, is an allowable tuple, and would be turned into backend code that explicitly or implicitly convertes i to a floating-point value (of the right size) before doing a floating-point addition.
Similarly, a particular translation may include unsigned
integer types (as one would in C). A backend has to take care of
these cases. For example, with an x86 backend, a JGE tuple would
become a jge instruction with signed arguments, but
a jae with unsigned arguments.
Because the set of data types is not specified in Squid, many tuples take arguments that represent sizes in bytes. It is up to a specific translation to compute these values.
Squid has integer, floating-point, and string literals. Enumeration, character, and boolean literals in a source language should be turned into simple integer literals by a front-end.
Squid has no other literals, such as for arrays or structs.
Squid variables represent run-time storage locations. Squid does not care if these locations are in static memory, in a stack frame, or the heap, but it does keep track of the static nesting level for a variable, which is useful for languages with arbitrarily nested subroutines. In languages without subroutine nesting, such as C, global varaibles would have level 0 and all others would have level 1.
Squid allows two types of variables, integer variables and floating point variables, since most hardware makes the same distinction. As of right now, these are the only two variable types. Any further interpretation is up to the compiler front or back end using Squid.
Temporaries differ from variables in that variables come from source language variables that are actually declared, while temporaries are the result of evaluating expressions. For example, if a source language featured the expression
x + y * z
the corresponding Squid tuple sequence would be:
(MUL, y, z, r0) (ADD, x, r0, r1)
where r0 and r1 are temporaries. Temporaries come in two flavors: value temporaries and address temporaries. An address temporary would arise from evaluating an expression like:
a[i].x
since the address of the corresponding storage location has to be computed at run-time (assuming the value of i is not known at compile-time).
Value temporaries can be either integer or floating-point, while address temporaries are always themselves integer-valued, since their value is an address, but internally they track whether they are referencing an integer or floating-point object.
Squid distinguishes between subroutines defined in the current translation unit (user subroutines) and those defined elsewhere (external subroutines).
A user subroutine denotes a function in the current translation unit. It contains the following properties:
External subroutines represent top-level functions callable by the current translation unit but defined elsewhere. They are known only by name. Example:
print(x+4); |
add i0, 4, r3
param r3
param s0
call __print, 8 |
Labels are used to hold the structured control graph in a linearized list of tuples.
while (x < 3) {
y = 10;
x = x * 5;
}
|
L1:
if i0 >= 3 goto L2
i1 := 10
r3 := i0 * 5
i0 := r3
goto L1
L2:
|
A simple example:
string f(real i) {
return "dog" unless i >= 3.3;
}
string test = f(cos(2.2));
|
p0: ; level=0 parent=null params=[] locals=[i0] f0 := cos 2.2 startcall p1 f1 := cos 2.2 param f1 r1 := call p1, 8 i0 := r1 exit p1: ; level=1 parent=p0 params=[x1] locals=[] r0 := x1 >= 3.3 if r0 != 0 goto L0 ret s0 L0: call __terminate_thread, 0 s0: [100, 111, 103] |
An example with heap-allocated arrays and structures:
int[] a;
a[6] = 20;
real x = a[80];
a = new int[]{4, 2, 1};
struct Point {
int x; int y;
}
Point[] polygon;
print($ polygon[5].y);
|
p0: ; level=0 parent=null params=[] locals=[i0, x1, i1] assert i0 != 0 r0 := *(i0+-4) assert 0 <= 6 < r0 r1 := 6 * 4 r2 := i0 + r1 *r2 := 20 assert i0 != 0 r3 := *(i0+-4) assert 0 <= 80 < r3 r4 := 80 * 4 r5 := i0 + r4 r6 := *r5 x1 := r6 r7 := alloc 16 *r7 := 3 r7 := r7 + 4 *(r7+0) := 4 *(r7+4) := 2 *(r7+8) := 1 i0 := r7 assert i1 != 0 r8 := *(i1+-4) assert 0 <= 5 < r8 r9 := 5 * 4 r10 := i1 + r9 r11 := *r10 assert r11 != 0 r12 := r11 + 4 r14 := *r12 r13 := int_to_str r14 param r13 call __print, 4 exit |
An example with threads:
void f() {
for (int i = 0; i < 20000; i++) {
print ($i + " ");
}
}
thread t1 = start f();
thread t2 = start f();
join(t1);
join(t2);
|
p0: ; level=0 parent=null params=[] locals=[i1, i2] param 0 param p1 r3 := call __create_thread, 8 i1 := r3 param 0 param p1 r4 := call __create_thread, 8 i2 := r4 param i1 call __join, 4 param i2 call __join, 4 exit p1: ; level=1 parent=p0 params=[] locals=[i0] i0 := 0 L0: if i0 >= 20,000 goto L1 r1 := int_to_str i0 param s0 param r1 r2 := call __concat_string, 8 param r2 call __print, 4 inc i0 goto L0 L1: ret s0: [32] |