Introduction to Compilers

It would appear a lot of people drop the term compiler without a real sense of what one is. Let's look at compilers and their relationship to interpreters. And let's see why anyone would care about compilers at all.

Programs, Interpreters and Translators

A program is something that computes:

program.png

An interpreter is a program whose input is a program P and some other data x; it "executes" P on x:

interpreter.png

That is, I(P,x) = P(x). The fact that a single interpreter exists that is capable of executing all possible programs (Turing-computable functions) is a very deep and significant discovery of Turing's.

Note that a CPU is, essentially, an interpreter for its machine language programs.

A translator is a program which takes as input a program in some language S and outputs a program in a language T such that the two programs have the same "semantics."

translator.png

where forall x, P(x) = Q(x).

Some translators have really strange names

Assembler
Translates assembly language programs into machine language programs
Compiler
Translates programs in a "high-level" language into programs in a lower-level language

We need translators because not every language has an interpreter. Sometimes we translate (compile) source code all the way down to something interpretable, such as machine language, for example:

staticcompilation.png

However, there are also languages that allow you to execute code during compilation time and compile new code at run time.

Exercise: Give examples of compilation during runtime and execution during compile time.

Translator Notation

There're three languages involved in translation:

transnotation.gif

Exercise: Explain how to use a self-compiler and a cross-compiler to port a compiler to an architecture that has no compiler.
Exercise: A compiler is called an optimizing compiler if it works really hard to produce the most efficient target code that is feasibly possible. What can you use an optimizing self-compiler for?

Translation Phases

Translation consists of analysis and synthesis phases.

analsyn.png

There are two reasons for this:

Exercise: Give the most outrageous example you can think of where word-for-word translation between two natural languages comes out "so wrong".
Exercise: Try out the site Lost in Translation. Try at least ten sentences. Use a lot of English language idioms, like "There's no such thing as a free lunch" and "He had an ace up his sleeve". Bring your three favorite translation "disasters" to the next class to share.

Why Study Compilers

Even if you never write a complete production compiler for a conventional programming language in order to generate assembly or machine language, compiler technology is good to know because its concepts are used everywhere, for example:

Also, if you need to write preprocessors, linkers, loaders, debuggers, or profiles, you go through many of the same tasks you do in writing a compiler.

Also, you learn how to write better programs since writing a compiler for a language means you better understand its intricacies and obscurities. Studying general principles of translation also makes you a better language designer. Does it really matter how "cool" your language is if it can't be implemented efficiently?

Compiler technology spans many different areas of computer science:

Also keep this in mind:

A compiler course is really just a Programming Languages II course. In Programming Languages I you learn about specifying and using languages. In Programming Languages II you learn about designing and implementing languages. Implementation is interpretation and translation.

Designing a Compiler

Some issues in designing a real compiler: