============================= CMSI 488 Course Review and Preparation for the Final ============================= CMSI 386 was about programming language design and specification. This class, CMSI 488, is about programming language implementation and translation. Course Objectives * Improve understanding of programming languages * Understand implications of language design on performance and efficiency * Learn more about what is under the hood * Improve software development skills via lots of practice writing and deplying a complex app (using Eclipse, ant, cvs) * Learn about cool tools like JavaCC Why study compilers? * Writing a compiler helps you understand the compilers you use every day * Everyone writes "little" translators all the time, like XML parsers or code that reads little configuration files that are part of bigger apps * Knowing what compilers do and how they work can help you write more efficient code since you know what a compiler is capable and not capable of optimizing * A compiler makes an awesome capstone project The topics we covered, as an outline What is a compiler Translators vs. interpreters T-diagrams Overall structure of translation Analysis Lexical (characters to tokens) Syntax (tokens to ASTs) Semantic (decoration of ASTs) Generation Production of abstract assembly language Machine independent optimization Generation of real assembly language Address assignment Instruction selection Register allocation Low-level optimization Language Description Aspects Syntax (structure) Semantics (meaning) Pragmatics (usage) Theoretical Tools Language Theory Grammars String Grammars Tree Grammars Ambiguity and inherent ambiguity Examples Iki Carlos Hana Syntax Context Freedom vs. Context Sensitivity The need for micro vs. macro syntax Specification Regexes CFG BNF EBNF (and is various flavors) Syntax Diagrams Concrete vs. Abstract Syntax Differences between parse trees and ASTs Multiple concrete syntaxes for a given abstract syntax Semantics Static vs. Dynamic semantics What is syntax and what is static semantics? Specifying semantics Translation Issues in lexical analysis Issues in parsing Two systems LL - top/down - expand/match (e.g. JavaCC) LR - bottom/up - shift/reduce (e.g. Lex/Yacc) Recovery from parsing errors (not covered in this class) Attributes for (both string and tree) grammars (IMPORTANT) Specifying attribute evaluation rules Semantic function approach (tag grammar rules) Action routine approach (embed evaulation code) JavaCC Intermediate Representations Why have them? Analysis/Synthesis is inherent to translation Break down complex problem Retargetability For machine independent optimizations High-level vs. Medium-level vs. Low-level Styles Semantic graph "Abstract assembly language" (instructions called tuples) Control flow graph (Basic blocks with tuple sequences) One big tuple sequence with jump tuples Stack code List of well-known IRs Case study: Squid Architecture How machines work (review) IA-32 case study Review of NASM (esp. calling conventions, floating-point) Code Generation Goals Approaches Naive Interpretive CGG (not covered in this class) Understanding the runtime system for block-structured languages Stack frames Dynamic links Static links Register save area Register spilling Code Optimization Machine independent vs. machine dependent Constant folding Strength reductions Algebraic simplifications Operand reordering Unreachable code elimination Dead code elimination Copy propagation CSE Loop unrolling Special purpose instructions e.g. muladd, range, conditional jump Loop invariant factoring Tail recursion elimination Induction variable simplification Static frame allocation Stack frame simplification Low-level optimizations Special instructions Alignment Cache Removing conditional jumps Scheduling to remove load delays and similar things To prepare for the final, make sure you know how to: * Write something in JavaCC * Make ASTs * Write NASM functions that do floating point computations * Determine whether certain things are lexical errors, syntax errors, static semantic errors, dynamic semantic errors, or not errors. * Implement short-circuit operators efficiently. * Identify "properties" of given grammars. * Write attribute grammars. * Use and employ strength reductions. * Write regular expressions * Write JavaCC grammars.