CMSI 488/588
Homework #3
Partial Answers
  1. Most C compilers represent ints in 32 bit twos-complement notation and represent doubles with the IEEE 754 64-bit representation. The latter has 52 bits of mantissa, and thus can represent a vastly larger range of integers than the int type (more than two million times as many). Programmers that need to count beyond the billions might want to use doubles.
  2. On a processor with a sign flag N, an overflow flag O, and a zero flag Z, we can compute x op y by doing x + (-y) and examining the flags as follows:

    OperationSignedUnsigned
    <N xor ON
    (N xor O) or ZN or Z
    =ZZ
    not Znot Z
    >not Z and not(N xor O)not N and not Z
    not(N xor O)not N

    This answer is done under the most reasonable assumption that the processor has a single ADD instruction that sets flags according to a signed addition interpretation. If you had a processor that distinguished between signed and unsigned additions, then it would never set the N flag and would use O to signal that an unsigned addition caused an overflow. Replacing N with O throughout the unsigned case, therefore, is an acceptable answer, but not the most realistic.

  3. Given
        1. r1 := K
        2. r4 := &A
        3. r6 := &B
        4. r2 := r1 × 4
        5. r3 := r4 + r2
        6. r3 := *r3
        7. r5 := *(r3 + 12)
        8. r3 := r6 + r2
        9. r3 := *r3
       10. r7 := *(r3 + 12)
       11. r3 := r5 + r7
       12. S := r3
    

    This code is adding the two structure fields at offset 12 within arrays A and B.

    Flow dependencies are

    Anti-dependencies are

    Output dependencies are

    Load delays occur after instructions 6, 9, and 10. Only the first can be filled, like this:

        1. r1 := K
        2. r4 := &A
        3. r2 := r1 × 4
        4. r3 := r4 + r2
        5. r3 := *r3
        6. r6 := &B
        7. r5 := *(r3 + 12)
        8. r3 := r6 + r2
        9. r3 := *r3
       10. r7 := *(r3 + 12)
       11. r3 := r5 + r7
       12. S := r3
    

    To fill the one after instruction 9, we realize that we should rename r3 to r8 for all the instructions from line 8 down. This means that instruction 7 can move down to fill the slot.

        1. r1 := K
        2. r4 := &A
        3. r2 := r1 × 4
        4. r3 := r4 + r2
        5. r3 := *r3
        6. r6 := &B
        7. r8 := r6 + r2
        8. r8 := *r8
        9. r5 := *(r3 + 12)
       10. r7 := *(r8 + 12)
       11. r8 := r5 + r7
       12. S := r8
    
  4. Given a misprediction penalty of 3 cycles, with correctly predicted branches incurring no penalty, and 20% of instructions being conditional branches:

    1. 75% of branches are predicted correctly. 25% are mispredicted. 25% of the 20% add three cycles. That is, 5% of the instructions have 3 extra cycles.
    2. 90% of branches are predicted correctly. 10% are mispredicted. 10% of the 20% add three cycles. That is, 2% of the instructions have 3 extra cycles.
    3. Suppose cycles per second is 1.5 with perfect prediction. With 75% prediction, 5% of the instructions take an average of 4.5 cycles. (95*1.5 + 5*4.5)/(100*1.5) = 1.1, meaning the code takes 10% longer..
    4. Suppose cycles per second is 0.6 with perfect prediction. With 75% prediction, 5% of the instructions take an average of 3.6 cycles. (95*0.6 + 5*3.6)/(100*0.6) = 1.25, meaning the code takes 25% longer..
    5. Mispredictions have a much larger impact on superscalar processors.
  5. C as an intermediate language? An excellent summary of advantages and disadvantages appears in http://c2.com/cgi/wiki?CeeAsAnIntermediateLanguage. To summarize the disadvantages:
    1. You're TOO FAR from the processor! How do you get to the carry and overflow flags, or any flags for that matter? Being able to dig down low might be very beneficial for certain constructs in your cool high-level language but there might be no way to express them at all in C, e.g. a vector-operator which might map nicely to vector hardware.
    2. Flat namespace; you have to implement your own name-mangling if your source language has overloading.
    3. Dumb linkers that can't handle long names (not really a big problem these days)
    4. Before C99, sizes and alignments of standard machine types were NOT specified by the language. If your language requires an N-bit integer type, how would you specify that in C?
    5. Concurrency, synchronization, exceptions, coroutines, iterators, (efficient) garbage collection, etc. missing from the language.

    Most of these disadvantages are not terrible. The first might be the most severe but is overcome by using a version of C that allows embedded assembly language. Points b), c), and e) are problems with raw assembly language as well, so "so what?" For d), just use C99.

  6. Answers vary. Interesting observations include (1) a jitter might avoid exhaustive-search type exponential-time optimizations since that would cause nasty pauses in an application at runtime, and (2) a jitter can be set up to recompile code that isn't performing up to expectations. Good stuff here.

  7. I love this problem. Here's the insight: graph objects are (references to) nodes. Nodes contain sequences of instructions. The test instruction has the form

        test reg trueGraph falseGraph
    

    where trueGraph and falseGraph are (obviously) node references; if you draw pictures, draw them as pointers. The goto instruction has the form

        goto Graph
    

    Again, draw it as a pointer. Also, write [i] to denote the graph node with a single instruction i, and [] for the node with a zero-length sequence of instructions. Finally, introduce the notation

        g1 ^ g2
    

    to mean the graph whose instruction sequence is all of g1's instructions followed by g2's.

    This problem isn't so bad if you've been trained to think functionally, but it is still very difficult to get right. Knowing where and when to use the goto instructions makes all the difference. (Note how the goto and the test instructions "finish off" a node. Here's the grammar (oh, notice I used "nfr" for next_free_reg, which is strange because I usually hate abbreviation.... Go figure! And I simplified the 'stp' thing too.):

        global reg_names = ["r1", "r2", ..., "rk"]
    
        program -> id stmt
            stmt.nfr := 0
            program.name := id.name
            program.graph := stmt.graph
    
        assign: stmt1 := id expr stmt2
            expr.nfr := stmt2.nfr := stmt1.nfr
            stmt1.graph := expr.graph ^ [id.name := expr.reg] ^ stmt2.graph
    
        if: stmt1 -> expr stmt2 stmt3 stmt4
            expr.nfr := stmt2.nfr := stmt3.nfr := stmt4.nfr := stmt1.nfr
            stmt1.graph :=
                expr.graph ^
                    [test expr.reg
                        stmt2.graph^[goto stmt4.graph]
                        stmt3.graph^[goto stmt4.graph]]
    
        while: stmt1 -> expr stmt2 stmt3
            expr.nfr := stmt2.nfr := stmt3.nfr := stmt1.nfr
            stmt1.graph := [goto g]
                where g =
                    expr.graph ^
                        [test expr.reg stmt3.graph stmt2.graph^[goto g]]
    
        null: stmt ->
            stmt.graph := []
    
    The rest of this problem is straightforward; change "code" to
    "graph", "+" to "^", "next_free_reg" to "nfr" and so on.
    
  8. The trick here is to do exactly what it says to do in the text: give each boolean expression two inherited attributes (called, for example, trueloc and falseloc) saying where to jump to if the expression evaluates to true and where to jump if false. Of course there are variations: sometimes you want to fall through, and sometimes you need to save the value of the expression anyway.

    To get started, let's look at the simple case where we don't worry about spilling and don't worry about saving the result.

        if : stmt1 -> expr stmt2 stmt3 stmt4
            expr.nfr := stmt2.nfr := stmt3.nfr := stmt4.nfr := stmt1.nfr
            THEN_LABEL := new_label(); ELSE_LABEL := new_label(); END_LABEL := new_label()
            expr.trueloc := THEN_LABEL
            expr.falseloc := ELSE_LABEL
            expr.do_not_save := true
            stmt1.code := expr.code
                + [THEN_LABEL ":"] + stmt2.code + ["goto" END_LABEL]
                + [ELSE_LABEL ":"] + stmt3.code + [END_LABEL ":"]
                + stmt4.code
    
    
        > : expr1 -> expr2 expr3
            expr1.code := expr2.code + expr3.code
                + ["if expr2.reg "<=" expr3.reg "goto" expr1.falseloc]
                + ["goto" expr1.trueloc]
    
        -- The other relational operators are analagous
    
        or : expr1 -> expr2 expr3
            expr1.code := expr2.code
                + ["if" expr2.reg "goto" expr1.trueloc]
                + expr3.code
                + ["if !" expr3.reg "goto" expr1.falseloc]
                + ["goto" expr1.trueloc]
    
        and : expr1 -> expr2 expr3
            expr1.code := expr2.code
                + ["if !" expr2.reg "goto" expr1.falseloc]
                + expr3.code
                + ["if !" expr3.reg "goto" expr1.falseloc]
                + ["goto" expr1.trueloc]
    

    Integrating saving and spilling should be done as part of this assignment, too. Also, knowing whether to test for false or true could be sent into an inherited attribute. There are many ways to do all this.

  9. A Prime Number Application TODO
  10. Hana analyze methods

    For StartExpression

        public void analyze(Log log, SymbolTable table) {
    
            // Analyze all the arguments
            for (Expression e: args) {
                e.analyze(log, table);
            }
    
            // Find out which function we're calling.  If it turns out
            // to be null, an error will be logged anyway, but there's
            // nothing that has to be done here.
            function = table.lookupFunction(functionName, args, log);
    
            type = Type.THREAD;
        }
    

    For DieStatement

        public void analyze(Log log, SymbolTable table, Function f, boolean inLoop) {
            argument.analyze(log, table);
            if (argument.type != Type.STRING) {
                log.error("die_arg_must_be_string");
            }
        }
    

    For UnlessStatement

        public void analyze(Log log, SymbolTable table, Function f, boolean inLoop) {
            condition.analyze(log, table);
            condition.assertBoolean("unless_condition_not_boolean", log);
            body.analyze(log, table, f, true);
        }
    

    For SliceVariable

        public void analyze(Log log, SymbolTable table) {
            array.analyze(log, table);
            low.analyze(log, table);
            high.analyze(log, table);
    
            array.assertArrayOrString("[...]", log);
            low.assertInteger("[...]", log);
            high.assertInteger("[...]", log);
            type = array.type;
        }