CMSI 386
Homework #6

Readings: [Scott], Chapter 1 and Chapter 2, up through the section on specifying syntax. Skim the section on scanning, enough so that you can get a good feel for what a finite state machine is. Consider the "Check Your Understanding" questions as part of your reading assignment: work them out but don't turn in your answers.

  1. Give examples (in any language you like) of each of the following:
    1. A lexical error
    2. A syntax error
    3. A static semantic error
    4. A dynamic semantic error detected by code generated by the compiler
    5. An error that the compiler cannot catch nor easily generate code to catch (a real error, not a program bug)
  2. Consider these two C functions.
    unsigned gcd1(unsigned i, unsigned j) {
        while (i != j) {
            if (i > j) i = i - j; else j = j - i;
        }
        return i;
    }
    
    unsigned gcd2(unsigned i, unsigned j) {
        while (i != j) {
            if (i > j) i = i % j; else j = j % i;
        }
        return i;
    }
    
    Is the second one correct? If not, show how to fix it. Under what circumstances would you expect one or the other to be faster?
  3. Write, from scratch, Ruby and C versions of map and reduce, together with a few interesting unit tests.
  4. For the following four sets of strings
    1. Strings of "8-bit" characters beginning and ending with a double quote character, that do not contain control characters, and for which the backslash is used to escape the next character. (These are similar to, but not exactly the same as, string literals in C.)
    2. The paren-star form of comments in Pascal: strings beginning with (* and ending with *) that do not contain *). Note comments "don't nest".
    3. Floating-point constants in Ada (you'll have to look this one up yourself)
    4. All non-empty sequences of letters other than "read", "red", and "real"
    do the following Unit tests are required.
  5. Based on your work in the previous problem on strings that are not "red", "real", or "read", comment on the practicality of using regular expressions for defining the set of allowed identifiers in a real programming language. (Identifiers are generally alphanumeric strings other than the keywords.)
  6. Write an unambiguous grammar for the language of all well-formed classic regular expressions (those involving only symbols, alternation, Kleene star and catenation). Assume the existence of a token called SYMBOL. Make all operators left-associative. Give Kleene closure the highest precedence and alternation the lowest precedence.
  7. The language Pascal has an ambiguous grammar; a fragment of which follows:
        STMT -> ASSIGNMENTSTMT | CALL | IFSTMT | WHILESTMT | 'begin' STMT* 'end'
        IFSTMT -> 'if' EXP 'then' STMT ('else' STMT)?
    
    1. Show two parse trees for "if e1 then if e2 then s1 else s2"
    2. Rewrite this grammar fragment to remove the ambiguity by disallowing if statements immediately after a "then" (as in Algol 60). Make sure you allow "if e1 then begin if e2 then s1 end else s2" as well as "if e1 then begin if e2 then s1 else s2 end"
  8. The syntax for type casts in C and its descendants introduces potential ambiguity: is (x)-y a subtraction, or the unary negation of y cast to type x? How do C, C++, and Java deal with this potential ambiguity?
  9. Design the syntax (in EBNF) of a small language consisting (only) of

Your sources should be organized as follows:

homework
  cmsi386
    src
      main
        docs
          hw4.tex
          <diagram source files>, if any
          <images>, if any
        c
          mapreduce.c (Problem 3)
        java
          edu
            lmu
              cs
                PatternUtils.java (Problem 4)
        ruby
          pattern.rb (Problem 4)
          mapreduce.rb (Problem 3)
      test
        c
          mapreducetest.c (Problem 3)
        java
          edu
            lmu
              cs
                PatternUtilsTest.java (Problem 4, JUnit test)
        ruby
          mapreducetest.rb (Problem 3)

Your LaTeX file will contain solutions to each problem, numbered, and in order. You may embed source code in the document, or give the answer as "See <filename>" and submit the sources for these answers at the end of the document. Generate a pdf from the LaTeX file and hand in a printed copy.

Notes