ELSE IFS

Motivation

Nearly every student of programming languages encounters the so-called dangling else problem. That is, the code

if e1 then if e2 then s1 else s2

can seem to mean one of two possible things:

danglingelse.png

Similar ambiguities occur with statements like

if e1 then while e2 do if e3 then s1 else s2

This problem arises in languages with (ambiguous) grammars like Pascal's:

STMT      -> ASSIGNMENT | IFSTMT | WHILESTMT | BLOCK | ...
IFSTMT    -> 'if' EXP 'then' STMT ('else' STMT)?
WHILESTMT -> 'while' EXP 'do' STMT
BLOCK     -> 'begin' STMT* 'end'

or C's:

STMT      -> ASSIGNMENT | IFSTMT | WHILESTMT | BLOCK | ...
IFSTMT    -> 'if' '(' EXP ')' STMT ('else' STMT)?
WHILESTMT -> 'while' '(' EXP ')' STMT
BLOCK     -> '{' STMT* '}'

There are many ways to deal with this problem, but most lead to rather unsatisfactory ways to handle the true multi-way conditional, We want a design that has no dangling else problems and allows — no, requires — a nice multiway conditional statement.

Existing Solutions

Here are seven existing approaches:

1) Resolve with a semantic rule

In Pascal and C, the syntax is ambiguous, but a semantic rule says that dangling elses match the closest if, regardless of indentation! (This can be very confusing to beginners.) So,

(* Without blocks, 'else'
tied to second 'if' *)
if cond1 then
    if cond2 then
        stmt1
else (* Goes w/ cond2 *)
    stmt2
(* MUST use block to tie
'else' to first 'if' *)
if cond1 then begin
    if cond2 then
        stmt1
end else (* Goes w/ cond1 *)
    stmt2

Purists will probably hate how this solution relies on a semantic rule to handle something so inherently structural! Anyway, let's see how multiway conditionals look:

if cond1 then
    stmt1
else if cond2 then
    stmt2
else if cond3 then
    stmt3
else if cond4 then
    stmt4
else
    stmt5
 
if cond1 then begin
    stmts1
end else if cond2 then begin
    stmts2
end else if cond3 then begin
    stmts3
end else if cond4 then begin
    stmts4
end else begin
    stmts5
end

Kind of ugly. But it can be a lot worse: programmers often mix up their conditional arms by sometimes using blocks and sometimes not, and get the ugly code they deserve. The worst part of the dangling else syntax is that there seem to be hundreds of different ways to format multiway if statements. Take a look at some old Pascal textbooks if you want to be grossed out.

2) Complicate the grammar to remove the ambiguity

With a lot of effort ... TODO

This grammar is no longer ambiguous, and every 'else' is automatically connected to the nearest 'if'. But it does nothing to prevent the confusion arising from misindented code. Java uses this approach.

3) Require the 'else' part

If there always has to be an else part, there is never any ambiguity, but you'll have funny looking code whether your empty statement is blank or a special null keyword.

if cond1 then
    if cond2 then
        stmt1
    else
        null
else
    stmt2
if cond1 then
    if cond2 then
        stmt1
    else
        stmt2
else
    null

Not pretty. It works, though, even if the while statement (and other compound statements) aren't fully bracketed, for example:

if e1 then while e2 do if e3 then s1 else null else s2

if e1 then while e2 do if e3 then s1 else s2 else null

4) Require bracketing

Ultimately, languages that have compound statements where the bodies are "trailing statements" leave too many formatting choices and look ugly and unbalanced to many people. Defining a language's syntax to require bracketed compound statements is a Good Thing for that reason, and as a bonus it removes the dangling else problem completely. The idea is that a block should not be a kind of statement, and that all compound statements use blocks for bodies, never simple statements! This grammar fragment:

STMT      -> ASSIGNMENT | IFSTMT | WHILESTMT | ...
IFSTMT    -> 'if' '(' EXP ')' BLOCK ('else' BLOCK)?
WHILESTMT -> 'while' '(' EXP ')' BLOCK
BLOCK     -> '{' STMT* '}'

yields code like

if (cond1) {
    if (cond2) {
        stmts1
    } else {
        stmts2
    }
}
if (cond1) {
    if (cond2) {
        stmts1
    }
} else {
    stmts2
}

."Bracketing" can also be done with just a terminating 'end' instead of curly braces or begin-end pairs:

STMT      -> ASSIGNMENT | IFSTMT | WHILESTMT | ...
IFSTMT    -> 'if' EXP 'then' STMT+ ('else' STMT+)? end
WHILESTMT -> 'while' EXP 'do' STMT+ 'end'

yielding rather clean code like this:

if cond1 then
    if cond2 then
        stmts1
    else
        stmts2
    end
end
if cond1 then
    if cond2 then
        stmts1
    end
else
    stmts2
end

But, wait, if we really require bracketing, won't that make multiway conditionals ugly? Either you start indenting too much or you get a bunch of "}"s (or ENDs) at the end. Like this, right?

if cond1 {
    stmts1
} else {
    if cond2 {
        stmts2
    } else {
        if cond3 {
            stmts3
        } else {
            if cond 4 {
                stmts4
            } else {
                stmts5
            }
        }
    }
}
if cond1 {
    stmts1
} else { if cond2 {
    stmts2
} else { if cond3 {
    stmts3
} else { if cond4 {
    stmts4
} else {
    stmts5
}}}}

5) Make indentation matter

In Python, there is no dangling else problem since the indentation makes things clear:

if cond1:
    if cond2:
        stmts1
    else
        stmts2
if cond1:
    if cond2:
        stmts1
else
    stmts2

But, in general, required indentation might lead to funny looking code in multiway conditionals:

if cond1:
    stmts1
else:
    if cond2:
        stmts2
    else
        if cond3:
            stmts3
        else
            if cond4:
                stmts4
            else
                stmts5

While the code above is legal Python, real Python programmers use the next solution....

6) Introduce a new keyword to simplify required bracketing or indentation

In most languages where bracketing (or indentation) is required, the multiway conditional is described syntactically as a single if-statement, usually with the help of a special keyword (called elsif in Ada, Ruby, and Perl; elif in bash and Python; and elseif in PHP. Curly-brace style:

IFSTMT -> 'if' '(' EXP ')' BLOCK
          ('elsif' '(' EXP ')' BLOCK)*
          ('else' BLOCK)?
BLOCK -> '{' STMT* '}'

and terminating-end style:

IFSTMT -> 'if' EXP 'then' STMT+
          ('elsif'  EXP 'then' STMT+)*
          ('else' STMT+)?
          'end'

Code looks like this:

# Ruby
if cond1
    stmts1
elsif cond2
    stmts2
elsif cond3
    stmts3
elsif cond4
    stmts4
else
    stmts5
end
# Python
if cond1:
    stmts1
elif cond2:
    stmts2
elif cond3:
    stmts3
elif cond4:
    stmts4
else
    stmts5
 
// PHP
if (cond1) {
    stmts1
} elseif (cond2) {
    stmts2
} elseif (cond3) {
    stmts3
} elseif (cond4) {
    stmts4
} else {
    stmts5
}

7) Lisp's COND

Because Lisp is basically written in abstract syntax trees, you won't ever have a dangling else, and the multiway conditional is already handled by COND:

(COND
    (condition1 block1)
    (condition2 block2)
    (condition3 block3)
    (condition4 block4)
    (T block5))
 

New Ideas

There is a way to require bracketing without resorting to special words like elsif or elif and without ending up with a whole mess of terminators at the end. The solution is amazingly simple:

IFSTMT -> 'if' '(' EXP ')' BLOCK
          ('else' 'if' '(' EXP ')' BLOCK)*
          ('else' BLOCK)?
BLOCK -> '{' STMT* '}'

Why is this not popular? The only thing I can think of is that top-down parsers will need a two-token lookahead when encountering an 'else'. Why should this be a big deal? Just peek ahead to see if the next token is an 'if' (or a '(').

How can we do this for a terminating-end syntax? This doesn't work very well::

IFSTMT -> 'if' EXP 'then' STMT+
          ('else' 'if'  EXP 'then' STMT+)*
          ('else' STMT+)?
          'end'

because the amount of required lookahead is infinite. What about rejecting 'if' statements in the final 'else' part?

STMT   -> IFSTMT | NONIFSTMT
IFSTMT -> 'if' EXP 'then' STMT+
          ('else' 'if'  EXP 'then' STMT+)*
          ('else' NONIFSTMT STMT*)?
          'end'

As long as NONIFSTMT cannot be empty and cannot start with 'if', we're parsable topdown with a lookahead of 2.

If the objection to 'elsif' and related words is just that they are made up, we could give the if-statement a make over, using more reasonable words, or symbols, even. Let's see:

when cond1 do
    stmts1
or when cond2 do
    stmts2
or when cond3 do
    stmts3
or when cond4 do
    stmts
or else do
    stmts
end
try cond1 do
    stmts1
or cond2 do
    stmts2
or cond3 do
    stmts3
or cond4 do
    stmts
otherwise
    stmts
end
try [ cond1 ] =>
    stmts1
[ cond2 ] =>
    stmts2
[ cond3 ] =>
    stmts3
[ cond4 ] =>
    stmts
[] =>
    stmts
end

Sketchbook Home