Let us take our simple language

    L = { a, a+a, a+a+a, a+a+a+a, ... }

and add balanced parentheses to it, so L becomes

    L = { a, (a), a+a, (a)+a, a+(a), (a+a), a+a+a, (a+a)+a, a+(a+a), ((a+a)+a)+a, ... }

-----------------------------------------------------------------------------


4.) Here is an ambiguous BNF grammar for this new language.


BNF:    expr -> expr '+' expr
              | '(' expr ')'
              | 'a'

-----------------------------------------------------------------------------


5.) Here is an unambiguous, right associative, BNF grammar
(and an EBNF grammar derived from it).


BNF:    expr -> 'a' '+' expr
              | '(' expr ')' '+' expr
              | '(' expr ')'
              | 'a'


EBNF:   expr -> 'a' [ '+' expr ]
              | '(' expr ')' [ '+' expr ]


EBNF:   expr -> ( 'a' | '(' expr ')' ) [ '+' expr ]


Here are simplified versions of these grammars.

BNF:   expr -> term '+' expr | term
       term -> 'a' | '(' expr ')'

EBNF:  expr -> term [ '+' expr ]     // "factor" out term
       term -> 'a' | '(' expr ')'

EBNF:  expr -> term ( '+' term )*    // replace right recursion with iteration
       term -> 'a' | '(' expr ')'

Notice how we can use two non-terminals to simplify this grammar,
but they are not required for disambiguating the grammar.


Here is what getExp() would look like for this EBNF production from above.

EBNF:   expr -> ( 'a' | '(' expr ')' ) [ '+' expr ]

void getExpr(tokens)
{
   tk = tokens.peek()
   if ( tk.equals("a") )
   {
      tokens.match("a");
   }
   else if ( tk.equals("(") )
   {
      tokens.match("(");
      getExpr(tokens);     // recursion
      tokens.match(")");
   }
   else
      throw new ParseException();

   if ( tokens.hasToken() )
   {
       if (! tokens.match("+") )
          throw new ParseException();
       getExpr(tokens);     // recursion
   }
}


Here is what the mutually recursive getExp() and getTerm() would
look like for this EBNF grammar from above.

EBNF:  expr -> term [ '+' expr ]
       term -> 'a' | '(' expr ')'

void getExpr(tokens)
{
   getTerm(tokens);
   if ( tokens.hasToken() )
   {
       if (! tokens.match("+") )
          throw new ParseException();
       getExpr(tokens);     // recursion
   }
}

void getTerm(tokens)
{
   tk = tokens.peek()
   if ( tk.equals("a") )
   {
      tokens.match("a");
   }
   else if ( tk.equals("(") )
   {
      tokens.match("(");
      getExpr(tokens);
      tokens.match(")");
   }
   else
      throw new ParseException();
}

You should modify this code so that it produces a parse tree
or an abstract syntax tree.


Question: What goes wrong if we try to write a recursive descent
parser using this grammar? (Remember that this is supposed to
right associative.)

EBNF:  expr -> term ( '+' term )*    // replace right recursion with iteration
       term -> 'a' | '(' expr ')'

-----------------------------------------------------------------------------


6.) Here is an unambiguous, left associative, BNF grammar
(and EBNF grammars derived from it).


BNF:    expr -> expr '+' 'a'
              | expr + '(' expr ')'
              | '(' expr ')'
              | 'a'


EBNF:   expr -> [ expr '+' ] 'a'
              | [ expr '+' ] '(' expr ')'


EBNF:   expr -> [ expr '+' ] ( 'a' | '(' expr ')' )


EBNF:   expr -> ( 'a' | '(' expr ')' ) ( '+' ( 'a' | '(' expr ')' ) )*


Here are simplified grammars.

BNF:   expr -> expr '+' term | term
       term -> 'a' | '(' expr ')'

EBNF:  expr -> [ expr '+' ] term
       term -> 'a' | '(' expr ')'

EBNF:  expr -> ( term '+' )* term
       term -> 'a' | '(' expr ')'

EBNF:  expr -> term ( '+' term )*
       term -> 'a' | '(' expr ')'



Here is what getExp() would look like for this EBNF production from above.

EBNF:   expr -> ( 'a' | '(' expr ')' ) ( '+' ( 'a' | '(' expr ')' ) )*

void getExpr(tokens)
{
   tk = tokens.peek()
   if ( tk.equals("a") )
   {
      tokens.match("a");
   }
   else if ( tk.equals("(") )
   {
      tokens.match("(");
      getExpr(tokens);     // recursion
      tokens.match(")");
   }
   else
      throw new ParseException();

   while ( tokens.hasToken() )
   {
      if (! tokens.match("+") )
         throw new ParseException();

      tk = tokens.peek()
      if ( tk.equals("a") )
      {
         tokens.match("a");
      }
      else if ( tk.equals("(") )
      {
         tokens.match("(");
         getExpr(tokens);     // recursion
         tokens.match(")");
      }
      else
         throw new ParseException();
   }
}


Notice that the above code has a repeated block of code. We can fix that by
using a simplified form of the grammar. Here is what getExp() and getTerm()
would look like for this EBNF production from above.

EBNF:  expr -> term ( '+' term )*
       term -> 'a' | '(' expr ')'

void getExpr(tokens)
{
   getTerm();
   while ( tokens.hasToken() )
   {
      if (! tokens.match("+") )
         throw new ParseException();

      getTerm();
   }
}

void getTerm(tokens)
{
   tk = tokens.peek()
   if ( tk.equals("a") )
   {
      tokens.match("a");
   }
   else if ( tk.equals("(") )
   {
      tokens.match("(");
      getExpr(tokens);
      tokens.match(")");
   }
   else
      throw new ParseException();
}

You should modify this code so that it produces a parse tree
or an abstract syntax tree.


------------------------------------------------------------------------------

Since we now using parentheses in our language, we can specify the following
interesting property. We can force + to be a non-associative operator. This
means that the string "a+a+a" would not be allowed in the language. Instead,
parentheses would have(!) to be used when there is more than one + operator.
(For example, in the programming language Maple, the exponentiation
operator, ^, is non-associative and 2^3^4 has no meaning. On the other hand,
in Fortran the exponentiation operator is right associative but in Matlab
it is left associative.)


BNF:  expr -> 'a'
            | 'a' '+' 'a'
            | 'a' '+' '(' expr ')'
            | '(' expr ')' '+' 'a'
            | '(' expr ')' '+' '(' expr ')'
            | '(' expr ')'


Some language designers make certain operators non-associative when they feel
that those operators are not universally assumed to be either left or right
associative (like the exponentiation operator). Forcing programmers to use
parentheses is thought to be safer than letting them write code without the
parentheses, but having the parser use the associativity that is opposite
to what the programmer thinks it is.
https://codeplea.com/exponentiation-associativity-options


Here is a simplified non-associative operator grammar.

BNF:  expr -> term '+' term | term
      term -> 'a' | '(' expr ')'

Or, in EBNF.

EBNF: expr -> term [ '+' term ]
      term -> 'a' | '(' expr ')'

What happens when you try to parse "a+a+a"?


Question: How does the C/C++/Java family of languages handle the
associativity of exponentiation?


Here is another example of operators that should not be associative, the
relational operators,
      ==, !=, <=, >=, <, >.
A string like
      3 < 4 < 5
should not parse. The reason is that the value of a relational operator is
a boolean. If the above expression were, for example, left associative,
then it would parse and evaluate as
      (3 < 4) < 5 --> (true) < 5
and it does not make sense to compare true with 5 (but what about the language C?)..

But notice that the relational operators are not quite like the exponentiation
operator. While we may not want
      a ^ b ^ c
to parse, we do want the parenthesized expressions (a ^ b) ^ c and a ^ (b ^c)
to both parse. On the other hand, while we do not want
      a < b < c
to parse, we also do not want either parenthesized string (a < b) < c or
a < (b