
      A Language of Expressions with Local and Global Variables


An expression with local and global variables is defined by the following grammar.

       Prog ::= Exp
              | '(' 'prog' Exp+ Exp ')'

        Exp ::= Begin
              | Var
              | Print
              | BExp
              | AExp              
              | INTEGER
              | BOOLEAN
              | VARIABLE

      Begin ::= '(' 'begin' Exp+ Exp ')'

        Var ::= '(' 'var' VARIABLE Exp ')'

      Print ::= '(' 'print' Exp ')'

       BExp ::= '(' '||'  Exp Exp+ ')'
              | '(' '&&'  Exp Exp+ ')'
              | '(' '!'   Exp ')'
              | '(' RelOp Exp Exp ')'
              | '('  EqOp Exp Exp ')'

      RelOp ::= '<' | '>' | '<=' | '>='
       EqOp ::= '==' | '!='

       AExp ::= '(' '+' Exp Exp* ')'
              | '(' '-' Exp Exp? ')'
              | '(' '*' Exp Exp+ ')'
              | '(' '/' Exp Exp  ')'
              | '(' '%' Exp Exp  ')'
              | '(' '^' Exp Exp  ')'
              
    INTEGER ::= [0-9]+
    BOOLEAN ::= 'true' | 'false'
   VARIABLE ::= [a-zA-Z][a-zA-Z0-9]*


This language adds one new expression to Language_3, the Begin expression, and it changes slightly the semantics of the Var expression. All the other expression in this language have the same syntax and semantics as in Language_3.

So to describe the semantics of this language we need to describe the semantics for Begin expressions and the changed semantics for Var expressions.

Before going into the details, let's look at the idea behind the new Begin expression. This expression introduces the idea of "scope" into our languages. The begin expression is meant to be similar to the curly brace blocks of code in C/C++/Java/JavaScript/PHP or the let-expressions of functional languages like Lisp/Scheme/ML/Haskell. A Begin expression gathers lines of code together into a unit that share local variables. For example

   (prog
      (var x 1)
      (var y 2)
      (var z 3)
      (print (+ x y z))      // ==> 6
      (begin
         (var x 3)           // a new, local x (hides the global x)
         (var y 4)           // a new, local y (hides the global y)
         (print (+ x y z))   // ==> 10
      )
      (print (+ x y z)))     // ==> 6

Notice how the x and y from the "inner block" effectively "hide" the x and y from the "outer block". We say that 'begin' starts a new scope and any Var expressions inside the 'begin' go "out of scope" when you get to the closing parenthesis of the 'begin'.


Now for the details of the semantics of Begin and Var.

The value of a Begin expression is the value of its last Exp.

Besides having a value, each Begin expression also has a side-effect. When the interpreter evaluates a Begin expression, the first thing the interpreter will do is create an empty Environment object (which is called the "local" Environment object) and link that object to the previous (outer) Environment object. Nested Begin terms will create a chain, or linked list, of Environment objects. So the side effect of a Begin expression is to change the chain of Environment objects.

After creating and linking the local Environment object, the interpreter will evaluate all the expressions in the Begin expression. Any variable reference found within any of those expressions will be looked up in the current environment chain. After evaluating all the expressions, the interpreter will de-allocate the local environment object it created at the beginning of the Begin expression and then the interpreter will return the value of the last expression. Another way to think about this is that a series of nested begins creates a chain of linked Environment objects. Entering a Begin expression adds an environment object (a local scope) to the beginning of the environment chain, and leaving a Begin deletes the environment object at the beginning of the chain (kind of like a stack of environment objects).

When the interpreter is first called, it creates an empty, global Environment object. This global environment object is always at the tail end of the chain of environment objects. It is also called the "global scope".


Besides having a value, each Var expression also has a side-effect. A Var expression either puts a new <variable, value> pair into the local Environment object or it updates the value in an already existing <variable, value> pair in the local Environment object. In other words, when the interpreter evaluates a Var term, the interpreter will mutate the local Environment object and also compute a value.

Notice that a Var term cannot mutate any Environment object other than the local one. In particular, a Var term cannot mutate a variable that has been declared in an outer scope.

When a variable is referenced in an expression, its value is looked up in the current chain of Environment objects (starting with the current local scope). The first occurrence of the variable in an Environment chain is the one that determines its value (so the declaration of a variable in a nested Begin expression can "hide" an occurrence of that variable in an enclosing Begin).


We should also notice that the semantics of a variable reference have changed slightly from Language_3. When the interpreter comes across a variable reference (in any expression other than a Var expression) the interpreter should look up the value of the variable in the whole current Environment chain (not just the current local environment object). This means that the implementation of the Environment class should now do a lookup in a recursive way that traverses the current Environment chain.

It is an error to reference a variable in an expression before the variable has been declared by a Var term in the same scope-chain as the expression using the variable. In other words, it is an error if the interpreter cannot find a variable reference in the current environment chain.



There is another possible semantics for this language that changes the meaning of the Var expression. We could define the Var expression so that if the variable from the <variable, value> pair is somewhere in the current chain of Environment objects, then the Var term should mutate the value of the variable in the Environment object that contains the variable, but if the variable is not defined in the current chain of Environment objects, then the Var term should put the <variable, value> pair into the current local Environment object. With this semantics, it is not possible for a nested Begin expression to hide a variable from an outer Begin expression (as opposed to the current semantics, where a nested 'begin' cannot mutate a variable from an outer 'begin'). Implementing this alternate semantics is left as an exercise for the reader.


And like Language_3, this language can also be given the semantics where Var can only declare variables, it cannot mutate them. In that case, this would be a language of immutable variables which can hide each other (which is the case in most functional programming languages).