ec3ba90faae70c21056250549d763cdc.ppt
- Количество слайдов: 35
Simon Gay Department of Computing Science University of Glasgow 2006/07
We want to understand how to convert the formal specification of a type system into an implemented typechecker. We will build typecheckers for the simple expression language and the simple functional language, and use them as the basis for implementations of more complex type systems later. The process is fairly straightforward, but we need to take care with some details: We will use Java for our implementations. You might find it interesting to look at Pierce’s implementations in OCaml.
Typechecking consists of traversing the AST, checking that the typing rules are obeyed. This requires establishing the type of each expression. Later stages of compilation require type information so the type of each expression must be stored. Variables must be matched with declarations - so scoping rules are also checked. The process is often called contextual analysis - perhaps this means more than typechecking, but we’ll just refer to typechecking. The process of establishing the type of every expression is sometimes called elaboration.
let var n : Integer; var c : Char in begin c : = ‘&’; n : = n + 1 end
Its abstract syntax tree:
Traversal:
Checking:
The details depend on the representation of ASTs, which in turn depends partly on the implementation language. For example, in a functional language we define a datatype corresponding to the abstract syntax of the language. In ML the datatype for SEL might look like this:
The typechecker is a function from expr to … what? We can define another datatype for elaborated ASTs. In general this must represent the type of every expression, but in SEL the only expression whose type is not obvious is the conditional: Compare this with Pierce’s approach.
The typechecker is a function from expr to typed_expr * ty.
We’re going to use Java, which means that The representation of ASTs uses a natural OO style: Watt and Brown’s book describes this in detail.
A more object-oriented approach is to use the visitor design pattern. (See Watt and Brown for more details. )
If we want to implement a typechecker (for SEL or SFL, say) then we also need a parser. It is convenient to use an automated tool to generate as much as possible of the front-end machinery. We will use Sable. CC, a compiler construction tool developed at Mc. Gill University in Canada. Sable. CC is given an annotated grammar, and generates
A grammar for SEL, suitable for Sable. CC, begins with a specification of tokens:
Followed by the productions:
Exercise: Draw a parse tree for the expression 1 + (2 + 3). Why are the brackets necessary and why has the grammar been defined in a way that makes them necessary?
For each non-terminal in the grammar, Sable. CC generates an abstract class, for example: The names of these classes are systematically generated from the names of the non-terminals.
For each production, Sable. CC generates a class, for example:
We define
An implementation of a typechecker for SEL can be found on the course web page. You should study the implementation, the accompanying notes, and Worksheet 3. Any questions about the implementation of the typechecker can be dealt with in a future tutorial.
An implementation of a typechecker for the Simple Functional Language can be found on the course web page and is described in the accompanying notes. You should study them in comparison with the SEL typechecker. The typechecker is based on the SEL typechecker, with two main differences: There are of course some changes to the grammar, including the fact that there is now syntax for the types int and bool.
An environment is essentially a lookup table, indexed by strings (identifier names) and containing two kinds of entry: variable with type function name with parameter types and result type We can use a Hashtable.
We must deal with nesting of scopes. Even though SFL does not have nested functions, there is still a global scope (containing type information for all functions) and a local scope within each function. The class Env implements a stack of Hashtables. To look up a variable or function name, first look in the Hashtable on top of the stack. If it is not there, keep looking down the stack. We will be able to use the same Env class for environments in languages with full scope nesting.
{ int x; bool b; { float x; int y; code…x…y…b… } code…x…b… } x : float, y : int x : int, b : bool open. Scope( ) creates a new Hashtable on the stack close. Scope( ) removes the top Hashtable put(String n, Env. Entry e), get(String n)
The SEL typechecker makes a single traversal of the syntax tree. If we want to typecheck SFL in a single pass, then in order to support mutually recursive functions we need to follow Pascal: or Standard ML: Instead, to stick closely to the formal definition of SFL, we use two passes: the first just looks at function definitions and builds an initial environment containing their type information.
We have a formal definition of the syntax, operational semantics and type system of SFL and we have proved that the type system is sound. When we have seen how to introduce functions properly, we’ll go on to look at structured data types (e. g. records).
In the tutorial we will discuss these exercises and any further details of the SEL example.