Project 3: Parsing and Type Checking Solution

Starting from:

$35

Your goal is to write a predictive parser and write a type checker for a given language. The parser checks the syntax of the input and the type checker enforces the semantic rules of the language. The semantic rules that we are interested in specify which assignments are valid and which expressions are valid.

The input to your code will be a program and the output will be:

an error message if there is a type mismatch or syntax error or

lists of symbols of different types if there is no error.

Grammar Description

program

scope

scope list

scope list

scope list

scope list

declaration

type decl

type name

var decl

id list

id list

stmt list

stmt list

stmt

stmt

assign stmt

while stmt

expr

expr

term

term

term

factor

factor

factor

factor

condition

condition

primary

primary

primary

relop

relop

relop

relop

relop

! scope
To be added

! LBRACE scope list RBRACE

! stmt
scope_list → scope
! declaration
scope_list → scope scope_list
! stmt scope list

! declaration scope list

! type decl | var decl

! TYPE id list COLON type name SEMICOLON

! REAL | INT | BOOLEAN | STRING | LONG | ID

! VAR id list COLON type name SEMICOLON

! ID

! ID COMMA id list

! stmt

! stmt stmt list

! assign stmt

! while stmt

! ID EQUAL expr SEMICOLON

! WHILE condition LBRACE stmt list RBRACE

! term

! term PLUS expr

! factor

! factor MULT term
To be deleted
!
factor DIV term

LPAREN expr RPAREN

NUM

REALNUM

ID

ID

primary relop primary

ID

NUM

REALNUM

GREATER

GTEQ

LESS

NOTEQUAL

LTEQ

The tokens used in the grammar description are:

TYPE
= TYPE
VAR
= VAR
REAL
= REAL
INT
= INT
BOOLEAN
= BOOLEAN
STRING
= STRING
LONG
= LONG
WHILE
= WHILE
COMMA
= ,
COLON
= :
SEMICOLON
= ;
LBRACE
= {
RBRACE
= }
LPAREN
= (
RPAREN
= )
EQUAL
= =
PLUS
= +
MULT
= *
DIV
= /
GREATER
=
GTEQ
= =
LESS
= <
LTEQ
= <=
NOTEQUAL
= <
ID
= letter (letter + digit)*
NUM
= 0 + (pdigit digit*)
REALNUM
= NUM \. digit digit*

Language Semantics

2.1. Scoping Rules

Lexical scoping is used. Every scope defines a scope.

2.2. Types

The language has five built-in types: INT , REAL , BOOLEAN , STRING , and LONG . Programmers can also declare new types with a type decl which can appear anywhere in the program, except in

the statement list of a while stmt .

2.3. Variables

Programmers declare variables explicitly in var decl . The names of the declared variables appear in the id list and their type is the type name .

Example

Consider the following program written in our language:

{

TYPE a : INT;

TYPE b : a;

2

VAR x : b;

VAR y : c;

y = x;

}

This program has two declared variables: x and y .

2.4. Declaration vs. Use

Any appearance of a name (type or variable) in the program is either a declaration or a use. The following lists all possible declarations of a name:

Any appearance of a name in the id list part of a type decl

Any appearance of a name in the id list part of a var decl

Any other appearance of a name is considered a use of that name. Note that the above definitions exclude the built-in type names.

Given the following example (the line numbers are not part of the input):

01 {

TYPE a : INT;

TYPE b : a;

VAR x : b;

VAR y : c;

y = x;

}

We can categorize all appearances of names as declaration or use:

Line 2, the appearance of name a is a declaration

Line 3, the appearance of name b is a declaration

Line 3, the appearance of name a is a use

Line 4, the appearance of name x is a declaration

Line 4, the appearance of name b is a use

Line 5, the appearance of name y is a declaration

Line 5, the appearance of name c is a declaration

Line 6, the appearance of name y is a use

Line 6, the appearance of name x is a use

2.5. Type System

Our language uses structural equivalence for checking type equivalence.

Here are all the type rules/constraints that your type checker will enforce (constraints are labeled from

C1 to C4 for reference):

C1: The left hand side of an assignment should have the same type as the right hand side of that assignment

3

C2: The operands of an operation ( PLUS , MINUS , MULT , and DIV ) should have the same type (it can be any type, including STRING and BOOLEAN )

C3: The operands of a relational operator (see relop in grammar) should have the same type (it can be any type, including STRING and BOOLEAN )

C4: condition should be of type BOOLEAN

The type of an expr is the same as the type of its operands

The result of p1 relop p2 is of type BOOLEAN (assuming that p1 and p2 have the same type)

NUM constants are of type INT

REALNUM constants are of type REAL

If two types cannot be determined to be the same according to the above rules, the two types are different

Parser

You must write a parser for the grammar, If your parser detects a syntax error in the input, it should output the following message and exit:

Syntax Error

You can start coding by writing the parser first and then move on to implementing the type checking part. You should make sure that your parser generates a syntax error message if the input program does not follow the proper syntax.

We recommend that you check your code on the submission website to make sure it passes all the test cases in the parsing category before moving on to implementing the type checking part.

Our grammar is not LL(1) i.e. it does not satisfy the conditions for predictive parser with one token lookahead, however, it is still possible to write a recursive descent parser with no backtracking. We can do this by looking ahead at more than one token in some cases and left-factoring some rules. In particular, parsing condition requires more than one token lookahead. Also, two rules like

stmt list ! stmt

stmt list ! stmt stmt list

can be parsed by first parsing stmt and then checking if the next token is the beginning of stmt list or in the FOLLOW of stmt list . In fact the two rules are equivalent to

stmt list ! stmt stmt list1

stmt list1 ! stmt list | ✏

but you do not need to explicitly left-factor the rules by introducing stmt list1 to parse stmt list .

4
Output

Your program will check for the following semantic errors and output the correct message when it en-counters that error. Note that there will only be at most one error per test case.

4.1. Redeclaration Errors

Errors involving programmer-defined types:

Programmer-defined type declared more than once (error code 1.1)

A type is declared more than once if it appears in more than one id list of type decl in the same scope. Declaring the same type name is allowed in non-overlapping scopes and in nested scopes (lexical scoping).

Programmer-defined type redeclared as a variable (error code 1.2)

A type name is redeclared as a variable if the name appears first in an id list of a type decl and appears again in id list of var decl in the same scope. Using the same programmer defined name for a variable and a type is allowed in non-overlapping scopes and in nested scopes (lexical scoping)

Programmer-defined type used as variable (error code 1.3)

If resolving a variable name returns a type declaration, the type is used as a variable.

Undeclared type (error code 1.4)

If resolving a type name returns no declaration, the type is undeclared.

Errors involving variable declarations:

Variable declared more than once (error code 2.1)

An explicitly declared variable can be declared again explicitly by appearing as part of an id list in a variable declaration and resolving the name at the site of the new declaration returns the first declaration.

Programmer-defined variable redeclared as a type (error code 2.2)

A variable name is redeclared as a type if the name appears first in an id list of a var decl and appears again in id list of type decl in the same scope. Using the same programmer defined name for a variable and a type is allowed in non-overlapping scopes and in nested scopes (lexical scoping)

Variable used as a type (error code 2.3)

If the type name in a variable declaration resolves to a variable declaration, the variable is used as a type.

Undeclared variable (error code 2.4)

If resolving a variable name that appears in a statement other than a declaration returns no declaration, the variable is undeclared.

Errors involving built-in type:

Redeclaration or use of a built-in type name If a built-in type name appears in id list of a variable declaration or a type declaration or appears in a statement other than a declaration statement, your parser should output syntax error.

For each error involving variable declarations and errors involving type declarations, your type checker should output one line in the following format:

ERROR CODE <code <symbol_name

in which <code should be replaced with the proper code (see the error codes listed above) and

<symbol name should be replaced with the name of the type or variable that caused the error. Since

the test cases will have at most one error each, the order in which these error messages are printed does not matter.

5
4.2. Type Mismatch

If any of the type constraints (listed in the Type System section above) is violated in the input program, then the output of your program should be:

TYPE MISMATCH <line_number <constraint

Where <line number is replaced with the line number that the violation occurs and <constraint should be replaced with the label of the violated type constraint (possible values are C1 through C4, see section on Type System for details of each constraint). Note that you can assume that anywhere a violation can occur it will be on a single line.

4.3. No Semantic Errors

If there are no semantic errors in the program, then your program should output for each use of a programmer-defined type name and variable name, in the order in which the use appears in the pro-gram, the line number in which the use appears and the line number of the declaration to which the use of the name resolves. For this part, the format should be the following

<name_used_1 <line_no_use_1 <line_no_declared_1 <name_used_2 <line_no_use_2 <line_no_declared_2 ...

Where <name use i is the name of a variable or a type and corresponds to i’th name use in the pro-gram. <line no use i is the line number in which the i’th use appears and <line no declared i is the line number of the declaration corresponding to the i’th use.

Examples

Given the following example (the line numbers are not part of the input):

01 {

TYPE a, b, c, b : INT;

VAR x : b;

VAR y : c;

y = x;

}

The output will be the following:

ERROR CODE 1.1 b

6

Given the following example (the line numbers are not part of the input):

01 {

TYPE a : INT;

VAR x : INT;

VAR b, a : STRING;

x = 10;

}

The output will be the following:

ERROR CODE 1.2 a

Given the following example (the line numbers are not part of the input):

01 {

02 VAR x : INT;

VAR y, x : STRING;

x = 10;

}

The output will be the following:

ERROR CODE 2.1 x

Given the following example (the line numbers are not part of the input):

01 {

04 VAR x : INT;

TYPE y, x : STRING;

x = 10;

}

The output will be the following:

ERROR CODE 2.2 x

Given the following example (the line numbers are not part of the input):

01 {

02 VAR x1 : INT;

VAR y, x2 : STRING;

y = x1;

}

7

The output will be the following:

TYPE MISMATCH 4 C1

Given the following example (the line numbers are not part of the input):

01 {

02 VAR

x : INT;

VAR y : REAL;

WHILE x < 10

{

x = x + y;

}

}

The output will be the following:

TYPE MISMATCH 7 C2

Given the following example (the line numbers are not part of the input):

01 {

TYPE t : INT;

VAR a, b : t;

04
{
VAR a
: INT;
05

WHILE
a b
06

{

07

a = a + b;
08

}

}

}

The output will be the following:

t 3 2

a 5 4

b 5 3

a 7 4

a 7 4

b 7 3

8
Evaluation

Your submission will be graded on passing the automated test cases.

The test cases (there will be multiple test cases in each category, each with equal weight) will be broken down in the following way (out of 100 points):

Parsing: 35 points

Errors involving programmer-defined types (error codes 1.x): 15 points

Errors involving variable declarations (error codes 2.x): 20 points

Type mismatch errors and no semantic error cases: 30 points

The parsing category is not partially graded, you need to pass all test cases in that category to get the 35 points. All other categories are partially graded.

9

More products

Execute the script file LargeCo_oralceScript.sql Solution (Best price and Top Grade Guaranteed)

$26.99

Buy now

Use SQL to create the tables described in table_descriptions Solution (Best price and Top Grade Guar

$26.99

Buy now

Create a program with the following OO structure Solution (Best price and Top Grade Guaranteed)

$26.99

Buy now