$24
In the previous assignment, you implemented a lexical analyzer, or scanner, to tokenize program statements. Your scanner accepted statements and returned a string of tokens corresponding to the statement. For example, your scanner accepts the statement num1 = num2/4 + num3/21 and returns the following token string: identi er:num1 assignment identi-er:num2 operator:/ number:4 operator:+ identi er:num3 operator:/ number:21.
In this assignment, you will be writing a recognizer, i.e. a computer program that recognizes token lists that can be gen-erated by a given grammar, and therefore constitute a valid statement in the language de ned by that grammar. Thus, your program accepts a string of tokens and categorizes it as either a valid statement, or an invalid statement. The grammar that you will be using for this purpose is shown below.
statement ! assign stmt j expression
assign stmt ! identif ier assignment expression
expression ! (identif ier j number) operator expression j (identif ier j number)
To perform this task, your program implements the following algorithm.
1. Takes the token string and removes the details, like identi er name etc. (highlighted in magenta above), retaining only the tokens
2. Assume that t1,t2,t3,t4,...tN represents the string of tokens being processed.
3. Starting from the right end of the string of tokens, your program attempts to nd the longest token substring that matches the right hand side of some production in the grammar.
(a) If it nds a match, it replaces that substring of tokens with the left hand side of the corresponding production.
(b) If no match is found, it attempts to the do the same with the next smallest substring starting from the right end.
4. This process is repeated till the token string has been reduced to the non-terminal statement, or no further reduction is possible. The rst case corresponds to a valid statement, and the second one corresponds to an invalid statement.
Example: a = b tokenizes as identi er assignment identi er
1. Check to see if the RHS of any rule matches identi er assignment identi er. It does not.
2. Forget the left most token identi er, and check to see if the RHS of any rule matches assignment identi er. It does not.
3. Forget the left most token, assignment, and check to see if the RHS of any rule matches identi er. Rule 3, expression -> identi er matches.
4. Replace identi er (RHS) with expression (LHS), and repeat the process for identi er assignment expression
5. RHS of rule 2 matches identi er assignment expression
6. Replace with LHS, assign stmt, and repeat the process.
7. RHS of rule 1 matches assign stmt
8. Replace with LHS, statement. Success!
Test your program on each statement in the test le provided. When executed, your program must read each statement in the le, and tokenize it using the tokenize function you wrote for assignment 1. If the tokenization has no error, the string of tokens is forwarded to the recognizer which classi es each statement as either a valid, or an invalid, statement. Submit both your program and the output of your program for each statement in the le.