Starting from:
$30

$24

CSE Assignment 5: Machine Independent Code Generator for tinyC Solution


    • Preamble { tinyC

The Lexical Grammar (Assignment 3) and the Phase Structure Grammar (As-signment 4) for tinyC have already been de ned as subsets of the C language speci cation from the International Standard ISO/IEC 9899:1999 (E).

In this assignment you will write the semantic actions in Bison to translate a tinyC program into an array of 3-address quad’s, a supporting symbol table, and other auxiliary data structures. The translation should be machine-independent, yet it has to carry enough information so that you can later target it to a speci c architecture (x86 / IA-32 / x86-64).

    • Scope of Machine-Independent Translation

Focus on the following from the di erent phases to write actions for translation.

2.1    Expression Phase

Support all arithmetic, shift, relational, bit, logical (boolean), and assignment expressions excluding:

    1. sizeof operator

    2. Comma (,) operator

    3. Compound assignment operators

*= /= %= += -= <<= >>= &= ^= |=

Support only simple assignment operator (=)

4. Structure component expression

2.2    Declarations Phase

Support for declarations should be provided as follows:

        1. Simple variable, pointer, array, and function declarations should be sup-ported. For example, the following would be translated:

float d = 2.3; int i, w[10]; int a = 4, *p, b;

void func(int i, float d); char c;


        2. Consider only void, char, int, and oat type-speci ers. As speci ed in C, char and int are to be taken as signed.

For computation of o set and storage mapping of variables, assume the following sizes1 (in bytes) of types:

1Using hard-coded sizes for types does not keep the code machine-independent. Hence you may want to use constants like size of char, size of int, size of oat, and size of pointer for sizes that can be de ned at the time of machine-dependent targeting.




1

Type
Size
Remarks






void
unde ned

char
1

int
4

oat
8

void *
4
All pointers have same size




It may also help to support an implicit bool (boolean) type with con-stants 1 (TRUE) and 0 (FALSE). This type may be inferred for a logical expression or for an int expression in logical context. Note that the users cannot de ne, load, or store variables of bool type explicitly, hence it is not storable and does not have a size.

    3. Initialization of arrays may be skipped.

    4. storage-class-speci er, enum-speci er, type-quali er, and function-speci er may be skipped.

    5. Function declaration with only parameter type list may be skipped. Hence, void func(int i, float d);

should be supported while void func(int, float); may not be.

2.3    Statement Phase

Support all statements excluding:

    1. Declaration within for.

    2. All Labelled statements (labeled-statement).

    3. switch in selection-statement.

    4. All Jump statements (jump-statement) except return.

2.4    External De nitions Phase

Support function de nitions and skip external declarations.

    • The 3-Address Code

Use the 3-Address Code speci cation as discussed in the class. For easy reference the same is reproduced here. Every 3-Address Code:

    • Uses only up to 3 addresses.

    • Is represented by a quad comprising { opcode, argument 1, argument 2, and result; where argument 2 is optional.

3.1    Address Types

    • Name: Source program names appear as addresses in 3-Address Codes.

    • Constant: Many di erent types and their (implicit) conversions are al-lowed as deemed addresses.

    • Compiler-Generated Temporary: Create a distinct name each time a tem-porary is needed { good for optimization.






2
3.2    Instruction Types

For Addresses x, y, z, and Label L

    • Binary Assignment Instruction: For a binary op (including arithmetic, shift, relational, bit, or logical operators):

x = y op z

    • Unary Assignment Instruction: For a unary operator op (including unary minus or plus, logical negation, bit, and conversion operators):

x = op y

    • Copy Assignment Instruction: x = y

    • Unconditional Jump: goto L

    • Conditional Jump: { Value-based:

if x goto L ifFalse x goto L

{ Comparison-based: For a relational operator op (including <, >, ==, !=, , ):

if x relop y goto L

    • Procedure Call: A procedure call p(x1, x2, ..., xN) having N 0 pa-rameters is coded as (for addresses p, x1, x2, and xN):

param x1 param x2

...

param xN

y = call p, N

Note that N is not redundant as procedure calls can be nested.

    • Return Value: Returning a return value and / or assigning it is optional. If there is a return value v it is returned from the procedure p as:

return v

    • Indexed Copy Instructions:

x = y[z]

x[z] = y

    • Address and Pointer Assignment Instructions:

x = &y

x = *y

*x = y









3
    • Design of the Translator

Lexer & Parser Use the Flex and Bison speci cations (if required you may correct your speci cations) you had developed in Assignments 3 and 4 respectively and write semantic actions for translating the subset of tinyC as speci ed in Section 2. Note that many grammar rules of your tinyC parser may not have any action or may just have propagate-only actions. Also, some of the lexical tokens may not be used.

Augmentation Augment the grammar rules with markers and add new grammar rules as needed for the intended semantic actions. Justify your augmentation decisions within comments of the rules.

Attributes Design the attributes for every grammar symbol (terminal as well as non-terminal). List the attributes against symbols (with brief justi cation) in comment on the top of your Bison speci cation le. Highlight the inherited attributes, if any.

Symbol Table Use symbol tables for user-de ned (including arrays and pointers) vari-ables, temporary variables and functions.




Name
Type

Initial

Size
O set

Nested













Value











Table































...

...


...



...

...


...




























For example, for



















float d = 2.3;



















int i, w[10];






















int a = 4, *p, b;



















void func(int i, float d);













char c;

























the Symbol Tables will look like:













ST(global)






This is the Symbol Table for global symbols





















Name



Type



Initial


Size
O set

Nested











Value










Table






















d

oat



2.3



8


0

null























i

int






null


4


8

null



















w

array(10, int)

null


40


12

null
























a

int






4




4


52

null





















p

ptr(int)



null


4


56

null























b

int






null


4


60

null





















func

function



null


0


64

ptr-to-ST(func)























c

char






null


1


64

null















ST(func)
This is the Symbol Table for function func
















Name

Type

Initial
Size

O set




Nested









Value











Table



















i



int

null

4



0

null



















d



oat

null

8



4

null

















retVal

void

null

0



12

null





























The Symbol Tables may support the following methods:

lookup(...)
A method to lookup an id (given its name or lexeme)

in the Symbol Table.  If the id exists, the entry is

returned, otherwise a new entry is created.


gentemp(...)
A static method to generate a new temporary, insert

it to the Symbol Table, and return a pointer to the

entry.


update(...)
A method to update di erent  elds of an existing

entry.


print(...)
A method to print the Symbol Table in a suitable

format.



4
Note:

    • The elds and the methods are indicative. You may change their name, functionality and also add other elds and / or methods that you may need.

    • It should be easy to extend the Symbol Table as further features are supported and more functionality is added.

    • The global symbol table is unique.

    • Every function will have a symbol table of its parameters and au-tomatic variables. This symbol table will be nested in the global symbol table.

    • Symbol de nitions within blocks are naturally carried in separate symbol tables. Each such table will be nested in the symbol table of the enclosing scope. This will give rise to an implicit stack of symbol tables (global one being the bottom-most) the while symbols are processed during translation. The search for a symbol starts from the top-most (current) table and goes down the stack up to the global table.

Quad Array The array to store the 3-address quad’s. Index of a quad in the array is the address of the 3-address code. The quad array will have the following elds (having usual meanings)

op
arg 1
arg 2
result




...
...
...
...





Note:

    • arg 1 and / or arg 2 may be a variable (address) or a constant.

    • result is variable (address) only.

    • arg 2 may be null.

For example, if

int i = 10, a[10], v = 5;

...

do i = i - 1; while (a[i] < v);

translates to

    100: t1 = i - 1

    101: i = t1

    102: t2 = i * 4

    103: t3 = a[t2]

    104: if t3 < v goto 100

the quad’s are represented as:

Index
op
arg 1
arg 2
result





...
...
...
...
...





100
{
i
1
t1





101
=
t1

i





102
*
i
4
t2





103
=[]
a
t2
t3





104
<
t3
v
100






The Quad Array may support the following methods:

emit(...)
An overloaded static method to add a (newly gen-

erated) quad of the form: result = arg1 op arg2

where op usually is a binary operator.  If arg2 is

missing, op is unary. If op also is missing, this is a

copy instruction.


print(...)
A method to print the quad array in a suitable for-

mat.



5
For example the above state of the array may be printed (with the symbol information) as:

void main()

{

int i = 10;

int a[10];

int v = 5;

int t1;

int t2;

int t3;

L100: t1 = i - 1;

L101: i = t1;

L102: t2 = i * 4;

L103: t3 = a[t2];

L104: if (t3 < v) goto L100;

}

Note:

    • The elds and the methods are indicative. You may change their name, functionality and also add other elds and / or methods that you may need.

Global Functions Following (or similar) global functions and more may be needed to imple-ment the semantic actions:


makelist(i)

A global function to create a new list containing only i, an index into the array of quad’s, and to return a pointer to the newly created list.

merge(p1, p2)

A global function to concatenate two lists pointed to by p1 and p2 and to return a pointer to the concatenated list.

backpatch(p, i)

A global function to insert i as the target label for each of the quad’s on the list pointed to by p.

typecheck(E1, E2)














A
global
function
to
check
if  E1  &

E2
have
same
types
(that  is,
if  <type

of

E1> = <type

of

E2>).
If
not,
then
to
check
if  they  have  compatible  types
(that
is,
one
can
be
converted

to
the
other),
to
use
an
appropri-

ate conversion function conv<type of E1>2<type of E2>(E) or conv<type of E2>2<type of E1>(E) and to make the necessary changes in the Symbol Table entries. If not, that is, they are of incompatible types, to throw an exception during translation.

conv<type1>2<type2>(E)

A global function to converta an expression E from its current type type1 to target type type2, to adjust the attributes of E accordingly, and nally to generate additional codes, if needed.


    • It is assumed that this function is called from typecheck(E1, E2) and hence the conversion is possible.

Naturally, these are indicative and should be adopted as needed. For every function used clearly explain the input, the output, the algorithm, and the purpose with possible use at the top of the function.










6
    • The Assignment

        1. Write a 3-Address Code translator based on the Flex and Bison speci ca-tions of tinyC. Assume that the input tinyC le is lexically, syntactically, and semantically correct. Hence no error handling and / or recovery is expected.

        2. Prepare a Make le to compile and test the project.

        3. Prepare test input les ass5 roll test<number>.c to test the semantic ac-tions and generate the translation output in ass5 roll quads<number>.out.

        4. Name your  les as follows:


File

Naming




Flex Speci cation
ass5 roll.l



Bison Speci cation
ass5
roll.y



Data Structures (Class De nitions) &
ass5
roll translator.h
Global Function Prototypes





Data Structures, Function Implementa-
ass5
roll translator.cxx
tions & Translator main()





Test Inputs
ass5
roll test<number>.c


Test Outputs
ass5 roll quads<number>.out





    5. Prepare a tar-archive with the name ass5 roll.tar containing all the les and upload to Moodle.


6
Credits

Design of Grammar Augmentations:
5

Explain the augmentations in the production rules in Bison

Design of Attributes:
5

Explain the attributes in the respective %token and %type in Bison

Design and Implementation of Symbol Table &

Supporting Data Structures:
10

Explain with class de nition of ST & other Data Structures

Design and Implementation of Quad Array:
5

Explain with class de nition of QA

Design and Implementation of Global Functions:
10

Explain i/p, o/p, algorithm & purpose for every function

Design and Implementation of Semantic Actions:


Explain with every action in Bison


Expression Phase:
15

Correct handling of operators, type checking & conversions


Declaration Phase:
10

Handling of variable declarations, function de nitions in ST


Statement Phase:
15

Correct handling of statements


External De nition Phase:
5

Correct handling of function de nitions

Design of Test  les and correctness of outputs:
20

Test at least 5 i/p  les covering all rules


Shortcoming and / or bugs, if any, should be highlighted










7

More products