$24
• Preamble – tinyC
This assignment follows the lexical specification of C language from the Interna-tional Standard ISO/IEC 9899:1999 (E). To keep the assignment within our required scope, we have chosen a subset of the specification as given below. We shall refer to this language as tinyC and subsequently (in a later assignment) specify its grammar from the Phase Structure Grammar given in the C Standard.
The lexical specification quoted here is written using a precise yet compact notation typically used for writing language specifications. We first outline the notation and then present the Lexical Grammar that we shall work with.
• Notation
In the syntax notation used here, syntactic categories (non-terminals) are in-dicated by italic type, and literal words and character set members (terminals) by bold type. A colon (:) following a non-terminal introduces its definition. Alternative definitions are listed on separate lines, except when prefaced by the words one of . An optional symbol is indicated by the subscript opt , so that the following indicates an optional expression enclosed in braces.
{ expressionopt }
• Lexical Grammar of tinyC
1. Lexical Elements
token:
keyword identifier constant string-literal punctuator
2. Keywords
keyword: one of
auto
enum
restrict
unsigned
break
extern
return
void
case
float
short
volatile
char
for
signed
while
const
goto
sizeof
Bool
continue
if
static
Complex
default
inline
struct
Imaginary
do
int
switch
double
long
typedef
else
register
union
3. Identifiers
identifier: identifier-nondigit identifier identifier-nondigit identifier digit
1
identifier-nondigit: one of
a b c d e f g h i j k l m
n o p q r s t u v w x y z
ABCDEFGHIJKLM
NOPQRSTUVWXYZ
digit: one of
0
1
2
3
4
5
6
7
8
9
4. Constants
constant: integer-constant floating-constant enumeration-constant character-constant
integer-constant: nonzero-digit integer-constant digit
nonzero-digit: one of
123456789 floating-constant:
fractional-constant exponent-partopt digit-sequence exponent-part
fractional-constant: digit-sequenceopt . digit-sequence digit-sequence .
exponent-part:
e signopt digit-sequence E signopt digit-sequence
sign: one of + –
digit-sequence: digit
digit-sequence digit enumeration-constant:
identifier
character'-constant: c-char-sequence '
c-char-sequence: c-char
c-char-sequence c-char
c-char:
any member of the source character set except
the single-quote ', backslash \, or new-line character escape-sequence
escape-sequence: one of
\\
\'
\
\?
\a
\b \f
\n \r \t \v
5. String literals
string-literal: s-char-sequenceopt
s-char-sequence: s-char
s-char-sequence s-char
s-char:
any member of the source character set except
the double-quote , backslash \, or new-line character escape-sequence
2
6. Punctuators punctuator: one of
[
]
(
)
{
}
. ->
++
--
&
*
+
-
~
!
/
%
<<
>>
<
>
<=
>===!=^|&&||
• :;...
= *= /= %= += -= <<= >>= &= ^= |=
▪ #
7. Comments
(a) Multi-line Comment
Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it. Thus, /* ... */ comments do not nest.
(b) Single-line Comment
Except within a character constant, a string literal, or a comment, the characters // introduce a comment that includes all multibyte characters up to, but not including, the next new-line character. The contents of such a comment are examined only to identify multibyte characters and to find the terminating new-line character.
• The Assignment
1. Write a flex specification for the language oftinyC using the above lexical grammar. Name of your file should beass3 roll.l. The ass3 roll.l should not contain the function main().
2. Write your main() (in a separate fileass3 roll.c) to test your lexer.
3. Prepare a Makefile to compile the specifications and generate the lexer.
4. Prepare a test input fileass3 roll test.c that will test all the lexical rules that you have coded.
5. Prepare a tar-archive with the name ass3 roll.tar containing all the above files and upload to Moodle.
• Credits
1. Flex Specifications: 60
2. Main function and Makefile: 20 [15+5]
3. Test file: 20
3
Sample Input and output
Output
• Print on the screen
• Write in a file
ass3_roll.l
(flex specification)
makefile
ass3_roll.c
(main() etc)
Lexical analyser
(a.out etc)
ass3_roll_test.c
(test tinyC
program)
Output: Stream of tokens