Starting from:
$35

$29

Programming Assignment 04 Solution

----------------------------------------------------------------------

Purpose:




* Demonstrate (elementary) understanding of Regular Expressions

and how to use them in various useful languages.




----------------------------------------------------------------------

Background:




* The input file will have multiple proposed tokens on each

line. There also might be lines with no proposed tokens

and / or just or extra whitespace.




* The proposed tokens will be separated by whitespace, which is

to be ignored.




* Your program will consider successive tokens from the input

file and classify them as 'GeePea', 'Shake', 'Orc', or

'does not match'.




* An GeePea is an odd number of 'g' or 'G' letters followed by

one or more exclamation points '!' and question marks '?'

followed by 'PEA' if it was only exclamation points, 'pea' if

it was only question marks, and nothing else if it was a

mixture of both.




(FYI: "an odd number" means 1 or 3 or 5 or 7 or ..., which

can also be stated as 2n+1, where n = 0.)




* A Shake is an ampersand '&' or a plus '+' or a '/' followed by

an even number of letters 'a' through 'z' followed by an

ampersand '&' or a plus '+' or a '/' BUT IT CANNOT END WITH

THE SAME CHARACTER WITH WHICH IT STARTED. That is, if it

begins with '&', it must end with '+' or '/', and so forth.




(FYI: "an even number" means 0 or 2 or 4 or 6 or ..., which

can also be stated as 2n, where n = 0.)




* An Orc is a ' '. followed by zero or more letters 'r' through

'w' or 'R' through 'W', followed by an ampersand '&' when the

letters are lowercase, an asterisk '*' when the letters are

uppercase, or an at sign '@' when there are no letters at all.

Mixing lowercase and uppercase letters is not allowed.




----------------------------------------------------------------------

Examples:




g!PEA -- legal GeePea

gGgGG!!!PEA -- legal GeePea

GGG?pea -- legal GeePea

GGG?!? -- legal GeePea




gG!PEA -- illegal GeePea, not odd number of g/G letters

ggg!Pea -- illegal GeePea, should have been PEA at end

GgGGg?!?pea -- illegal GeePea, should have had no pea at end

gggGGG -- illegal GeePea, need !/? marks (and maybe pea/PEA)




&abcd/ -- legal Shake

/mnop+ -- legal Shake

+gxhyiz& -- legal Shake




&abcd& -- illegal Shake, begins and ends with same character

/mop+ -- illegal Shake, odd number of letters

+GxHyIz& -- illegal Shake, not all lowercase letters




rtv& -- legal Orc

WTUSU* -- legal Orc

@ -- legal Orc




rrr& -- illegal Orc, no at front

UVUV@ -- illegal Orc, should be * at end

rStU& -- illegal Orc, can't mix lowercase and uppercase




----------------------------------------------------------------------

Tasks:




1. Download HMWK_04_dalioba.zip from Blackboard.




2. Unzip the file somewhere convenient.




3. Change 'dalioba' in the name of the directory to your NetID.

(Your NetID is three letters followed by four or five digits.

The directory name will now be something like

'hmwk_04_abc1234', with YOUR NetID instead of 'abc1234'.)




4. Look in that directory.




5. Change the header lines in the skeleton files

hmwk_04.c / .cc :




-- Line 1: Family name first, then a comma, then

personal name.




-- Line 2: Your NetID.




-- Line 3: The date you edited the file.




6. Run the files you just changed with the provided

'inputdata.text' as the input file.




7. Observe the following output (it will be the same no matter

which language you picked):




processing tokens from inputdata.txt ...

g!PEA< is the proposed token.

gGgGG!!!PEA< is the proposed token.

GGG?pea< is the proposed token.

GGG?!?< is the proposed token.

gG!PEA< is the proposed token.

ggg!Pea< is the proposed token.

GgGGg?!?pea< is the proposed token.

gggGGG< is the proposed token.

&abcd/< is the proposed token.

/mnop+< is the proposed token.

+gxhyiz&< is the proposed token.

&abcd&< is the proposed token.

/mop+< is the proposed token.

+GxHyIz&< is the proposed token.

rtv&< is the proposed token.

WTUSU*< is the proposed token.

@< is the proposed token.

rrr&< is the proposed token.

UVUV@< is the proposed token.

rStU&< is the proposed token.




8. Now, change the contents of processToken() function in each

of the hmwk_04.c and .cc files to use the regular expression

support of the corresponding language so that the following

output is generated for the 'inputdata.txt' test case file.




processing tokens from inputdata.txt ...

g!PEA< matches GeePea.

gGgGG!!!PEA< matches GeePea.

GGG?pea< matches GeePea.

GGG?!?< matches GeePea.

gG!PEA< does not match.

ggg!Pea< does not match.

GgGGg?!?pea< does not match.

gggGGG< does not match.

&abcd/< matches Shake.

/mnop+< matches Shake.

+gxhyiz&< matches Shake.

&abcd&< does not match.

/mop+< does not match.

+GxHyIz&< does not match.

rtv&< matches Orc.

WTUSU*< matches Orc.

@< matches Orc.

rrr&< does not match.

UVUV@< does not match.

rStU&< does not match.




9. You should get the same output for each of the two languages.

Make your output match this format EXACTLY since when your

solutions are tested, their output will be checked using

diff.




----------------------------------------------------------------------

Submission:




Make a zipfile of your 'hmwk_04_abc1234' directory (where

'abc1234' is replaced with YOUR NetID) and submit it on Blackboard

as your results for this assignment. Your submission should be a

zipfile that has exactly one item in it, a directory named

'hmwk_04_abc1234' (where 'abc1234' is YOUR NetID). Inside that

directory should be two source files, hmwk_04.c and hmwk_04.cc.




Your submission will be run on another file of test data.

That file will have 24 possible tokens and your solutions will

score 1/2 point for each token that generates the correct message.




Therefore, the maximum possible score for this homework assignment

is 24 points (12 + 12).




----------------------------------------------------------------------

Hints:

1. Ensure your programs compile and run correctly. Not

compiling or not generating the correct output will cost you

points.




Ensure your output messages match the format shown above when

you change the processToken() function. The output is going

to be checked by a program, so it has to match EXACTLY.




After you write your programs, use diff or fc to compare

your output to the supplied 'outputdata.txt'. It must match

EXACTLY or you will be penalized points.




('EXACTLY' means character-by-character the same. No, e.g.,

differences in spacing, no changes in wording, no changes

in punctuation, no changes in capitalization, and so forth.

Check your output against the 'outputdata.txt' file!)




2. Ensure that you update the three header lines in each of the

source files with YOUR name (family name first, then a comma,

then your personal name), YOUR NetID, and the date you edit

the file.




Not updating the header lines properly will cost you points.




3. DO NOT change anything in the main() routine in the C++ case.

You might want to put some initialization code at the top of

the main() routine in the C case (depending on how you do the

processing) but DO NOT change anything else in that routine.




Your programs will be tested from the command line. If they

do not run correctly when run that way, you will score

ZERO points.




4. You might use some lines of static code aside from changing

the contents of processToken(). (This will depend on how you

decide to do the regular expressions.)




5. Ensure you use the regular expression support of the

language. Programs that do not do all of their matching

using the regular expression support of the corresponding

language will score ZERO points.




6. These programs are not complex. The processToken() routine

in the C reference solution is 12 lines of code. There are

three additional lines of static data and 12 lines of

initialization code at the beginning of the C main function.




For C++, the processToken() routine is 15 lines of code,

including three lines of static declarations.




If you find yourself writing lots more code than this in

either the C or C++ case, you're probably going down the

wrong path.




7. After you write your regular expressions, make up some test

cases of your own to ensure that your REs really match the

descriptions given above. The test cases in 'inputdata.txt'

are useful, but they are NOT comprehensive. Make up some

more of your own.




8. Ensure your submission to Blackboard is packaged EXACTLY as

described above.




-- Your submission should be a ZIP FILE (not a tar, rar, gz,

or any other kind of compressed file).




-- The zip file should be named 'hmwk_04_abc1234.zip' (with

'abc1234' replaced with YOUR NetID).




-- This zip file should have ONE item in it, a directory

named 'hmwk_04_abc1234' (with 'abc1234' replaced with

YOUR NetID).




-- Your source files should be in that directory. The

source files should be named hmwk_04.c / .cc.




Submissions in the wrong format score ZERO points.




9. After you submit your zip file on Blackboard, download it

from Blackboard and check that your submission is in the

proper format, that the programs run and print the correct

output, and that you updated the header lines correctly in

each of the source files.




10. Are you CERTAIN you complied with all of these nit-picking

instructions? Really? Maybe you ought to check just one

more time. :)




----------------------------------------------------------------------

More products