Starting from:
$35

$29

ASSIGNMENT 1 Solution




 
Assignment Overview




This assignment involves writing two command line applications to process streams of text. Both programs will read lines of text from standard input, and print a subset of the input to standard output.




The overall goal of this assignment is to introduce C programming in a unix setting, with particular emphasis on C array and string processing. Your eventual submission will consist of the source les for both programs, accompanied by test cases. Sections 1.2 and 1.3 below contain speci cations for each of the two programs and Section 2 describes the testing component. The evaluation scheme is given in Section 3.




Your code is expected to compile without warnings in the course lab (ECS 242) using the -Wall and -std=c99 ags with the gcc compiler. Although you are permitted to use dynamic memory allocation (the malloc and free functions) in your code, you are strongly encouraged to avoid doing so, both because dynamic memory issues can be di cult to debug and because it is useful to learn how to rely exclusively on automatic allocation.







1.1 Background: Stream Searching with grep




The classic unix grep command is a powerful tool for searching streams of text. By default, grep reads lines of text from standard input and outputs only those lines which match some provided pattern (which is usually given as a command line argument). For this assignment, you will im-plement a stream searching program which duplicates some of the basic functionality of grep (the grep program has a number of other powerful features that we will study later in the course). Consider the text le macbeth.txt below (which contains a soliliquoy from Macbeth).




Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day,




To the last
syllable of recorded time;


And all
our
yesterdays have lighted
fools
The way
to
dusty death. Out, out, brief
candle!
Life's but
a walking shadow, a poor
player
That struts
and frets his hour upon
the
stage
And then is
heard no more. It is a tale


Told by
an
idiot, full of sound and
fury


Signifying
nothing.










1



Recall that in a unix shell, the < operator is used to redirect a text le into the standard input stream of a program. The command




$ grep day < macbeth.txt




invokes the grep command to search for all occurrences of the word `day' in the le above. The output will be all lines which contain the search term:




Creeps in this petty pace from day to day,




And all our yesterdays have lighted fools




To search using multiple search terms, a list of words can be stored in a text le (with one word per line). Consider the le macbeth_words.txt below.




tomorrow




yesterday




of




The command




$ grep -F -f macbeth_words.txt < macbeth.txt




will search the input stream for occurrences of any words in macbeth_words.txt and print all lines which contain one or more of the words. For the les above, the output will be




Tomorrow, and tomorrow, and tomorrow,




To the last syllable of recorded time;




And all our yesterdays have lighted fools




Told by an idiot, full of sound and fury




1.2 Stream Searching Program




The rst part of the assignment is to write a C program, contained in a source le called stream_search.c, which duplicates the functionality of the grep -F -f invocation described in the previous section.




The program must compile, with no warnings, in ECS 242 using the following compile command. $ gcc -Wall -std=c99 -o stream_search stream_search.c




After compiling, a correct implementation will take the name of a word list le as a command line argument and read lines of text from standard input, outputting all lines which contain one or more of the words in the word list le. For example, using the les macbeth.txt and macbeth_words.txt in the previous section, the command




$ ./stream_search macbeth_words.txt < macbeth.txt will produce the output

Tomorrow, and tomorrow, and tomorrow,




To the last syllable of recorded time;




And all our yesterdays have lighted fools




Told by an idiot, full of sound and fury




The program may assume that the input data is subject to the following restrictions.




Each line of the word list le contains exactly one word, which may contain only alphabetical characters (uppercase or lowercase). However, the line may contain spaces before or after the word (in addition to the newline character at the end). These spaces must be ignored.




2



Each line of the word list le contains at most 100 characters, including the newline character at the end of the line. Note that the last line in the le may not contain a newline character at the end.




The lines of text read from standard input may contain letters, numbers, spaces (including tabs), punctuation or symbols.




Each line of input read from standard input contains at most 1000 characters, including the newline character at the end of the line. Note that the last line of input may not contain a newline character at the end.




Blank lines in the word list le must be completely ignored (that is, the program should not treat a blank line as a zero-length word).




There is no limit to the number of words in the word list.




The program is not expected to detect or handle violations of the conditions above. However, the program must correctly handle the following error conditions.




If the program is invoked with no command line arguments, or with more than one argu-ment, the program will print the text \Usage: ./stream_search <word list file" to the stderr stream and exit immediately before reading any input.




If the program is unable to open the word list le for any reason, it will print the text \Error: Unable to open word list" to the stderr and exit immediately before reading any lines from standard input.




You may nd the strstr function in the C standard library useful for searching for a given word within a line of text (but you are not required to use strstr in your implementation).







1.3 Stream Splicing Program




The second part of the assignment is to write a C program, contained in a le stream_splice.c, which takes a single word as a command line argument and reads lines of text from standard input, then removes all occurrences of the word from each line before printing the modi ed line to standard output. The program must compile, with no warnings, in ECS 242 using the following compile command.




$ gcc -Wall -std=c99 -o stream_splice stream_splice.c




The program may assume that each line read from standard input will contain at most 1000 characters, including the newline character at the end of the line. Note that the last line of input may not contain a newline character at the end.




The program will delete all occurrences of a search word w (given as the rst command line argument) from each line read from standard input, then print the result to standard output. If the program is invoked with no command line arguments, or with more than one argument, the program will print the text \Usage: ./stream_splice <search word" to the `stderr' stream and exit immediately before reading any input.




Using the text in the macbeth.txt le in Section 1.1 as input, the command




$ ./stream_splice day < macbeth.txt




will produce the following output (in which every occurrence of `day' has been deleted).










3



Tomorrow, and tomorrow, and tomorrow,




Creeps in this petty pace from to ,




To the last syllable of recorded time;




And all our yesters have lighted fools




The way to dusty death. Out, out, brief candle!




Life's but a walking shadow, a poor player




That struts and frets his hour upon the stage




And then is heard no more. It is a tale




Told by an idiot, full of sound and fury




Signifying nothing.




Note that the program is expected to continue removing occurrences of the search word until the output line does not contain the search word. Consequently, if the search word occurs multiple times, or if removing one instance of the search word creates another instance, the program should repeat the removal process until no occurrences of the search word remain.




For example, consider the input le fin.txt, containing the text below.




find




fin




fine




financial




coffining




finite




Running the command




$ ./stream_splice fin < fin.txt




to delete all occurrences of the word `fin' from each line of the le will produce the following output:




d




e




ancial




cog




ite




A suggested algorithm for this task is given by the pseudocode below.




 
w Search word




 
for each line L of input text do




 
while L contains an occurrence of w do




 
Delete the rst occurrence of w from L




 
end while




 
Print L to standard output.




 
end for






















4



 
Test Inputs




You should test both of your programs with a variety of test inputs, covering as many di erent use cases as possible. You should ensure that your programs handle error cases (such as les which do not exist) appropriately and do not produce errors on valid inputs. Since thorough testing is an integral part of the software engineering process, you will be expected to submit one test input for each program with your code. The set of all submitted test inputs will be anonymously published to all students in the course. In addition, the output of every submission on each test input will be published. Please ensure that your test les do not contain any identifying information.




For the stream_search program, submit a le search_wordlist.txt containing a word list and a le search_text.txt containing input text. Your submitted test case is expected to be a valid input, and therefore must obey all of the constraints on input given in Section 1.2. You will not receive any marks for your test case if it violates any of the input constraints.




For the stream_splice program, submit a le splice_searchword.txt containing a single search word (which will be used as a command line argument during testing) and a le splice_text.txt containing input text. Your submitted test case is expected to be a valid input, and therefore must obey all of the constraints on input given in Section 1.3. You will not receive any marks for your test case if it violates any of the input constraints.




Due to le size constraints for electronic submission, your test les may be at most 10kb in size.










 
Evaluation




Submit all of your code electronically through the Assignments tab on conneX. Your submission should consist of six les, which must be named as follows: stream_search.c, stream_splice.c, search_wordlist.txt, search_text.txt, splice_searchword.txt, splice_text.txt.




The assignment will be marked out of 18 marks and is worth 9% of your nal grade. To receive full marks on the implementation components, your implementations must function correctly for all valid inputs and produce the speci ed error messages in all de ned error cases. To receive full marks on the test cases, your submitted test cases must be comprehensive. The marks are distributed among the components of the assignment as follows.







Marks
Component






 
The stream search program functions correctly on a variety of valid inputs and behaves as speci ed for all de ned error cases.




 
The stream splice program functions correctly on a variety of valid inputs and behaves as speci ed for all de ned error cases.







 
Test cases for stream search and stream splice. You will receive 0/4 if your sub-mission does not contain all four of the required les (that is, you must submit test cases for both programs to receive any marks in this section).







Ensure that all code les needed to compile and run your code in ECS 242 are submitted. Only the les that you submit through conneX will be marked. The best way to make sure your submission is correct is to download it from conneX after submitting and test it. You are not permitted to revise your submission after the due date, and late submissions will not be accepted, so you should ensure that you have submitted the correct version of your code before the due date. conneX will allow you to change your submission before the due date if you notice a mistake. After submitting




5



your assignment, conneX will automatically send you a con rmation email. If you do not receive such an email, your submission was not received. If you have problems with the submission process, send an email to the instructor before the due date.


















































































































































































6

More products