$29
As soon as we started programming, we found to our surprise that it wasn’t as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from
then on was going to be spent in finding mistakes in my own programs.
• Maurice Wilkes, designer of EDSAC, on programming, 1949
—-/usr/games/fortune
Due by 11:59:59pm, Wednesday November 28th
For this assignment you may work with a partner. Be sure both names appear in the README. This will have to be a partner for both this assignment and Asgn 6.
Program: parseline
Shell command-line parsing:
This assignment is to do the command-line parsing necessary for the Minimally Useful SHell (mush) that will be the subject of Assignment 6.
The Minimally Useful SHell (mush) has nowhere near the functionality of a full-blown shell like \bin\sh or \bin\csh, but it does support both file redirection and pipes. parseline is a subset of the shell that prompts for and read a single mush command-line and parse it into a list of command showing the inputs, outputs, and arguments of each.
Details
The grammar of mush is fairly simple:
• A command (pipeline stage) consists of a command name followed by its arguments, separated by white space.
• A command’s standard input and standard out can be redirected ffrom or into files via the use of < (standard in) and > (standard out). The filename
1
for the redirection is the single word following the redirection symbol. A missing name will cause an error.
The redirection symbol and filename are not considered part of the com-mand name or argument list and are not included in the count of arguments.
• A pipe (|) connects the standard output of one command to the standard input of the following one. For example, "ls | sort" makes the output of ls the input of sort. A series of commands separated by pipes is a pipeline.
• You can assume that '<', '>', and '|' will appear as words by themselves with space around them. Thus, "ls b<a|more" is not a case one must handle. Additionally, these characters are not valid file names, so "a.out > < a" is an error, not the creation of a file called "<".
Note, however, that redirections will not necessarily appear at the end of the command line. That is "ls > out a b" would be a legitimate command to list the files a and b into the file out.
In order to make the process easier, you may apply certain limits to the command line structure. These limits must be documented in your README file, and command-lines that are rejected for exceeding limits must be reported as errors.
Command line length: at least 512 bytes
Commands in a pipeline: at least 10
Arguments to a command: at least 10
These limits are NOT mandatory for the project, but if you choose to enforce these limits, then your Assignment 6 must also comply to them.
Error Handling
parseline must identify and report malformed commands. This includes:
• malformed redirects. For example, "a.out < " doesn’t specify the name of the file for redirect while "a.out < a < b" has two input redirects.
• ambiguous inputs or outputs. For example, in the pipeline "ls | sort < foo", the input to sort is specified to be two different things.
Output
The purpose of parsing a command line is to identify the various components of each command so that each can be launched appropriately. On a Unix system, one needs to know where the input will come from, where the output will go, the number of arguments on the command line (not including any redirection commands) and the values of those arguments.
2
In order for this program to be efficiently graded, it is important to adhere to the output format specified below.
Error Cases When there is an error in one of the commands in the pipeline, parseline prints an error message and exits with a nonzero exit status. If there are multiple errors on a line, it prints the first one it encounters.
Cause
Message
command line length limit exceeded
command too long
pipeline has too many elements
pipeline too deep
an individual command has exceeded
cmd:
too many arguments
the limit on arguments
a pipeline has an empty stage, e.g.,
invalid null command
“ls | | more”
a command either has more then one
cmd:
bad input redirection
input redirection character (’<’), or
the input filename is missing
a command either has more then one
cmd:
bad output redirection
output redirection character (’>’), or
the input filename is missing
a stage has both an input redirect and
cmd:
ambiguous input
a pipe in
a stage has both an output redirect
cmd:
ambiguous output
and a pipe out
Valid Cases For syntactically correct pipelines, parseline should print out a description of each stage of the pipeline in the form given below. For the sake of parseline, the first stage of a pipeline will be stage 0. The required form of the output:
1. a header that identifies the stage number and shows the portion of the command line that corresponds to that stage, in quotes. This header should follow a blank line and have the following form:
--------
Stage n: "<command line>"
--------
Example:
--------
Stage 0: "ls a b c"
--------
2. a specification of the input for the stage which will be one of: a filename, if redirected
"original standard input" "pipe from stage n "
3
3. a specification of the output for the stage which will be one of: a filename, if redirected
"original standard input" "pipe to stage n "
4. the argument count (argc) for this stage. Example:
‘argc: 4‘
5. the arguments strings for the stage with extra whitespace trimmed off, in quotes, comma-separated.
Example:
‘argv: "ls", "a", "b", "c"‘
Miscellaneous Thoughts
• You can find out the number of stages in the pipeline by counting number of times the pipe character (|) appears and adding 1.
• You only need to report the first error you find, so you can abort processing once you find one.
• Be far-sighted about this. Read the specification for the full version of mush before starting so you will know where you want to end up. You want to make design decisions you can live with the next couple weeks.
• Remember that although parsing looks easy it it harder than it looks. There are a lot of edge cases.
Tricks and Tools
Remember, parsing is always harder than it looks. e sure to give some serious thought to your approaches and data structures before diving in. There are many library routines and system calls that may help with implementing parseline. Some are listed below.
sscanf()
The string functions, defined in string.h and strings.h, are
strchar()
helpful for parsing strings
index()
strtok()
strpbrk()
etc.
isspace()
One of many functions defined in ctype.h for text processing. Very
etc.
useful.
4
What to Turn In
Submit via handin to the asgn5 directory of the ngonella account:
• your well-documented source files
• A makefile (called Makefile) that will build your program with the com-mand make parseline
– For this makefile your are NOT required to have a make test, though you will be required to have one for Assignment 6
• A README file that contains
– Your name(s). In addition to your names, please include your Cal Poly login names with it, in parentheses. E.g. (ngonella)
– Any special instructions for running your program.
– Any other thing you want me to know while I am grading it.
The README file should be plain text, and should be named “README”, all capitals with no extension.
Sample Runs
Below are some sample runs of parseline. I will also place an executable version on Unix Servers in ~ngonella/public/csc-357/asgn5/parseline
• parseline line: ls
--------
Stage 0: "ls"
--------
input: original stdin
output: original stdout
argc: 1
argv: "ls"
% parseline
line: ls < one > two three four
--------
Stage 0: "ls < one > two three four"
--------
input: one
output: two
argc: 3
argv: "ls","three","four"
% parseline
line: ls < one | more | sort
--------
5
Stage 0: "ls < one "
--------
input: one
output: pipe to stage 1
argc: 1
argv: "ls"
--------
Stage 1: " more "
--------
input: pipe from stage 0
output: pipe to stage 2
argc: 1
argv: "more"
--------
Stage 2: " sort"
--------
input: pipe from stage 1
output: original stdout
argc: 1
argv: "sort"
• parseline line: ls | | more invalid null command
• parseline
line: ls < a < b | more
ls: bad input redirection
% parseline
line: ls < a | more < file
more: ambiguous input
% parseline
line: ls < a < b > c > d
ls: bad input redirection
%
6