$29
1 Introduction
Your task in this project will be to take three C programs we give you and write equivalent assembly programs. Remember that information about MIPS is on the Administrative and resources page on ELMS. We demonstrated running the QtSpim graphical simulator in class. If your account is set up right you should be able to run it without having to do anything special or different, but you have to be using an X server to run the graphical simulator.
Read the requirements in Section 3 carefully before coding, and again after finishing the project. Failure to follow these requirements will result in losing significant credit, so be sure to read them carefully. Ask in advance if you have questions.
There are two assembly homeworks: Homework #8 is on basic assembly and is on ELMS now, and Homework #9 will be on functions in assembly and will be on ELMS on Thursday. If you do these homeworks and understand the answers, you will have already gotten practice with every assembly language feature that this project requires. If you come to the TAs’ office hours to ask any questions about writing assembly, you must have already done these homeworks. The TAs will not answer any questions about the project unless you can show them that you have first tried to do the assembly homeworks. (Of course if you need help with the homeworks you can ask in office hours.)
Due to the size of the course it is not feasible for us to be able to provide project information or help via email/ELMS messages, so we will be unable to answer such questions. However you are welcome to ask any questions verbally during the TAs’ office hours (or during, before, or after discussion section or lecture if time permits). However, you cannot wait to write projects in a course with this many students.
1.1 Extra credit
If you make only one submission that passes all the public tests you will get 3 extra–credit bonus points. And if your single submission is made at least 48 hours before the on–time deadline you will get 3 additional extra credit bonus points, for 6 extra credit points total. (Obviously you can’t get the second 3 extra credit points if you don’t get the first 3, but you can get the first 3 without getting these second 3.) (If for some reason your program passes all the public tests on Grace, but doesn’t work when you submit it, so you have to submit more than once– and this is not due to an error on your part or a bug in your code– talk with me in office hours about receiving extra credit despite having more than one submission.) You will again lose credit for having too many submissions.
2 Project description, and programs to be written
The project tarfile contains three C programs prog1.c, prog2.c, and prog3.c that you must translate to MIPS assembly programs. Extract the files from the tarfile using commands similar to those from before:
cd ~/216
tar -zxvf ~/216public/project10/project10.tgz
For each C program you must write a MIPS assembly language program that works the same way and does the same things, so it will produce the same output as the C program if given the same input (with one exception discussed in Section 2.4). Your MIPS assembly program files must be named prog1.s, prog2.s, and prog3.s.
Below we briefly explain what the three C programs do, so we can discuss some things about them, but the definitive description of what they do is the programs themselves, available in the project tarfile. Your assembly programs should do things exactly the same way as the C programs do.
Your assembly programs should all terminate via the MIPS system call to terminate execution; abnormal termination should never occur (except in the case of an I/O error, which you are not required to handle).
2.1 Rectangle comparison program (prog1.c)
This program reads four integers from the input and stores them into four global variables l1, w1, l2, and w2. The numbers respectively represent the length and width of one rectangle and the length and width of another rectangle. The program just prints −2 if any of the four dimensions were negative. Otherwise the program calculates the areas of both
© 2023 L. Herman; all rights reserved 1
rectangles and prints 0 if the rectangles have equal area, −1 if the first rectangle’s area is less than that of the second rectangle, and 1 if the first rectangle’s area is greater than that of the second rectangle. The program prints a newline after whichever number it prints.
This program will be the easiest of the three to write because it does not use any functions other than main(), and main() uses only global variables.
2.2 Digit counting program (prog2.c)
This program determines the number of digits in any number in any positive base. It reads one integer into a global variable number and another one into a global variable base. It then passes these two numbers into the parameters n and base of a function num_digits(), which computes and returns how many digits its parameter n would have in the base base, using repeated division. When the function returns, its return value is stored in another global variable result in main, which is then printed, followed by a newline.
Here is how the function handles various special cases:
• A base of zero doesn’t make any mathematical sense, so if its second parameter is zero, the function returns −1 to indicate that this is an error case.
• Mathematically it is actually possible to have a negative base. In fact, a few early mainframe computers stored numbers using a negative base system, because this allows positive and negative numbers to be stored using the same representation (unlike modern computers, which use a different representation, twos complement, for negative numbers). We do not explain this more fully here because, because since modern computers do not use negative bases we are uninterested in the number of digits of numbers in negative bases, so the function also considers this to be an error case and returns −1 in such cases as well.
• Zero has one digit, which is 0. Zero has to be handled as a special case though, because without it the function would just return 0 for zero, meaning no digits, but zero has one digit (except as noted below).
• Negative numbers have a number of digits also. The preceding negative sign is not a digit. For example, −543987 has six digits. The function handles this as a special case using this if statement and the unary prefix negation operator:
if (n < 0)
n= -n;
You may look at the function and think that this case is unnecessary and that the function will work right without it. In fact, if you were to comment it out the function would work correctly– except it wouldn’t return the right result in one specific case. So this if statement is necessary.
• Unary notation has only one digit. It doesn’t matter what is used for that digit– logically, using the symbol 0 for the
digit would probably be most consistent with other bases– but conventionally 1 is used. In unary, the magnitude of a number is just the number of 1s in its representation. For example, 310 is 1111 (meaning in unary) and 1210 is 1111111111111 . Unary numbers can also be negative, with a preceding negative sign.
The number of digits of a number in unary can’t be calculated using repeated division, because that would mean dividing by 1 at each iteration, so the number would not change. So the number of digits in a unary number has to be handled as a special case; it is just the magnitude (absolute value) of the number itself.
Note that any number has at least one digit in any positive base, except for zero in unary, which as mentioned just has zero digits.
This program (and the next one) will be more work than the first one because they have functions other than main, and the functions have parameters, local variables, and return values.
Carefully read and study the PDF lecture examples of functions in assembly. They illustrate everything you have to know and to use. But you have to understand everything that they are doing, before trying to write assembly code yourself for this program. Study them and if there is anything that you don’t understand completely about what they are doing or why, ask in office hours before starting to write this part of the project.
Here are suggestions for developing the program one part at a time, assuming you are up to speed on the assembly function examples. (Even someone who is familiar with things can still make mistakes coding in assembly, because it’s very easy to make errors in assembly but very hard to find them.)
© 2023 L. Herman; all rights reserved 2
• First just write the main function to read two numbers (your first program also read numbers, so you should understand how to read integers now) and just print them afterwards (you should also know by now how to print an integer). Run the program and make sure this works.
You don’t really have to print the numbers after they are read– you can instead see whether the numbers are being read into registers correctly in the QtSpim graphical simulator– but the main function has to print something at its end anyway, and it’s not that many instructions to print a number.
• Then just write an empty num_digits function. In other words, try writing a num_digits function that has no local variables or parameters, which doesn’t do anything other than to immediately return (after its stack frame is created), and call it from the main function between reading and printing the two numbers. Run the program and make sure this works. If so, you have some indication that you are creating the stack frame for num_digits correctly and removing it later.
• Then just modify the empty num_digits function to return a hardcoded value, like 216, meaning that it won’t actually compute the number of digits in numbers yet, and it won’t even have parameters yet, it will just always return 216, or whatever other number you like. Also modify the main function to print the value returned by the num_digits function after calling it. Run the program and if this works then it seems you are able to return a value from an assembly function.
• Then modify both the main function and the num_digits function to add two parameters to num_digits, and pass the two numbers read in main into it. The num_digits function should just ignore its parameters and still just return a fixed value (like 216), but if you test this and it works, so your program is able to call a function with parameters and print the return value afterwards (in the main program), you have some confidence that the code for your function is properly manipulating the runtime stack when a function is called and when it exits, even if the function has parameters.
• Then modify the num_digits function to just print both of its parameters, before returning the same hardcoded value. Test this and if it works you have some confidence that you are able to correctly access parameters in an assembly function.
• Then have the function just return one of its parameters, meaning one of the two numbers that were read and passed to it, instead of always returning a fixed number (like 216) as above, and still instead of trying to compute the number of digits of numbers yet. Run the program and make sure this works.
• Lastly of course, add instructions to the num_digits function to actually compute and return the number of digits in its first parameter in the indicated base (second parameter).
2.3 Recursive digit counting program (prog3.c)
This program turns the function in prog2.c into a recursive function. The program’s results will be the same in all cases as the previous one; it just uses recursion instead. This program MUST USE A RECURSIVE FUNCTION.
Zero in unary is just no digits at all, meaning just the empty string. Note that second program, prog2.c described above„ has anomalous behavior and is mathematically not correct for this case, in that it prints 1 if it is asked to determine the number of digits of zero in unary. (This program, prog3.c is correct in this regard.) Your assembly programs should agree 100% with what the C programs do, even if the C programs differ for this case. (This should be the only case that they differ for.)
Note that you may not have to repeat most of the steps above in this program– if you already wrote the second program then you already have a num_digits function that is being called correctly from the main function and that computes and returns a value. You just have to change it to call itself, which may be a little more tricky to wrap your head around than an iterative function, but of course we have provided examples of recursive assembly functions.
2.4 Reading input
All three programs read input, and they just read integer inputs. You may assume that legitimate integer values will be input to all programs, and that the numbers will be small enough to fit into an int.
Note that you can’t use the mouse to copy and paste input into the input window in the graphical simulator QtSpim, you just have to type input manually (pressing return after each number entered if a program reads more than one number).
© 2023 L. Herman; all rights reserved 3
3 Requirements
In this project you are a compiler. In particular, you are a compiler that does not do any optimization. There are many different ways that the three C programs could have been written differently yet produce the same results, but you are not going to modify the C programs when you convert them to assembly, because that is not what a compiler does. A compiler simply converts C programs, as the programmer wrote them, to assembly.
What this means is that you must follow the programs 100% accurately in translating them to assembly. Your assembly programs must use the exact same algorithms as the corresponding C programs. They must have instructions that implement the statements that are in the C programs, just converted to assembly. The functions must have the same number of returns as the C programs. They must have the exact same local variables and parameters as the functions in the C programs do. Every function, local variable, global variable, if statement, loop, etc., which is present in each C program needs to likewise be exactly present in the assembly program that you write for it. And there should also not be anything in each assembly program that is not in the corresponding C program.
If you don’t follow the programs exactly in translating them, you will lose significant credit. As an extreme example, it would easily be possible to write a program that had the same effect as the second or third programs, without using a separate function at all. However we will detect this in grading and you would lose all credit for that part of the project as a result. (If this caused you to pass fewer than half of the public tests of this project then you would be in danger of not passing the course, regardless of overall grade.) Here are more specifics and requirements:
• For the second and third programs that use functions, each function must use registers beginning with $t0. You cannot try to “reserve” different registers for use in different functions. First, if you do this, we will deduct significant credit. Second, and more important, compilers do not do this because it will not work. Trying to do this for even small programs such as the second and third ones in this project would require more registers than the machine has. So just start using registers with $t0, $t1, etc., in each function.
• A consequence of the above is that any statement in one of the C programs that has a side effect must immediately cause something (the modified variable) to be stored in memory. The semantics of side effects are that they cause memory to be modified, so you cannot just keep variables only in registers (there are not enough registers to be able to do this).
Registers temporarily store operands and results of computations, but operands of computations are first fetched (loaded) from memory, and results of computations are stored back in memory right after they are produced.
For example, the first C program has ten side effects– four scanf()s and six assignment statements. So your assembly program must have ten sw instructions.
The number of sw instructions in the second and third programs will be more than the number of scanf()s and assignments, because passing arguments to functions and creating a function’s stack frame also involve storing values in memory.
• Related to the above: when an assembly function makes any function call– which could either be to another function, or even a recursive call to itself– after the function call it cannot assume that any registers have the same values that they did before the call.
Of course if assembly code puts a value in a register and uses that register later, and it has not changed the value in that register or made any function calls in the meantime, the register will still have the same value. But differ-ent functions use the same registers– because there are not enough registers for each function to have its “own” registers– so registers will almost certainly not have the same values after a function call as before. So what assem-bly code must do is to immediately store the result of any statement that has a side effect into memory– which must be a memory location in the runtime stack if the side effect is changing a local variable or parameter– and after any function call, any values that are needed again must be reloaded into registers from memory (from the runtime stack, if the values needed are local variables or parameters).
• The second and third programs (that have functions) must pass the same arguments as their C main programs pass to their functions, using the runtime stack, as shown in class. You cannot just use global variables or registers to communicate values from one function (including main) to another function. And because the functions in the C programs only use local variables and parameters– and do not use any global variables– your assembly functions (other than main) can only use local variables and parameters and not use any global variables.
© 2023 L. Herman; all rights reserved 4
If you don’t faithfully follow the program and store all function arguments in the runtime stack you will lose significant credit.
• Your functions in the second and third programs must pass their return values back via a register, as illustrated in class (not for example using a global variable).
• The functions in the second and third assembly programs must have a local variable for every local variable in the corresponding C program. Similar to the previous item, all function local variables must be stored in the runtime stack, not in the data segment with a label (only global and static variables are stored in the data segment).
For example, the function in the third program is shown on the left below (just formatted slightly differently in one place). The version to the right of it would work exactly the same, but your assembly code must implement the version on the left (storing the results of special cases and the value returned by the recursive call into the local variable ans, then returning the value of ans, rather than just directly returning the special case values or the result produced by the recursive call), because one thing the third C program is testing for is your code being able to store and access local variables in a recursive function.
• As mentioned, the num_digits functions in the second and third C programs only use local variables and parame-ters; they do not use any global variables. Only the main functions use global variables. Your assembly functions must do the same.
If your assembly programs don’t follow these things you will lose significant credit.
static int num_digits(int n, int base) { int ans;
ans= 0;
if (base <= 0)
ans= -1;
else {
if (n < 0)
n= -n;
if (base == 1)
ans= n;
else
if (n < base)
ans= 1;
else ans= 1 + num_digits(n / base, base);
}
return ans;
}
static int num_digits(int n, int base) {
if (base <= 0)
return -1;
else {
if (n < 0)
n= -n;
if (base == 1)
return n;
else
if (n < base)
return 1;
else return 1 + num_digits(n / base, base);
}
return 0;
}
Of course there are slightly different ways of translating some statements from C to assembly, which would all be correct (just like there are obviously different ways of writing the C projects in this course that satisfy the requirements and are fine). For example, when writing code for a conditional or loop you can choose to invert the condition or not (of course the code for the subsidiary statements would be in different orders depending upon which way you choose). But the intention is to faithfully translate the programs like a compiler would (albeit an inefficient compiler), using the same algorithms, steps, statements, and variables that the original C programs do.
© 2023 L. Herman; all rights reserved 5
• Development procedure
A.1 Running your programs and checking your results
The public tests are just text files that have the numbers to be read by your assembly programs. Each assembly program will be run using the spim simulator, with input redirected from an input data file. (For debugging you will want to use QtSpim, but the tests just use the nongraphical command–line simulator spim.) To simplify things, since different tests run different programs, we wrote short, mostly one–line shell scripts to run each public test. To run the first public test just run the script public01 (as a command), which you can see just executes the command spim -file prog1.s < public01.inputdata. The shell scripts then pipe the output of the simulator into the command tail -n +2, which ignores or removes the very first line of its input (it prints all of its standard input starting with the second line). This is because the spim simulator prints an extra first line when it runs, which is different on the submit server (because the simulator is set up differently there than on Grace). After tail is used to ignore that first line your program’s actual output is all that remains, which can be directly compared with the expected outputs both on Grace and on the submit server.
As before, if no differences exist between your output and the correct output, diff will produce no output, and your code passed the test. Because the tests are all shell scripts the run-tests2 script that you were able to use in Project #9 can again be used in this project, to automate the process of running all of the tests. Once you have written the three assembly programs just use the single command run-tests2 to run all of the public tests, and see which ones passed or failed. Note also that if you name your own test input files of the form studentN.inputdata, and create expected output files with names of the form studentN.output and write scripts to run your test named like studentN (where N is a one–digit or two–digit number), run-tests2 will run your tests along with the public tests. (You must give your scripts executable permission using the chmod command, for example chmod u+x student01 for a script named student01, in order to be able to run them.)
A.2 Submitting your program
As before, the command submit will submit your project. Before you submit, however, you must first check to make sure you have passed all the public tests, by running them yourself.
A.3 Grading criteria
Your grade for this project will be based on:
public tests
45 points
secret tests
40 points
programming style
15 points
A.3.1 Style grading
For this project the style guidelines are different, as you are writing in assembly language, not C. However, assembly language is still a programming language, and good style is crucial to use in any language, so you or others can read and understand your code. Please pay close attention to these guidelines:
• Each of your MIPS program files should begin with a comment containing your name, TerpConnect login ID, University ID number, and your section number.
• Each of your MIPS program files should have a detailed explanatory comment at the top explaining in high–level terms what that program is doing.
• Your code must be thoroughly commented with explanatory comments throughout. Assembly language can easily become unreadable without proper documentation (and it’s not that easy to understand even with good explanation), so it is absolutely necessary that you comment your code very well.
Because assembly is much less readable than C you should write descriptive comments during development, not wait until you”re finished with coding to comment. Following the logic to trace a bug in assembly can be confusing and your comments will help you to remember what you were doing when you wrote code.
© 2023 L. Herman; all rights reserved 6
• Reasonable and consistent indentation is required, and you should also line up the operands of instructions neatly. As assembly language is straightforward (syntactically) compared to higher–level languages, it should not be too difficult to manually maintain indentation, however, Emacs has its own idea of what assembly indentation should look like, which might not agree with yours, and it can be annoying when it indents things differently from what you are trying to do. To disable Emacs from trying to indent your program automatically, just use the following as the very first line of your assembly program (the # makes it a comment, so spim and QtSpim will ignore it):
◦ -*- mode: text -*-
Note that all of the lecture example MIPS programs have this line.
• Label names should be descriptive and meaningful.
• Use appropriate vertical whitespace, meaning blank lines between blocks of instructions that are performing dif-ferent tasks.
• Program lines should be no longer than 80 characters. (Run your Project #2 on your programs to check this!)
• Other requirements, notes, and hints
◦ Reminder– there is a short handout on ELMS explaining how to use the graphical simulator QtSpim, including useful information about things like how to rerun a program, how to reload it after changing it, how to enlarge the default font, how to single–step through a program, etc. Be sure to read it to help with the development process.
◦ Write and test small pieces of code at a time. This is even more important with assembly than C code.
◦ Don’t forget to assign the return values of function calls to the local variables ans in the second and third programs, which the C programs do, otherwise your functions will probably be returning the wrong values.
◦ Except as noted in Section 2.4 above, in all cases each of your assembly programs must perform identically and produce the same results as the corresponding C program.
We emphasize again that your programs must do the same things the C programs do, and use the same algorithms. You should strive to implement the programs in assembly as closely to what the C programs are doing as possible. Do not add any extra conditions or checks, or remove any existing ones. And every C statement that has a side effect must cause a memory location to be modified (even if you think the program would work without it).
Remember that assembly instructions are sequential and that when compilers translate code to assembly they do so incrementally, one C statement at a time. So a good approach is to go through a program and think of how each individual line of code in isolation would be translated into one or more assembly instructions, completely ignoring what statements are before it and after it.
◦ Your programs may use any MIPS instructions supported by spim/QtSpim, even if they were not explained in class.
◦ For the programs that have functions you must use the conventions we showed in class for the location and order that parameters and local variables are stored in stack frames, and must use the pattern that we use for functions’ prologue and epilogue code.
◦ You will lose major credit if your second and third programs don’t use functions, or if the num_digits function in the third program isn’t recursive.
◦ If you were to just copy the second program prog2.s to the third one prog3.s it would pass all tests of the third program. But this will be checked for in grading, it would be considered to be hardcoding and handled as such, and you would receive at minimum a severe penalty. Doing so could also postentially be considered to be cheating, so we would have to involve the Office of Student Conduct to determine that.
◦ Recall that the course project grading policies handout on ELMS says that all your projects must work on at least half of the public tests (by the end of the semester) in order for you to be eligible to pass the course. The project grading policy has full details.
◦ If an assembly program uses memory that was never initialized it can produce different results in spim vs. QtSpim, and it can produce different results when just run in QtSpim versus single–stepped through in QtSpim. So even in assembly you must make sure that every variable in memory is given a value somehow before the value in that location is used.
© 2023 L. Herman; all rights reserved 7
• Do not use an actual MIPS compiler to generate your MIPS assembly code. If you did this, you would not get any credit for the project, so you would not pass the course. And you would also be referred to the Office of Student Conduct. It is easy to identify compiler–generated code. You need to write the assembly programs yourself.
• For this project you will lose one point from your final project score for every submission that you make in excess of six submissions, for any reason.
• Academic integrity
Please carefully read the academic honesty section of the syllabus. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, publicly providing others access to your project code online, or unau-thorized use of computer accounts, will be submitted to the Office of Student Conduct, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to projects. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. More information is in the course syllabus – please review it now.
The academic integrity requirements also apply to any test data for projects, which must be your own original work.
Exchanging test data or working together to write test cases is also prohibited.
© 2023 L. Herman; all rights reserved 8