$35
(project concept, portions of the writeup, and your generated executables thanks to Dr. Misurda.)
A great way to learn how something works is tovbreak it.vThis is especially true for things where you don’t knowvhowvthey work. Many of the things we know about biology, the universe, physics etc. come from seeing what happens when things go wrong.
The goal of this project is tovrecover passwords - or password patterns - for three executable files.vYouvdon’t have the source codevto these programs, only the compiled executable!
To make it more of a challenge, the programsvmayvhave a single, fixed password, or theyvmayvuse some kind of algorithm to generate/check the password.
Read this section before doing anything else!
Here’s how to set upvgdbvto use the Intel disassembly syntax. While logged into thoth, do this:
pico ~/.gdbinit
Inside that file, write this exactly:
set disassembly-flavor intel
and save. Now, when you view disassembly invgdb, it will match the slides I gave you and will be way easier to understand overall.
[40 points]vmystrings.cv 
There are many tools that will be helpful for password-cracking. One is a program to extractvreadable stringsvfrom a binary file.
A binary file can contain text, but it will be embedded within a bunch of non-text data. Here’s a snippet of an example output of thevhexdump -Cvcommand, which shows the bytes in hex on the left, and their ASCII version (or . if it’s not ASCII) on the right.
...
000002e0  21 00 00 00 12 00 00 00  00 00 00 00 00 00 00 00  |!...............|
000002f0  00 00 00 00 00 00 00 00  00 5f 5f 67 6d 6f 6e 5f  |.........__gmon_|
00000300  73 74 61 72 74 5f 5f 00  6c 69 62 63 2e 73 6f 2e  |start__.libc.so.|
00000310  36 00 70 72 69 6e 74 66  00 71 73 6f 72 74 00 5f  |6.printf.qsort._|
00000320  5f 6c 69 62 63 5f 73 74  61 72 74 5f 6d 61 69 6e  |_libc_start_main|
00000330  00 47 4c 49 42 43 5f 32  2e 32 2e 35 00 00 00 00  |.GLIBC_2.2.5....|
00000340  02 00 00 00 02 00 02 00  01 00 01 00 10 00 00 00  |................|
00000350  10 00 00 00 00 00 00 00  75 1a 69 09 00 00 02 00  |........u.i.....|
00000360  39 00 00 00 00 00 00 00  e8 09 60 00 00 00 00 00  |9.........`.....|
00000370  06 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000380  08 0a 60 00 00 00 00 00  07 00 00 00 01 00 00 00  |..`.............|
00000390  00 00 00 00 00 00 00 00  10 0a 60 00 00 00 00 00  |..........`.....|
000003a0  07 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
000003b0  18 0a 60 00 00 00 00 00  07 00 00 00 04 00 00 00  |..`.............|
000003c0  00 00 00 00 00 00 00 00  48 83 ec 08 e8 7b 00 00  |........H....{..|
000003d0  00 e8 0a 01 00 00 e8 d5  02 00 00 48 83 c4 08 c3  |...........H....|
000003e0  ff 35 12 06 20 00 ff 25  14 06 20 00 0f 1f 40 00  |.5.. ..%.. ...@.|
000003f0  ff 25 12 06 20 00 68 00  00 00 00 e9 e0 ff ff ff  |.%.. .h.........|
...
There is a UNIX program calledvstringsvwhich lets you search for strings within a binary file like this. You will write a little program calledvmystringsvwhich will be a very simplified version ofvstrings.
How it should workv 
Your program will bevveryvsimple. Seriously. This can be done in about 30-40 lines of C, not counting braces.
mystringsvwill work as follows:
    • You will run it likev./mystrings somefile.
    • It will readvsomefilevas avbinaryvfile.
        ◦ It should check if the file doesn’t exist, by seeing ifvfopenvreturnedvNULL. If so, print an error and quit.
    • It should look for sequences of 4 or morevprintable ASCII characters.
        ◦ It should print each of those sequences on its own line.
        ◦ Avprintable ASCII charactervis avcharvwhose value is in the range 32 to 126,vinclusive.vSo [32, 126].
That’s it.
Requirements and Tipsv 
Youvdon’t have to supportvany of the extra options that thevstringsvprogram supports.
Youvdon’t have to display tab and newline characters as embedded within strings, likevstringsvdoes.
The strings arevnot necessarily 0-terminated!vThey are terminated byvanyvnon-printable ASCII character. So, you can’t really usevprintf("%s").
Make sure your program can handle strings that arevarbitrarily long.vSince you have no idea how long a string is when you start reading it, that means you can’t really allocate space to store the whole string.
Fortunately, you only have tovprintvASCII characters until you hit the end. How many characters do you really have tovstore?vDo you really need to store the string at all??
Try having a look at the functions in the C standard library, such as invstdio.hvandvctypes.h. Maybe you’ll find something useful!
Testing itv 
Do not run it on text files. That’s not what it’s for.vRun it on binary files! Examples include executables, object files, image files, and more. There are lots of executables inv/bin, such asv/bin/ls.
You can see what your program isvsupposedvto output for a file by comparing its output to the output of the realvstringsvprogram. I’ve made a shell script for you:
cp /afs/pitt.edu/home/j/f/jfb42/public/cs449/testmystrings.sh .
Have a look inside it, and you will see that it runs both programs andvredirectsvtheir output to thevstrings.outvandvmystrings.outvfiles. Then it usesvgit diffvto compare them.
Thev> strings.outvisvredirectionv- it sends the output of the program to the filevstrings.outvinstead of to the console.
Now you can run it in the same directory asvmystringsvlike:
./testmystrings.sh /bin/ls
If it prints nothing, then the files are identical!
Green linesvmean your program is findingvtoo manyvstrings.
Red linesvmean your program isvmissingvstrings.
[60 points] Password-crackingv 
I have generatedvthree executable files for each of you.
Log into thoth. Thenvcd /u/SysLab/yourusername, like mine isv/u/SysLab/jfb42. (The capitalization invSysLabvis important.) There are your executables.
Thev/u/SysLab/vdirectory is physically stored on thoth. You cannot access it from any other computer, and any files you put in it will disappear at the end of the term when you lose access to thoth. It’s a better idea to do all your work in your AFS space.
So,vcopy the programs to your private directoryvby doingvcp * ~/private. From there, you can do whatever you want with em.
Your goalv 
When you run these executables, they wait for you to type something and hit enter. You must type the correct password to “unlock” them. The program will tell you whether or not you succeeded.
OH MY GOD HOW DO I EVEN STARTv 
Reverse engineering is like a puzzle. You have tovstart with what you know.vHere are some things to know:
    • A program may have a *different password every time you run it!vMake sure to test it several times, on different dates, from different computers etc.
    • All the programs arevwritten in C.
        ◦ The assembly you look at was generated by an algorithm, so there isvstructurevto it.
    • Each program will have a different passwordvper student,vbutvhow to find itvwill be the same for everyone.
    • All passwords will bevprintable ASCII charactersvand bevfar less than 100 characters in length.
        ◦ That being said, brute-forcing will probably not be productive.
Here is a page with a lot more info that might be helpful for you.
The writeupv 
You will be typing up a short document which shows, for each of the three programs:
    1. thevpassword
        ◦ it might be avsingle fixed string
        ◦ or it might bevgenerated by an algorithm,vin which case you mustvdescribe the algorithm
    2. an explanation ofvhow you found it
        ◦ you don’t have to describe theventirevprocess, but…
        ◦ briefly describe anyvfailed attempts or ideas
        ◦ and then describevhow you succeeded.
        ◦ If you couldn’t find the password,vyou can still get full credit for this part of the writeup by explaining your attempts and your theories on what the password is.
This is not a writing course. Please don’t write a novel. There is no page count to hit.vIt only has to be between 1 and 2 pages, line spacing 1.5, 12pt font, 1 inch margins.
Please keep it to-the-point and technical. You don’t need an intro and body and conclusion. Just writevclearly and tersely.vTechnical writing is aboutvclarity.v(Humor and wit are still welcome, of course.)
Grading rubric
    • [5] Submission
        ◦ Any deviation from the submission instructions will lose youvall 5 points.
    • [35]vmystrings.c
        ◦ [5]vCompiles and runs according to the specified interface
            ▪ (should work withvgcc -Wall -Werror --std=c99)
        ◦ [10]vfollows the correct definition of a “string”
            ▪ (at least 4 printable ASCII characters)
        ◦ [10]vhandles strings ofvarbitrary length
            ▪ (not just “100 characters”)
        ◦ [5]voutput isvdisplayed properly
            ▪ (one string per line, no blank lines, no debugging info etc.)
        ◦ [5]vStyle
            ▪ (this is a tiny program but try to make it readable)
    • [60] Passwords
        ◦ [10]vFound password 1
            ▪ If a program uses an algorithm for the password, but you only foundvone, it’s half credit.
        ◦ [10]vFound password 2
        ◦ [10]vFound password 3
        ◦ [10]vWriteup for password 1
            ▪ remember, you can still get full writeup credit if you didn’t find the password, as long as you describe what you tried.
        ◦ [10]vWriteup for password 2
        ◦ [10]vWriteup for password 3
Submission
Follow this checklist for a free 5 points:
You should submit a gzipped tarball containing exactlyvthree files:
    • mystrings.c
        ◦ Your username and full name should be in the comments at the top.
        ◦ Never turn in a program that doesn’t compile.
    • Yourvmystringsvexecutable
    • AvsinglevWord or PDF file namedv<username>.docx/pdfv(likevjfb42.pdf) that contains:
        ◦ Your username and full name at the top
        ◦ The three passwords (if you found them!)
        ◦ The three program writeups
Name your filevusername_proj3.tar.gzvlikevjfb42_proj3.tar.gz.
Now you canvsubmit as usual.