CSC 252/452 Computer Organization
罗切斯特大学 University of Rochester
Introduction
This assignment helps you develop a detailed understanding of the calling stack organization on an x86 processor. It involves applying a series of buffer overflow attacks on an executable file.
Note: In this lab, you will gain first-hand experience with one of the methods commonly used to exploit security weaknesses in operating systems and network servers. Our purpose is to help you learn about the run-time operation of programs and to understand the nature of this form of security weakness so that you can avoid it when you write system code. We vigorously condemn the use of these or any other form of attack to gain unauthorized access to any system resources. There are criminal statutes governing such activities.
Due Dates
11:59pm, Friday Oct. 25th.
Location of the Bomb
To obtain a bomb, point your browser at http://cycle3.csug.rochester.edu:18213/. (This will work only for browsers running on a csug machine.) You'll end up downloading a tar
file which, when unpacked, will give you a set of three programs:
makecookie
- Generates a "cookie" based on your team name.
bufbomb
- The code you will attack.
hex2raw
- A utility to help convert between string formats.
All of these programs are compiled to run on the csug machines.
Team Name and Cookie
You should create a team name for the one or two people in your group of the following form:
- name
- where name is your username, if you are working alone, or
- name1
- where name1 is the username of the first team member.
A cookie is a string of eight hexadecimal digits that is (with high probability) unique to your team. You can generate your cookie with the makecookie
program giving your name as the argument. For example:
unix> makecookie bob 0x39297273
In three of your four buffer attacks, your objective will be to make your cookie show up in places where it ordinarily would not.
The bufbomb
Program
The bufbomb
program reads a string from standard input with a function getbuf
having the following C code:
int getbuf() { char buf[NORMAL_BUFFER_SIZE]; Gets(buf); return 1; }
The function Gets
is similar to the standard library function gets
. It reads a string from standard input (terminated by '\n' or end-of-file) and stores it (along with a null terminator) at the specified destination. In this code, the destination is an array buf
having sufficient space for 32 characters.
Neither Gets
nor gets
has any way to determine whether there is enough space at the destination to store the entire string. Instead, they simply copy the entire string, possibly overrunning the bounds of the storage allocated at the destination.
If the string typed by the user to getbuf
is no more than 31 characters long, it is clear that getbuf
will return 1, as shown by the following execution example:
unix> bufbomb -u bovik Type string: howdy doody Dud: getbuf returned 0x1
If we type a longer string, typically an error occurs:
unix> bufbomb -u bovik Type string: This string is too long Ouch!: You caused a segmentation fault!
As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed bufbomb
so that it does more interesting things. These are called exploit strings.
Note: This version of the lab has been specially modified to defeat the stack randomization techniques used by newer versions of Linux. It works by using mmap()
and a bit of in-line assembly code to move the stack pointed at by %esp
to an otherwise unused part of your address space. You may need to use gdb
to figure out where that is.
Bufbomb
takes several command line arguments:
- -u TEAM
- Operate the bomb for the indicated team. You should always provide this argument for several reasons:
- It is required to log your successful attacks.
Bufbomb
determines the cookie you will be using based on your team name, just as does the programmakecookie
.- We have built features into
bufbomb
so that some of the key stack addresses you will need to use depend on your team's cookie.
- -h
- Print list of possible command line arguments
- -n
- Operate in Nitro mode, as is used in Level 4 below.
- -s
- Submit your solution exploit string to the grading server.
Your exploit strings will typically contain byte values that do not correspond to the ASCII values for printing characters. The program hex2raw
can help you generate these raw strings. It takes as input a hex-formatted string. In this format, each byte value is represented by two hex digits. For example, the string "012345" could be entered in hex format as "30 31 32 33 34 35" since the ASCII code for decimal digit 0 is 0x30 and so forth.
The hex characters you pass hex2raw
should be separated by whitespace (blanks or newlines). I recommend separating different parts of your exploit string with newlines while you're working on it. hex2raw
also supports C-style block comments, so you can mark off sections of your exploit string. For example:
bf 66 7b 32 78 /* mov $0x78327b66,%edi */
Be sure to leave space around both the starting and ending comment strings (/*, */) so they will be properly ignored.
If you place a hex-formatted exploit string in the file exploit.txt, you can apply the raw string to bufbomb
in at least two different ways:
- You can set up a series of pipes to pass the string through
hex2raw
.unix> cat exploit.txt | ./hex2raw | ./bufbomb -u bovik
- You can store the raw string in a file and use I/O redirection to supply it to
bufbomb
:unix> ./hex2raw < exploit.txt > exploit-raw.txt unix> ./bufbomb -u bovik < exploit-raw.txt
This second approach can also be used when running bufbomb
from within gdb
:
unix> gdb bufbomb (gdb) run -u bovik < exploit-raw.txt
One important point: your exploit string must not contain byte value 0x0A at any intermediate position, since this is the ASCII code for newline ('\n'). When Gets
encounters this byte, it will assume you intended to terminate the string.
When you have correctly solved one of the levels, say level 0:
../hex2raw < smoke-bovik.txt | ../bufbomb -u bovik Userid: bovik Cookie: 0x1005b2b7 Type string:Smoke!: You called smoke() VALID NICE JOB!
then you can submit your solution to the grading server using the -s
option:
./hex2raw < smoke-bovik.txt | ./bufbomb -u bovik -s Userid: bovik Cookie: 0x1005b2b7 Type string:Smoke!: You called smoke() VALID Sent exploit string to server to be validated. NICE JOB!
The server will test your exploit string to make sure it really works, and it will update the Buffer Lab scoreboard page indicating that your userid (listed by your cookie for anonimity) has completed this level.
You can view the scoreboard by pointing your browser at http://cycle3.csug.rochester.edu:18213/scoreboard
Unlike the bomb lab, there is no penalty for making mistakes in this lab. Feel free to fire away at bufbomb
with any string you like.
Levels
Levels may be done in any order.
Logistics
Hand-in occurs automatically whenever you correctly solve a level and use the -s
option. Upon receiving your solution, the server will validate your string and update the Buffer Lab scoreboard Web page, which you can view by pointing your Web browser at http://cycle3.csug.rochester.edu:18213/scoreboard. You should be sure to check this page after your submission to make sure your string has been validated. (If you really solved the level, your string should be valid.)
Note that each level is graded individually. You do not need to do them in the specified order, but you will get credit only for the levels for which the server receives a valid message.
Have fun!
Generating Byte Codes
Using gcc
as an assembler and objdump
as a disassembler makes it convenient to generate the byte codes for instruction sequences. For example, suppose we write a file example.s containing the following assembly code:
# Example of hand-generated assembly code pushl $0x89abcdef # Push value onto stack addl $17,%eax # Add 17 to %eax .align 4 # Following will be aligned on multiple of 4 .long 0xfedcba98 # A 4-byte constant .long 0x00000000 # Padding
The code can contain a mixture of instructions and data. Anything to the right of a '#' character is a comment. We have added an extra word of all 0
s to work around a shortcoming in objdump
to be described shortly.
We can now assemble and disassemble this file:
unix> gcc -c example.s unix> objdump -d example.o > example.d
The generated file example.d contains the following lines
0: 68 ef cd ab 89 push $0x89abcdef 5: 83 c0 11 add $0x11,%eax 8: 98 cwtl Objdump tries to interpret 9: ba dc fe 00 00 mov $0xfedc,%edx these as instructions
Each line shows a single instruction. The number on the left indicates the starting address (starting with 0), while the hex digits after the ':' character indicate the byte codes for the instruction. Thus, we can see that the instruction pushl $0x89ABCDEF
has hex-formatted byte code 68 ef cd ab 89
.
Starting at address 8
, the disassembler gets confused. It tries to interpret the bytes in the file example.o as instructions, but these bytes actually correspond to data. Note, however, that if we read off the 4 bytes starting at address 8
we get: 98 ba dc fe
. This is a byte-reversed version of the data word 0xFEDCBA98
. This byte reversal represents the proper way to supply the bytes as a string, since a little endian machine lists the least significant byte first. Note also that it only generated two of the four bytes at the end with value 00
. Had we not added this padding, objdump
gets even more confused and does not emit all of the bytes we want.
Finally, we can read off the byte sequence for our code (omitting the final 0
s) as:
68 ef cd ab 89 83 c0 11 98 ba dc fe
咨询 Alpha 小助手,获取更多课业帮助