18-213 Computer Systems: Attack Labs
1 Introduction
This assignment involves generating a total of five attacks on two programs which have different security
vulnerabilities. In this lab, you will:
• Learn different ways that attackers can exploit security vulnerabilities when programs do not safe
guard themselves well enough against buffer overflows.
• Gain a better understanding of how to write programs that are more secure, as well as some of the
features provided by compilers and operating systems to make programs less vulnerable.
• Gain a deeper understanding of the stack and parameter-passing mechanisms of x86-64 machine code.
• Gain a deeper understanding of how x86-64 instructions are encoded.
• Gain more experience with debugging tools such as GDB and OBJDUMP.
Note: In this lab, you will gain firsthand experience with methods used to exploit security weaknesses in
operating systems and network servers. Our purpose is to help you learn about the runtime operation of
programs and to understand the nature of these security weaknesses so that you can avoid them when you
write system code. We do not condone the use of any other form of attack to gain unauthorized access to
any system resources.
You will want to study Sections 3.10.3 and 3.10.4 of the book as reference material for this lab.
2 Logistics
As usual, this is an individual project. You will generate attacks for target programs that are custom generated for you.
2.1 Getting FilesYou can obtain your files from the Autolab sitehttps://ics.autolabproject.comAfter logging in to Autolab, select Attacklab -> Download handout. The Autolab server willbuild your files and return them to your browser in a tar file called targetk.tar, where k is the uniquenumber of your target programs.Note: It takes about 15 seconds to build and download your target, so please be patient.Save the targetk.tar file in a (protected) Andrew directory in which you plan to do your work. Thenlogin to a shark machine and give the command: tar -xvf targetk.tar. This will extract a directorytargetk containing the files described below.You should only download one set of files. If for some reason you download multiple targets, choose onetarget to work on and delete the rest.Warning: If you expand your targetk.tar on a PC, by using a utility such as Winzip, or letting yourbrowser do the extraction, you’ll risk resetting permission bits on the executable files.The files in targetk include:README.txt: A file describing the contents of the directoryctarget: An executable program vulnerable to code-injection attacksrtarget: An executable program vulnerable to return-oriented-programming attackscookie.txt: An 8-digit hex code that you will use as a unique identifier in your attacks.farm.c: The source code of your target’s “gadget farm,” which you will use in generating return-orientedprogramming attacks.hex2raw: A utility to generate attack strings.In the following instructions, we will assume that you have already copied the files to a protected localdirectory, and that you are executing the programs in that local directory.2.2 Important PointsHere is a summary of some important rules regarding valid solutions for this lab. These points will not makemuch sense when you read this document for the first time. They are presented here as a central referenceof rules once you get started.• You must do the assignment on one of the class shark machines. There are ten machines available tostudents to use, a full list of which can be found on the course Web site at:http://www.cs.cmu.edu/˜18213/labmachines.html• Your solutions may not use attacks to circumvent the validation code in the programs.• You may only construct gadgets from file rtarget with addresses ranging between those for functions start_farm and end_farm.• You can use any gadgets you discover in the farm, not just those for which we give the byte codes inFigure 3.• You are allowed to use the standard tools for this assignment: gdb (or lldb), objdump, and any toolthat you create (from scratch) on your own.• You are not allowed to use tools or gdb plugins that are designed to assist in implementing bufferoverflow and ROP attacks.• It is highly recommended that you gdb to confirm that your exploit is placed on the stack with thecorrect byte ordering• It is not uncommon for gdb to get lost while performing exploit code. You may be unable to stepthrough your exploit in gdb while seeing all of the usual diagnostic information.3 Target ProgramsBoth CTARGET and RTARGET read strings from standard input. They do so with the function read and process linedefined below:This function reads a byte sequence from standard input, terminated by either a newline (’\n’), or end offile (EOF). Then it calls another function, process line, passing it the bytes that were read. (It does notadd a NUL terminator to the bytes it reads, so what it passes to process line is not a “string.” Instead,it passes the number of bytes read as process line’s second argument.)In the code sample, you can see that read and process line stores the byte sequence in a local variable buf, an array of BUFFER_SIZE bytes. (BUFFER_SIZE is a compile-time constant, specific to yourversion of CTARGET and RTARGET.) Notice that the while loop does not stop when BUFFER_SIZE byteshave been read. This is the same bug that’s found in the C library function gets: it keeps reading data untilend of line or file, possibly overrunning the bounds of the storage allocated for the data.(Note that the value on the Cookie: line will differ from yours.)Program RTARGET will have the same behavior. As the errormessage indicates, overrunning the buffer typically causes the program state to be corrupted, leading to amemory access error. Your task is to be more clever with the strings you feed CTARGET and RTARGET sothat they do more interesting things. These are called exploit strings.Both CTARGET and RTARGET take several different command line arguments:-h: Print list of possible command line arguments-i FILE: Supply input from a file, rather than from standard inputYour exploit strings will typically contain byte values that do not correspond to the ASCII values for printingcharacters. The program HEX2RAW will enable you to generate these raw strings. See Appendix A for moreinformation on how to use HEX2RAW.Important points:• Your exploit string must not contain byte value 0x0a at any intermediate position, since this is theASCII code for newline (‘\n’). When Gets encounters this byte, it will assume you intended toterminate the string.• HEX2RAW expects two-digit hex values separated by one or more white spaces. So if you want tocreate a byte with a hex value of 0, you need to write it as 00. To create the word 0xdeadbeefyou should pass “ef be ad de” to HEX2RAW (note the reversal required for little-endian byteordering).Unlike the Bomb Lab, there is no penalty for making mistakes in this lab. Feel free to fire away at CTARGETand RTARGET with any strings you like.1Figure 1 summarizes the five phases of the lab. As can be seen, the first three involve code-injection (CI)attacks on CTARGET, while the last two involve return-oriented-programming (ROP) attacks on RTARGET.4 Part I: Code Injection AttacksFor the first three phases, your exploit strings will attack CTARGET. This program is set up in a way thatthe stack positions will be consistent from one run to the next and so that data on the stack can be treated asexecutable code. These features make the program vulnerable to attacks where the exploit strings containthe byte encodings of executable code.4.1 Level 1For Phase 1, you will not inject new code. Instead, your exploit string will redirect the program to executean existing procedure.Function read and process line is called within CTARGET by a function test having the followingC code:When read and process line executes its return statement (line 5 of read and process line),the program ordinarily resumes execution within function test (at line 5 of this function). We want tochange this behavior. Within the file ctarget, there is code for a function touch1 having the followingC representation:Your task is to get CTARGET to execute the code for touch1 when read and process line executesits return statement, rather than returning to test. Note that your exploit string may also corrupt partsof the stack not directly related to this stage, but this will not cause a problem, since touch1 causes theprogram to exit directly.Some Advice:• All the information you need to devise your exploit string for this level can be determined by examining a disassembled version of CTARGET. Use objdump -d to get this dissembled version.• The idea is to position a byte representation of the starting address for touch1 so that the retinstruction at the end of the code for read and process line will transfer control to touch1.• Be careful about byte ordering. Remember, arrays (such as strings) are saved in index order, butvalues like integers are evaluated in little-endian.• You might want to use GDB to step the program through the last few instructions of read and process lineto make sure it is doing the right thing.• The placement of buf within the stack frame for read and process line depends on the valueof compile-time constant BUFFER_SIZE, as well the allocation strategy used by GCC. You will needto examine the disassembled code to determine its position.4.2 Level 2Phase 2 involves injecting a small amount of code as part of your exploit string.Within the file ctarget there is code for a function touch2 having the following C representation:Your task is to get CTARGET to execute the code for touch2 rather than returning to test. In this case,however, you must make it appear to touch2 as if you have passed your cookie as its argument.Some Advice:• You will want to position a byte representation of the address of your injected code in such a way thatret instruction at the end of the code for read and process line will transfer control to it.• Recall that the first argument to a function is passed in register %rdi.• Your injected code should set the register to your cookie, and then use a ret instruction to transfercontrol to the first instruction in touch2.• Do not attempt to use jmp or call instructions in your exploit code. The encodings of destinationaddresses for these instructions are difficult to formulate. Use ret instructions for all transfers ofcontrol, even when you are not returning from a call.• See the discussion in Appendix B on how to use tools to generate the byte-level representations ofinstruction sequences.4.3 Level 3Phase 3 also involves a code injection attack, but passing a string as argument.Within the file ctarget there is code for functions hexmatch and touch3 having the following Crepresentations:Your task is to get CTARGET to execute the code for touch3 rather than returning to test. You mustmake it appear to touch3 as if you have passed a string representation of your cookie as its argument.Some Advice:• You will need to include a string representation of your cookie in your exploit string. The string shouldconsist of the eight hexadecimal digits (ordered from most to least significant) without a leading “0x.”• Recall that a string is represented in C as a sequence of bytes followed by a byte with value 0. Type“man ascii” on any Linux machine to see the byte representations of the characters you need.• Your injected code should set register %rdi to the address of this string.• When functions hexmatch and strncmp are called, they push data onto the stack, overwritingportions of memory that held the buffer used by read and process line. As a result, you willneed to be careful where you place the string representation of your cookie.5 Part II: Return-Oriented ProgrammingPerforming code-injection attacks on program RTARGET is much more difficult than it is for CTARGET,because it uses two techniques to thwart such attacks:• It uses randomization so that the stack positions differ from one run to another. This makes it impossible to determine where your injected code will be located.• It marks the section of memory holding the stack as nonexecutable, so even if you could set theprogram counter to the start of your injected code, the program would fail with a segmentation fault.Fortunately, clever people have devised strategies for getting useful things done in a program by executingexisting code, rather than injecting new code. The most general form of this is referred to as return-orientedprogramming (ROP) [1, 2]. The strategy with ROP is to identify byte sequences within an existing programthat consist of one or more instructions followed by the instruction ret. Such a segment is referred to as agadget. Figure 2 illustrates how the stack can be set up to execute a sequence of n gadgets. In this figure, thestack contains a sequence of gadget addresses. Each gadget consists of a series of instruction bytes, with thefinal one being 0xc3, encoding the ret instruction. When the program executes a ret instruction startingwith this configuration, it will initiate a chain of gadget executions, with the ret instruction at the end ofeach gadget causing the program to jump to the beginning of the next.A gadget can make use of code corresponding to assembly-language statements generated by the compiler,especially ones at the ends of functions. In practice, there may be some useful gadgets of this form, but notenough to implement many important operations. For example, it is highly unlikely that a compiled functionwould have popq %rdi as its last instruction before ret. Fortunately, with a byte-oriented instructionset, such as x86-64, a gadget can often be found by extracting patterns from other parts of the instructionbyte sequence.The chances of this function being useful for attacking a system seem pretty slim. But, the disassembledmachine code for this function shows an interesting byte sequence:The byte sequence 48 89 c7 encodes the instruction movq %rax, %rdi. (See Figure 3A for theencodings of useful movq instructions.) This sequence is followed by byte value c3, which encodes theret instruction. The function starts at address 0x400f15, and the sequence starts on the fourth byte ofthe function. Thus, this code contains a gadget, having a starting address of 0x400f18, that will copy the64-bit value in register %rax to register %rdi.Your code for RTARGET contains a number of functions similar to the setval_210 function shown abovein a region we refer to as the gadget farm. Your job will be to identify useful gadgets in the gadget farm anduse these to perform attacks similar to those you did in Phases 2 and 3.Important: The gadget farm is demarcated by functions start_farm and end_farm in your copy ofrtarget. Do not attempt to construct gadgets from other portions of the program code5.1 Level 2For Phase 4, you will repeat the attack of Phase 2, but do so on program RTARGET using gadgets from yourgadget farm. You can construct your solution using gadgets consisting of the following instruction types,and using only the first eight x86-64 registers (%rax–%rdi).movq : The codes for these are shown in Figure 3A.popq : The codes for these are shown in Figure 3B.ret : This instruction is encoded by the single byte 0xc3.nop : This instruction (pronounced “no op,” which is short for “no operation”) is encoded by the singlebyte 0x90. Its only effect is to cause the program counter to be incremented by 1.Some Advice:• All the gadgets you need can be found in the region of the code for rtarget demarcated by thefunctions start_farm and mid_farm.• You can do this attack with just two gadgets.• When a gadget uses a popq instruction, it will pop data from the stack. As a result, your exploitstring will contain a combination of gadget addresses and data.5.2 Level 3Before you take on the Phase 5, pause to consider what you have accomplished so far. In Phases 2 and 3,you caused a program to execute machine code of your own design. If CTARGET had been a network server,you could have injected your own code into a distant machine. In Phase 4, you circumvented two of themain devices modern systems use to thwart buffer overflow attacks. Although you did not inject your owncode, you were able inject a type of program that operates by stitching together sequences of existing code.You have also gotten 95/100 points for the lab. That’s a good score. If you have other pressing obligationsconsider stopping right now.Phase 5 requires you to do an ROP attack on RTARGET to invoke function touch3 with a pointer to a stringrepresentation of your cookie. That may not seem significantly more difficult than using an ROP attack toinvoke touch2, except that we have made it so. Moreover, Phase 5 counts for only 5 points, which is not atrue measure of the effort it will require. Think of it as more an extra credit problem for those who want togo beyond the normal expectations for the course.To solve Phase 5, you can use gadgets in the region of the code in rtarget demarcated by functionsstart_farm and end_farm. In addition to the gadgets used in Phase 4, this expanded farm includesthe encodings of different movl instructions, as shown in Figure 3C. The byte sequences in this part of thefarm also contain 2-byte instructions that serve as functional nops, i.e., they do not change any register ormemory values. These include instructions, shown in Figure 3D, such as andb %al,%al, that operate onthe low-order bytes of some of the registers but do not change their values.• You’ll want to review the effect a movl instruction has on the upper 4 bytes of a register, as isdescribed on page 183 of the text.• The official solution requires a sequence of eight gadgets. Depending on the contents of your farm(each target has a different one), you may be able find a shorter one.• Remember: Your exploit string must not contain the newline character (byte value 0x0a) at any intermediate position6 Logistical NotesWhenever you correctly solve a phase, your target program will send a message to Autolab. The serverwill test your exploit string to make sure it really works, (From Autolab, follow Attacklab -> Viewscoreboard). This Web page is updated every minute or so to show the progress for each target. Youshould be sure to check this page after your submission to make sure your string has been validated. (Ifyou really solved the phase, your string should be valid. But, an attack that works on your copy of oneof the targets may be rejected by the version on the server, if it does not correctly implement the specifiedfunctionality.)Each phase is graded individually. You do not need to do them in the specified order, but you will get creditonly for the phases for which the server receives a valid message.Good luck and have fun!
咨询 Alpha 小助手,获取更多课业帮助