【施工完成】CSAPP bomb lab

2020-02-01 · 13 min read · csapp

背景

疫情肆虐,在家百无聊赖,于是开始拆炸弹. 炸弹分为6个阶段,每个阶段必须输入一个特定的字符串,否则炸弹就会爆炸. 提供给我们的是一个.c文件和一个linux可执行文件bomb

  1
  2/***************************************************************************
  3 * Dr. Evil's Insidious Bomb, Version 1.1
  4 * Copyright 2011, Dr. Evil Incorporated. All rights reserved.
  5 *
  6 * LICENSE:
  7 *
  8 * Dr. Evil Incorporated (the PERPETRATOR) hereby grants you (the
  9 * VICTIM) explicit permission to use this bomb (the BOMB).  This is a
 10 * time limited license, which expires on the death of the VICTIM.
 11 * The PERPETRATOR takes no responsibility for damage, frustration,
 12 * insanity, bug-eyes, carpal-tunnel syndrome, loss of sleep, or other
 13 * harm to the VICTIM.  Unless the PERPETRATOR wants to take credit,
 14 * that is.  The VICTIM may not distribute this bomb source code to
 15 * any enemies of the PERPETRATOR.  No VICTIM may debug,
 16 * reverse-engineer, run "strings" on, decompile, decrypt, or use any
 17 * other technique to gain knowledge of and defuse the BOMB.  BOMB
 18 * proof clothing may not be worn when handling this program.  The
 19 * PERPETRATOR will not apologize for the PERPETRATOR's poor sense of
 20 * humor.  This license is null and void where the BOMB is prohibited
 21 * by law.
 22 ***************************************************************************/
 23
 24#include <stdio.h>
 25#include <stdlib.h>
 26#include "support.h"
 27#include "phases.h"
 28
 29/* 
 30 * Note to self: Remember to erase this file so my victims will have no
 31 * idea what is going on, and so they will all blow up in a
 32 * spectaculary fiendish explosion. -- Dr. Evil 
 33 */
 34
 35FILE *infile;
 36
 37int main(int argc, char *argv[])
 38{
 39    char *input;
 40
 41    /* Note to self: remember to port this bomb to Windows and put a 
 42     * fantastic GUI on it. */
 43
 44    /* When run with no arguments, the bomb reads its input lines 
 45     * from standard input. */
 46    if (argc == 1) {  
 47	infile = stdin;
 48    } 
 49
 50    /* When run with one argument <file>, the bomb reads from <file> 
 51     * until EOF, and then switches to standard input. Thus, as you 
 52     * defuse each phase, you can add its defusing string to <file> and
 53     * avoid having to retype it. */
 54    else if (argc == 2) {
 55	if (!(infile = fopen(argv[1], "r"))) {
 56	    printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);
 57	    exit(8);
 58	}
 59    }
 60
 61    /* You can't call the bomb with more than 1 command line argument. */
 62    else {
 63	printf("Usage: %s [<input_file>]\n", argv[0]);
 64	exit(8);
 65    }
 66
 67    /* Do all sorts of secret stuff that makes the bomb harder to defuse. */
 68    initialize_bomb();
 69
 70    printf("Welcome to my fiendish little bomb. You have 6 phases with\n");
 71    printf("which to blow yourself up. Have a nice day!\n");
 72
 73    /* Hmm...  Six phases must be more secure than one phase! */
 74    input = read_line();             /* Get input                   */
 75    phase_1(input);                  /* Run the phase               */
 76    phase_defused();                 /* Drat!  They figured it out!
 77				      * Let me know how they did it. */
 78    printf("Phase 1 defused. How about the next one?\n");
 79
 80    /* The second phase is harder.  No one will ever figure out
 81     * how to defuse this... */
 82    input = read_line();
 83    phase_2(input);
 84    phase_defused();
 85    printf("That's number 2.  Keep going!\n");
 86
 87    /* I guess this is too easy so far.  Some more complex code will
 88     * confuse people. */
 89    input = read_line();
 90    phase_3(input);
 91    phase_defused();
 92    printf("Halfway there!\n");
 93
 94    /* Oh yeah?  Well, how good is your math?  Try on this saucy problem! */
 95    input = read_line();
 96    phase_4(input);
 97    phase_defused();
 98    printf("So you got that one.  Try this one.\n");
 99    
100    /* Round and 'round in memory we go, where we stop, the bomb blows! */
101    input = read_line();
102    phase_5(input);
103    phase_defused();
104    printf("Good work!  On to the next...\n");
105
106    /* This phase will never be used, since no one will get past the
107     * earlier ones.  But just in case, make this one extra hard. */
108    input = read_line();
109    phase_6(input);
110    phase_defused();
111
112    /* Wow, they got it!  But isn't something... missing?  Perhaps
113     * something they overlooked?  Mua ha ha ha ha! */
114    
115    return 0;
116}

好了.开始用gdb拆炸弹.

拆除phase 1

首先看一下phase_1函数的反汇编代码:

 1
 2(gdb) disas phase_1
 3Dump of assembler code for function phase_1:
 4   0x0000000000400ee0 <+0>:     sub    $0x8,%rsp
 5   0x0000000000400ee4 <+4>:     mov    $0x402400,%esi
 6   0x0000000000400ee9 <+9>:     callq  0x401338 <strings_not_equal>
 7   0x0000000000400eee <+14>:    test   %eax,%eax
 8   0x0000000000400ef0 <+16>:    je     0x400ef7 <phase_1+23>
 9   0x0000000000400ef2 <+18>:    callq  0x40143a <explode_bomb>
10   0x0000000000400ef7 <+23>:    add    $0x8,%rsp
11   0x0000000000400efb <+27>:    retq   
12End of assembler dump.

可以看到phase_1函数通过调用strings_not_equal函数判断两个字符串是否相等,如果不相等就跳到explode_bomb炸掉,否则返回到main函数的执行流.

接下来我们看一下strings_not_equal,暂时省略中间无关的部分.

 1
 2(gdb) disas strings_not_equal
 3Dump of assembler code for function strings_not_equal:
 4   0x0000000000401338 <+0>:     push   %r12
 5   0x000000000040133a <+2>:     push   %rbp
 6   0x000000000040133b <+3>:     push   %rbx
 7   0x000000000040133c <+4>:     mov    %rdi,%rbx
 8   0x000000000040133f <+7>:     mov    %rsi,%rbp
 9   0x0000000000401342 <+10>:    callq  0x40131b <string_length>
10   0x0000000000401347 <+15>:    mov    %eax,%r12d
11   0x000000000040134a <+18>:    mov    %rbp,%rdi
12   0x000000000040134d <+21>:    callq  0x40131b <string_length>
13   0x0000000000401352 <+26>:    mov    $0x1,%edx
14   0x0000000000401357 <+31>:    cmp    %eax,%r12d
15   0x000000000040135a <+34>:    jne    0x40139b <strings_not_equal+99>
16    ...
17    ...
18    ...
19   0x000000000040139b <+99>:    mov    %edx,%eax
20   0x000000000040139d <+101>:   pop    %rbx
21   0x000000000040139e <+102>:   pop    %rbp
22   0x000000000040139f <+103>:   pop    %r12
23   0x00000000004013a1 <+105>:   retq

我们可以看到这里似乎是通过string_length函数求了两个字符串的长度,以此来比较两个函数是否相等. 求字符串长度总需要知道字符串是什么吧? 有戏. 我们再看一下string_length 函数

 1
 2(gdb) disas string_length
 3Dump of assembler code for function string_length:
 4   0x000000000040131b <+0>:     cmpb   $0x0,(%rdi)
 5   0x000000000040131e <+3>:     je     0x401332 <string_length+23>
 6   0x0000000000401320 <+5>:     mov    %rdi,%rdx
 7   0x0000000000401323 <+8>:     add    $0x1,%rdx
 8   0x0000000000401327 <+12>:    mov    %edx,%eax
 9   0x0000000000401329 <+14>:    sub    %edi,%eax
10   0x000000000040132b <+16>:    cmpb   $0x0,(%rdx)
11   0x000000000040132e <+19>:    jne    0x401323 <string_length+8>
12   0x0000000000401330 <+21>:    repz retq 
13   0x0000000000401332 <+23>:    mov    $0x0,%eax
14   0x0000000000401337 <+28>:    retq   
15End of assembler dump.

可以看出,该函数先判断字符串是否为空,然后根据c style 字符串末尾的'\0'作为循环的终止条件得到字符串的长度.

显然,字符串内容存储在寄存器 rdi的值所在的地址.

我们在string_length 函数开始的位置设置断点,得到两个字符串.第一个为用户输入的字符串,第二个就是phase_1 过关需要的字符串,问题解决!

 1
 2(gdb) b string_length
 3Breakpoint 1 at 0x40131b
 4(gdb) c
 5The program is not being run.
 6(gdb) r
 7Starting program: /home/coder/workspace/csapp-lab/bomblab/bomb 
 8Welcome to my fiendish little bomb. You have 6 phases with
 9which to blow yourself up. Have a nice day!
10hello from 111qqz 
11
12Breakpoint 1, 0x000000000040131b in string_length ()
13(gdb) print (char *) $rdi
14$1 = 0x603780 <input_strings> "hello from 111qqz "
15(gdb) c
16Continuing.
17
18Breakpoint 1, 0x000000000040131b in string_length ()
19(gdb) print (char *) $rdi
20$2 = 0x402400 "Border relations with Canada have never been better."
21(gdb)

拆除phase 2

接下来开始解决第二阶段.

看一下phase_2 的反汇编代码:

 1
 2disas phase_2
 3Dump of assembler code for function phase_2:
 4   0x0000000000400efc <+0>:     push   %rbp
 5   0x0000000000400efd <+1>:     push   %rbx
 6   0x0000000000400efe <+2>:     sub    $0x28,%rsp
 7   0x0000000000400f02 <+6>:     mov    %rsp,%rsi
 8   0x0000000000400f05 <+9>:     callq  0x40145c <read_six_numbers>
 9   0x0000000000400f0a <+14>:    cmpl   $0x1,(%rsp)
10   0x0000000000400f0e <+18>:    je     0x400f30 <phase_2+52>
11   0x0000000000400f10 <+20>:    callq  0x40143a <explode_bomb>
12   0x0000000000400f15 <+25>:    jmp    0x400f30 <phase_2+52>
13   0x0000000000400f17 <+27>:    mov    -0x4(%rbx),%eax
14   0x0000000000400f1a <+30>:    add    %eax,%eax
15   0x0000000000400f1c <+32>:    cmp    %eax,(%rbx)
16   0x0000000000400f1e <+34>:    je     0x400f25 <phase_2+41>
17   0x0000000000400f20 <+36>:    callq  0x40143a <explode_bomb>
18   0x0000000000400f25 <+41>:    add    $0x4,%rbx
19=> 0x0000000000400f29 <+45>:    cmp    %rbp,%rbx
20   0x0000000000400f2c <+48>:    jne    0x400f17 <phase_2+27>
21   0x0000000000400f2e <+50>:    jmp    0x400f3c <phase_2+64>
22   0x0000000000400f30 <+52>:    lea    0x4(%rsp),%rbx
23   0x0000000000400f35 <+57>:    lea    0x18(%rsp),%rbp
24   0x0000000000400f3a <+62>:    jmp    0x400f17 <phase_2+27>
25   0x0000000000400f3c <+64>:    add    $0x28,%rsp
26   0x0000000000400f40 <+68>:    pop    %rbx
27   0x0000000000400f41 <+69>:    pop    %rbp
28   0x0000000000400f42 <+70>:    retq   
29End of assembler dump.

接下来看一下read_six_numbers函数

 1
 2disas read_six_numbers 
 3Dump of assembler code for function read_six_numbers:
 4   0x000000000040145c <+0>:     sub    $0x18,%rsp
 5   0x0000000000401460 <+4>:     mov    %rsi,%rdx
 6   0x0000000000401463 <+7>:     lea    0x4(%rsi),%rcx
 7   0x0000000000401467 <+11>:    lea    0x14(%rsi),%rax
 8   0x000000000040146b <+15>:    mov    %rax,0x8(%rsp)
 9   0x0000000000401470 <+20>:    lea    0x10(%rsi),%rax
10   0x0000000000401474 <+24>:    mov    %rax,(%rsp)
11   0x0000000000401478 <+28>:    lea    0xc(%rsi),%r9
12   0x000000000040147c <+32>:    lea    0x8(%rsi),%r8
13   0x0000000000401480 <+36>:    mov    $0x4025c3,%esi
14   0x0000000000401485 <+41>:    mov    $0x0,%eax
15   0x000000000040148a <+46>:    callq  0x400bf0 <__isoc99_sscanf@plt>
16   0x000000000040148f <+51>:    cmp    $0x5,%eax
17   0x0000000000401492 <+54>:    jg     0x401499 <read_six_numbers+61>
18   0x0000000000401494 <+56>:    callq  0x40143a <explode_bomb>
19   0x0000000000401499 <+61>:    add    $0x18,%rsp
20   0x000000000040149d <+65>:    retq   
21End of assembler dump.

看起来是从stdin读了六个值,如果输入值的个数不足6个,就会炸掉.

从

1
2 0x0000000000401480 <+36>:    mov    $0x4025c3,%esi
3 (gdb) print (char *) 0x4025c3
4$155 = 0x4025c3 "%d %d %d %d %d %d"

可以得出,读入的六个值的类型都为int.

从read_six_numbers的 [+0, +32] 部分的代码可以得出,读取的六个数的存储位置. 依次为:

 1
 2%rsi     %rdx
 3%rsi+4   %rcx
 4%rsi+8   %r8
 5%rsi+12  %r9
 6%rsi+16  (%rsp) 或者 %rax
 7%rsi+20  (%rsp +8)
 8
 9
10(gdb) print /x $rdx
11$171 = 0x7fffffffdae0
12(gdb) print /x $rcx
13$172 = 0x7fffffffdae4
14(gdb) print /x $r8
15$173 = 0x7fffffffdae8
16(gdb) print /x $r9
17$174 = 0x7fffffffdaec
18(gdb) print /x $rax
19$175 = 0x7fffffffdaf0
20(gdb) print /x *(int *) ($rsp+8)
21$176 = 0xffffdaf4

需要注意的是这些汇编代码中的常数都是十六进制

接下来我们可以从phase_2的 [+14,+20]行中看出,

1
2   0x0000000000400f02 <+6>:     mov    %rsp,%rsi
3   0x0000000000400f05 <+9>:     callq  0x40145c <read_six_numbers>
4   0x0000000000400f0a <+14>:    cmpl   $0x1,(%rsp)
5   0x0000000000400f0e <+18>:    je     0x400f30 <phase_2+52>
6   0x0000000000400f10 <+20>:    callq  0x40143a <explode_bomb>

(%rsp)的值必须为1,否则炸弹就炸了. 而(%rsp)的值,实际上就是read_six_numbers中 %rsi的值.

因此也就是读取的第一个数 %rsi 必须为 1.

第一个数正确后,会跳转到:

1
2
3   0x0000000000400f30 <+52>:    lea    0x4(%rsp),%rbx
4   0x0000000000400f35 <+57>:    lea    0x18(%rsp),%rbp
5   0x0000000000400f3a <+62>:    jmp    0x400f17 <phase_2+27>

可以看到 %rbx的值为读取的第二个数,而%rbp的值为读取的六个数后一个位置,猜想是类似EOF或者EOL之类标示stdin结束的值.

接下来查看 [+27,+48]行代码

1
2   0x0000000000400f17 <+27>:    mov    -0x4(%rbx),%eax
3   0x0000000000400f1a <+30>:    add    %eax,%eax
4   0x0000000000400f1c <+32>:    cmp    %eax,(%rbx)
5   0x0000000000400f1e <+34>:    je     0x400f25 <phase_2+41>
6   0x0000000000400f20 <+36>:    callq  0x40143a <explode_bomb>
7   0x0000000000400f25 <+41>:    add    $0x4,%rbx
8=> 0x0000000000400f29 <+45>:    cmp    %rbp,%rbx
9   0x0000000000400f2c <+48>:    jne    0x400f17 <phase_2+27>

可以看出这段代码是一个循环,循环终止条件为 %rbx等于 %rbp, 证实了刚刚 %rbp是某种终止符的猜想.

看起来是从第二个数,每个数(%rbx)必须等于前一个数 %eax,也就是 (%rbx - 4)的二倍,否则炸断就会炸.

因此推断,这六个数分别为: 1 2 4 8 16 32

问题解决!

拆除phase 3

接下来开始解决第三阶段.

看一下phase_3 的反汇编代码:

 1
 2(gdb) disas phase_3
 3Dump of assembler code for function phase_3:
 4   0x0000000000400f43 <+0>:     sub    $0x18,%rsp
 5   0x0000000000400f47 <+4>:     lea    0xc(%rsp),%rcx
 6   0x0000000000400f4c <+9>:     lea    0x8(%rsp),%rdx
 7   0x0000000000400f51 <+14>:    mov    $0x4025cf,%esi
 8   0x0000000000400f56 <+19>:    mov    $0x0,%eax
 9   0x0000000000400f5b <+24>:    callq  0x400bf0 <__isoc99_sscanf@plt>
10   0x0000000000400f60 <+29>:    cmp    $0x1,%eax
11   0x0000000000400f63 <+32>:    jg     0x400f6a <phase_3+39>
12   0x0000000000400f65 <+34>:    callq  0x40143a <explode_bomb>
13   0x0000000000400f6a <+39>:    cmpl   $0x7,0x8(%rsp)
14   0x0000000000400f6f <+44>:    ja     0x400fad <phase_3+106>
15   0x0000000000400f71 <+46>:    mov    0x8(%rsp),%eax
16=> 0x0000000000400f75 <+50>:    jmpq   *0x402470(,%rax,8)
17   0x0000000000400f7c <+57>:    mov    $0xcf,%eax
18   0x0000000000400f81 <+62>:    jmp    0x400fbe <phase_3+123>
19   0x0000000000400f83 <+64>:    mov    $0x2c3,%eax
20   0x0000000000400f88 <+69>:    jmp    0x400fbe <phase_3+123>
21   0x0000000000400f8a <+71>:    mov    $0x100,%eax
22   0x0000000000400f8f <+76>:    jmp    0x400fbe <phase_3+123>
23   0x0000000000400f91 <+78>:    mov    $0x185,%eax
24   0x0000000000400f96 <+83>:    jmp    0x400fbe <phase_3+123>
25   0x0000000000400f98 <+85>:    mov    $0xce,%eax
26   0x0000000000400f9d <+90>:    jmp    0x400fbe <phase_3+123>
27   0x0000000000400f9f <+92>:    mov    $0x2aa,%eax
28   0x0000000000400fa4 <+97>:    jmp    0x400fbe <phase_3+123>
29   0x0000000000400fa6 <+99>:    mov    $0x147,%eax
30   0x0000000000400fab <+104>:   jmp    0x400fbe <phase_3+123>
31   0x0000000000400fad <+106>:   callq  0x40143a <explode_bomb>
32   0x0000000000400fb2 <+111>:   mov    $0x0,%eax
33   0x0000000000400fb7 <+116>:   jmp    0x400fbe <phase_3+123>
34   0x0000000000400fb9 <+118>:   mov    $0x137,%eax
35   0x0000000000400fbe <+123>:   cmp    0xc(%rsp),%eax
36   0x0000000000400fc2 <+127>:   je     0x400fc9 <phase_3+134>
37   0x0000000000400fc4 <+129>:   callq  0x40143a <explode_bomb>
38   0x0000000000400fc9 <+134>:   add    $0x18,%rsp
39   0x0000000000400fcd <+138>:   retq   
40End of assembler dump.

可以看出这是一段switch 语句从stdin读了两个值,第一个数为case 的label,然后得到的结果与读入的第二个数比较,不相等就会爆炸.

从

1
2(gdb) print (char *) 0x4025cf
3$18 = 0x4025cf "%d %d"

读入的两个值都为int类型.

根据如下代码可以得知,第一个值的范围在[0,7]之间.

1
2   0x0000000000400f6a <+39>:    cmpl   $0x7,0x8(%rsp)
3   0x0000000000400f6f <+44>:    ja     0x400fad <phase_3+106>

[0,7]区间内共有8个整数,于是我们检查从地址 0x402470 开始的8个word,得到:

1
2(gdb) x/16w  0x402470
30x402470:       0x00400f7c      0x00000000      0x00400fb9      0x00000000
40x402480:       0x00400f83      0x00000000      0x00400f8a      0x00000000
50x402490:       0x00400f91      0x00000000      0x00400f98      0x00000000
60x4024a0:       0x00400f9f      0x00000000      0x00400fa6      0x00000000

再对照 phase_3 的汇编代码,得到如下 case label与返回值的对应关系,其中括号内为十进制表示.

1
20: 0xcf(207)
31: 0x137(311)
42: 0x2c3(707)
53: 0x100(256)
64: 0x185(389)
75: 0xce(206)
86: 0x2aa(682)
97: 0x147(327)

因此phase 3的答案有上述8对,输入任意一对即可.

拆除phase 4

接下来开始解决第四阶段.

汇编代码为:

 1
 2disas  phase_4
 3Dump of assembler code for function phase_4:
 4   0x000000000040100c <+0>:     sub    $0x18,%rsp
 5   0x0000000000401010 <+4>:     lea    0xc(%rsp),%rcx
 6   0x0000000000401015 <+9>:     lea    0x8(%rsp),%rdx
 7   0x000000000040101a <+14>:    mov    $0x4025cf,%esi
 8   0x000000000040101f <+19>:    mov    $0x0,%eax
 9   0x0000000000401024 <+24>:    callq  0x400bf0 <__isoc99_sscanf@plt>
10   0x0000000000401029 <+29>:    cmp    $0x2,%eax
11   0x000000000040102c <+32>:    jne    0x401035 <phase_4+41>
12   0x000000000040102e <+34>:    cmpl   $0xe,0x8(%rsp)
13   0x0000000000401033 <+39>:    jbe    0x40103a <phase_4+46>
14   0x0000000000401035 <+41>:    callq  0x40143a <explode_bomb>
15   0x000000000040103a <+46>:    mov    $0xe,%edx
16   0x000000000040103f <+51>:    mov    $0x0,%esi
17   0x0000000000401044 <+56>:    mov    0x8(%rsp),%edi
18   0x0000000000401048 <+60>:    callq  0x400fce <func4>
19   0x000000000040104d <+65>:    test   %eax,%eax
20   0x000000000040104f <+67>:    jne    0x401058 <phase_4+76>
21   0x0000000000401051 <+69>:    cmpl   $0x0,0xc(%rsp)
22   0x0000000000401056 <+74>:    je     0x40105d <phase_4+81>
23   0x0000000000401058 <+76>:    callq  0x40143a <explode_bomb>
24   0x000000000040105d <+81>:    add    $0x18,%rsp
25   0x0000000000401061 <+85>:    retq   
26End of assembler dump.

根据

1
2(gdb) print (char *) 0x4025cf
3$1 = 0x4025cf "%d %d"

可以得知,依然是读取了两个int类型的值. 第一个值存储在 (%rsp+0x8),第二个值存储在(%rsp+0xc)

根据 [+34,+41] 得知, 第一个数的范围是[0,14]

接下来查看func4的汇编代码:

 1
 2(gdb) disas func4
 3Dump of assembler code for function func4:
 4   0x0000000000400fce <+0>:     sub    $0x8,%rsp
 5   0x0000000000400fd2 <+4>:     mov    %edx,%eax
 6   0x0000000000400fd4 <+6>:     sub    %esi,%eax
 7   0x0000000000400fd6 <+8>:     mov    %eax,%ecx
 8   0x0000000000400fd8 <+10>:    shr    $0x1f,%ecx
 9   0x0000000000400fdb <+13>:    add    %ecx,%eax
10   0x0000000000400fdd <+15>:    sar    %eax
11   0x0000000000400fdf <+17>:    lea    (%rax,%rsi,1),%ecx
12   0x0000000000400fe2 <+20>:    cmp    %edi,%ecx
13   0x0000000000400fe4 <+22>:    jle    0x400ff2 <func4+36>
14   0x0000000000400fe6 <+24>:    lea    -0x1(%rcx),%edx
15   0x0000000000400fe9 <+27>:    callq  0x400fce <func4>
16   0x0000000000400fee <+32>:    add    %eax,%eax
17   0x0000000000400ff0 <+34>:    jmp    0x401007 <func4+57>
18   0x0000000000400ff2 <+36>:    mov    $0x0,%eax
19   0x0000000000400ff7 <+41>:    cmp    %edi,%ecx
20   0x0000000000400ff9 <+43>:    jge    0x401007 <func4+57>
21   0x0000000000400ffb <+45>:    lea    0x1(%rcx),%esi
22   0x0000000000400ffe <+48>:    callq  0x400fce <func4>
23   0x0000000000401003 <+53>:    lea    0x1(%rax,%rax,1),%eax
24   0x0000000000401007 <+57>:    add    $0x8,%rsp
25   0x000000000040100b <+61>:    retq   
26End of assembler dump.

这段代码比较复杂,可以使用gdb watchpoints来观察几个寄存器的变化.

调试的时候要注意,需要先disable这些寄存器上的watchpoints,当进入到func4函数中再开启.

最终可以艰难得发现, 读入的第一个值应该为0.

根据phase_4 汇编代码[+69,+76]

1
2   0x0000000000401051 <+69>:    cmpl   $0x0,0xc(%rsp)
3   0x0000000000401056 <+74>:    je     0x40105d <phase_4+81>
4   0x0000000000401058 <+76>:    callq  0x40143a <explode_bomb>

可以得到读入的第二个值应该为0.

此外,我们注意到第一个值也只有15种可能的取值情况. 第二个值已经确定,因此总的取值情况也只有15种.

不妨暴力枚举得到答案.

综上,输入的两个值为 0 0

拆除phase 5

接下来开始解决第五阶段.

看一下反汇编代码:

 1
 2(gdb) disas phase_5
 3Dump of assembler code for function phase_5:
 4   0x0000000000401062 <+0>:     push   %rbx
 5   0x0000000000401063 <+1>:     sub    $0x20,%rsp
 6   0x0000000000401067 <+5>:     mov    %rdi,%rbx
 7   0x000000000040106a <+8>:     mov    %fs:0x28,%rax
 8   0x0000000000401073 <+17>:    mov    %rax,0x18(%rsp)
 9   0x0000000000401078 <+22>:    xor    %eax,%eax
10   0x000000000040107a <+24>:    callq  0x40131b <string_length>
11   0x000000000040107f <+29>:    cmp    $0x6,%eax
12   0x0000000000401082 <+32>:    je     0x4010d2 <phase_5+112>
13   0x0000000000401084 <+34>:    callq  0x40143a <explode_bomb>
14   0x0000000000401089 <+39>:    jmp    0x4010d2 <phase_5+112>
15   0x000000000040108b <+41>:    movzbl (%rbx,%rax,1),%ecx
16   0x000000000040108f <+45>:    mov    %cl,(%rsp)
17   0x0000000000401092 <+48>:    mov    (%rsp),%rdx
18   0x0000000000401096 <+52>:    and    $0xf,%edx
19   0x0000000000401099 <+55>:    movzbl 0x4024b0(%rdx),%edx
20   0x00000000004010a0 <+62>:    mov    %dl,0x10(%rsp,%rax,1)
21   0x00000000004010a4 <+66>:    add    $0x1,%rax
22   0x00000000004010a8 <+70>:    cmp    $0x6,%rax
23   0x00000000004010ac <+74>:    jne    0x40108b <phase_5+41>
24   0x00000000004010ae <+76>:    movb   $0x0,0x16(%rsp)
25   0x00000000004010b3 <+81>:    mov    $0x40245e,%esi
26   0x00000000004010b8 <+86>:    lea    0x10(%rsp),%rdi
27=> 0x00000000004010bd <+91>:    callq  0x401338 <strings_not_equal>
28   0x00000000004010c2 <+96>:    test   %eax,%eax
29   0x00000000004010c4 <+98>:    je     0x4010d9 <phase_5+119>
30   0x00000000004010c6 <+100>:   callq  0x40143a <explode_bomb>
31   0x00000000004010cb <+105>:   nopl   0x0(%rax,%rax,1)
32   0x00000000004010d0 <+110>:   jmp    0x4010d9 <phase_5+119>
33   0x00000000004010d2 <+112>:   mov    $0x0,%eax
34   0x00000000004010d7 <+117>:   jmp    0x40108b <phase_5+41>
35   0x00000000004010d9 <+119>:   mov    0x18(%rsp),%rax
36   0x00000000004010de <+124>:   xor    %fs:0x28,%rax
37   0x00000000004010e7 <+133>:   je     0x4010ee <phase_5+140>
38   0x00000000004010e9 <+135>:   callq  0x400b30 <__stack_chk_fail@plt>
39   0x00000000004010ee <+140>:   add    $0x20,%rsp
40   0x00000000004010f2 <+144>:   pop    %rbx
41   0x00000000004010f3 <+145>:   retq   
42End of assembler dump.

从下面的代码可以得知,读入的字符串的长度为6

1
2   0x0000000000401078 <+22>:    xor    %eax,%eax
3   0x000000000040107a <+24>:    callq  0x40131b <string_length>
4   0x000000000040107f <+29>:    cmp    $0x6,%eax
5    0x0000000000401082 <+32>:    je     0x4010d2 <phase_5+112>
6   0x0000000000401084 <+34>:    callq  0x40143a <explode_bomb>

接下来分析这段代码中比较难懂的部分:

 1
 2   0x000000000040108b <+41>:    movzbl (%rbx,%rax,1),%ecx 取读入字符串的第一个字符
 3   0x000000000040108f <+45>:    mov    %cl,(%rsp)    %cl与%ecx是相同的,原因是ascii的取值范围可以通过一个字节表示.
 4   0x0000000000401092 <+48>:    mov    (%rsp),%rdx
 5   0x0000000000401096 <+52>:    and    $0xf,%edx    将该字符的ascii值与0xf取and,也就是截取该ascii值的后四个bit,得到的值的范围在0..15,该值存储在%edx中
 6   0x0000000000401099 <+55>:    movzbl 0x4024b0(%rdx),%edx  %rdx作为存储在$0x4024b0位置的下标索引
 7   0x00000000004010a0 <+62>:    mov    %dl,0x10(%rsp,%rax,1)  #将得到的新的字符存放在以(%rsp+0x10)开始的位置
 8   0x00000000004010a4 <+66>:    add    $0x1,%rax
 9   0x00000000004010a8 <+70>:    cmp    $0x6,%rax
10   0x00000000004010ac <+74>:    jne    0x40108b <phase_5+41>

大致可以看出这是一个循环,依次处理每个字符.

在进入这个循环之前, %rbx 存储着读入字符串的指针,%rax在[+112] 清空了,是0x0

$0x4024b0 位置开始的长度16的候选字符串为 "maduiersnfotvbyl"

因此上面这段循环做的就是从上述候选字符串中选6个字符,依次存储在从(%rsp+0x10)开始的位置

接下来我们看这段代码:

1
2   0x00000000004010b3 <+81>:    mov    $0x40245e,%esi
3   0x00000000004010b8 <+86>:    lea    0x10(%rsp),%rdi
4=> 0x00000000004010bd <+91>:    callq  0x401338 <strings_not_equal>
5   0x00000000004010c2 <+96>:    test   %eax,%eax
6   0x00000000004010c4 <+98>:    je     0x4010d9 <phase_5+119>
7   0x00000000004010c6 <+100>:   callq  0x40143a <explode_bomb>

发现是要求经过变换后的字符串与某个给定的字符串相等

给定的字符串为"flyers",找出这6个字符再候选字符串中的位置

分别为 9,15,14,5,6,7 写成十六进制就是 0x9,0xf,0xe,0x5,0x6,0x7

查看ascii表,找到6个最后的十六进制值为上述值的ascii 码,就是答案.

显然,答案有多种. 其中一种答案为:

ionefg

问题解决!

拆除phase 6

一口老血...phase 6真是恶心....花的时间比前面五个加起来还要多

接下来开始解决第六阶段.

看一下反汇编代码:

 1
 2(gdb) disas phase_6
 3Dump of assembler code for function phase_6:
 4   0x00000000004010f4 <+0>:     push   %r14
 5   0x00000000004010f6 <+2>:     push   %r13
 6   0x00000000004010f8 <+4>:     push   %r12
 7   0x00000000004010fa <+6>:     push   %rbp
 8   0x00000000004010fb <+7>:     push   %rbx
 9   0x00000000004010fc <+8>:     sub    $0x50,%rsp
10   0x0000000000401100 <+12>:    mov    %rsp,%r13
11   0x0000000000401103 <+15>:    mov    %rsp,%rsi
12   0x0000000000401106 <+18>:    callq  0x40145c <read_six_numbers>
13   0x000000000040110b <+23>:    mov    %rsp,%r14
14   0x000000000040110e <+26>:    mov    $0x0,%r12d
15   0x0000000000401114 <+32>:    mov    %r13,%rbp
16   0x0000000000401117 <+35>:    mov    0x0(%r13),%eax
17   0x000000000040111b <+39>:    sub    $0x1,%eax
18   0x000000000040111e <+42>:    cmp    $0x5,%eax
19   0x0000000000401121 <+45>:    jbe    0x401128 <phase_6+52>
20   0x0000000000401123 <+47>:    callq  0x40143a <explode_bomb>
21   0x0000000000401128 <+52>:    add    $0x1,%r12d
22   0x000000000040112c <+56>:    cmp    $0x6,%r12d
23   0x0000000000401130 <+60>:    je     0x401153 <phase_6+95>
24   0x0000000000401132 <+62>:    mov    %r12d,%ebx
25   0x0000000000401135 <+65>:    movslq %ebx,%rax
26   0x0000000000401138 <+68>:    mov    (%rsp,%rax,4),%eax
27   0x000000000040113b <+71>:    cmp    %eax,0x0(%rbp)
28   0x000000000040113e <+74>:    jne    0x401145 <phase_6+81>
29   0x0000000000401140 <+76>:    callq  0x40143a <explode_bomb>
30   0x0000000000401145 <+81>:    add    $0x1,%ebx
31   0x0000000000401148 <+84>:    cmp    $0x5,%ebx
32   0x000000000040114b <+87>:    jle    0x401135 <phase_6+65>
33   0x000000000040114d <+89>:    add    $0x4,%r13
34   0x0000000000401151 <+93>:    jmp    0x401114 <phase_6+32>
35   0x0000000000401153 <+95>:    lea    0x18(%rsp),%rsi
36   0x0000000000401158 <+100>:   mov    %r14,%rax
37   0x000000000040115b <+103>:   mov    $0x7,%ecx
38   0x0000000000401160 <+108>:   mov    %ecx,%edx
39   0x0000000000401162 <+110>:   sub    (%rax),%edx
40   0x0000000000401164 <+112>:   mov    %edx,(%rax)
41   0x0000000000401166 <+114>:   add    $0x4,%rax
42   0x000000000040116a <+118>:   cmp    %rsi,%rax
43   0x000000000040116d <+121>:   jne    0x401160 <phase_6+108>
44   0x000000000040116f <+123>:   mov    $0x0,%esi
45   0x0000000000401174 <+128>:   jmp    0x401197 <phase_6+163>
46   0x0000000000401176 <+130>:   mov    0x8(%rdx),%rdx
47   0x000000000040117a <+134>:   add    $0x1,%eax
48   0x000000000040117d <+137>:   cmp    %ecx,%eax
49   0x000000000040117f <+139>:   jne    0x401176 <phase_6+130>
50   0x0000000000401181 <+141>:   jmp    0x401188 <phase_6+148>
51   0x0000000000401183 <+143>:   mov    $0x6032d0,%edx
52   0x0000000000401188 <+148>:   mov    %rdx,0x20(%rsp,%rsi,2)
53   0x000000000040118d <+153>:   add    $0x4,%rsi
54   0x0000000000401191 <+157>:   cmp    $0x18,%rsi
55   0x0000000000401195 <+161>:   je     0x4011ab <phase_6+183>
56   0x0000000000401197 <+163>:   mov    (%rsp,%rsi,1),%ecx
57   0x000000000040119a <+166>:   cmp    $0x1,%ecx
58   0x000000000040119d <+169>:   jle    0x401183 <phase_6+143>
59   0x000000000040119f <+171>:   mov    $0x1,%eax
60   0x00000000004011a4 <+176>:   mov    $0x6032d0,%edx
61   0x00000000004011a9 <+181>:   jmp    0x401176 <phase_6+130>
62   0x00000000004011ab <+183>:   mov    0x20(%rsp),%rbx
63   0x00000000004011b0 <+188>:   lea    0x28(%rsp),%rax
64   0x00000000004011b5 <+193>:   lea    0x50(%rsp),%rsi
65   0x00000000004011ba <+198>:   mov    %rbx,%rcx
66   0x00000000004011bd <+201>:   mov    (%rax),%rdx
67   0x00000000004011c0 <+204>:   mov    %rdx,0x8(%rcx)
68   0x00000000004011c4 <+208>:   add    $0x8,%rax
69   0x00000000004011c8 <+212>:   cmp    %rsi,%rax
70   0x00000000004011cb <+215>:   je     0x4011d2 <phase_6+222>
71   0x00000000004011cd <+217>:   mov    %rdx,%rcx
72   0x00000000004011d0 <+220>:   jmp    0x4011bd <phase_6+201>
73   0x00000000004011d2 <+222>:   movq   $0x0,0x8(%rdx)
74   0x00000000004011da <+230>:   mov    $0x5,%ebp
75   0x00000000004011df <+235>:   mov    0x8(%rbx),%rax
76   0x00000000004011e3 <+239>:   mov    (%rax),%eax
77=> 0x00000000004011e5 <+241>:   cmp    %eax,(%rbx)
78   0x00000000004011e7 <+243>:   jge    0x4011ee <phase_6+250>
79   0x00000000004011e9 <+245>:   callq  0x40143a <explode_bomb>
80   0x00000000004011ee <+250>:   mov    0x8(%rbx),%rbx
81   0x00000000004011f2 <+254>:   sub    $0x1,%ebp
82   0x00000000004011f5 <+257>:   jne    0x4011df <phase_6+235>
83   0x00000000004011f7 <+259>:   add    $0x50,%rsp
84   0x00000000004011fb <+263>:   pop    %rbx
85   0x00000000004011fc <+264>:   pop    %rbp
86   0x00000000004011fd <+265>:   pop    %r12
87   0x00000000004011ff <+267>:   pop    %r13
88   0x0000000000401201 <+269>:   pop    %r14
89   0x0000000000401203 <+271>:   retq   
90End of assembler dump.

前半部分代码还算比较好懂:

 1
 2   0x0000000000401106 <+18>:    callq  0x40145c <read_six_numbers>
 3   0x000000000040110b <+23>:    mov    %rsp,%r14
 4   0x000000000040110e <+26>:    mov    $0x0,%r12d
 5   0x0000000000401114 <+32>:    mov    %r13,%rbp
 6   0x0000000000401117 <+35>:    mov    0x0(%r13),%eax
 7   0x000000000040111b <+39>:    sub    $0x1,%eax
 8   0x000000000040111e <+42>:    cmp    $0x5,%eax
 9   0x0000000000401121 <+45>:    jbe    0x401128 <phase_6+52>
10   0x0000000000401123 <+47>:    callq  0x40143a <explode_bomb>
11   0x0000000000401128 <+52>:    add    $0x1,%r12d
12   0x000000000040112c <+56>:    cmp    $0x6,%r12d
13   0x0000000000401130 <+60>:    je     0x401153 <phase_6+95>
14   0x0000000000401132 <+62>:    mov    %r12d,%ebx
15   0x0000000000401135 <+65>:    movslq %ebx,%rax
16   0x0000000000401138 <+68>:    mov    (%rsp,%rax,4),%eax
17   0x000000000040113b <+71>:    cmp    %eax,0x0(%rbp)
18   0x000000000040113e <+74>:    jne    0x401145 <phase_6+81>
19   0x0000000000401140 <+76>:    callq  0x40143a <explode_bomb>
20   0x0000000000401145 <+81>:    add    $0x1,%ebx
21   0x0000000000401148 <+84>:    cmp    $0x5,%ebx
22   0x000000000040114b <+87>:    jle    0x401135 <phase_6+65>
23   0x000000000040114d <+89>:    add    $0x4,%r13
24   0x0000000000401151 <+93>:    jmp    0x401114 <phase_6+32>

这段代码是读取6个int 类型的数,然后每个数必须属于[1,6],并且互不相同. 因此输入的6个数为1..6的6个数字的一个排列.

接下来也比较好懂,就是把每个读入的数x变成了7-x.

为了方便说明,接下来省略提及这个变换,将7-x当成读作的数

 1
 2   0x0000000000401153 <+95>:    lea    0x18(%rsp),%rsi
 3   0x0000000000401158 <+100>:   mov    %r14,%rax
 4   0x000000000040115b <+103>:   mov    $0x7,%ecx
 5   0x0000000000401160 <+108>:   mov    %ecx,%edx
 6   0x0000000000401162 <+110>:   sub    (%rax),%edx
 7   0x0000000000401164 <+112>:   mov    %edx,(%rax)
 8   0x0000000000401166 <+114>:   add    $0x4,%rax
 9   0x000000000040116a <+118>:   cmp    %rsi,%rax
10   0x000000000040116d <+121>:   jne    0x401160 <phase_6+108>

接下来这段简直吐血:

 1
 2
 3   0x000000000040116f <+123>:   mov    $0x0,%esi
 4   0x0000000000401174 <+128>:   jmp    0x401197 <phase_6+163>
 5   0x0000000000401176 <+130>:   mov    0x8(%rdx),%rdx
 6   0x000000000040117a <+134>:   add    $0x1,%eax
 7   0x000000000040117d <+137>:   cmp    %ecx,%eax
 8   0x000000000040117f <+139>:   jne    0x401176 <phase_6+130>
 9   0x0000000000401181 <+141>:   jmp    0x401188 <phase_6+148>
10   0x0000000000401183 <+143>:   mov    $0x6032d0,%edx
11   0x0000000000401188 <+148>:   mov    %rdx,0x20(%rsp,%rsi,2)
12   0x000000000040118d <+153>:   add    $0x4,%rsi
13   0x0000000000401191 <+157>:   cmp    $0x18,%rsi
14   0x0000000000401195 <+161>:   je     0x4011ab <phase_6+183>
15   0x0000000000401197 <+163>:   mov    (%rsp,%rsi,1),%ecx
16   0x000000000040119a <+166>:   cmp    $0x1,%ecx
17   0x000000000040119d <+169>:   jle    0x401183 <phase_6+143>
18   0x000000000040119f <+171>:   mov    $0x1,%eax
19   0x00000000004011a4 <+176>:   mov    $0x6032d0,%edx
20   0x00000000004011a9 <+181>:   jmp    0x401176 <phase_6+130>

在进入上述这段代码之前, %rsp 存储着第一个读入数的内存地址,%ecx是常数7

要注意两个语句,一个是 mov (%rsp,%rsi,1),%ecx 中(%rsp,%rsi,1)表示的是读入的数存放的位置

一个是 mov %rdx,0x20(%rsp,%rsi,2) 中 0x20(%rsp,%rsi,2) 表示的是某个结果存放的位置.

这段代码先跳到 +163, 拿到读入的数,看是否等于1

如果相等: 跳到 +143,把 $0x6032d0 放到 0x20(%rsp,%rsi,2) 的位置
如果不相等: 继续执行,eax为1,edx为$0x6032d0,然后跳到 +130,edx变为下一个位置$0x6032e0.这样一直进行读入的数x-1次. (比如读入的数是4,那么就会进行3次edx变为下一个位置的操作,也就是变为$0x603300)

可以看出这是个链表的数据结构.

因此这段代码做的是, 按照读入的数的从小到大的出现的位置,依次把从 $0x6032e0 到$0x603320 放到
栈上(%rsp+0x20+8i)的位置.

举个例子, 假设读入的数为:

15 4 3 2 1 6

那么从 (%rsp+0x20+8i) 开始的数就是:

1$0x603310  $0x603300  $0x6032f0 $0x6032e0 $0x6032d0  $0x603320

接下来我们看最后一段代码:

 1
 2   0x00000000004011ab <+183>:   mov    0x20(%rsp),%rbx
 3   0x00000000004011b0 <+188>:   lea    0x28(%rsp),%rax
 4   0x00000000004011b5 <+193>:   lea    0x50(%rsp),%rsi
 5   0x00000000004011ba <+198>:   mov    %rbx,%rcx
 6   0x00000000004011bd <+201>:   mov    (%rax),%rdx
 7   0x00000000004011c0 <+204>:   mov    %rdx,0x8(%rcx)
 8   0x00000000004011c4 <+208>:   add    $0x8,%rax
 9   0x00000000004011c8 <+212>:   cmp    %rsi,%rax
10   0x00000000004011cb <+215>:   je     0x4011d2 <phase_6+222>
11   0x00000000004011cd <+217>:   mov    %rdx,%rcx
12   0x00000000004011d0 <+220>:   jmp    0x4011bd <phase_6+201>
13   0x00000000004011d2 <+222>:   movq   $0x0,0x8(%rdx)
14   0x00000000004011da <+230>:   mov    $0x5,%ebp
15   0x00000000004011df <+235>:   mov    0x8(%rbx),%rax
16   0x00000000004011e3 <+239>:   mov    (%rax),%eax
17   0x00000000004011e5 <+241>:   cmp    %eax,(%rbx)
18   0x00000000004011e7 <+243>:   jge    0x4011ee <phase_6+250>
19   0x00000000004011e9 <+245>:   callq  0x40143a <explode_bomb>
20   0x00000000004011ee <+250>:   mov    0x8(%rbx),%rbx
21   0x00000000004011f2 <+254>:   sub    $0x1,%ebp
22   0x00000000004011f5 <+257>:   jne    0x4011df <phase_6+235>

从 [+235,+245] 可以看出,这里要求从 0x20(%rsp) 开始的栈上存储的地址对应位置的值,必须前一个必须大于后一个.

我们可以查看从 $0x6032d0位置开始的6个地址中的值:

 1(gdb) x/3w 0x6032d0
 20x6032d0 <node1>:       0x0000014c      0x00000001      0x00000000
 3(gdb) x/3w 0x6032e0
 40x6032e0 <node2>:       0x000000a8      0x00000002      0x006032d0
 5(gdb) x/3w 0x6032f0
 60x6032f0 <node3>:       0x0000039c      0x00000003      0x006032e0
 7(gdb) x/3w 0x603300
 80x603300 <node4>:       0x000002b3      0x00000004      0x006032f0
 9(gdb) x/3w 0x603310
100x603310 <node5>:       0x000001dd      0x00000005      0x00603300
11(gdb) x/3w 0x603320
120x603320 <node6>:       0x000001bb      0x00000006      0x00603310

按照值的顺序降序node序号,得到 3 4 5 6 1 2

最后不要忘记用7减去每个位置的值,因此最终答案就是: 4 3 2 1 6 5