软件调试学习笔记

Posted by API Caller on February 20, 2020, last modified on February 26, 2020

偶尔胡乱记录一些, 可能也会包括其它的, 例如 kGDB x Android 🕊

GDB

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
$ gdb-multiarch --help

    gdb [options] [executable-file [core-file or process-id]]
    gdb [options] --args executable-file [inferior-arguments ...]

Selection of debuggee and its files:

  --args             Arguments after executable-file are passed to inferior
  --core=COREFILE    Analyze the core dump COREFILE.
  --exec=EXECFILE    Use EXECFILE as the executable.
  --pid=PID          Attach to running process PID.
  --directory=DIR    Search for source files in DIR.
  --se=FILE          Use FILE as symbol file and executable file.
  --symbols=SYMFILE  Read symbols from SYMFILE.
  --readnow          Fully read symbol files on first access.
  --readnever        Do not read symbol files.
  --write            Set writing into executable and core files.

Initial commands and command files:

  --command=FILE, -x Execute GDB commands from FILE.
  --init-command=FILE, -ix
                     Like -x but execute commands before loading inferior.
  --eval-command=COMMAND, -ex
                     Execute a single GDB command.
                     May be used multiple times and in conjunction
                     with --command.
  --init-eval-command=COMMAND, -iex
                     Like -ex but before loading inferior.
  --nh               Do not read ~/.gdbinit.
  --nx               Do not read any .gdbinit files in any directory.

Output and user interface control:

  --fullname         Output information used by emacs-GDB interface.
  --interpreter=INTERP
                     Select a specific interpreter / user interface
  --tty=TTY          Use TTY for input/output by the program being debugged.
  -w                 Use the GUI interface.
  --nw               Do not use the GUI interface.
  --tui              Use a terminal user interface.
  --dbx              DBX compatibility mode.

Operating modes:

  --batch            Exit after processing options.
  --batch-silent     Like --batch, but suppress all gdb stdout output.
  --return-child-result
                     GDB exit code will be the child's exit code.
  --configuration    Print details about GDB configuration and then exit.

Remote debugging options:

  -b BAUDRATE        Set serial port baud rate used for remote debugging.
  -l TIMEOUT         Set timeout in seconds for remote debugging.

Other options:

  --cd=DIR           Change current directory to DIR.
  --data-directory=DIR, -D
                     Set GDB's data-directory to DIR.

At startup, GDB reads the following init files and executes their commands:
   * system-wide init file: /etc/gdb/gdbinit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
$ gdbserver --help
Usage:  gdbserver [OPTIONS] COMM PROG [ARGS ...]
        gdbserver [OPTIONS] --attach COMM PID
        gdbserver [OPTIONS] --multi COMM

COMM may either be a tty device (for serial debugging),
HOST:PORT to listen for a TCP connection, or '-' or 'stdio' to use
stdin/stdout of gdbserver.
PROG is the executable program.  ARGS are arguments passed to inferior.
PID is the process ID to attach to, when --attach is specified.

Operating modes:

  --attach              Attach to running process PID.
  --multi               Start server without a specific program, and
                        only quit when explicitly commanded.
  --once                Exit after the first connection has closed.

Other options:

  --wrapper WRAPPER --  Run WRAPPER to start new programs.
  --disable-randomization
                        Run PROG with address space randomization disabled.
  --no-disable-randomization
                        Don't disable address space randomization when
                        starting PROG.
  --startup-with-shell
                        Start PROG using a shell.  I.e., execs a shell that
                        then execs PROG.  (default)
  --no-startup-with-shell
                        Exec PROG directly instead of using a shell.
                        Disables argument globbing and variable substitution
                        on UNIX-like systems.

Debug options:

  --debug               Enable general debugging output.
  --debug-format=opt1[,opt2,...]
                        Specify extra content in debugging output.
                          Options:
                            all
                            none
                            timestamp
  --remote-debug        Enable remote protocol debugging output.
  --disable-packet=opt1[,opt2,...]
                        Disable support for RSP packets or features.
                          Options:
                            vCont, Tthread, qC, qfThreadInfo and
                            threads (disable all threading packets).

两个 help 又水一篇, 为所欲为

As this work has progressed, the line-of-code delta between mainline and AOSP has dropped significantly. In fact at the last audit one of the most significant contributors towards the line count turned out to be a little known tool for Android called the FIQ debugger.

Google’s Android team have already implemented an interactive debugger that can, optionally, take advantage of FIQ interrupts. Fiq_debugger has a long history that dates back several years before kdb was merged into the kernel. Recently it was used in the development of many of Google’s Nexus phones and tablets. On these devices, the UART is connected either to the USB or headphone sockets. These UARTs are disabled during normal use but become active when presence-detect resistors indicate that something is listening to the UART.


FIQ

1
2
3
CONFIG_FIQ_DEBUGGER=y
CONFIG_FIQ_DEBUGGER_CONSOLE=y
CONFIG_FIQ_DEBUGGER_CONSOLE_DEFAULT_ENABLE=y

Hardware Breakpoints

1
2
CONFIG_HAVE_HW_BREAKPOINT=y
# CONFIG_HAVE_ARCH_TRACEHOOK=y

例如 hammerhead 有 4 个可用的硬件断点:

1
2
3
4
5
6
7
gdb-multiarch -q --nh \
    -ex 'set architecture arm' \
    -ex 'set sysroot /usr/arm-linux-gnueabihf' \
    -ex 'file a.out' \
    -ex 'target remote 192.168.1.9:1234' \
    -ex 'break main' \
    -ex continue
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
The target architecture is assumed to be arm
Reading symbols from a.out...done.
Remote debugging using 192.168.1.9:1234
0x000101e8 in _start ()
Breakpoint 1 at 0x102ee: file test.c, line 6.
Continuing.

Breakpoint 1, main (argc=1, argv=0xbefffdb4) at test.c:6
6               int x = 1;
(gdb) watch x
Hardware watchpoint 2: x
(gdb) awatch y
Hardware access (read/write) watchpoint 3: y
(gdb) c
Continuing.

Hardware watchpoint 2: x

Old value = 455752
New value = 1
main (argc=1, argv=0xbefffdb4) at test.c:7
7               x++;
(gdb) c
Continuing.

Hardware watchpoint 2: x

Old value = 1
New value = 2
main (argc=1, argv=0xbefffdb4) at test.c:8
8               x++;
(gdb) i b
Num     Type           Disp Enb Address    What
1       breakpoint     keep y   0x000102ee in main at test.c:6
        breakpoint already hit 1 time
2       hw watchpoint  keep y              x
        breakpoint already hit 2 times
3       acc watchpoint keep y              y
(gdb)

kGDB

1
2
3
4
5
6
7
8
9
10
11
12
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y
CONFIG_KGDB_TESTS=y
CONFIG_KGDB_TESTS_ON_BOOT=n
CONFIG_KGDB_KDB=y
CONFIG_KDB_KEYBOARD=y
CONFIG_CONSOLE_POLL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MSM_WATCHDOG_V2=n
CONFIG_FORCE_PAGES=y
CONFIG_STRICT_MEMORY_RWX=n

androidboot.console=ttyHSL0 kgdboc=ttyHSL0,115200 kgdbretry=4 msm_watchdog_v2.enable=0


ftrace / kprobes / kprobe events

build (Android)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 高版本还有个 CONFIG_UPROBES   uprobes

CONFIG_KPROBES=y
CONFIG_KPROBE_EVENT=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_IRQSOFF_TRACER=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_PREEMPT_TRACER=y

CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=n

CONFIG_STRICT_MEMORY_RWX=n
CONFIG_DEVMEM=y
CONFIG_DEVKMEM=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y

CONFIG_BPF_JIT=n
CONFIG_NET_TCPPROBE=n
CONFIG_I2C_STUB=n
CONFIG_MEDIA_ATTACH=n
CONFIG_RTLLIB=n
CONFIG_VT6656=n
CONFIG_USB_ENESTORAGE=n
CONFIG_KPROBES_SANITY_TEST=n
CONFIG_ARM_KPROBES_TEST=n
CONFIG_DEBUG_SET_MODULE_RONX=n
CONFIG_CRYPTO_TEST=n

krobe events

1
.

troubleshooting

  • /sys/kernel/debug/tracing/trace 无法 tail -f 也尝试过类似 while true; do cat; sleep 1; done < /sys/kernel/debug/tracing/trace 之类的, 都无法实现, 其它文件则可以, 不过主要用 trace-cmd 的远程功能, 所以也不太需要解决.

systemtap (android)

build

这个交叉编译有难度


trace-cmd & KernelShark

build

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git
git clone https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
cd trace-cmd

make clean
# Android
make LDFLAGS=-static CC=arm-linux-gnueabi-gcc trace-cmd -j7

sudo apt-get install build-essential git cmake libjson-c-dev -y
sudo apt-get install freeglut3-dev libxmu-dev libxi-dev -y
sudo apt-get install qtbase5-dev -y
sudo apt-get install graphviz doxygen-gui -y

make clean
make LDFLAGS=-static gui -j7

# 虽然是 QT 的, 但似乎并没有跨平台支持 

trace-cmd

usage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
trace-cmd version 2.8.dev

usage:
  trace-cmd [COMMAND] ...

  commands:
     record - record a trace into a trace.dat file
     start - start tracing without recording into a file
     extract - extract a trace from the kernel
     stop - stop the kernel from recording trace data
     restart - restart the kernel trace data recording
     show - show the contents of the kernel tracing buffer
     reset - disable all kernel tracing and clear the trace buffers
     clear - clear the trace buffers
     report - read out the trace stored in a trace.dat file
     stream - Start tracing and read the output directly
     profile - Start profiling and read the output directly
     hist - show a historgram of the trace.dat information
     stat - show the status of the running tracing (ftrace) system
     split - parse a trace.dat file into smaller file(s)
     options - list the plugin options available for trace-cmd report
     listen - listen on a network socket for trace clients
     list - list the available events, plugins or options
     restore - restore a crashed record
     snapshot - take snapshot of running trace
     stack - output, enable or disable kernel stack tracing
     check-events - parse trace event formats


 trace-cmd record [-v][-e event [-f filter]][-p plugin][-F][-d][-D][-o file] \
           [-q][-s usecs][-O option ][-l func][-g func][-n func] \
           [-P pid][-N host:port][-t][-r prio][-b size][-B buf][command ...]
           [-m max][-C clock]
          -e run command with event enabled
          -f filter for previous -e event
          -R trigger for previous -e event
          -p run command with plugin enabled
          -F filter only on the given process
          -P trace the given pid like -F for the command
          -c also trace the childen of -F (or -P if kernel supports it)
          -C set the trace clock
          -T do a stacktrace on all events
          -l filter function name
          -g set graph function
          -n do not trace function
          -m max size per CPU in kilobytes
          -M set CPU mask to trace
          -v will negate all -e after it (disable those events)
          -d disable function tracer when running
          -D Full disable of function tracing (for all users)
          -o data output file [default trace.dat]
          -O option to enable (or disable)
          -r real time priority to run the capture threads
          -s sleep interval between recording (in usecs) [default: 1000]
          -S used with --profile, to enable only events in command line
          -N host:port to connect to (see listen)
          -t used with -N, forces use of tcp in live trace
          -b change kernel buffersize (in kilobytes per CPU)
          -B create sub buffer and folling events will be enabled here
          -k do not reset the buffers after tracing.
          -i do not fail if an event is not found
          -q print no output to the screen
          --quiet print no output to the screen
          --module filter module name
          --by-comm used with --profile, merge events for related comms
          --profile enable tracing options needed for report --profile
          --func-stack perform a stack trace for function tracer
             (use with caution)
          --max-graph-depth limit function_graph depth


 trace-cmd show [-p|-s][-c cpu][-B buf][options]
          Basically, this is a cat of the trace file.
          -p read the trace_pipe file instead
          -s read the snapshot file instance
           (Can't have both -p and -s)
          -c just show the file associated with a given CPU
          -B read from a tracing buffer instance.
          -f display the file path that is being dumped
          The following options shows the corresponding file name
           and then exits.
          --tracing_on
          --current_tracer
          --buffer_size (for buffer_size_kb)
          --buffer_total_size (for buffer_total_size_kb)
          --ftrace_filter (for set_ftrace_filter)
          --ftrace_notrace (for set_ftrace_notrace)
          --ftrace_pid (for set_ftrace_pid)
          --graph_function (for set_graph_function)
          --graph_notrace (for set_graph_notrace)
          --cpumask (for tracing_cpumask)

远程数据:

1
2
3
4
5
6
# 接收数据侧
mkdir tracefiles
trace-cmd listen -p 12345 -d tracefiles

# 发送数据侧
trace-cmd record -e demoprobe -P 2756 -N 192.168.0.212:12345

adb shell issue

在使用 trace-cmd record 的时候, 通过 ctrl + c 可以停止记录, 但是在部分 adb shell 中, ctrl+c 没有正常停止 trace-cmd record, 导致只有 trace.dat.cpuX 这样的中间文件, kernelshark 无法解析.

解决方法1:

1
adb shell su -c "pkill -2 trace-cmd"

KernelShark Plugin

资料很少, 源码里能看见, 有一个最简的 demo kernel-shark/src/plugins/missed_events.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// KernelShark
#include "plugins/missed_events.h"

static void nop_action(struct kshark_context *kshark_ctx,
				struct tep_record *record,
				struct kshark_entry *entry)
{}

/** Load this plugin. */
int KSHARK_PLUGIN_INITIALIZER(struct kshark_context *kshark_ctx)
{
	kshark_register_event_handler(&kshark_ctx->event_handlers,
				      KS_EVENT_OVERFLOW,
				      nop_action,
				      draw_missed_events);

	return 1;
}

/** Unload this plugin. */
int KSHARK_PLUGIN_DEINITIALIZER(struct kshark_context *kshark_ctx)
{
	kshark_unregister_event_handler(&kshark_ctx->event_handlers,
					KS_EVENT_OVERFLOW,
					nop_action,
					draw_missed_events);

	return 1;
}

所以好消息是只需要编译出的 plugin_*.so 有对应的导出函数, 调用约定一致, 也未必要用 c/c++ 来开发:

1
2
3
4
5
6
7
# plugin-missed_events.so

0000000000001ec0 T kshark_plugin_deinitializer
0000000000001e90 T kshark_plugin_initializer

$ ldd ./kernel-shark/lib/plugin-missed_events.so 
        libkshark.so.1.1.0 => 

Ref