You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

242 lines
9.1 KiB

Demonstrations of reset-trace, for Linux bcc/BPF.
You will probably never need this tool. If you kill -9 a bcc tool (plus other
signals, like SIGTERM), or if a bcc tool crashes, then kernel tracing can be
left in a semi-enabled state. It's not as bad as it sounds: there may just be
overhead for writing to ring buffers that are never read. This tool can be
used to clean up the tracing state, and reset and disable active tracing.
WARNING: Make sure no other tracing sessions are active, as it will likely
stop them from functioning (perhaps ungracefully).
This specifically clears the state in at least the following files in
/sys/kernel/debug/tracing: kprobe_events, uprobe_events, trace_pipe.
Other tracing facilities (ftrace) are checked, and if not in an expected state,
a note is printed. All tracing files can be reset with -F for force, but this
will interfere with any other running tracing sessions (eg, ftrace).
Here's an example:
# ./reset-trace.sh
#
That's it.
You can use -v to see what it does:
# ./reset-trace.sh -v
Reseting tracing state...
Checking /sys/kernel/debug/tracing/kprobe_events
Checking /sys/kernel/debug/tracing/uprobe_events
Checking /sys/kernel/debug/tracing/trace
Checking /sys/kernel/debug/tracing/current_tracer
Checking /sys/kernel/debug/tracing/set_ftrace_filter
Checking /sys/kernel/debug/tracing/set_graph_function
Checking /sys/kernel/debug/tracing/set_ftrace_pid
Checking /sys/kernel/debug/tracing/events/enable
Checking /sys/kernel/debug/tracing/tracing_thresh
Checking /sys/kernel/debug/tracing/tracing_on
Done.
In this example, no resetting was necessary.
Here's an example of actually needing it:
# ./funccount 'bash:r*'
Tracing 317 functions for "bash:r*"... Hit Ctrl-C to end.
^C
FUNC COUNT
rl_free_undo_list 1
rl_deprep_terminal 1
readline_internal_teardown 1
rl_on_new_line 1
rl_crlf 1
rl_clear_signals 1
rl_prep_terminal 1
rl_reset_line_state 1
rl_initialize 1
rl_newline 1
readline_internal_setup 1
rl_set_screen_size 1
readline 1
rl_set_signals 1
rl_expand_prompt 1
replace_history_data 1
rl_set_prompt 1
rl_add_undo 1
rl_insert_text 2
rl_insert 2
rl_redisplay 3
rl_read_key 3
rl_getc 3
readline_internal_char 3
restore_parser_state 6
reap_dead_jobs 6
reset_parser 6
restore_input_line_state 6
realloc 7
read_octal 10
read_tty_modified 13
run_exit_trap 13
redirection_expand 13
restore_pipestatus_array 18
reader_loop 20
run_return_trap 21
remember_args 25
reset_signal_handlers 30
remove_quoted_escapes 60
run_unwind_frame 102
reset_terminating_signals 125
restore_original_signals 139
reset_internal_getopt 405
run_debug_trap 719
read_command 940
remove_quoted_nulls 1830
run_pending_traps 3207
^C
^C
^C
I've traced 317 functions using funccount, and when I hit Ctrl-C, funccount is
not exiting (it can normally take many seconds, but this really looks stuck):
# pidstat 1
Linux 4.9.0-rc1-virtual (bgregg-xenial-bpf-i-xxx) 10/18/2016 _x86_64_ (8 CPU)
10:00:33 PM UID PID %usr %system %guest %CPU CPU Command
10:00:34 PM 60004 3277 0.00 0.98 0.00 0.98 0 redis-server
10:00:34 PM 0 27980 87.25 10.78 0.00 98.04 3 funccount.py
10:00:34 PM 0 29965 0.00 0.98 0.00 0.98 6 pidstat
10:00:34 PM UID PID %usr %system %guest %CPU CPU Command
10:00:35 PM 65534 3276 0.00 1.00 0.00 1.00 2 multilog
10:00:35 PM 0 27980 77.00 23.00 0.00 100.00 3 funccount.py
10:00:35 PM 0 29965 0.00 1.00 0.00 1.00 6 pidstat
10:00:35 PM 60004 29990 0.00 1.00 0.00 1.00 6 catalina.sh
funccount looks a lot like it's in an infinite loop (I can use a stack-sampling
profiler to confirm). This is a known bug (#665) and may be fixed by the time
you read this. But right now it's a good example of needing reset-trace.
I'll send a SIGTERM, before resorting to a SIGKILL:
# kill 27980
Terminated
Ok, so the process is now gone, but it did leave tracing in a semi-enabled
state. Using reset-trace:
# ./reset-trace.sh -v
Reseting tracing state...
Checking /sys/kernel/debug/tracing/kprobe_events
Checking /sys/kernel/debug/tracing/uprobe_events
Needed to reset /sys/kernel/debug/tracing/uprobe_events
uprobe_events, before (line enumerated):
1 p:uprobes/p__bin_bash_0xa2540 /bin/bash:0x00000000000a2540
2 p:uprobes/p__bin_bash_0x21220 /bin/bash:0x0000000000021220
3 p:uprobes/p__bin_bash_0x78530 /bin/bash:0x0000000000078530
4 p:uprobes/p__bin_bash_0xa3840 /bin/bash:0x00000000000a3840
5 p:uprobes/p__bin_bash_0x9c550 /bin/bash:0x000000000009c550
6 p:uprobes/p__bin_bash_0x5e360 /bin/bash:0x000000000005e360
7 p:uprobes/p__bin_bash_0xb2630 /bin/bash:0x00000000000b2630
8 p:uprobes/p__bin_bash_0xb1e70 /bin/bash:0x00000000000b1e70
9 p:uprobes/p__bin_bash_0xb2540 /bin/bash:0x00000000000b2540
10 p:uprobes/p__bin_bash_0xb16e0 /bin/bash:0x00000000000b16e0
[...]
312 p:uprobes/p__bin_bash_0xa80b0 /bin/bash:0x00000000000a80b0
313 p:uprobes/p__bin_bash_0x9e280 /bin/bash:0x000000000009e280
314 p:uprobes/p__bin_bash_0x9e100 /bin/bash:0x000000000009e100
315 p:uprobes/p__bin_bash_0xb2bd0 /bin/bash:0x00000000000b2bd0
316 p:uprobes/p__bin_bash_0x9d9c0 /bin/bash:0x000000000009d9c0
317 p:uprobes/p__bin_bash_0x4a930 /bin/bash:0x000000000004a930
uprobe_events, after (line enumerated):
Checking /sys/kernel/debug/tracing/trace
Checking /sys/kernel/debug/tracing/current_tracer
Checking /sys/kernel/debug/tracing/set_ftrace_filter
Checking /sys/kernel/debug/tracing/set_graph_function
Checking /sys/kernel/debug/tracing/set_ftrace_pid
Checking /sys/kernel/debug/tracing/events/enable
Checking /sys/kernel/debug/tracing/tracing_thresh
Checking /sys/kernel/debug/tracing/tracing_on
Done.
Now looks clean. I did truncate the output here: there were a few hundred lines
from uprobe_events.
Here's the same situation, but without the verbose option:
# ./reset-trace.sh
Needed to reset /sys/kernel/debug/tracing/uprobe_events
#
And again with quiet:
# ./reset-trace.sh -q
#
Here is an example of reset-trace detecting an unrelated tracing session:
# ./reset-trace.sh
Noticed unrelated tracing file /sys/kernel/debug/tracing/set_ftrace_filter isn't set as expected. Not reseting (-F to force, -v for verbose).
And verbose:
# ./reset-trace.sh -v
Reseting tracing state...
Checking /sys/kernel/debug/tracing/kprobe_events
Checking /sys/kernel/debug/tracing/uprobe_events
Checking /sys/kernel/debug/tracing/trace
Checking /sys/kernel/debug/tracing/current_tracer
Checking /sys/kernel/debug/tracing/set_ftrace_filter
Noticed unrelated tracing file /sys/kernel/debug/tracing/set_ftrace_filter isn't set as expected. Not reseting (-F to force, -v for verbose).
Contents of set_ftrace_filter is (line enumerated):
1 tcp_send_mss
2 tcp_sendpage
3 tcp_sendmsg
4 tcp_send_dupack
5 tcp_send_challenge_ack.isra.53
6 tcp_send_rcvq
7 tcp_send_ack
8 tcp_send_loss_probe
9 tcp_send_fin
10 tcp_send_active_reset
11 tcp_send_synack
12 tcp_send_delayed_ack
13 tcp_send_window_probe
14 tcp_send_probe0
Expected "".
Checking /sys/kernel/debug/tracing/set_graph_function
Checking /sys/kernel/debug/tracing/set_ftrace_pid
Checking /sys/kernel/debug/tracing/events/enable
Checking /sys/kernel/debug/tracing/tracing_thresh
Checking /sys/kernel/debug/tracing/tracing_on
Done.
So this file is not currently used by bcc, but it may be useful to know that
it's not in the default state -- something is either using it or has left it
enabled. These files can be reset with -F, but that may break other tools that
are currently using them.
Use -h to print the USAGE message:
# ./reset-trace.sh -h
USAGE: reset-trace [-Fhqv]
-F # force: reset all tracing files
-v # verbose: print details while working
-h # this usage message
-q # quiet: no output
eg,
reset-trace # disable semi-enabled tracing