You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
891 lines
20 KiB
891 lines
20 KiB
7 months ago
|
Demonstrations of stackcount, the Linux eBPF/bcc version.
|
||
|
|
||
|
|
||
|
This program traces functions and frequency counts them with their entire
|
||
|
stack trace, summarized in-kernel for efficiency. For example, counting
|
||
|
stack traces that led to the submit_bio() kernel function, which creates
|
||
|
block device I/O:
|
||
|
|
||
|
# ./stackcount submit_bio
|
||
|
Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
submit_bio
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__filemap_fdatawrite_range
|
||
|
filemap_flush
|
||
|
ext4_alloc_da_blocks
|
||
|
ext4_release_file
|
||
|
__fput
|
||
|
____fput
|
||
|
task_work_run
|
||
|
exit_to_usermode_loop
|
||
|
syscall_return_slowpath
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
tar [15069]
|
||
|
5
|
||
|
|
||
|
submit_bio
|
||
|
ext4_bio_write_page
|
||
|
mpage_submit_page
|
||
|
mpage_map_and_submit_buffers
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__filemap_fdatawrite_range
|
||
|
filemap_flush
|
||
|
ext4_alloc_da_blocks
|
||
|
ext4_release_file
|
||
|
__fput
|
||
|
____fput
|
||
|
task_work_run
|
||
|
exit_to_usermode_loop
|
||
|
syscall_return_slowpath
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
tar [15069]
|
||
|
15
|
||
|
|
||
|
submit_bio
|
||
|
ext4_readpages
|
||
|
__do_page_cache_readahead
|
||
|
ondemand_readahead
|
||
|
page_cache_async_readahead
|
||
|
generic_file_read_iter
|
||
|
__vfs_read
|
||
|
vfs_read
|
||
|
sys_read
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
[unknown]
|
||
|
tar [15069]
|
||
|
113
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
The output shows unique stack traces, in order from leaf (on-CPU) to root,
|
||
|
followed by their occurrence count. The last stack trace in the above output
|
||
|
shows syscall handling, sys_read(), vfs_read(), and then "readahead" functions:
|
||
|
looks like an application issued file read has triggered read ahead. The
|
||
|
application can be seen after the stack trace, in this case, "tar [15069]"
|
||
|
for the "tar" command, PID 15069.
|
||
|
|
||
|
The order of printed stack traces is from least to most frequent. The most
|
||
|
frequent in this case, the ext4_rename() stack, was taken 113 times during
|
||
|
tracing.
|
||
|
|
||
|
The "[unknown]" frames are from user-level, since this simple workload is
|
||
|
the tar command, which apparently has been compiled without frame pointers.
|
||
|
It's a common compiler optimization, but it breaks frame pointer-based stack
|
||
|
walkers. Similar broken stacks will be seen by other profilers and debuggers
|
||
|
that use frame pointers. Hopefully your application preserves them so that
|
||
|
the user-level stack trace is visible. So how does one get frame pointers, if
|
||
|
your application doesn't have them to start with? For the current bcc (until
|
||
|
it supports other stack walkers), you need to be running a application binaries
|
||
|
that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's
|
||
|
about all I'll say here: this is a big topic that is not bcc/BPF specific.
|
||
|
|
||
|
It can be useful to trace the path to submit_bio to explain unusual rates of
|
||
|
disk IOPS. These could have in-kernel origins (eg, background scrub).
|
||
|
|
||
|
|
||
|
Now adding the -d option to delimit kernel and user stacks:
|
||
|
|
||
|
# ./stackcount -d submit_bio
|
||
|
Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
submit_bio
|
||
|
submit_bh
|
||
|
journal_submit_commit_record
|
||
|
jbd2_journal_commit_transaction
|
||
|
kjournald2
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
jbd2/xvda1-8 [405]
|
||
|
1
|
||
|
|
||
|
submit_bio
|
||
|
submit_bh
|
||
|
jbd2_journal_commit_transaction
|
||
|
kjournald2
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
jbd2/xvda1-8 [405]
|
||
|
2
|
||
|
|
||
|
submit_bio
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__filemap_fdatawrite_range
|
||
|
filemap_flush
|
||
|
ext4_alloc_da_blocks
|
||
|
ext4_release_file
|
||
|
__fput
|
||
|
____fput
|
||
|
task_work_run
|
||
|
exit_to_usermode_loop
|
||
|
syscall_return_slowpath
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
tar [15187]
|
||
|
5
|
||
|
|
||
|
submit_bio
|
||
|
ext4_bio_write_page
|
||
|
mpage_submit_page
|
||
|
mpage_map_and_submit_buffers
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__filemap_fdatawrite_range
|
||
|
filemap_flush
|
||
|
ext4_alloc_da_blocks
|
||
|
ext4_release_file
|
||
|
__fput
|
||
|
____fput
|
||
|
task_work_run
|
||
|
exit_to_usermode_loop
|
||
|
syscall_return_slowpath
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
tar [15187]
|
||
|
15
|
||
|
|
||
|
submit_bio
|
||
|
ext4_readpages
|
||
|
__do_page_cache_readahead
|
||
|
ondemand_readahead
|
||
|
page_cache_async_readahead
|
||
|
generic_file_read_iter
|
||
|
__vfs_read
|
||
|
vfs_read
|
||
|
sys_read
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
tar [15187]
|
||
|
171
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
A "--" is printed between the kernel and user stacks.
|
||
|
|
||
|
|
||
|
As a different example, here is the kernel function hrtimer_init_sleeper():
|
||
|
|
||
|
# ./stackcount.py -d hrtimer_init_sleeper
|
||
|
Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
containerd [16020]
|
||
|
1
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__pthread_cond_timedwait
|
||
|
Monitor::IWait(Thread*, long)
|
||
|
Monitor::wait(bool, long, bool)
|
||
|
CompileQueue::get()
|
||
|
CompileBroker::compiler_thread_loop()
|
||
|
JavaThread::thread_main_inner()
|
||
|
JavaThread::run()
|
||
|
java_start(Thread*)
|
||
|
start_thread
|
||
|
java [4996]
|
||
|
1
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
containerd [16020]
|
||
|
1
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__pthread_cond_timedwait
|
||
|
VMThread::loop()
|
||
|
VMThread::run()
|
||
|
java_start(Thread*)
|
||
|
start_thread
|
||
|
java [4996]
|
||
|
3
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
dockerd [16008]
|
||
|
4
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
dockerd [16008]
|
||
|
4
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__pthread_cond_timedwait
|
||
|
Lio/netty/util/ThreadDeathWatcher$Watcher;::run
|
||
|
Interpreter
|
||
|
Interpreter
|
||
|
call_stub
|
||
|
JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
|
||
|
JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
|
||
|
JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
|
||
|
thread_entry(JavaThread*, Thread*)
|
||
|
JavaThread::thread_main_inner()
|
||
|
JavaThread::run()
|
||
|
java_start(Thread*)
|
||
|
start_thread
|
||
|
java [4996]
|
||
|
4
|
||
|
|
||
|
hrtimer_init_sleeper
|
||
|
do_futex
|
||
|
SyS_futex
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__pthread_cond_timedwait
|
||
|
clock_gettime
|
||
|
[unknown]
|
||
|
java [4996]
|
||
|
79
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
I was just trying to find a more interesting example. This output includes
|
||
|
some Java stacks where user-level has been walked correctly (even includes a
|
||
|
JIT symbol translation). dockerd and containerd don't have frame pointers
|
||
|
(grumble), but Java does (which is running with -XX:+PreserveFramePointer).
|
||
|
|
||
|
|
||
|
Here's another kernel function, ip_output():
|
||
|
|
||
|
# ./stackcount.py -d ip_output
|
||
|
Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
ip_output
|
||
|
ip_queue_xmit
|
||
|
tcp_transmit_skb
|
||
|
tcp_write_xmit
|
||
|
__tcp_push_pending_frames
|
||
|
tcp_push
|
||
|
tcp_sendmsg
|
||
|
inet_sendmsg
|
||
|
sock_sendmsg
|
||
|
sock_write_iter
|
||
|
__vfs_write
|
||
|
vfs_write
|
||
|
SyS_write
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__write_nocancel
|
||
|
[unknown]
|
||
|
sshd [15015]
|
||
|
5
|
||
|
|
||
|
ip_output
|
||
|
ip_queue_xmit
|
||
|
tcp_transmit_skb
|
||
|
tcp_write_xmit
|
||
|
__tcp_push_pending_frames
|
||
|
tcp_push
|
||
|
tcp_sendmsg
|
||
|
inet_sendmsg
|
||
|
sock_sendmsg
|
||
|
sock_write_iter
|
||
|
__vfs_write
|
||
|
vfs_write
|
||
|
SyS_write
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__write_nocancel
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
sshd [8234]
|
||
|
5
|
||
|
|
||
|
ip_output
|
||
|
ip_queue_xmit
|
||
|
tcp_transmit_skb
|
||
|
tcp_write_xmit
|
||
|
__tcp_push_pending_frames
|
||
|
tcp_push
|
||
|
tcp_sendmsg
|
||
|
inet_sendmsg
|
||
|
sock_sendmsg
|
||
|
sock_write_iter
|
||
|
__vfs_write
|
||
|
vfs_write
|
||
|
SyS_write
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
__write_nocancel
|
||
|
sshd [15015]
|
||
|
7
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
This time just sshd is triggering ip_output() calls.
|
||
|
|
||
|
|
||
|
Watch what happens if I filter on kernel stacks only (-K) for ip_output():
|
||
|
|
||
|
# ./stackcount.py -K ip_output
|
||
|
Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
ip_output
|
||
|
ip_queue_xmit
|
||
|
tcp_transmit_skb
|
||
|
tcp_write_xmit
|
||
|
__tcp_push_pending_frames
|
||
|
tcp_push
|
||
|
tcp_sendmsg
|
||
|
inet_sendmsg
|
||
|
sock_sendmsg
|
||
|
sock_write_iter
|
||
|
__vfs_write
|
||
|
vfs_write
|
||
|
SyS_write
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
13
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
They have grouped together as a single unique stack, since the kernel part
|
||
|
was the same.
|
||
|
|
||
|
|
||
|
Here is just the user stacks, fetched during the kernel function ip_output():
|
||
|
|
||
|
# ./stackcount.py -U ip_output
|
||
|
Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
[unknown]
|
||
|
snmpd [1645]
|
||
|
1
|
||
|
|
||
|
__write_nocancel
|
||
|
[unknown]
|
||
|
[unknown]
|
||
|
sshd [8234]
|
||
|
3
|
||
|
|
||
|
__write_nocancel
|
||
|
sshd [15015]
|
||
|
4
|
||
|
|
||
|
I should really run a custom sshd with frame pointers so we can see its
|
||
|
stack trace...
|
||
|
|
||
|
|
||
|
User-space functions can also be traced if a library name is provided. For
|
||
|
example, to quickly identify code locations that allocate heap memory for
|
||
|
PID 4902 (using -p), by tracing malloc from libc ("c:malloc"):
|
||
|
|
||
|
# ./stackcount -p 4902 c:malloc
|
||
|
Tracing 1 functions for "malloc"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
malloc
|
||
|
rbtree_new
|
||
|
main
|
||
|
[unknown]
|
||
|
12
|
||
|
|
||
|
malloc
|
||
|
_rbtree_node_new_internal
|
||
|
_rbtree_node_insert
|
||
|
rbtree_insert
|
||
|
main
|
||
|
[unknown]
|
||
|
1189
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
Kernel stacks are absent as this didn't enter kernel code.
|
||
|
|
||
|
Note that user-space uses of stackcount can be somewhat more limited because
|
||
|
a lot of user-space libraries and binaries are compiled without frame-pointers
|
||
|
as discussed earlier (the -fomit-frame-pointer compiler default) or are used
|
||
|
without debuginfo.
|
||
|
|
||
|
|
||
|
In addition to kernel and user-space functions, kernel tracepoints and USDT
|
||
|
tracepoints are also supported.
|
||
|
|
||
|
For example, to determine where threads are being created in a particular
|
||
|
process, use the pthread_create USDT tracepoint:
|
||
|
|
||
|
# ./stackcount -p $(pidof parprimes) u:pthread:pthread_create
|
||
|
Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
|
||
|
parprimes [11923]
|
||
|
pthread_create@@GLIBC_2.2.5
|
||
|
main
|
||
|
__libc_start_main
|
||
|
[unknown]
|
||
|
7
|
||
|
|
||
|
You can use "readelf -n file" to see if it has USDT tracepoints.
|
||
|
|
||
|
|
||
|
Similarly, to determine where context switching is happening in the kernel,
|
||
|
use the sched:sched_switch kernel tracepoint:
|
||
|
|
||
|
# ./stackcount t:sched:sched_switch
|
||
|
__schedule
|
||
|
schedule
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
kworker/0:2 [25482]
|
||
|
1
|
||
|
|
||
|
__schedule
|
||
|
schedule
|
||
|
schedule_hrtimeout_range_clock
|
||
|
schedule_hrtimeout_range
|
||
|
ep_poll
|
||
|
SyS_epoll_wait
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
epoll_wait
|
||
|
Lsun/nio/ch/SelectorImpl;::lockAndDoSelect
|
||
|
Lsun/nio/ch/SelectorImpl;::select
|
||
|
Lio/netty/channel/nio/NioEventLoop;::select
|
||
|
Lio/netty/channel/nio/NioEventLoop;::run
|
||
|
Interpreter
|
||
|
Interpreter
|
||
|
call_stub
|
||
|
JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
|
||
|
JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
|
||
|
JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
|
||
|
thread_entry(JavaThread*, Thread*)
|
||
|
JavaThread::thread_main_inner()
|
||
|
JavaThread::run()
|
||
|
java_start(Thread*)
|
||
|
start_thread
|
||
|
java [4996]
|
||
|
1
|
||
|
|
||
|
... (omitted for brevity)
|
||
|
|
||
|
__schedule
|
||
|
schedule
|
||
|
schedule_preempt_disabled
|
||
|
cpu_startup_entry
|
||
|
xen_play_dead
|
||
|
arch_cpu_idle_dead
|
||
|
cpu_startup_entry
|
||
|
cpu_bringup_and_idle
|
||
|
swapper/1 [0]
|
||
|
289
|
||
|
|
||
|
|
||
|
A -i option can be used to set an output interval, and -T to include a
|
||
|
timestamp. For example:
|
||
|
|
||
|
# ./stackcount.py -Tdi 1 submit_bio
|
||
|
Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
|
||
|
|
||
|
06:05:13
|
||
|
|
||
|
06:05:14
|
||
|
submit_bio
|
||
|
xfs_do_writepage
|
||
|
write_cache_pages
|
||
|
xfs_vm_writepages
|
||
|
do_writepages
|
||
|
__writeback_single_inode
|
||
|
writeback_sb_inodes
|
||
|
__writeback_inodes_wb
|
||
|
wb_writeback
|
||
|
wb_workfn
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:1 [15686]
|
||
|
1
|
||
|
|
||
|
submit_bio
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:0 [16007]
|
||
|
1
|
||
|
|
||
|
submit_bio
|
||
|
xfs_buf_submit
|
||
|
xlog_bdstrat
|
||
|
xlog_sync
|
||
|
xlog_state_release_iclog
|
||
|
_xfs_log_force
|
||
|
xfs_log_force
|
||
|
xfs_fs_sync_fs
|
||
|
sync_fs_one_sb
|
||
|
iterate_supers
|
||
|
sys_sync
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
sync [16039]
|
||
|
1
|
||
|
|
||
|
submit_bio
|
||
|
submit_bh
|
||
|
journal_submit_commit_record
|
||
|
jbd2_journal_commit_transaction
|
||
|
kjournald2
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
jbd2/xvda1-8 [405]
|
||
|
1
|
||
|
|
||
|
submit_bio
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/0:2 [25482]
|
||
|
2
|
||
|
|
||
|
submit_bio
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__writeback_single_inode
|
||
|
writeback_sb_inodes
|
||
|
__writeback_inodes_wb
|
||
|
wb_writeback
|
||
|
wb_workfn
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:0 [16007]
|
||
|
4
|
||
|
|
||
|
submit_bio
|
||
|
xfs_vm_writepages
|
||
|
do_writepages
|
||
|
__writeback_single_inode
|
||
|
writeback_sb_inodes
|
||
|
__writeback_inodes_wb
|
||
|
wb_writeback
|
||
|
wb_workfn
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:1 [15686]
|
||
|
5
|
||
|
|
||
|
submit_bio
|
||
|
__block_write_full_page
|
||
|
block_write_full_page
|
||
|
blkdev_writepage
|
||
|
__writepage
|
||
|
write_cache_pages
|
||
|
generic_writepages
|
||
|
blkdev_writepages
|
||
|
do_writepages
|
||
|
__filemap_fdatawrite_range
|
||
|
filemap_fdatawrite
|
||
|
fdatawrite_one_bdev
|
||
|
iterate_bdevs
|
||
|
sys_sync
|
||
|
entry_SYSCALL_64_fastpath
|
||
|
--
|
||
|
[unknown]
|
||
|
sync [16039]
|
||
|
7
|
||
|
|
||
|
submit_bio
|
||
|
submit_bh
|
||
|
jbd2_journal_commit_transaction
|
||
|
kjournald2
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
jbd2/xvda1-8 [405]
|
||
|
8
|
||
|
|
||
|
submit_bio
|
||
|
ext4_bio_write_page
|
||
|
mpage_submit_page
|
||
|
mpage_map_and_submit_buffers
|
||
|
ext4_writepages
|
||
|
do_writepages
|
||
|
__writeback_single_inode
|
||
|
writeback_sb_inodes
|
||
|
__writeback_inodes_wb
|
||
|
wb_writeback
|
||
|
wb_workfn
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:0 [16007]
|
||
|
8
|
||
|
|
||
|
submit_bio
|
||
|
__block_write_full_page
|
||
|
block_write_full_page
|
||
|
blkdev_writepage
|
||
|
__writepage
|
||
|
write_cache_pages
|
||
|
generic_writepages
|
||
|
blkdev_writepages
|
||
|
do_writepages
|
||
|
__writeback_single_inode
|
||
|
writeback_sb_inodes
|
||
|
__writeback_inodes_wb
|
||
|
wb_writeback
|
||
|
wb_workfn
|
||
|
process_one_work
|
||
|
worker_thread
|
||
|
kthread
|
||
|
ret_from_fork
|
||
|
--
|
||
|
kworker/u16:0 [16007]
|
||
|
60
|
||
|
|
||
|
|
||
|
06:05:15
|
||
|
|
||
|
06:05:16
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
This only included output for the 06:05:14 interval. The other internals
|
||
|
did not span block device I/O.
|
||
|
|
||
|
|
||
|
The -s output prints the return instruction offset for each function (aka
|
||
|
symbol offset). Eg:
|
||
|
|
||
|
# ./stackcount.py -s tcp_sendmsg
|
||
|
Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
tcp_sendmsg+0x1
|
||
|
sock_sendmsg+0x38
|
||
|
sock_write_iter+0x85
|
||
|
__vfs_write+0xe3
|
||
|
vfs_write+0xb8
|
||
|
SyS_write+0x55
|
||
|
entry_SYSCALL_64_fastpath+0x1e
|
||
|
__write_nocancel+0x7
|
||
|
sshd [15015]
|
||
|
3
|
||
|
|
||
|
tcp_sendmsg+0x1
|
||
|
sock_sendmsg+0x38
|
||
|
sock_write_iter+0x85
|
||
|
__vfs_write+0xe3
|
||
|
vfs_write+0xb8
|
||
|
SyS_write+0x55
|
||
|
entry_SYSCALL_64_fastpath+0x1e
|
||
|
__write_nocancel+0x7
|
||
|
sshd [8234]
|
||
|
3
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
If it wasn't clear how one function called another, knowing the instruction
|
||
|
offset can help you locate the lines of code from a disassembly dump.
|
||
|
|
||
|
|
||
|
The -v output is verbose, and shows raw addresses:
|
||
|
|
||
|
./stackcount.py -v tcp_sendmsg
|
||
|
Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
ffffffff817b05c1 tcp_sendmsg
|
||
|
ffffffff8173ea48 sock_sendmsg
|
||
|
ffffffff8173eae5 sock_write_iter
|
||
|
ffffffff81232b33 __vfs_write
|
||
|
ffffffff812331b8 vfs_write
|
||
|
ffffffff81234625 SyS_write
|
||
|
ffffffff818739bb entry_SYSCALL_64_fastpath
|
||
|
7f120511e6e0 __write_nocancel
|
||
|
sshd [8234]
|
||
|
3
|
||
|
|
||
|
ffffffff817b05c1 tcp_sendmsg
|
||
|
ffffffff8173ea48 sock_sendmsg
|
||
|
ffffffff8173eae5 sock_write_iter
|
||
|
ffffffff81232b33 __vfs_write
|
||
|
ffffffff812331b8 vfs_write
|
||
|
ffffffff81234625 SyS_write
|
||
|
ffffffff818739bb entry_SYSCALL_64_fastpath
|
||
|
7f919c5a26e0 __write_nocancel
|
||
|
sshd [15015]
|
||
|
11
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
|
||
|
A wildcard can also be used. Eg, all functions beginning with "tcp_send",
|
||
|
kernel stacks only (-K) with offsets (-s):
|
||
|
|
||
|
# ./stackcount -Ks 'tcp_send*'
|
||
|
Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end.
|
||
|
^C
|
||
|
tcp_send_delayed_ack0x1
|
||
|
tcp_rcv_established0x3b1
|
||
|
tcp_v4_do_rcv0x130
|
||
|
tcp_v4_rcv0x8e0
|
||
|
ip_local_deliver_finish0x9f
|
||
|
ip_local_deliver0x51
|
||
|
ip_rcv_finish0x8a
|
||
|
ip_rcv0x29d
|
||
|
__netif_receive_skb_core0x637
|
||
|
__netif_receive_skb0x18
|
||
|
netif_receive_skb_internal0x23
|
||
|
1
|
||
|
|
||
|
tcp_send_delayed_ack0x1
|
||
|
tcp_rcv_established0x222
|
||
|
tcp_v4_do_rcv0x130
|
||
|
tcp_v4_rcv0x8e0
|
||
|
ip_local_deliver_finish0x9f
|
||
|
ip_local_deliver0x51
|
||
|
ip_rcv_finish0x8a
|
||
|
ip_rcv0x29d
|
||
|
__netif_receive_skb_core0x637
|
||
|
__netif_receive_skb0x18
|
||
|
netif_receive_skb_internal0x23
|
||
|
4
|
||
|
|
||
|
tcp_send_mss0x1
|
||
|
inet_sendmsg0x67
|
||
|
sock_sendmsg0x38
|
||
|
sock_write_iter0x78
|
||
|
__vfs_write0xaa
|
||
|
vfs_write0xa9
|
||
|
sys_write0x46
|
||
|
entry_SYSCALL_64_fastpath0x16
|
||
|
7
|
||
|
|
||
|
tcp_sendmsg0x1
|
||
|
sock_sendmsg0x38
|
||
|
sock_write_iter0x78
|
||
|
__vfs_write0xaa
|
||
|
vfs_write0xa9
|
||
|
sys_write0x46
|
||
|
entry_SYSCALL_64_fastpath0x16
|
||
|
7
|
||
|
|
||
|
Detaching...
|
||
|
|
||
|
Use -r to allow regular expressions.
|
||
|
|
||
|
|
||
|
The -f option will emit folded output, which can be used as input to other
|
||
|
tools including flame graphs. For example, with delimiters as well:
|
||
|
|
||
|
# ./stackcount.py -df t:sched:sched_switch
|
||
|
^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
|
||
|
kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
|
||
|
watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1
|
||
|
snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
|
||
|
svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1
|
||
|
python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
|
||
|
kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
|
||
|
[...]
|
||
|
|
||
|
Flame graphs visualize stack traces. For information about them and links
|
||
|
to open source software, see http://www.brendangregg.com/flamegraphs.html .
|
||
|
This folded output can be piped directly into flamegraph.pl (the Perl version).
|
||
|
|
||
|
|
||
|
USAGE message:
|
||
|
|
||
|
# ./stackcount -h
|
||
|
usage: stackcount [-h] [-p PID] [-i INTERVAL] [-D DURATION] [-T] [-r] [-s]
|
||
|
[-P] [-K] [-U] [-v] [-d] [-f] [--debug]
|
||
|
pattern
|
||
|
|
||
|
Count events and their stack traces
|
||
|
|
||
|
positional arguments:
|
||
|
pattern search expression for events
|
||
|
|
||
|
optional arguments:
|
||
|
-h, --help show this help message and exit
|
||
|
-p PID, --pid PID trace this PID only
|
||
|
-i INTERVAL, --interval INTERVAL
|
||
|
summary interval, seconds
|
||
|
-D DURATION, --duration DURATION
|
||
|
total duration of trace, seconds
|
||
|
-T, --timestamp include timestamp on output
|
||
|
-r, --regexp use regular expressions. Default is "*" wildcards
|
||
|
only.
|
||
|
-s, --offset show address offsets
|
||
|
-P, --perpid display stacks separately for each process
|
||
|
-K, --kernel-stacks-only
|
||
|
kernel stack only
|
||
|
-U, --user-stacks-only
|
||
|
user stack only
|
||
|
-v, --verbose show raw addresses
|
||
|
-d, --delimited insert delimiter between kernel/user stacks
|
||
|
-f, --folded output folded format
|
||
|
--debug print BPF program before starting (for debugging
|
||
|
purposes)
|
||
|
|
||
|
examples:
|
||
|
./stackcount submit_bio # count kernel stack traces for submit_bio
|
||
|
./stackcount -d ip_output # include a user/kernel stack delimiter
|
||
|
./stackcount -s ip_output # show symbol offsets
|
||
|
./stackcount -sv ip_output # show offsets and raw addresses (verbose)
|
||
|
./stackcount 'tcp_send*' # count stacks for funcs matching tcp_send*
|
||
|
./stackcount -r '^tcp_send.*' # same as above, using regular expressions
|
||
|
./stackcount -Ti 5 ip_output # output every 5 seconds, with timestamps
|
||
|
./stackcount -p 185 ip_output # count ip_output stacks for PID 185 only
|
||
|
./stackcount -p 185 c:malloc # count stacks for malloc in PID 185
|
||
|
./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint
|
||
|
./stackcount -p 185 u:node:* # count stacks for all USDT probes in node
|
||
|
./stackcount -K t:sched:sched_switch # kernel stacks only
|
||
|
./stackcount -U t:sched:sched_switch # user stacks only
|