You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
115 lines
3.8 KiB
115 lines
3.8 KiB
4 months ago
|
.TH fileslower 8 "2016-02-07" "USER COMMANDS"
|
||
|
.SH NAME
|
||
|
fileslower \- Trace slow synchronous file reads and writes.
|
||
|
.SH SYNOPSIS
|
||
|
.B fileslower [\-h] [\-p PID] [-a] [min_ms]
|
||
|
.SH DESCRIPTION
|
||
|
This script uses kernel dynamic tracing of synchronous reads and writes
|
||
|
at the VFS interface, to identify slow file reads and writes for any file
|
||
|
system.
|
||
|
|
||
|
This version traces __vfs_read() and __vfs_write() and only showing
|
||
|
synchronous I/O (the path to new_sync_read() and new_sync_write()), and
|
||
|
I/O with filenames. This approach provides a view of just two file
|
||
|
system request types: file reads and writes. There are typically many others:
|
||
|
asynchronous I/O, directory operations, file handle operations, file open()s,
|
||
|
fflush(), etc.
|
||
|
|
||
|
WARNING: See the OVERHEAD section.
|
||
|
|
||
|
By default, a minimum millisecond threshold of 10 is used.
|
||
|
|
||
|
Since this works by tracing various kernel __vfs_*() functions using dynamic
|
||
|
tracing, it will need updating to match any changes to these functions. A
|
||
|
future version should switch to using FS tracepoints instead.
|
||
|
|
||
|
Since this uses BPF, only the root user can use this tool.
|
||
|
.SH REQUIREMENTS
|
||
|
CONFIG_BPF and bcc.
|
||
|
.SH OPTIONS
|
||
|
\-p PID
|
||
|
Trace this PID only.
|
||
|
.TP
|
||
|
\-a
|
||
|
Include non-regular file types in output (sockets, FIFOs, etc).
|
||
|
.TP
|
||
|
min_ms
|
||
|
Minimum I/O latency (duration) to trace, in milliseconds. Default is 10 ms.
|
||
|
.SH EXAMPLES
|
||
|
.TP
|
||
|
Trace synchronous file reads and writes slower than 10 ms:
|
||
|
#
|
||
|
.B fileslower
|
||
|
.TP
|
||
|
Trace slower than 1 ms:
|
||
|
#
|
||
|
.B fileslower 1
|
||
|
.TP
|
||
|
Trace slower than 1 ms, for PID 181 only:
|
||
|
#
|
||
|
.B fileslower \-p 181 1
|
||
|
.SH FIELDS
|
||
|
.TP
|
||
|
TIME(s)
|
||
|
Time of I/O completion since the first I/O seen, in seconds.
|
||
|
.TP
|
||
|
COMM
|
||
|
Process name.
|
||
|
.TP
|
||
|
PID
|
||
|
Process ID.
|
||
|
.TP
|
||
|
D
|
||
|
Direction of I/O. R == read, W == write.
|
||
|
.TP
|
||
|
BYTES
|
||
|
Size of I/O, in bytes.
|
||
|
.TP
|
||
|
LAT(ms)
|
||
|
Latency (duration) of I/O, measured from when the application issued it to VFS
|
||
|
to when it completed. This time is inclusive of block device I/O, file system
|
||
|
CPU cycles, file system locks, run queue latency, etc. It's a more accurate
|
||
|
measure of the latency suffered by applications performing file system I/O,
|
||
|
than to measure this down at the block device interface.
|
||
|
.TP
|
||
|
FILENAME
|
||
|
A cached kernel file name (comes from dentry->d_iname).
|
||
|
.SH OVERHEAD
|
||
|
Depending on the frequency of application reads and writes, overhead can become
|
||
|
severe, in the worst case slowing applications by 2x. In the best case, the
|
||
|
overhead is negligible. Hopefully for real world workloads the overhead is
|
||
|
often at the lower end of the spectrum -- test before use. The reason for
|
||
|
high overhead is that this traces VFS reads and writes, which includes FS
|
||
|
cache reads and writes, and can exceed one million events per second if the
|
||
|
application is I/O heavy. While the instrumentation is extremely lightweight,
|
||
|
and uses in-kernel eBPF maps for efficient timing and filtering, multiply that
|
||
|
cost by one million events per second and that cost becomes a million times
|
||
|
worse. You can get an idea of the possible cost by just counting the
|
||
|
instrumented events using the bcc funccount tool, eg:
|
||
|
.PP
|
||
|
# ./funccount.py -i 1 -r '^__vfs_(read|write)$'
|
||
|
.PP
|
||
|
This also costs overhead, but is somewhat less than fileslower.
|
||
|
.PP
|
||
|
If the overhead is prohibitive for your workload, I'd recommend moving
|
||
|
down-stack a little from VFS into the file system functions (ext4, xfs, etc).
|
||
|
Look for updates to bcc for specific file system tools that do this. The
|
||
|
advantage of a per-file system approach is that we can trace post-cache,
|
||
|
greatly reducing events and overhead. The disadvantage is needing custom
|
||
|
tracing approaches for each different file system (whereas VFS is generic).
|
||
|
.SH SOURCE
|
||
|
This is from bcc.
|
||
|
.IP
|
||
|
https://github.com/iovisor/bcc
|
||
|
.PP
|
||
|
Also look in the bcc distribution for a companion _examples.txt file containing
|
||
|
example usage, output, and commentary for this tool.
|
||
|
.SH OS
|
||
|
Linux
|
||
|
.SH STABILITY
|
||
|
Unstable - in development.
|
||
|
.SH AUTHOR
|
||
|
Brendan Gregg
|
||
|
.SH SEE ALSO
|
||
|
biosnoop(8), funccount(8)
|