You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
107 lines
3.5 KiB
107 lines
3.5 KiB
7 months ago
|
.TH cpuunclaimed 8 "2016-12-21" "USER COMMANDS"
|
||
|
.SH NAME
|
||
|
cpuunclaimed \- Sample CPU run queues and calculate unclaimed idle CPU. Uses Linux eBPF/bcc.
|
||
|
.SH SYNOPSIS
|
||
|
.B cpuunclaimed
|
||
|
[\-T] [\-j] [\-J] [interval [count]]
|
||
|
.SH DESCRIPTION
|
||
|
This tool samples the length of the run queues and determine when there are idle
|
||
|
CPUs, yet queued threads waiting their turn. It reports the amount of idle
|
||
|
(yet unclaimed by waiting threads) CPU as a system-wide percentage.
|
||
|
|
||
|
This situation can happen for a number of reasons:
|
||
|
.IP -
|
||
|
An application has been bound to some, but not all, CPUs, and has runnable
|
||
|
threads that cannot migrate to other CPUs due to this configuration.
|
||
|
.IP -
|
||
|
CPU affinity: an optimization that leaves threads on CPUs where the CPU
|
||
|
caches are warm, even if this means short periods of waiting while other
|
||
|
CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*).
|
||
|
.IP -
|
||
|
Scheduler bugs.
|
||
|
.P
|
||
|
An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a
|
||
|
cause for concern. By leaving the CPU idle, overall throughput of the system
|
||
|
may be improved. This tool is best for identifying larger issues, > 2%, due
|
||
|
to the coarseness of its 99 Hertz samples.
|
||
|
|
||
|
This is an experimental tool that currently works by use of sampling to
|
||
|
keep overheads low. Tool assumptions:
|
||
|
.IP -
|
||
|
CPU samples consistently fire around the same offset. There will sometimes
|
||
|
be a lag as a sample is delayed by higher-priority interrupts, but it is
|
||
|
assumed the subsequent samples will catch up to the expected offsets (as
|
||
|
is seen in practice). You can use -J to inspect sample offsets. Some
|
||
|
systems can power down CPUs when idle, and when they wake up again they
|
||
|
may begin firing at a skewed offset: this tool will detect the skew, print
|
||
|
an error, and exit.
|
||
|
.IP -
|
||
|
All CPUs are online (see ncpu).
|
||
|
.P
|
||
|
If this identifies unclaimed CPU, you can double check it by dumping raw
|
||
|
samples (-j), as well as using other tracing tools to instrument scheduler
|
||
|
events (although this latter approach has much higher overhead).
|
||
|
|
||
|
Since this uses BPF, only the root user can use this tool.
|
||
|
.SH REQUIREMENTS
|
||
|
CONFIG_BPF and bcc.
|
||
|
.SH EXAMPLES
|
||
|
.TP
|
||
|
Sample and calculate unclaimed idle CPUs, output every 1 second (default:
|
||
|
#
|
||
|
.B cpuunclaimed
|
||
|
.TP
|
||
|
Print 5 second summaries, 10 times:
|
||
|
#
|
||
|
.B cpuunclaimed 5 10
|
||
|
.TP
|
||
|
Print 1 second summaries with timestamps:
|
||
|
#
|
||
|
.B cpuunclaimed \-T 1
|
||
|
.TP
|
||
|
Raw dump of all samples (verbose), as comma-separated values:
|
||
|
#
|
||
|
.B cpuunclaimed \-j
|
||
|
.SH FIELDS
|
||
|
.TP
|
||
|
%CPU
|
||
|
CPU utilization as a system-wide percentage.
|
||
|
.TP
|
||
|
unclaimed idle
|
||
|
Percentage of CPU resources that were idle when work was queued on other CPUs,
|
||
|
as a system-wide percentage.
|
||
|
.TP
|
||
|
TIME
|
||
|
Time (HH:MM:SS)
|
||
|
.TP
|
||
|
TIMESTAMP_ns
|
||
|
Timestamp, nanoseconds.
|
||
|
.TP
|
||
|
CPU#
|
||
|
CPU ID.
|
||
|
.TP
|
||
|
OFFSET_ns_CPU#
|
||
|
Time offset that a sample fired within a sample group for this CPU.
|
||
|
.SH OVERHEAD
|
||
|
The overhead is expected to be low/negligible as this tool uses sampling at
|
||
|
99 Hertz (on all CPUs), which has a fixed and low cost, rather than sampling
|
||
|
every scheduler event as many other approaches use (which can involve
|
||
|
instrumenting millions of events per second). Sampled CPUs, run queue lengths,
|
||
|
and timestamps are written to ring buffers that are periodically read by
|
||
|
user space for reporting. Measure overhead in a test environment.
|
||
|
.SH SOURCE
|
||
|
This is from bcc.
|
||
|
.IP
|
||
|
https://github.com/iovisor/bcc
|
||
|
.PP
|
||
|
Also look in the bcc distribution for a companion _examples.txt file containing
|
||
|
example usage, output, and commentary for this tool.
|
||
|
.SH OS
|
||
|
Linux
|
||
|
.SH STABILITY
|
||
|
Unstable - in development.
|
||
|
.SH AUTHOR
|
||
|
Brendan Gregg
|
||
|
.SH SEE ALSO
|
||
|
runqlen(8)
|