You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
326 lines
15 KiB
326 lines
15 KiB
Demonstrations of cpuunclaimed, the Linux eBPF/bcc version.
|
|
|
|
|
|
This tool samples the length of the CPU run queues and determine when there are
|
|
idle CPUs, yet queued threads waiting their turn. It reports the amount of idle
|
|
(yet unclaimed by waiting threads) CPU as a system-wide percentage. For
|
|
example:
|
|
|
|
# ./cpuunclaimed.py
|
|
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
|
|
%CPU 83.00%, unclaimed idle 0.12%
|
|
%CPU 87.25%, unclaimed idle 0.38%
|
|
%CPU 85.00%, unclaimed idle 0.25%
|
|
%CPU 85.00%, unclaimed idle 0.25%
|
|
%CPU 80.88%, unclaimed idle 0.00%
|
|
%CPU 82.25%, unclaimed idle 0.00%
|
|
%CPU 83.50%, unclaimed idle 0.12%
|
|
%CPU 81.50%, unclaimed idle 0.00%
|
|
%CPU 81.38%, unclaimed idle 0.00%
|
|
[...]
|
|
|
|
This shows a system running at over 80% CPU utilization, and with less than
|
|
0.5% unclaimed idle CPUs.
|
|
|
|
Unclaimed idle CPUs can happen for a number of reasons:
|
|
|
|
- An application has been bound to some, but not all, CPUs, and has runnable
|
|
threads that cannot migrate to other CPUs due to this configuration.
|
|
- CPU affinity: an optimization that leaves threads on CPUs where the CPU
|
|
caches are warm, even if this means short periods of waiting while other
|
|
CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*).
|
|
- Scheduler bugs.
|
|
|
|
An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a
|
|
cause for concern. By leaving the CPU idle, overall throughput of the system
|
|
may be improved. This tool is best for identifying larger issues, > 2%, due
|
|
to the coarseness of its 99 Hertz samples.
|
|
|
|
|
|
This is an 8 CPU system, with an 8 CPU-bound threaded application running that
|
|
has been bound to one CPU (via taskset):
|
|
|
|
# ./cpuunclaimed.py
|
|
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
|
|
%CPU 12.63%, unclaimed idle 86.36%
|
|
%CPU 12.50%, unclaimed idle 87.50%
|
|
%CPU 12.63%, unclaimed idle 87.37%
|
|
%CPU 12.75%, unclaimed idle 87.25%
|
|
%CPU 12.50%, unclaimed idle 87.50%
|
|
%CPU 12.63%, unclaimed idle 87.37%
|
|
%CPU 12.50%, unclaimed idle 87.50%
|
|
%CPU 12.50%, unclaimed idle 87.50%
|
|
[...]
|
|
|
|
It shows that 7 of the 8 CPUs (87.5%) are idle at the same time there are
|
|
queued threads waiting to run on CPU. This is an artificial situation caused
|
|
by binding threads to the same CPU, to demonstrate how the tool works.
|
|
|
|
|
|
This is an 8 CPU system running a Linux kernel build with "make -j8", and -T
|
|
to print timestamps:
|
|
|
|
# ./cpuunclaimed.py -T
|
|
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
|
|
22:25:55 %CPU 98.88%, unclaimed idle 0.12%
|
|
22:25:56 %CPU 99.75%, unclaimed idle 0.25%
|
|
22:25:57 %CPU 99.50%, unclaimed idle 0.50%
|
|
22:25:58 %CPU 99.25%, unclaimed idle 0.75%
|
|
22:25:59 %CPU 99.75%, unclaimed idle 0.25%
|
|
22:26:00 %CPU 99.50%, unclaimed idle 0.50%
|
|
22:26:01 %CPU 99.25%, unclaimed idle 0.75%
|
|
22:26:02 %CPU 99.25%, unclaimed idle 0.75%
|
|
22:26:03 %CPU 99.01%, unclaimed idle 0.87%
|
|
22:26:04 %CPU 99.88%, unclaimed idle 0.12%
|
|
22:26:05 %CPU 99.38%, unclaimed idle 0.62%
|
|
|
|
There's now a consistent, yet small, amount of unclaimed idle CPU. This is
|
|
expected to be deliberate: CPU affinity, as mentioned earlier.
|
|
|
|
|
|
The -j option will print raw samples: around one hundred lines of output
|
|
every second. For the same system with a Linux kernel build of "make -j8":
|
|
|
|
# ./cpuunclaimed.py -j
|
|
TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7
|
|
514606928954752,1,1,1,1,1,1,1,1
|
|
514606939054312,1,1,1,1,1,1,1,2
|
|
514606949156518,1,1,1,1,1,1,1,1
|
|
514606959256596,2,2,1,1,1,1,1,1
|
|
514606969357989,1,1,1,1,1,2,1,1
|
|
514606979459700,1,2,1,1,1,2,1,1
|
|
514606989560481,1,1,1,1,1,1,1,1
|
|
514606999661396,1,1,1,1,1,1,2,1
|
|
514607009795601,1,1,1,1,1,1,1,2
|
|
514607019862711,1,1,1,1,1,1,1,1
|
|
514607029963734,1,1,1,1,1,1,1,1
|
|
514607040062372,1,1,1,1,1,1,1,1
|
|
514607050197735,1,1,1,2,1,1,1,1
|
|
514607060266464,1,1,1,1,1,1,1,2
|
|
514607070368025,1,1,1,1,1,2,1,1
|
|
514607080468375,1,1,1,1,1,1,1,2
|
|
514607090570292,3,2,1,1,1,1,1,1
|
|
514607100670725,1,1,1,1,1,2,1,1
|
|
514607110771946,1,2,1,1,1,1,1,1
|
|
514607120873489,1,1,1,1,2,1,2,1
|
|
514607130973857,2,1,1,1,3,1,1,1
|
|
514607141080056,0,1,1,1,1,2,1,3
|
|
514607151176312,1,1,1,2,1,1,1,1
|
|
514607161277753,1,1,1,1,1,1,2,1
|
|
514607171379095,1,1,1,1,1,1,1,1
|
|
514607181479262,1,1,1,1,1,1,1,1
|
|
514607191580794,3,1,1,1,1,1,1,1
|
|
514607201680952,1,1,1,1,1,1,2,1
|
|
514607211783683,1,1,1,1,1,1,1,1
|
|
514607221883274,1,1,1,1,1,1,0,1
|
|
514607231984244,1,1,1,1,1,1,1,1
|
|
514607242085698,1,1,1,1,1,1,1,1
|
|
514607252216898,1,2,1,1,1,1,1,1
|
|
514607262289420,1,1,1,1,1,2,1,1
|
|
514607272389922,1,1,1,1,1,1,1,1
|
|
514607282489413,1,1,1,1,1,1,1,1
|
|
514607292589950,1,3,1,1,1,1,1,1
|
|
514607302693367,1,1,1,1,2,1,1,1
|
|
514607312793792,1,1,1,1,1,1,1,1
|
|
514607322895249,1,1,1,3,1,1,3,1
|
|
514607332994278,1,0,1,1,1,2,1,2
|
|
514607343095836,1,1,1,1,1,2,1,1
|
|
514607353196533,1,1,1,1,2,1,1,1
|
|
514607363297749,1,1,1,1,1,1,1,2
|
|
514607373399011,1,1,1,1,1,1,1,2
|
|
514607383499730,1,1,1,1,1,1,1,2
|
|
514607393601510,1,1,1,1,1,1,1,2
|
|
514607403704117,2,1,1,1,1,1,1,2
|
|
514607413802700,1,1,1,1,2,1,0,1
|
|
514607423904559,1,1,1,1,1,1,1,1
|
|
[...]
|
|
|
|
The output is verbose: printing out a timestamp, and then the length of each
|
|
CPU's run queue. The second last line, of timestamp 514607413802700, is an
|
|
example of what this tool detects: CPU 4 has a run queue length of 4, which
|
|
means one thread running and one thread queued, while CPU 6 has a run queue
|
|
length of 0: idle. The very next sample shows all CPUs busy.
|
|
|
|
|
|
The -J option prints raw samples with time offsets showing when the samples
|
|
were collected on each CPU. It's mostly useful for debugging the tool itself.
|
|
For example, during a Linux kernel build:
|
|
|
|
# ./cpuunclaimed.py -J
|
|
TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
|
|
514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
|
|
514722635299034,1,1,1,1,1,2,1,1,0,28809,51999,74183,89552,110011,131995,153519
|
|
514722645400274,1,1,1,1,1,1,1,2,0,28024,51333,73652,88964,110075,131973,153568
|
|
514722655501816,1,2,1,1,1,1,1,1,0,28893,51671,75233,89496,109430,131945,153694
|
|
514722665602594,1,1,2,1,1,2,1,1,0,28623,50988,73866,89383,109186,131786,154555
|
|
514722675703498,1,1,1,1,1,1,1,1,0,27379,51031,73175,89625,110380,131482,104811
|
|
514722685804942,1,1,1,1,1,2,1,1,0,27213,50501,72183,88797,108780,130659,152153
|
|
514722695906294,1,1,1,1,1,1,1,1,0,27036,51182,73420,87861,109585,130364,155089
|
|
514722706005778,1,1,1,1,1,1,1,1,0,28492,51851,74138,89744,110208,132462,154060
|
|
514722716060705,1,1,1,1,1,1,1,1,0,154499,152528,155232,155046,154502,178746,200001
|
|
514722726209615,1,1,1,1,1,1,1,1,0,28170,49580,72605,87741,108144,130723,152138
|
|
514722736309475,1,2,1,1,1,1,1,1,0,27421,51386,73061,89358,109457,131273,153005
|
|
514722746410845,1,2,1,1,1,2,1,1,0,27788,50840,72720,88920,109111,131143,152979
|
|
514722756511363,1,1,1,1,1,1,2,1,0,28280,50977,73559,89848,109659,131579,152693
|
|
514722766613044,1,1,1,1,1,1,1,1,0,28046,50812,72754,89160,110108,130735,152948
|
|
514722776712932,1,1,1,2,1,1,1,1,0,28586,51177,73974,89588,109947,132376,154162
|
|
514722786815477,1,1,1,1,1,1,1,1,0,27973,71104,72539,88302,108896,130414,152236
|
|
514722796914955,1,1,1,1,1,1,1,1,0,29054,52354,74214,89592,110615,132586,153925
|
|
514722807044060,1,1,1,1,1,1,1,1,1587130,0,24079,46633,61787,82325,104706,125278
|
|
514722817117432,2,1,2,1,1,1,1,1,0,27628,51038,75138,89724,109340,132426,155348
|
|
514722827218254,1,1,1,1,1,1,2,1,0,29111,51868,74347,88904,109911,132764,153851
|
|
514722837340158,1,1,1,1,1,1,1,1,0,7366,30760,53528,68622,89317,111095,132319
|
|
514722847421305,1,1,1,1,1,1,1,1,0,28257,51105,73841,89037,110820,131605,153368
|
|
514722857521112,1,1,1,1,1,1,1,1,0,28544,51441,73857,89530,110497,131915,153513
|
|
514722867626129,0,2,1,1,1,1,1,1,0,24621,47917,70568,85391,106670,128081,150329
|
|
514722877727183,2,1,1,1,1,1,1,1,0,24869,47630,71547,84761,106048,128444,149285
|
|
514722887824589,1,1,1,1,1,1,2,1,0,28793,51212,73863,89584,109773,132348,153194
|
|
514722897925481,1,1,1,1,1,1,2,1,0,29278,51163,73961,89774,109592,132029,153715
|
|
514722908026097,1,1,1,1,1,1,1,1,0,30630,35595,36210,188001,190815,190072,190732
|
|
514722918127439,1,1,1,1,1,1,1,1,0,28544,51885,73948,89987,109763,132632,154083
|
|
514722928227399,1,1,1,1,1,1,1,1,0,31882,51574,74769,89939,110578,132951,154356
|
|
514722938329471,1,1,1,1,1,1,1,1,0,28498,51304,74101,89670,110278,132653,153176
|
|
514722948430589,1,1,1,1,1,1,1,1,0,27868,50925,73477,89676,109583,132360,153014
|
|
514722958531802,1,1,1,1,1,1,1,1,0,28505,50886,73729,89919,109618,131988,152896
|
|
514722968632181,1,1,1,1,1,1,1,1,0,28492,51749,73977,90334,109816,132897,152890
|
|
514722978733584,1,1,1,1,1,1,1,1,0,28847,50957,74121,90014,110019,132377,152978
|
|
514722988834321,1,1,1,1,1,1,1,1,0,28601,51437,74021,89968,110252,132233,153623
|
|
514722998937170,1,1,2,1,1,1,1,1,0,27007,50044,73259,87725,108663,132194,152459
|
|
514723009036821,1,2,1,2,1,1,1,1,0,28226,50937,73983,89110,110476,131740,153663
|
|
514723019137577,1,1,1,1,1,1,1,1,0,30261,52357,75657,87803,61823,131850,153585
|
|
514723029238745,1,1,1,1,1,1,1,1,0,28030,50752,74452,89240,110791,132187,153327
|
|
514723039339069,1,1,1,1,1,1,1,1,0,29791,52636,75996,90475,110414,132232,154714
|
|
514723049439822,1,1,1,1,2,1,1,1,0,29133,56662,74153,89520,110683,132740,154708
|
|
514723059541617,1,1,1,1,1,1,1,1,0,27932,51480,74644,89656,109176,131499,153732
|
|
514723069642500,1,1,2,1,1,1,2,1,0,27678,51509,73984,90136,110124,131554,153459
|
|
514723079743525,2,1,1,1,1,1,1,1,0,28029,51424,74394,90056,110087,132383,152963
|
|
514723089844091,2,1,1,2,1,1,1,1,0,28944,51692,74440,90339,110402,132722,154083
|
|
514723099945957,1,1,2,1,1,1,1,1,0,28425,51267,73164,89322,115048,114630,115187
|
|
514723110047020,1,1,2,0,1,1,1,2,0,28192,50811,76814,89835,109370,131265,153511
|
|
514723120216662,1,1,2,1,1,2,1,1,29,34,0,4514,19268,40293,62674,84009
|
|
[...]
|
|
|
|
This is a Xen guest system, and it shows that CPU 0 usually completes first (an
|
|
offset of 0), followed by CPU 1 around 28000 nanoseconds later, and so on.
|
|
The spread of offsets is triggered by the bcc Python library that initializes
|
|
the timers, which steps through the CPUs in sequence, with a small delay
|
|
between them merely from executing its own loop code.
|
|
|
|
Here's more output during a Linux kernel build:
|
|
|
|
# ./cpuunclaimed.py -J
|
|
TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
|
|
514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
|
|
515700745758947,2,1,1,1,1,1,1,1,0,19835,34891,49397,59364,71988,87571,102769
|
|
515700755860451,2,1,1,1,1,1,1,2,0,19946,34323,49855,59844,72741,87925,102891
|
|
515700765960560,1,1,1,1,1,1,1,1,0,20805,35339,50436,59677,73557,88661,104796
|
|
515700776061744,1,1,1,1,1,1,1,1,1626,77,0,190,153452,154665,178218,154116
|
|
515700786162017,1,1,1,1,1,1,1,1,0,20497,35361,51552,59787,74451,147789,104545
|
|
515700796262811,1,1,1,1,1,1,1,2,0,20910,35657,50805,60175,73953,88492,103527
|
|
515700806364951,1,1,1,1,1,1,1,1,0,20140,35023,50074,59726,72757,88040,102421
|
|
515700816465253,1,1,1,1,1,1,2,1,0,20952,34899,50262,60048,72890,88067,103545
|
|
515700826566573,1,1,1,1,1,1,1,1,0,20898,35490,50609,59805,74060,88550,103354
|
|
515700836667480,1,1,1,1,1,1,2,1,0,20548,34760,50959,59490,73059,87820,103006
|
|
515700846768182,1,1,1,1,1,1,2,1,0,20571,35113,50777,59962,74139,88543,103192
|
|
515700856869468,1,1,2,1,1,2,2,1,0,20932,35382,50510,60106,73739,91818,103684
|
|
515700866971905,1,1,1,2,1,1,1,1,0,19780,33018,49075,58375,71949,86537,102136
|
|
515700877073459,2,1,1,1,1,1,1,1,0,20065,73966,48989,58832,71408,85714,101067
|
|
515700887172772,1,1,1,1,1,1,1,1,0,20909,34608,51493,59890,73564,88668,103454
|
|
515700897273292,1,2,1,1,1,1,1,1,0,20353,35292,50114,59773,73948,88615,103383
|
|
515700907374341,1,1,2,1,1,1,1,1,0,20816,35206,50915,60062,73878,88857,103794
|
|
515700917475331,1,1,6,1,1,2,1,1,0,20752,34931,50280,59764,73781,88329,103234
|
|
515700927576958,1,1,1,1,1,1,1,1,0,19929,34703,50181,59364,73004,88053,103127
|
|
515700937677298,1,1,2,2,1,1,1,1,0,21178,34724,50740,61193,73452,89030,103390
|
|
515700947778409,2,1,1,1,1,1,1,1,0,21059,35604,50853,60098,73919,88675,103506
|
|
515700957879196,2,1,1,1,1,1,1,1,0,21326,35939,51492,60083,74249,89474,103761
|
|
[...]
|
|
|
|
Notice the tighter range of offsets? I began executing cpuunclaimed when the
|
|
system was idle, and it initialized the CPU timers more quickly, and then I
|
|
began the Linux kernel build.
|
|
|
|
Here's some different output, this time from a physical system with 4 CPUs,
|
|
also doing a kernel build,
|
|
|
|
# ./cpuunclaimed.py -J
|
|
TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3
|
|
4429382557480,1,1,1,1,0,6011,10895,16018
|
|
4429392655042,2,1,1,1,0,8217,13661,19378
|
|
4429402757604,1,1,1,1,0,6879,12433,18000
|
|
4429412857809,1,1,1,1,0,8303,13190,18719
|
|
4429422960709,2,1,1,1,0,6095,11234,17079
|
|
4429433060391,1,1,1,2,0,6747,12480,18070
|
|
4429443161699,1,1,1,1,0,6560,12264,17945
|
|
4429453262002,1,2,1,1,0,6992,12644,18341
|
|
4429463363706,1,2,1,1,0,6211,12071,17853
|
|
4429473465571,1,1,1,1,0,5766,11495,17638
|
|
4429483566920,1,1,1,1,0,5223,11736,16358
|
|
4429493666279,1,1,1,1,0,6964,12653,18410
|
|
4429503769113,1,1,1,1,0,5161,11399,16612
|
|
4429513870744,1,1,1,1,0,5943,10583,15768
|
|
4429523969826,1,1,1,1,0,6533,12336,18189
|
|
4429534070311,1,1,1,1,0,6834,12816,18488
|
|
4429544170456,1,1,1,1,0,7284,13401,19129
|
|
4429554274467,1,2,1,1,0,5941,11160,16594
|
|
4429564372365,1,2,1,1,0,7514,13618,19190
|
|
4429574474406,1,2,1,1,0,6687,12650,18248
|
|
4429584574220,1,2,1,1,0,7912,13705,19136
|
|
[...]
|
|
|
|
If the offset range becomes too great, we can no longer conclude about when
|
|
some CPUs were idle and others had queued work. The tool will detect this,
|
|
and print an error message and exit.
|
|
|
|
|
|
Some systems can power down CPUs when idle, and when they wake up again the
|
|
timed samples may resume from different offsets. If this happens, this tool
|
|
can no longer draw conclusions about when some CPUs were idle and others
|
|
had queued work, so it prints an error, and exits. Eg:
|
|
|
|
# ./cpuunclaimed.py 1
|
|
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
|
|
%CPU 0.25%, unclaimed idle 0.00%
|
|
%CPU 0.75%, unclaimed idle 0.00%
|
|
%CPU 0.25%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.12%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.25%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.12%, unclaimed idle 0.00%
|
|
%CPU 0.13%, unclaimed idle 0.00%
|
|
%CPU 0.12%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
%CPU 0.00%, unclaimed idle 0.00%
|
|
ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 4328176 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
|
|
|
|
It's expected that this will only really occur on idle systems.
|
|
|
|
USAGE:
|
|
|
|
# ./cpuunclaimed.py -h
|
|
usage: cpuunclaimed.py [-h] [-j] [-J] [-T] [interval] [count]
|
|
|
|
Sample CPU run queues and calculate unclaimed idle CPU
|
|
|
|
positional arguments:
|
|
interval output interval, in seconds
|
|
count number of outputs
|
|
|
|
optional arguments:
|
|
-h, --help show this help message and exit
|
|
-j, --csv print sample summaries (verbose) as comma-separated values
|
|
-J, --fullcsv print sample summaries with extra fields: CPU sample
|
|
offsets
|
|
-T, --timestamp include timestamp on output
|
|
|
|
examples:
|
|
./cpuunclaimed # sample and calculate unclaimed idle CPUs,
|
|
# output every 1 second (default)
|
|
./cpuunclaimed 5 10 # print 5 second summaries, 10 times
|
|
./cpuunclaimed -T 1 # 1s summaries and timestamps
|
|
./cpuunclaimed -j # raw dump of all samples (verbose), CSV
|