You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
431 lines
15 KiB
431 lines
15 KiB
<?xml version="1.0" encoding="UTF-8" ?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
<link rel="stylesheet" href="resources/doc.css" charset="UTF-8" type="text/css" />
|
|
<link rel="stylesheet" href="../coverage/jacoco-resources/prettify.css" charset="UTF-8" type="text/css" />
|
|
<link rel="shortcut icon" href="resources/report.gif" type="image/gif" />
|
|
<script type="text/javascript" src="../coverage/jacoco-resources/prettify.js"></script>
|
|
<title>JaCoCo - Control Flow Analysis</title>
|
|
</head>
|
|
<body onload="prettyPrint()">
|
|
|
|
<div class="breadcrumb">
|
|
<a href="../index.html" class="el_report">JaCoCo</a> >
|
|
<a href="index.html" class="el_group">Documentation</a> >
|
|
<span class="el_source">Control Flow Analysis</span>
|
|
</div>
|
|
<div id="content">
|
|
|
|
<h1>Control Flow Analysis for Java Methods</h1>
|
|
|
|
<p class="hint">
|
|
Implementing a coverage tool that supports statement (C0) as well as branch
|
|
coverage coverage (C1) requires detailed analysis of the internal control flow
|
|
of Java methods. Due to the architecture of JaCoCo this analysis happens on
|
|
the bytecode of compiled class files. This document describes JaCoCo's
|
|
strategies for inserting probes into the control flow at runtime and analyzing
|
|
the actual code coverage. Marc R. Hoffmann, November 2011
|
|
</p>
|
|
|
|
<h2>Control Flow Graphs for Java Bytecode</h2>
|
|
|
|
<p>
|
|
As an starting point we take the following example method that contains a
|
|
single branching point:
|
|
</p>
|
|
|
|
<pre class="source lang-java linenums">
|
|
public static void example() {
|
|
a();
|
|
if (cond()) {
|
|
b();
|
|
} else {
|
|
c();
|
|
}
|
|
d();
|
|
}
|
|
</pre>
|
|
|
|
<p>
|
|
A Java compiler will create the following bytecode from this example method.
|
|
Java bytecode is a linear sequence of instructions. Control flow is
|
|
implemented with <i>jump</i> instructions like the conditional
|
|
<code>IFEQ</code> or the unconditional <code>GOTO</code> opcode. The jump
|
|
targets are technically relative offsets to the target instruction. For better
|
|
readability we use symbolic labels (<code>L1</code>, <code>L2</code>) instead
|
|
(also the ASM API uses such symbolic labels):
|
|
</p>
|
|
|
|
<pre class="source linenums">
|
|
public static example()V
|
|
INVOKESTATIC a()V
|
|
INVOKESTATIC cond()Z
|
|
IFEQ L1
|
|
INVOKESTATIC b()V
|
|
GOTO L2
|
|
L1: INVOKESTATIC c()V
|
|
L2: INVOKESTATIC d()V
|
|
RETURN
|
|
</pre>
|
|
|
|
<p>
|
|
The possible control flow in the bytecode above can be represented by a graph.
|
|
The nodes are byte code instruction, the edged of the graph represent the
|
|
possible control flow between the instructions. The control flow of the
|
|
example is shown in the left box of this diagram:
|
|
</p>
|
|
|
|
<img src="resources/flow-example.png" alt="Bytecode Control Flow"/>
|
|
|
|
|
|
<h3>Flow Edges</h3>
|
|
|
|
<p>
|
|
The control flow graph of a Java method defined by Java byte code may have
|
|
the following Edges. Each edge connects a source instruction with a target
|
|
instruction. In some cases the source instruction or the target instruction
|
|
does not exist (virtual edges for method entry and exit) or cannot be
|
|
exactly specified (exception handlers).
|
|
</p>
|
|
|
|
<table class="coverage">
|
|
<thead>
|
|
<tr>
|
|
<td>Type</td>
|
|
<td>Source</td>
|
|
<td>Target</td>
|
|
<td>Remarks</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>ENTRY</td>
|
|
<td>-</td>
|
|
<td>First instruction in method</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>SEQUENCE</td>
|
|
<td>Instruction, except <code>GOTO</code>, <code>xRETURN</code>,
|
|
<code>THROW</code>, <code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code></td>
|
|
<td>Subsequent instruction</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>JUMP</td>
|
|
<td><code>GOTO</code>, <code>IFx</code>, <code>TABLESWITCH</code> or
|
|
<code>LOOKUPSWITCH</code> instruction</td>
|
|
<td>Target instruction</td>
|
|
<td><code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code> will define
|
|
multiple edges.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>EXHANDLER</td>
|
|
<td>Any instruction in handler scope</td>
|
|
<td>Target instruction</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>EXIT</td>
|
|
<td><code>xRETURN</code> or <code>THROW</code> instruction</td>
|
|
<td>-</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>EXEXIT</td>
|
|
<td>Any instruction</td>
|
|
<td>-</td>
|
|
<td>Unhandled exception.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
The current JaCoCo implementation ignores edges caused by implicit exceptions
|
|
and the the method entry. This means we consider SEQUENCE, JUMP, EXIT.
|
|
</p>
|
|
|
|
|
|
<h2>Probe Insertion Strategy</h2>
|
|
|
|
<p>
|
|
Probes are additional instructions that can be inserted between existing
|
|
instructions. They do not change the behavior of the method but record the
|
|
fact that they have been executed. One can think probes are placed on edges of
|
|
the control flow graph. Theoretically we could insert a probe at every edge of
|
|
the control flow graph. As a probe implementation itself requires multiple
|
|
bytecode instructions this would increase the size of the class files several
|
|
times and significantly slow down execution speed of the instrumented classes.
|
|
Fortunately this is not required, in fact we only need a few probes per method
|
|
depending on the control flow of the method. For example a method without any
|
|
branches requires a single probe only. The reason for this is that starting
|
|
from a certain probe we can back-trace the execution path and typically get
|
|
coverage information for multiple instructions.
|
|
</p>
|
|
|
|
<p>
|
|
If a probe has been executed we know that the corresponding edge has been
|
|
visited. From this edge we can conclude to other preceding nodes and edges:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>If a edge has been visited, we know that the source node of the this edge
|
|
has been executed.</li>
|
|
<li>If a node has been executed and the node is the target of only one edge
|
|
we know that this edge has been visited.</li>
|
|
</ul>
|
|
|
|
<p>
|
|
Recursively applying these rules allows to determine the execution status of
|
|
all instructions of a method – given that we have probes at the right
|
|
positions. Therefore JaCoCo inserts probes
|
|
</p>
|
|
|
|
<ul>
|
|
<li>at every method exit (return or throws) and</li>
|
|
<li>at every edge where the target instruction is the target of more than one
|
|
edge.</li>
|
|
</ul>
|
|
|
|
<p>
|
|
We recall that a probe is simply a small sequence of additional instructions
|
|
that needs to be inserted at a control flow edge. The following table
|
|
illustrates how this extra instructions are added in case of different edge
|
|
types.
|
|
</p>
|
|
|
|
<table class="coverage">
|
|
<thead>
|
|
<tr>
|
|
<td>Type</td>
|
|
<td>Before</td>
|
|
<td>After</td>
|
|
<td>Remarks</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>SEQUENCE</td>
|
|
<td><img src="resources/flow-sequence.png" alt="Sequence"/></td>
|
|
<td><img src="resources/flow-sequence-probe.png" alt="Sequence with Probe"/></td>
|
|
<td>
|
|
In case of a simple sequence the probe is simply inserted between the
|
|
two instructions.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>JUMP (unconditional)</td>
|
|
<td><img src="resources/flow-goto.png" alt="Unconditional Jump"/></td>
|
|
<td><img src="resources/flow-goto-probe.png" alt="Unconditional Jump with Probe"/></td>
|
|
<td>
|
|
As an unconditional jump is executed in any case, we can also insert the
|
|
probe just before the GOTO instruction.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>JUMP (conditional)</td>
|
|
<td><img src="resources/flow-cond.png" alt="Conditional Jump"/></td>
|
|
<td><img src="resources/flow-cond-probe.png" alt="Conditional Jump with Probe"/></td>
|
|
<td>
|
|
Adding a probe to an conditional jump is little bit more tricky. We
|
|
invert the semantic of the opcode and add the probe right after the
|
|
conditional jump. With a subsequent <code>GOTO</code> instruction we
|
|
jump to the original target. Note that this approach will not introduce
|
|
a backward jump, which would cause trouble with the Java verifier if we
|
|
have an uninitialized object on the stack.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>EXIT</td>
|
|
<td><img src="resources/flow-exit.png" alt="Exit"/></td>
|
|
<td><img src="resources/flow-exit-probe.png" alt="Exit with Probe"/></td>
|
|
<td>
|
|
As is is the nature of RETURN and THROW statements to actually leave the
|
|
method we add the probe right before these statements.
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
Now let's see how this rules apply to the example snippet above. We see that
|
|
<code>INVOKE d()</code> instruction is the only node with more than one
|
|
incoming edge. So we need to place probes on those edges and another probe on
|
|
the only exit node. The result is shown the the right box of the diagram
|
|
above.
|
|
</p>
|
|
|
|
<h2>Additional Probes Between Lines</h2>
|
|
|
|
<p>
|
|
The probe insertion strategy described so far does not consider implicit
|
|
exceptions thrown for example from invoked methods. If the control flow
|
|
between two probes is interrupted by a exception not explicitly created with
|
|
a <code>throw</code> statement all instruction in between are considered as
|
|
not covered. This leads to unexpected results especially when the the block of
|
|
instructions spans multiple lines of source code.
|
|
</p>
|
|
|
|
<p>
|
|
Therefore JaCoCo adds an additional probe between the instructions of two
|
|
lines whenever the subsequent line contains at least one method invocation.
|
|
This limits the effect of implicit exceptions from method invocations to
|
|
single lines of source. The approach only works for class files compiled with
|
|
debug information (line numbers) and does not consider implicit exceptions
|
|
from other instructions than method invocations (e.g.
|
|
<code>NullPointerException</code> or <code>ArrayIndexOutOfBoundsException</code>).
|
|
</p>
|
|
|
|
<h2>Probe Implementation</h2>
|
|
|
|
<p>
|
|
Code coverage analysis is a runtime metric that provides execution details
|
|
of the software under test. This requires detailed recording about the
|
|
instructions (instruction coverage) that have been executed. For branch
|
|
coverage also the outcome of decisions has to be recorded. In any case
|
|
execution data is collected by so called probes:
|
|
</p>
|
|
|
|
<p class="hint">
|
|
A <b>probe</b> is a sequence of bytecode instructions that can be inserted
|
|
into a Java method. When the probe is executed, this fact is recorded and can
|
|
be reported by the coverage runtime. The probe must not change the behavior
|
|
of the original code.
|
|
</p>
|
|
|
|
<p>
|
|
The only purpose of the probe is to record that it has been executed at least
|
|
once. The probe does not record the number of times it has been called or
|
|
collect any timing information. The latter is out of scope for code coverage
|
|
analysis and more in the objective of a performance analysis tool. Typically
|
|
multiple probes needs to be inserted into each method, therefore probes needs
|
|
to be identified. Also the probe implementation and the storage mechanism it
|
|
depends on needs to be thread safe as multi-threaded execution is a common
|
|
scenario for java applications (albeit not for plain unit tests). Probes must
|
|
not have any side effects on the original code of the method. Also they should
|
|
add minimal overhead.
|
|
</p>
|
|
|
|
<p>
|
|
So to summarize the requirements for execution probes:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>Record execution</li>
|
|
<li>Identification for different probes</li>
|
|
<li>Thread safe</li>
|
|
<li>No side effects on application code</li>
|
|
<li>Minimal runtime overhead</li>
|
|
</ul>
|
|
|
|
<p>
|
|
JaCoCo implements probes with a <code>boolean[]</code> array instance per
|
|
class. Each probe corresponds to a entry in this array. Whenever the probe is
|
|
executed the entry is set to <code>true</code> with the following four
|
|
bytecode instructions:
|
|
</p>
|
|
|
|
<pre class="source">
|
|
ALOAD probearray
|
|
xPUSH probeid
|
|
ICONST_1
|
|
BASTORE
|
|
</pre>
|
|
|
|
<p>
|
|
Note that this probe code is thread safe, does not modify the operand stack
|
|
or modify local variables and is also thread safe. It does also not leave the
|
|
method though an external call. The only prerequisite is that the probe array
|
|
is available as a local variable. For this at the beginning of each method
|
|
additional instrumentation code needs to be added to obtain the array instance
|
|
associated with the belonging class. To avoid code duplication the
|
|
initialization is delegated to a static private method
|
|
<code>$jacocoinit()</code> which is added to every non-interface class.
|
|
</p>
|
|
|
|
<p>
|
|
The size of the probe code above depends on the position of the probe array
|
|
variable and the value of the probe identifier as different opcodes can be
|
|
used. As calculated in the table below the overhead per probe ranges between 4
|
|
and 7 bytes of additional bytecode:
|
|
</p>
|
|
|
|
<table class="coverage">
|
|
<thead>
|
|
<tr>
|
|
<td>Possible Opcodes</td>
|
|
<td>Min. Size [bytes]</td>
|
|
<td>Max. Size [bytes]</td>
|
|
</tr>
|
|
</thead>
|
|
<tfoot>
|
|
<tr>
|
|
<td>Total:</td>
|
|
<td>4</td>
|
|
<td>7</td>
|
|
</tr>
|
|
</tfoot>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>ALOAD_x</code>, <code>ALOAD</code> <sup>1</sup></td>
|
|
<td>1</td>
|
|
<td>2</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>ICONST_x</code>, <code>BIPUSH</code>, <code>SIPUSH</code>, <code>LDC</code>, <code>LDC_W</code> <sup>2</sup></td>
|
|
<td>1</td>
|
|
<td>3</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>ICONST_1</code></td>
|
|
<td>1</td>
|
|
<td>1</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>BASTORE</code></td>
|
|
<td>1</td>
|
|
<td>1</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
<sup>1</sup> The probe array is the first variable after the arguments.
|
|
If the method arguments do not consume more that 3 slots the 1-byte opcode can
|
|
be used.<br/>
|
|
<sup>2</sup> 1-byte opcodes for ids 0 to 5, 2-byte opcode for ids up to 127,
|
|
3-byte opcode for ids up to 32767. Ids values of 32768 or more require an
|
|
additional constant pool entry. For normal class files it is very unlikely to
|
|
require more than 32,000 probes.
|
|
</p>
|
|
|
|
<h2>Performance</h2>
|
|
|
|
<p>
|
|
The control flow analysis and probe insertion strategy described in this
|
|
document allows to efficiently record instruction and branch coverage. In
|
|
total classes instrumented with JaCoCo increase their size by about 30%. Due
|
|
to the fact that probe execution does not require any method calls, only local
|
|
instructions, the observed execution time overhead for instrumented
|
|
applications typically is less than 10%.
|
|
</p>
|
|
|
|
<h2>References</h2>
|
|
|
|
<ul>
|
|
<li><a href="http://asm.objectweb.org/">ASM byte code library</a> by Eric Bruneton at al.</li>
|
|
<li><a href="http://andrei.gmxhome.de/bytecode/index.html">Bytecode Outline Plug-In</a> by Andrei Loskutov</li>
|
|
<li><a href="http://en.wikipedia.org/wiki/Glossary_of_graph_theory">Wikipedia: Glossary of Graph Theory</a></li>
|
|
</ul>
|
|
|
|
</div>
|
|
<div class="footer">
|
|
<div class="right"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</div>
|
|
<a href="license.html">Copyright</a> © @copyright.years@ Mountainminds GmbH & Co. KG and Contributors
|
|
</div>
|
|
|
|
</body>
|
|
</html>
|