You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
428 lines
18 KiB
428 lines
18 KiB
# Regres - SwiftShader automated testing
|
|
|
|
## Introduction
|
|
|
|
Regres is a collection of tools to perform [dEQP](https://github.com/KhronosGroup/VK-GL-CTS)
|
|
presubmit and continuous integration testing and code coverage evaluation for
|
|
SwiftShader.
|
|
|
|
Regres provides:
|
|
|
|
* [Presubmit testing](#presubmit-testing) - An automatic OpenGL|ES and Vulkan
|
|
dEQP test run for each Gerrit patchset put up for review.
|
|
* [Continuous integration testing](#daily-run-continuous-integration-testing) -
|
|
A OpenGL|ES and Vulkan dEQP test run performed against the `master` branch each night. \
|
|
This nightly run also produces code coverage information which can be viewed at
|
|
[swiftshader-regres.github.io/swiftshader-coverage](https://swiftshader-regres.github.io/swiftshader-coverage/).
|
|
* [Local dEQP test runner](#local-dEQP-test-runner) Provides a local tool for
|
|
efficiently running a number of dEQP tests based wildcard or regex name
|
|
matching.
|
|
|
|
The Regres source root directory is at [`<swiftshader>/tests/regres/`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/).
|
|
|
|
## Presubmit testing
|
|
|
|
Regres monitors changes that have been [put up for review with Gerrit](https://swiftshader-review.googlesource.com/q/status:open).
|
|
|
|
Once a new [qualifying](#qualifying) patchset has been found, regres will
|
|
checkout, build and test the change against the parent changelist. \
|
|
Any differences in results are reported as a review comment on the change
|
|
[[example]](https://swiftshader-review.googlesource.com/c/SwiftShader/+/46369/5#message-4f09ea3e6d01ed94ae26183c8b6c547c90492c12).
|
|
|
|
### Qualifying
|
|
|
|
As Regres may be running externally authored code on Google hardware,
|
|
Regres will only test a change if it is authored by or reviewed by a Googler.
|
|
|
|
Only the most recent patchset of a change will be tested. If a new patchset is
|
|
pushed while the previous is currently being tested, then testing will continue
|
|
to completion and the previous patchsets will be posted, and the new patchset
|
|
will be queued for testing.
|
|
|
|
### Prioritization
|
|
|
|
At the time of writing a Regres presubmit run takes a little over 20 minutes to
|
|
complete, and there is a single Regres machine servicing all changes.
|
|
To keep Regres responsive, changes are prioritized based on their 'readiness to
|
|
land', which is determined by the change's `Kokoro-Presubmit`, `Code-Review` and
|
|
`Presubmit-Ready` Gerrit labels.
|
|
|
|
### Test Filtering
|
|
|
|
By default, Regres will run all the test lists declared in the
|
|
[`<swiftshader>/tests/regres/ci-tests.json`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/ci-tests.json) file.\
|
|
As new functionally is being implemented, the test lists in `ci-tests.json` may
|
|
reference known-passing test lists updated by the [daily run](#daily-run-continuous-integration-testing),
|
|
so that failing tests for incomplete functionality are skipped, but tests that
|
|
pass for new functionality *are tested* to ensure they do not regres.
|
|
|
|
Additional tests names found in the files referenced by
|
|
[`<swiftshader>/tests/regres/full-tests.json`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/full-tests.json)
|
|
can be explicitly included in the change's presubmit run
|
|
by including a line in the change description with the signature:
|
|
|
|
```text
|
|
Test: <dEQP-test-pattern>
|
|
```
|
|
|
|
`<dEQP-test-pattern>` can be a single dEQP test name, or you can use wildcards
|
|
[as documented here](https://golang.org/pkg/path/filepath/#Match).
|
|
|
|
You can repeat `Test:` as many times as you like. `Tests:` is also acccepted.
|
|
|
|
[For example](https://swiftshader-review.googlesource.com/c/SwiftShader/+/26574):
|
|
|
|
```text
|
|
Add support for OpLogicalEqual, OpLogicalNotEqual
|
|
|
|
Test: dEQP-VK.glsl.operator.bool_compare.*
|
|
Test: dEQP-VK.glsl.operator.binary_operator.equal.*
|
|
Test: dEQP-VK.glsl.operator.binary_operator.not_equal.*
|
|
Bug: b/126870789
|
|
Change-Id: I9d33444d67792274d8027b7d1632235533cfc079
|
|
```
|
|
|
|
## Daily-run continuous integration testing
|
|
|
|
Once a day, regres will also test another set of tests from [`<swiftshader>/tests/regres/full-tests.json`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/full-tests.json),
|
|
and post the test result lists as a Gerrit changelist
|
|
[[example]](https://swiftshader-review.googlesource.com/c/SwiftShader/+/46448).
|
|
|
|
The daily run also performs code coverage instrumentation per dEQP test,
|
|
automatically uploading the results of all the dEQP tests to the viewer at
|
|
[swiftshader-regres.github.io/swiftshader-coverage](https://swiftshader-regres.github.io/swiftshader-coverage/).
|
|
|
|
## Local dEQP test runner
|
|
|
|
Regres also provides a multi-threaded, [process sandboxed](#process-sandboxing),
|
|
local dEQP test runner with a wild-card / regex based test name matcher.
|
|
|
|
The local test runner can be run with:
|
|
|
|
[`<swiftshader>/tests/regres/run_testlist.sh`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/run_testlist.sh) `--deqp-vk=<path to deqp-vk> [--filter=<test name filter>]`
|
|
|
|
`<test name filter>` can be a single dEQP test name, or you can use wildcards
|
|
[as documented here](https://golang.org/pkg/path/filepath/#Match).
|
|
Alternatively, start with a `/` to use a regex filter.
|
|
|
|
Other useful flags:
|
|
|
|
```text
|
|
-limit int
|
|
only run a maximum of this number of tests
|
|
-no-results
|
|
disable generation of results.json file
|
|
-output string
|
|
path to an output JSON results file (default "results.json")
|
|
-shuffle
|
|
shuffle tests
|
|
-test-list string
|
|
path to a test list file (default "vk-master-PASS.txt")
|
|
```
|
|
|
|
Run [`<swiftshader>/tests/regres/run_testlist.sh`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/run_testlist.sh) with `--help` to see all available flags.
|
|
|
|
## Process sandboxing
|
|
|
|
Regres will run each dEQP test in a separate process to prevent state
|
|
leakage between tests.
|
|
|
|
Tests are run concurrently, and crashing processes will not take down the test
|
|
runner.
|
|
|
|
Some dEQP tests are known to perform excessive memory allocations (i.e. keep
|
|
allocating until no more can be claimed from the OS). \
|
|
In order to prevent a single test starving other test processes of memory, each
|
|
process is restricted to a fraction of the system's memory using [linux resource limits](https://man7.org/linux/man-pages/man2/getrlimit.2.html).
|
|
|
|
Tests may also deadlock, so each test process has a time limit before they are
|
|
automatically killed.
|
|
|
|
## Implementation details
|
|
|
|
### Presubmit & daily run process
|
|
|
|
Regres runs until stopped, and will:
|
|
|
|
* Download a known compatible version of Clang to a cache directory. This will
|
|
be used for all compilation stages below.
|
|
* Periodically poll Gerrit for recently opened changes
|
|
* Periodically query Gerrit for details about each tracked change, determining
|
|
[whether it should be tested](#qualifying), and determine its current
|
|
[priority](#prioritization).
|
|
* A qualifying change with the highest priority will be picked, and the
|
|
following is performed for the change:
|
|
1. The change is `git fetch`ed into a temporary directory.
|
|
2. If not already cached, the dEQP version described in the
|
|
change's [`<swiftshader>/tests/regres/deqp.json`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/deqp.json) file is downloaded and built the into a cached directory.
|
|
3. The source for the change is built into a temporary build directory.
|
|
4. The built dEQP binaries are used to test the change. The full test results
|
|
are stored in a cached directory.
|
|
5. If the parent change's test results aren't already cached, then steps 3 and
|
|
4 are repeated for the parent change.
|
|
6. The results of the two changes are diffed, and the results of the diff are
|
|
posted to the change as a Gerrit review comment.
|
|
* The above is repeated until it is time to perform a daily run, upon which:
|
|
1. The `HEAD` change of `master` is fetched into a temporary directory.
|
|
2. If not already cached, the dEQP version described in the
|
|
change's [`<swiftshader>/tests/regres/deqp.json`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/deqp.json) file is downloaded and built the into a cached directory.
|
|
3. The `HEAD` change is built into a temporary directory, optionally with code
|
|
coverage instrumenting.
|
|
4. The build dEQP binaries are used to test the change. The full test results
|
|
are stored in a cached directory, and the each test is binned by status and
|
|
written to the [`<swiftshader>/tests/regres/testlists`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/testlists) directory.
|
|
5. A new Gerrit change is created containing the updated test lists and put up
|
|
for review, along with a summary of test result changes [[example]](https://swiftshader-review.googlesource.com/c/SwiftShader/+/46448).
|
|
If there's an existing daily test change up for review then this is reused
|
|
instead of creating another.
|
|
6. If the build included code coverage instrumentation, then the coverage
|
|
results are collated from all test runs, processed and compressed, and
|
|
uploaded to [github.com/swiftshader-regres/swiftshader-coverage](https://github.com/swiftshader-regres/swiftshader-coverage)
|
|
which is immediately reflected at [swiftshader-regres.github.io/swiftshader-coverage](https://swiftshader-regres.github.io/swiftshader-coverage).
|
|
This process is [described in more detail below](#code-coverage).
|
|
7. Stages 3 - 5 are repeated for both the LLVM and Subzero backends.
|
|
|
|
### Caching
|
|
|
|
The cache directory is heavily used to avoid duplicated work. For example, it
|
|
is common for patchsets to be repeatedly pushed with the same parent change, so
|
|
the test results of the parent can be calculated once and stored. A tested
|
|
patchset that is merged into master would also be cached when used as a parent
|
|
of another change.
|
|
|
|
The cache needs to consider more than just the change identifier as the
|
|
cache-key for storing and retrieving data. Both the test lists and version of
|
|
dEQP used are dictated by the change being tested, and so both used as part of
|
|
the cache key.
|
|
|
|
### Code coverage
|
|
|
|
The [daily run](#daily-run-continuous-integration-testing) produces code
|
|
coverage information that can be examined for each individual dEQP test at
|
|
[swiftshader-regres.github.io/swiftshader-coverage](https://swiftshader-regres.github.io/swiftshader-coverage/).
|
|
|
|
The process for generating this information is complex, and is described in
|
|
detail below:
|
|
|
|
#### Per-test generation
|
|
|
|
Code coverage instrumentation is generated with
|
|
[clang's `--coverage`](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html)
|
|
functionality. This compiler option is enabled by using SwiftShader's
|
|
`SWIFTSHADER_EMIT_COVERAGE` CMake flag.
|
|
|
|
Each dEQP test process is run with a unique `LLVM_PROFILE_FILE` environment
|
|
variable value which dictates where the process writes its raw coverage profile
|
|
file. Each process gets a different path so that we can emit coverage from
|
|
multiple, concurrent dEQP test processes.
|
|
|
|
#### Parsing
|
|
|
|
[Clang provides two tools](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports) for processing coverage data:
|
|
|
|
* `llvm-profdata` indexes the raw `.profraw` coverage profile file and emits a
|
|
`.profdata` file.
|
|
* `llvm-cov` further processes the `.profdata` file into something human
|
|
readable or machine parsable.
|
|
|
|
`llvm-cov` provides many options, including emitting an pretty HTML file, but is
|
|
remarkably slow at producing easily machine-parsable data. Fortunately the core
|
|
of `llvm-cov` is [a few hundreds of lines of code](https://github.com/llvm/llvm-project/tree/master/llvm/tools/llvm-cov), as it relies on LLVM libraries to do the heavy lifting. Regres
|
|
replaces `llvm-cov` with ["`turbo-cov`"](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/cov/turbo-cov/) which efficiently converts a `.profdata` into a simple binary stream which can
|
|
be consumed by Regres.
|
|
|
|
#### Processing
|
|
|
|
At the time of writing there are over 560,000 individual dEQP tests, and around
|
|
176,000 lines of C++ code in [`<swiftshader>/src`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:src/).
|
|
If you used 1 bit for each source line, per-line source coverage for all dEQP
|
|
tests would require over 11GiB of storage. That's just for one snapshot.
|
|
|
|
The processing and compression schemes described below reduces this down to
|
|
around 10 MiB (~1100x reduction in size), and supports sub-line coverage scopes.
|
|
|
|
##### Spans
|
|
|
|
Code coverage information is described in spans.
|
|
|
|
A span is a described as an interval of source locations, where a location is a
|
|
line-column pair:
|
|
|
|
```go
|
|
type Location struct {
|
|
Line, Column int
|
|
}
|
|
|
|
type Span struct {
|
|
Start, End Location
|
|
}
|
|
```
|
|
|
|
##### Test tree construction
|
|
|
|
Each dEQP test is uniquely identified by a fully qualified name.
|
|
Each test belongs to a group, and that group may be nested within any number of
|
|
parent groups. The groups are described in the test name, using dots (`.`) to
|
|
delimit the groups and leaf test name.
|
|
|
|
For example, the fully qualified test name:
|
|
|
|
`dEQP-VK.fragment_shader_interlock.basic.discard.ssbo.sample_unordered.4xaa.sample_shading.16x16`
|
|
|
|
Can be broken down into the following groups and test name:
|
|
|
|
```text
|
|
dEQP-VK <-- root group name
|
|
╰ fragment_shader_interlock
|
|
╰ basic.discard
|
|
╰ ssbo
|
|
╰ sample_unordered
|
|
╰ 4xaa
|
|
╰ sample_shading
|
|
╰ 16x16 <-- leaf test name
|
|
```
|
|
|
|
Breaking down fully qualified test names into groups provide a natural way to
|
|
structure coverage data, as tests of the same group are likely to have similar
|
|
coverage spans.
|
|
|
|
So, for each source file in the codebase, we create a tree with test groups as
|
|
non-leaf nodes, and tests as leaf nodes.
|
|
|
|
For example, given the following test list:
|
|
|
|
```text
|
|
a.b.d.h
|
|
a.b.d.i.n
|
|
a.b.d.i.o
|
|
a.b.e.j
|
|
a.b.e.k.p
|
|
a.b.e.k.q
|
|
a.c.f
|
|
a.c.g.l.r
|
|
a.c.g.m
|
|
```
|
|
|
|
We would construct the following tree:
|
|
|
|
```text
|
|
a
|
|
╭──────┴──────╮
|
|
b c
|
|
╭───┴───╮ ╭───┴───╮
|
|
d e f g
|
|
╭─┴─╮ ╭─┴─╮ ╭─┴─╮
|
|
h i j k l m
|
|
╭┴╮ ╭┴╮ │
|
|
n o p q r
|
|
|
|
```
|
|
|
|
Each leaf node in this tree (`h`, `n`, `o`, `j`, `p`, `q`, `f`, `r`, `m`)
|
|
represent a test, and non-leaf nodes (`a`, `b`, `c`, `d`, `e`, `g`, `i`, `k`,
|
|
`l`) are a groups.
|
|
|
|
To begin, we create a test tree structure, and associate the full list of test
|
|
coverage spans with every leaf node (test) in this tree.
|
|
|
|
This data structure hasn't given us any compression benefits yet, but we can
|
|
now do a few tricks to dramatically reduce number of spans needed to describe
|
|
the graph:
|
|
|
|
##### Optimization 1: Common span promotion
|
|
|
|
The first compression scheme is to promote common spans up the tree when they
|
|
are common for all children. This will reduce the number of spans needed to be
|
|
encoded in the final file.
|
|
|
|
For example, if the test group `a` has 4 children that all share the same span
|
|
`X`:
|
|
|
|
```text
|
|
a
|
|
╭───┬─┴─┬───╮
|
|
b c d e
|
|
[X,Y] [X] [X] [X,Z]
|
|
```
|
|
|
|
Then span `X` can be promoted up to `a`:
|
|
|
|
```text
|
|
[X]
|
|
a
|
|
╭───┬─┴─┬───╮
|
|
b c d e
|
|
[Y] [] [] [Z]
|
|
```
|
|
|
|
##### Optimization 2: Span XOR promotion
|
|
|
|
This idea can be extended further, by not requiring all the children to share
|
|
the same span before promotion. If **most** child nodes share the same span, we
|
|
can still promote the span, but this time we **remove** the span from the
|
|
children **if they had it**, and **add** the span to children **if they didn't
|
|
have it**.
|
|
|
|
For example, if the test group `a` has 4 children with 3 that share the span
|
|
`X`:
|
|
|
|
```text
|
|
a
|
|
╭───┬─┴─┬───╮
|
|
b c d e
|
|
[X,Y] [X] [] [X,Z]
|
|
```
|
|
|
|
Then span `X` can be promoted up to `a` by flipping the presence of `X` on the
|
|
child nodes:
|
|
|
|
```text
|
|
[X]
|
|
a
|
|
╭───┬─┴─┬───╮
|
|
b c d e
|
|
[Y] [] [X] [Z]
|
|
```
|
|
|
|
This process repeats up the tree.
|
|
|
|
With this optimization applied, we now need to traverse the tree from root to
|
|
leaf in order to know whether a given span is in use for the leaf node (test):
|
|
|
|
* If the span is encountered an **odd** number of times during traversal, then
|
|
the span is **covered**.
|
|
* If the span is encountered an **even** number of times during traversal, then
|
|
the span is **not covered**.
|
|
|
|
See [`tests/regres/cov/coverage_test.go`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/cov/coverage_test.go) for more examples of this optimization.
|
|
|
|
##### Optimization 3: Common span grouping
|
|
|
|
With real world data, we encounter groups of spans that are commonly found
|
|
together. To further reduce coverage data, the whole graph is scanned for common
|
|
span patterns, and are indexed by each tree node.
|
|
The XOR'ing of spans as described above is performed as if the spans were not
|
|
grouped.
|
|
|
|
##### Optimization 4: Lookup tables
|
|
|
|
All spans, span-groups and strings are stored in de-duplicated tables, and are
|
|
indexed wherever possible.
|
|
|
|
The final serialization is performed by [`tests/regres/cov/serialization.go`](https://cs.opensource.google/swiftshader/SwiftShader/+/master:tests/regres/cov/serialization.go).
|
|
|
|
##### Optimization 5: zlib compression
|
|
|
|
The coverage data is encoded into JSON for parsing by the web page.
|
|
|
|
Before writing the JSON file, the text data is zlib compressed.
|
|
|
|
#### Presentation
|
|
|
|
The zlib-compressed JSON coverage data is decompressed using
|
|
[`pako`](https://github.com/nodeca/pako), and consumed by some
|
|
[vanilla JavaScript](https://github.com/swiftshader-regres/swiftshader-coverage/blob/gh-pages/index.html).
|
|
|
|
[`codemirror`](https://codemirror.net/) is used to perform coverage span and C++
|
|
syntax highlighting
|