You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
674 lines
7.8 KiB
674 lines
7.8 KiB
# Benchmarks
|
|
|
|
The results of these benchmarks suggest that building this `bc` with
|
|
optimization at `-O3` with link-time optimization (`-flto`) will result in the
|
|
best performance. However, using `-march=native` can result in **WORSE**
|
|
performance.
|
|
|
|
*Note*: all benchmarks were run four times, and the fastest run is the one
|
|
shown. Also, `[bc]` means whichever `bc` was being run, and the assumed working
|
|
directory is the root directory of this repository. Also, this `bc` was at
|
|
version `3.0.0` while GNU `bc` was at version `1.07.1`, and all tests were
|
|
conducted on an `x86_64` machine running Gentoo Linux with `clang` `9.0.1` as
|
|
the compiler.
|
|
|
|
## Typical Optimization Level
|
|
|
|
These benchmarks were run with both `bc`'s compiled with the typical `-O2`
|
|
optimizations and no link-time optimization.
|
|
|
|
### Addition
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc add.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.54
|
|
user 1.21
|
|
sys 1.32
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.88
|
|
user 0.85
|
|
sys 0.02
|
|
```
|
|
|
|
### Subtraction
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.51
|
|
user 1.05
|
|
sys 1.45
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.91
|
|
user 0.85
|
|
sys 0.05
|
|
```
|
|
|
|
### Multiplication
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 7.15
|
|
user 4.69
|
|
sys 2.46
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 2.20
|
|
user 2.10
|
|
sys 0.09
|
|
```
|
|
|
|
### Division
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc divide.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 3.36
|
|
user 1.87
|
|
sys 1.48
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 1.61
|
|
user 1.57
|
|
sys 0.03
|
|
```
|
|
|
|
### Power
|
|
|
|
The command used was:
|
|
|
|
```
|
|
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 11.30
|
|
user 11.30
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.73
|
|
user 0.72
|
|
sys 0.00
|
|
```
|
|
|
|
### Scripts
|
|
|
|
[This file][1] was downloaded, saved at `../timeconst.bc` and the following
|
|
patch was applied:
|
|
|
|
```
|
|
--- ../timeconst.bc 2018-09-28 11:32:22.808669000 -0600
|
|
+++ ../timeconst.bc 2019-06-07 07:26:36.359913078 -0600
|
|
@@ -110,8 +110,10 @@
|
|
|
|
print "#endif /* KERNEL_TIMECONST_H */\n"
|
|
}
|
|
- halt
|
|
}
|
|
|
|
-hz = read();
|
|
-timeconst(hz)
|
|
+for (i = 0; i <= 50000; ++i) {
|
|
+ timeconst(i)
|
|
+}
|
|
+
|
|
+halt
|
|
```
|
|
|
|
The command used was:
|
|
|
|
```
|
|
time -p [bc] ../timeconst.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 16.71
|
|
user 16.06
|
|
sys 0.65
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 13.16
|
|
user 13.15
|
|
sys 0.00
|
|
```
|
|
|
|
Because this `bc` is faster when doing math, it might be a better comparison to
|
|
run a script that is not running any math. As such, I put the following into
|
|
`../test.bc`:
|
|
|
|
```
|
|
for (i = 0; i < 100000000; ++i) {
|
|
y = i
|
|
}
|
|
|
|
i
|
|
y
|
|
|
|
halt
|
|
```
|
|
|
|
The command used was:
|
|
|
|
```
|
|
time -p [bc] ../test.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 16.60
|
|
user 16.59
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 22.76
|
|
user 22.75
|
|
sys 0.00
|
|
```
|
|
|
|
I also put the following into `../test2.bc`:
|
|
|
|
```
|
|
i = 0
|
|
|
|
while (i < 100000000) {
|
|
i += 1
|
|
}
|
|
|
|
i
|
|
|
|
halt
|
|
```
|
|
|
|
The command used was:
|
|
|
|
```
|
|
time -p [bc] ../test2.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 17.32
|
|
user 17.30
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 16.98
|
|
user 16.96
|
|
sys 0.01
|
|
```
|
|
|
|
It seems that the improvements to the interpreter helped a lot in certain cases.
|
|
|
|
Also, I have no idea why GNU `bc` did worse when it is technically doing less
|
|
work.
|
|
|
|
## Recommended Optimizations from `2.7.0`
|
|
|
|
Note that, when running the benchmarks, the optimizations used are not the ones
|
|
I recommended for version `2.7.0`, which are `-O3 -flto -march=native`.
|
|
|
|
This `bc` separates its code into modules that, when optimized at link time,
|
|
removes a lot of the inefficiency that comes from function overhead. This is
|
|
most keenly felt with one function: `bc_vec_item()`, which should turn into just
|
|
one instruction (on `x86_64`) when optimized at link time and inlined. There are
|
|
other functions that matter as well.
|
|
|
|
I also recommended `-march=native` on the grounds that newer instructions would
|
|
increase performance on math-heavy code. We will see if that assumption was
|
|
correct. (Spoiler: **NO**.)
|
|
|
|
When compiling both `bc`'s with the optimizations I recommended for this `bc`
|
|
for version `2.7.0`, the results are as follows.
|
|
|
|
### Addition
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc add.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.44
|
|
user 1.11
|
|
sys 1.32
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.59
|
|
user 0.54
|
|
sys 0.05
|
|
```
|
|
|
|
### Subtraction
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.42
|
|
user 1.02
|
|
sys 1.40
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.64
|
|
user 0.57
|
|
sys 0.06
|
|
```
|
|
|
|
### Multiplication
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 7.01
|
|
user 4.50
|
|
sys 2.50
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 1.59
|
|
user 1.53
|
|
sys 0.05
|
|
```
|
|
|
|
### Division
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc divide.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 3.26
|
|
user 1.82
|
|
sys 1.44
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 1.24
|
|
user 1.20
|
|
sys 0.03
|
|
```
|
|
|
|
### Power
|
|
|
|
The command used was:
|
|
|
|
```
|
|
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 11.08
|
|
user 11.07
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.71
|
|
user 0.70
|
|
sys 0.00
|
|
```
|
|
|
|
### Scripts
|
|
|
|
The command for the `../timeconst.bc` script was:
|
|
|
|
```
|
|
time -p [bc] ../timeconst.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 15.62
|
|
user 15.08
|
|
sys 0.53
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 10.09
|
|
user 10.08
|
|
sys 0.01
|
|
```
|
|
|
|
The command for the next script, the `for` loop script, was:
|
|
|
|
```
|
|
time -p [bc] ../test.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 14.76
|
|
user 14.75
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 17.95
|
|
user 17.94
|
|
sys 0.00
|
|
```
|
|
|
|
The command for the next script, the `while` loop script, was:
|
|
|
|
```
|
|
time -p [bc] ../test2.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 14.84
|
|
user 14.83
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 13.53
|
|
user 13.52
|
|
sys 0.00
|
|
```
|
|
|
|
## Link-Time Optimization Only
|
|
|
|
Just for kicks, let's see if `-march=native` is even useful.
|
|
|
|
The optimizations I used for both `bc`'s were `-O3 -flto`.
|
|
|
|
### Addition
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc add.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.41
|
|
user 1.05
|
|
sys 1.35
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.58
|
|
user 0.52
|
|
sys 0.05
|
|
```
|
|
|
|
### Subtraction
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 2.39
|
|
user 1.10
|
|
sys 1.28
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.65
|
|
user 0.57
|
|
sys 0.07
|
|
```
|
|
|
|
### Multiplication
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 6.82
|
|
user 4.30
|
|
sys 2.51
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 1.57
|
|
user 1.49
|
|
sys 0.08
|
|
```
|
|
|
|
### Division
|
|
|
|
The command used was:
|
|
|
|
```
|
|
tests/script.sh bc divide.bc 1 0 1 1 [bc]
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 3.25
|
|
user 1.81
|
|
sys 1.43
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 1.27
|
|
user 1.23
|
|
sys 0.04
|
|
```
|
|
|
|
### Power
|
|
|
|
The command used was:
|
|
|
|
```
|
|
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 10.50
|
|
user 10.49
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 0.72
|
|
user 0.71
|
|
sys 0.00
|
|
```
|
|
|
|
### Scripts
|
|
|
|
The command for the `../timeconst.bc` script was:
|
|
|
|
```
|
|
time -p [bc] ../timeconst.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 15.50
|
|
user 14.81
|
|
sys 0.68
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 10.17
|
|
user 10.15
|
|
sys 0.01
|
|
```
|
|
|
|
The command for the next script, the `for` loop script, was:
|
|
|
|
```
|
|
time -p [bc] ../test.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 14.99
|
|
user 14.99
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 16.85
|
|
user 16.84
|
|
sys 0.00
|
|
```
|
|
|
|
The command for the next script, the `while` loop script, was:
|
|
|
|
```
|
|
time -p [bc] ../test2.bc > /dev/null
|
|
```
|
|
|
|
For GNU `bc`:
|
|
|
|
```
|
|
real 14.92
|
|
user 14.91
|
|
sys 0.00
|
|
```
|
|
|
|
For this `bc`:
|
|
|
|
```
|
|
real 12.75
|
|
user 12.75
|
|
sys 0.00
|
|
```
|
|
|
|
It turns out that `-march=native` can be a problem. As such, I have removed the
|
|
recommendation to build with `-march=native`.
|
|
|
|
## Recommended Compiler
|
|
|
|
When I ran these benchmarks with my `bc` compiled under `clang` vs. `gcc`, it
|
|
performed much better under `clang`. I recommend compiling this `bc` with
|
|
`clang`.
|
|
|
|
[1]: https://github.com/torvalds/linux/blob/master/kernel/time/timeconst.bc
|