Rating CEDR=3708
Timestamp: 1676722998
Fix overflow in add_dpbusd_epi32x2
This patch fixes 16bit overflow in *_add_dpbusd_epi32x2 functions,
that can be triggered in rare cases depending on the NNUE weights.
While the code leads to some slowdown on affected architectures
(most notably avx2), the fix is simpler than some of the other
options discussed in
https://github.com/official-stockfish/Stockfish/pull/4394
Code suggested by Sopel97.
Result of "bench 4096 1 30 default depth nnue":
| Architecture | master | patch (gcc) | patch (clang) |
|---------------------|-----------|-------------|---------------|
| x86-64-vnni512 | 762122798 | 762122798 | 762122798 |
| x86-64-avx512 | 769723503 | 762122798 | 762122798 |
| x86-64-bmi2 | 769723503 | 762122798 | 762122798 |
| x86-64-ssse3 | 769723503 | 762122798 | 762122798 |
| x86-64 | 762122798 | 762122798 | 762122798 |
Following architectures will experience ~4% slowdown due to an
additional instruction in the middle of hot path:
* x86-64-avx512
* x86-64-bmi2
* x86-64-avx2
* x86-64-sse41-popcnt (x86-64-modern)
* x86-64-ssse3
* x86-32-sse41-popcnt
This patch clearly loses Elo against master with both STC and LTC.
Failed non-regression STC (256bit fix only):
LLR: -2.95 (-2.94,2.94) <-1.75,0.25>
Total: 33528 W: 8769 L: 9049 D: 15710 Elo -2.90
Ptnml(0-2): 96, 3616, 9600, 3376, 76
https://tests.stockfishchess.org/tests/view/63e6a5b44299542b1e26a485
60+0.6 @ 30000 games:
Elo: -1.67 +-1.7 (95%) LOS: 2.8%
Total: 30000 W: 7848 L: 7992 D: 14160 Elo -1.67
Ptnml(0-2): 12, 2847, 9436, 2683, 22
nElo: -3.84 +-3.9 (95%) PairsRatio: 0.95
https://tests.stockfishchess.org/tests/view/63e7ac716d0e1db55f35a660
However, a test against nn-a3dc078bafc7.nnue, which is the latest "safe"
network not causing the bug, passed with regular bounds.
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 160456 W: 42658 L: 42175 D: 75623 Elo +1.05
Ptnml(0-2): 487, 17638, 43469, 18173, 461
https://tests.stockfishchess.org/tests/view/63e89836d62a5d02b0fa82c8
closes https://github.com/official-stockfish/Stockfish/pull/4391
closes https://github.com/official-stockfish/Stockfish/pull/4394
No functional change
Download:
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Comments
Post a Comment