Maelstrom 3.0.0 - what's new?
NNUE Evaluation + Search Improvements/Fixes
Estimated ELO: ~2800 LTC, ~2700 STC
Testing below done at
STC: 8s+0.08
LTC: 40s+0.4
vs. Maelstrom v2.1.0
STC ELO: 343.3 +/- 65.5
Score of Maelstrom v3.0.0 vs Maelstrom v2.1.0: 195 - 21 - 14 [0.878] 230
... Maelstrom v3.0.0 playing White: 96 - 11 - 7 [0.873] 114
... Maelstrom v3.0.0 playing Black: 99 - 10 - 7 [0.884] 116
... White vs Black: 106 - 110 - 14 [0.491] 230
Elo difference: 343.3 +/- 65.5, LOS: 100.0 %, DrawRatio: 6.1 %
SPRT: llr 2.91 (100.5%), lbound -2.25, ubound 2.89 - H1 was accepted
LTC ELO: 515.5 +/- 119.7
Score of Maelstrom v3.0.0 vs Maelstrom v2.1.0: 172 - 6 - 6 [0.951] 184
... Maelstrom v3.0.0 playing White: 87 - 2 - 3 [0.962] 92
... Maelstrom v3.0.0 playing Black: 85 - 4 - 3 [0.940] 92
... White vs Black: 91 - 87 - 6 [0.511] 184
Elo difference: 515.5 +/- 119.7, LOS: 100.0 %, DrawRatio: 3.3 %
SPRT: llr 2.89 (100.1%), lbound -2.25, ubound 2.89 - H1 was accepted
vs. Stash 21.2
STC ELO: 26.6 +/- 13.9
Score of Maelstrom v3.0.0 vs Stash 21: 903 - 750 - 347 [0.538] 2000
... Maelstrom v3.0.0 playing White: 483 - 368 - 149 [0.557] 1000
... Maelstrom v3.0.0 playing Black: 420 - 382 - 198 [0.519] 1000
... White vs Black: 865 - 788 - 347 [0.519] 2000
Elo difference: 26.6 +/- 13.9, LOS: 100.0 %, DrawRatio: 17.3 %
SPRT: llr 2.42 (83.6%), lbound -2.25, ubound 2.89
LTC ELO: 96.0 +/- 25.5
Score of Maelstrom v3.0.0 vs Stash 21: 338 - 171 - 111 [0.635] 620
... Maelstrom v3.0.0 playing White: 181 - 79 - 50 [0.665] 310
... Maelstrom v3.0.0 playing Black: 157 - 92 - 61 [0.605] 310
... White vs Black: 273 - 236 - 111 [0.530] 620
Elo difference: 96.0 +/- 25.5, LOS: 100.0 %, DrawRatio: 17.9 %
SPRT: llr 2.9 (100.4%), lbound -2.25, ubound 2.89 - H1 was accepted
Changelist:
Add NNUE evaluation with current architecture of (768 -> 256) x2 -> 1 using a SCReLU activation function
Train network on SF/Lc0 data (acquired from T60T70wIsRightFarseer.binpack & linrock datasets)
Optimize SCReLU using SIMD instructions written with C, compiled with cgo. Based off Lizard's SCReLU function documented in CPW
Replace IID with IIR
Remove buggy delta pruning and depth limits from quiescence search
Update aspiration window to use 25 centipawns as the base margin and add exponentially widening bounds
Update parameters/conditions for RFP/FP/Razoring
Update history heuristic to use history gravity formula and proper maluses
Embed network weights into the built binary, no external dependencies required to run the engine
Cleanup search/eval code to remove HCE components, as well as remove custom opening book and lichess TB integration
Goals for next release(s):
Figure out SPSA/CTT to optimize existing search parameters
(After I feel things are reasonably optimized) continue with improvements like NNUE output buckets and search features like SEE
Maelstrom 2.1.0 vs other engines:
Libra-chess 1.0.1 | 2/2 | +2 | 2 Games |
Emerald 0.3.0 | 2/2 | +2 | 2 Games |
Ironfang 1.0 | 2/2 | +2 | 2 Games |
Freda 1.1 | 1/2 | +0 | 2 Games |
Celeris 1.0 | 0.5/2 | -1 | 2 Games |
Fridolin 4.00 JA | 0.5/2 | -1 | 2 Games |
Pzchessbot 2.0 | 0.5/2 | -1 | 2 Games |
Fire 10 | 0/2 | -2 | 2 Games |
Kraken 2025-06-16 | 0/2 | -2 | 2 Games |
Zangdar 4.04 | 0/2 | -2 | 2 Games |
Lunar 0.2.1 | 0/2 | -2 | 2 Games |
Zangdar 4.04 JA | 0/2 | -2 | 2 Games |
Fire 9.3 JA | 0/2 | -2 | 2 Games |
Stormphrax 7.0.0 | 0/2 | -2 | 2 Games |
PlentyChess 6.0.5 JA | 0/2 | -2 | 2 Games |
Halogen 13 | 0/2 | -2 | 2 Games |
Aku 2025.6.12 JA | 0/2 | -2 | 2 Games |
Comments
Post a Comment