Since you’re here...

We hope you will consider supporting us today. We need your support to continue to exist, because good entries are more and more work time. Every reader contribution, however big or small, is so valuable. Support "Chess Engines Diary" even a small amount– and it only takes a minute. Thank you.
============================== My email: jotes@go2.pl



Official Stockfish pre-release: Stockfish dev-20251102 (Windows and MacOS)



Official prerelease Stockfish dev-20251102
Rating Chess Engines Diary CEDR=3761,3

Great news for chess enthusiasts! 
The pre-release version of Stockfish dev-20251102, widely recognized as the world's top chess engine, has been released. This brings us one step closer to the next official release. Our team is excited to begin testing it today, and we'll be sure to share the results on our website. Stay tuned!

What changes in the new Stockfish dev-20251102 version?

Use shared memory for network weights

This enables different Stockfish processes that use the same weights to use the same memory. The approach establishes equivalence by memory content, and is compatible with NUMA replication.  The benefit of sharing is reduced memory usage and a speedup thanks to improved (inter-process) caching of the network in the CPUs cache, and thus reduced bandwidth usage to main memory. Even though this change doesn't benefit a user running a single process, this helps on fishtest or e.g.  for Lichess, when multiple games run concurrently, or multiple positions are analyzed in parallel.

This concept was probably first introduced in the Monty engine
(https://github.com/official-monty/Monty/pull/62), after a discussion in
https://github.com/official-stockfish/fishtest/issues/2077 on the issue of
memory pressure. Measurements based on Torch
(https://github.com/user-attachments/files/21386224/verbatim.pdf) further
suggested that large gains were possible. Multiple other engines have
adopted this 'verbatim' format as well.

The implementation here adds the flexibility needed for SF, for example, retains the ability to bundle compressed networks with the binary, to load nets by uci option, and to distribute the shared nets to the proper NUMA region. This flexibility comes with a fair amount of complexity in the implementation, such as OS specific code, and fallback code.

For most users this should be transparent. However, for example, those running docker containers should ensure the `--ipc` flag is set correctly, and `--shm-size` is sufficiently large.

The benefits of this patch significantly depend on hardware, with systems with many cores and a large (O(150MB), the net size) L3 cache benefitting typically most.  On such systems SF speedups (as measured via nps playing games with large concurrency but just 1 thread) can be 38%, which results in master vs. patch Elo which gains about 25 Elo.

```
   # PLAYER             :  RATING  ERROR   POINTS  PLAYED   (%)
   1 shared_memoryPR    :    24.8    1.9  39432.0   73728    53
   2 master             :     0.0   ----  34296.0   73728    47
```

In a multithreaded setup, where weights are already shared, that benefit is smaller,
for example on the same HW as above, but with 8t for each side.
```
   # PLAYER             :  RATING  ERROR  POINTS  PLAYED   (%)
   1 shared_memoryPR    :     5.2    3.5  9351.0   18432    51
   2 master             :     0.0   ----  9081.0   18432    49
```

On fishtest with a typical hardware mix of our contributors, the following was measured:

STC, 60k games
https://tests.stockfishchess.org/tests/view/69074a49ea4b268f1fac236c
Elo: 4.69 ± 1.4 (95%) LOS: 100.0%
Total: 60000 W: 16085 L: 15275 D: 28640
Ptnml(0-2): 154, 6440, 16053, 7148, 205
nElo: 9.38 ± 2.8 (95%) PairsRatio: 1.12

To verify correctness with a single process on a NUMA architecture,
speedtest was used, confirming near equivalence:
```
master:        Average (over 10):  296236186
shared_memory: Average (over 10):  295769332
```
Currently, using large pages for the shared network weights is not always possible,
which can lead to a small slowdown (1-2%), in case a single process is run.

closes https://github.com/official-stockfish/Stockfish/pull/6173
No functional change

Stockfish dev-20251102 download (all files: stockfish-windows-armv8.exe, stockfish-windows-armv8-dotprod.exe, stockfish-windows-x86-64.exe, stockfish-windows-x86-64-avx2.exe, stockfish-windows-x86-64-avx512.exe, stockfish-windows-x86-64-avx512icl.exe, stockfish-windows-x86-64-avxvnni.exe, stockfish-windows-x86-64-bmi2.exe, stockfish-windows-x86-64-sse41-popcnt.exe, stockfish-windows-x86-64-vnni256.exe, stockfish-windows-x86-64-vnni512.exe  )  mirror 

Here are some sample tournament wins for the Stockfish engine on Android and Windows::



Comments