Since you’re here...

We hope you will consider supporting us today. We need your support to continue to exist, because good entries are more and more work time. Every reader contribution, however big or small, is so valuable. Support "Chess Engines Diary" even a small amount– and it only takes a minute. Thank you.
============================== My email: jotes@go2.pl



Can a phone beat Magnus Carlsen at chess?

 


Can a phone beat Magnus Carlsen at chess?

Is a question that I am sometimes asked by my non-chess friends or my non-technologically inclined chess friends. At one time this was an interesting question, but it is getting difficult to convey just how silly it has become in recent years. Engines are so strong and phones are so fast that there really isn’t much of a qualitative difference between a phone and a supercomputer when it comes to playing chess against people. 
They are both so far beyond human ability that the result of a match would be the same - the human loses every game.

But the essence of the question is still interesting. There must exist hardware slow enough that it would be an even match against top humans. What would that look like? I’ve conducted some experiments to try to figure that out.

I started by finding the slowest hardware I own that can run the latest version of Stockfish. This is a Raspberry Pi Zero W, which is a small single-board computer powered by what is essentially a fifteen-year-old budget cell phone processor. It runs Stockfish 17.1 at a paltry 2,200 nodes per second. To simulate top human play, I got out my trusty old copy of Fritz Bahrain, which in 2002 drew a match with Kramnik. Using a single core on an i7-6700k, Fritz Bahrain searches about 3.5 million nodes per second, which is pretty close to the reported figures for the machine that Kramnik played. I figured I would have it serve as a reference point for 2800-level play and thought that these machines might have an interesting match.

However, even at only 2,200 nodes per second Stockfish was way too strong. In classical-length games it achieved search depths of 20-25. This is comparable to the eval bar we are familiar with in broadcasts and game analyses, which we know is fallible but still comfortably superhuman. It mercilessly crushed Fritz in a short set of classical-length test games that I played.

Stockfish had to be further handicapped to get a close match. I was able to underclock the Raspberry Pi to 600 Mhz, resulting in about 1,600 nodes per second, but that didn’t make a huge difference. I knew I would have to give the programs unequal time as well. Unfortunately time handicaps are not supported by the old Chessbase interfaces required to run Fritz Bahrain. Thus I needed to find an alternative engine to be my human surrogate, ideally one that is similar in strength to Fritz but is UCI compliant and bug-free. After a few test matches, Stockfish 1.0 emerged as the best candidate.
It performed about +50 Elo in a 100-game blitz match against Fritz Bahrain so I had it serve as a reference point for 2850-level play.

Stockfish 1.0 (32-bit) used a single core of an i7-6700k and a time control of 90+60 (it searched ~1.8 million nodes per second). 
Stockfish 17.1 started at 3+2 on the Raspberry Pi. Since it was searching about 1,600 nodes per second and had a 30:1 time deficit, this simulated Stockfish 17.1 playing classical chess on hardware that gets roughly 50 nodes per second. And finally I found something that is no longer superhuman. In a 100-game match, Stockfish 17.1 scored 36 points (+22 =28 -50). Stockfish 17.1’s positional play was far superior to Stockfish 1.0 and it usually achieved good positions but was often not able to convert. When low on time it frequently blundered 2-4 move tactics. Its final performance was about -100 Elo, or a ~2750 performance. Doubling the time to 6+4 (simulating hardware getting roughly 100 nodes per second) resulted in a performance of about +70 against Stockfish 1.0 (+43 =33 -24), or ~2900.

So somewhere around 100 nodes per second is likely where performance becomes superhuman. What kind of hardware would that be? It’s hard to say since modern versions of Stockfish would take a lot of work to get running on truly old hardware, if it is possible at all. But ignoring that, this user reported getting Stockfish 6 running on a 386 at about 1,000 nodes per second. On my machines SF 17.1 gets about 35% as many nodes per second as SF 6, so let’s say a 386 would run it at 350 nodes per second. That would still result in 3000+ play. Perhaps a 286 would run Stockfish 17.1 in the 100 nps range. Of course with 16-bit architecture and nowhere near enough RAM to fit the neural net, this would be pretty much impossible, but this experiment suggests that it really is ancient hardware like this we would need to reference if we want modern Stockfish to sink to the level of top humans.

By: EvilNalu

Comments

  1. Absolutely fascinating analysis! Your experiments vividly illustrate how even minimal computing power can yield superhuman chess performance. The fact that Stockfish 17.1, operating at just 100 nodes per second, can achieve a performance rating around 2900 Elo underscores the remarkable efficiency of modern engines. This also highlights the vast gap between human capabilities and AI, even on constrained hardware. Your work not only answers the titular question but also provides deep insights into the evolution and dominance of chess engines. Thank you for sharing this enlightening exploration!

    best regards

    how much is starlink

    ReplyDelete

Post a Comment