Stockfish Development Versions are build automatically if there are changes on the master branch in the git repository (https://github.com/official-stockfish/Stockfish). Use it at your own risk.
They are compiled with gcc/mingw 7.3 on Ubuntu 18.04.
Read the FAQ or leave a comment.

Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Stefan Geschwentner
Date: Mon Sep 6 14:19:47 2021 +0200
Timestamp: 1630930787

Improve history updates

If a search failed low at an expected PV or CUT node do greater history updates.

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 95112 W: 24293 L: 23982 D: 46837 Elo +1.14
Ptnml(0-2): 285, 10893, 24906, 11170, 302
https://tests.stockfishchess.org/tests/view/6132aa1a2ffb3c36aceb926f

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 116352 W: 29450 L: 28975 D: 57927 Elo +1.42
Ptnml(0-2): 93, 12263, 32984, 12748, 88
https://tests.stockfishchess.org/tests/view/613394d12ffb3c36aceb92f4

closes https://github.com/official-stockfish/Stockfish/pull/3693

Bench: 6130736
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Mon Sep 6 14:08:22 2021 +0200
Timestamp: 1630930102

Update default net to nn-6762d36ad265.nnue

SPSA 1: https://tests.stockfishchess.org/tests/view/612cdb1fbb4956d8b78eb5ab
Parameters: A total of 256 net biases were tuned (hidden layer 2)
Base net: nn-fe433fd8c7f6.nnue
New net: nn-5f134823db04.nnue

SPSA 2: https://tests.stockfishchess.org/tests/view/612fcde645091e810014af19
Parameters: A total of 64 net biases were tuned (hidden layer 1)
Base net: nn-5f134823db04.nnue
New net: nn-8eca5dd4e3f7.nnue

SPSA 3: https://tests.stockfishchess.org/tests/view/6130822345091e810014af61
Parameters: 256 net weights and 8 net biases (output layer)
Base net: nn-8eca5dd4e3f7.nnue
New net: nn-4556108e4f00.nnue

SPSA 4: https://tests.stockfishchess.org/tests/view/613287652ffb3c36aceb923c
Parameters: A total of 256 net biases were tuned (hidden layer 2)
Base net: nn-4556108e4f00.nnue
New net: nn-6762d36ad265.nnue

STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 162776 W: 41220 L: 40807 D: 80749 Elo +0.88
Ptnml(0-2): 517, 18800, 42359, 19177, 535
https://tests.stockfishchess.org/tests/view/6134107125b9b35584838559

LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 41056 W: 10428 L: 10156 D: 20472 Elo +2.30
Ptnml(0-2): 30, 4288, 11618, 4564, 28
https://tests.stockfishchess.org/tests/view/6134ad6525b9b3558483857a

closes https://github.com/official-stockfish/Stockfish/pull/3691

Bench: 5812158
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Michael Chaly
Date: Mon Sep 6 13:59:17 2021 +0200
Timestamp: 1630929557

Extend captures and promotions

This patch introduces extension for captures and promotions. Every capture or
promotion that is not the first move in the list gets extended at PvNodes and
cutNodes. Special thanks to @locutus2 - all my previous attepmts that failed
on this idea were done only for PvNodes - idea to include also cutNodes was
based on his latest passed patch.

STC
https://tests.stockfishchess.org/tests/view/6134abf325b9b35584838574
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 188920 W: 47754 L: 47304 D: 93862 Elo +0.83
Ptnml(0-2): 595, 21754, 49344, 22140, 627

LTC
https://tests.stockfishchess.org/tests/view/613521de25b9b355848385d7
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 8768 W: 2283 L: 2098 D: 4387 Elo +7.33
Ptnml(0-2): 7, 866, 2452, 1053, 6

closes https://github.com/official-stockfish/Stockfish/pull/3692

bench: 5564555
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Tue Aug 31 12:56:19 2021 +0200
Timestamp: 1630407379

Update default net to nn-735bba95dec0.nnue

SPSA 1: https://tests.stockfishchess.org/tests/view/61286d8b62d20cf82b5ad1bd
Parameters: A total of 256 net biases were tuned (hidden layer 2)
Base net: nn-33495fe25081.nnue
New net: nn-83e3cf2af92b.nnue

SPSA 2: https://tests.stockfishchess.org/tests/view/6129cf2162d20cf82b5ad25f
Parameters: A total of 64 net biases were tuned (hidden layer 1)
Base net: nn-83e3cf2af92b.nnue
New net: nn-69a528eaef35.nnue

SPSA 3: https://tests.stockfishchess.org/tests/view/612a0dcb62d20cf82b5ad2a0
Parameters: 256 net weights and 8 net biases (output layer)
Base net: nn-69a528eaef35.nnue
New net: nn-735bba95dec0.nnue

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 95144 W: 24310 L: 23999 D: 46835 Elo +1.14
Ptnml(0-2): 232, 11059, 24748, 11232, 301
https://tests.stockfishchess.org/tests/view/612bb3be0fdf40644b4b9996

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 33632 W: 8522 L: 8271 D: 16839 Elo +2.59
Ptnml(0-2): 18, 3511, 9516, 3744, 27
https://tests.stockfishchess.org/tests/view/612ce5b9bb4956d8b78eb5b3

Closes https://github.com/official-stockfish/Stockfish/pull/3685

Bench: 5600615
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: VoyagerOne
Date: Fri Aug 27 21:41:32 2021 +0200
Timestamp: 1630093292

CMH Pruning Tweak

Tweak pruning formula by adding up CMH values.

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 14608 W: 3837 L: 3641 D: 7130 Elo +4.66
Ptnml(0-2): 27, 1681, 3723, 1815, 58
https://tests.stockfishchess.org/tests/view/612792f362d20cf82b5ad156

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 53520 W: 13580 L: 13276 D: 26664 Elo +1.97
Ptnml(0-2): 28, 5610, 15183, 5908, 31
https://tests.stockfishchess.org/tests/view/6127d27062d20cf82b5ad191

closes https://github.com/official-stockfish/Stockfish/pull/3682

Bench: 5186641
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Fri Aug 27 07:51:26 2021 +0200
Timestamp: 1630043486

Update default net to nn-33495fe25081.nnue

STC:
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 37368 W: 9621 L: 9391 D: 18356 Elo +2.14
Ptnml(0-2): 117, 4287, 9664, 4481, 135
https://tests.stockfishchess.org/tests/view/612768165318138ee1204977

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 13328 W: 3446 L: 3246 D: 6636 Elo +5.21
Ptnml(0-2): 11, 1383, 3682, 1571, 17
https://tests.stockfishchess.org/tests/view/6127dc8d62d20cf82b5ad196

Closes https://github.com/official-stockfish/Stockfish/pull/3679

Bench: 5179347
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: ppigazzini
Date: Fri Aug 27 07:49:26 2021 +0200
Timestamp: 1630043366

Use "pedantic" flag also for mingw

This will avoid to run in fishtest a test where the linux machines exit from
the building process and only the windows machines run the test.

See:
https://tests.stockfishchess.org/tests/view/61122d732a8a49ac5be79996
https://github.com/SFisGOD/Stockfish/commit/4e422577d6ebd1f6ecf606189190b8f6fb03f6c9#comments

closes https://github.com/official-stockfish/Stockfish/pull/3671

No functional change.
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Joost VandeVondele
Date: Fri Aug 27 07:48:18 2021 +0200
Timestamp: 1630043298

Fix empty EvalFile option

some GUIs send an empty string for EvalFile, in that case explicitly try the default name

fixes https://github.com/official-stockfish/Stockfish/issues/3675

closes https://github.com/official-stockfish/Stockfish/pull/3678

No functional change.
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: bmc4
Date: Sun Aug 22 09:15:19 2021 +0200
Timestamp: 1629616519

Simplify Declaration on Pawn Move Generation

Removes possible micro-optimization in favor of readability.

STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 75432 W: 5824 L: 5777 D: 63831 Elo +0.22
Ptnml(0-2): 178, 4648, 28036, 4657, 197
https://tests.stockfishchess.org/tests/view/611fa7f84977aa1525c9cb75

LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 41200 W: 1156 L: 1106 D: 38938 Elo +0.42
Ptnml(0-2): 13, 981, 18562, 1031, 13
https://tests.stockfishchess.org/tests/view/611fcc694977aa1525c9cb9b

Closes https://github.com/official-stockfish/Stockfish/pull/3669

No functional change
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Sun Aug 22 09:09:58 2021 +0200
Timestamp: 1629616198

Update default net to nn-517c4f68b5df.nnue

SPSA: https://tests.stockfishchess.org/tests/view/611cf0da4977aa1525c9ca03
Parameters: 256 net weights and 8 net biases (output layer)
Base net: nn-ac5605a608d6.nnue
New net: nn-517c4f68b5df.nnue

STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 11600 W: 998 L: 851 D: 9751 Elo +4.40
Ptnml(0-2): 30, 705, 4186, 846, 33
https://tests.stockfishchess.org/tests/view/611f84524977aa1525c9cb5b

LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 9360 W: 338 L: 243 D: 8779 Elo +3.53
Ptnml(0-2): 0, 220, 4151, 303, 6
https://tests.stockfishchess.org/tests/view/611f8c5b4977aa1525c9cb64

closes https://github.com/official-stockfish/Stockfish/pull/3667

Bench: 4844618
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: candirufish
Date: Sun Aug 22 09:05:53 2021 +0200
Timestamp: 1629615953

do more LMR extensions for PV nodes

LMR Pv and depth 6 Extension tweak:

LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 52488 W: 1542 L: 1394 D: 49552 Elo +0.98
Ptnml(0-2): 18, 1253, 23552, 1405, 16
https://tests.stockfishchess.org/tests/view/611e49c34977aa1525c9caa7

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 76216 W: 6000 L: 5784 D: 64432 Elo +0.98
Ptnml(0-2): 204, 4745, 28006, 4937, 216
https://tests.stockfishchess.org/tests/view/611e0e254977aa1525c9ca89

closes https://github.com/official-stockfish/Stockfish/pull/3666

Bench: 5046381
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: bmc4
Date: Sun Aug 22 09:00:15 2021 +0200
Timestamp: 1629615615

Simplify Null Move Search Reduction

slightly simpler formula for reduction computation.

first round of tests:
STC:
LLR: 2.97 (-2.94,2.94) <-2.50,0.50>
Total: 15632 W: 1319 L: 1204 D: 13109 Elo +2.56
Ptnml(0-2): 33, 956, 5733, 1051, 43
https://tests.stockfishchess.org/tests/view/60bd03c7457376eb8bcaa600

LTC:
LLR: 3.37 (-2.94,2.94) <-2.50,0.50>
Total: 86296 W: 2814 L: 2779 D: 80703 Elo +0.14
Ptnml(0-2): 33, 2500, 38039, 2551, 25
https://tests.stockfishchess.org/tests/view/60bd1ff0457376eb8bcaa653

recent tests:
STC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 23936 W: 1895 L: 1793 D: 20248 Elo +1.48
Ptnml(0-2): 40, 1470, 8869, 1526, 63
https://tests.stockfishchess.org/tests/view/611f9b7d4977aa1525c9cb6b

LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 62568 W: 1750 L: 1713 D: 59105 Elo +0.21
Ptnml(0-2): 19, 1560, 28085, 1605, 15
https://tests.stockfishchess.org/tests/view/611fa4814977aa1525c9cb71

functional on high depth

closes https://github.com/official-stockfish/Stockfish/pull/3535

Bench: 5375286
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Tomasz Sobczyk
Date: Fri Aug 20 08:50:25 2021 +0200
Timestamp: 1629442225

Optimize and tidy up affine transform code.

The new network caused some issues initially due to the very narrow neuron set between the first two FC layers. Necessary changes were hacked together to make it work. This patch is a mature approach to make the affine transform code faster, more readable, and easier to maintain should the layer sizes change again.

The following changes were made:

* ClippedReLU always produces a multiple of 32 outputs. This is about as good of a solution for AffineTransform's SIMD requirements as it can get without a bigger rewrite.

* All self-contained simd helpers are moved to a separate file (simd.h). Inline asm is utilized to work around GCC's issues with code generation and register assignment. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101693, https://godbolt.org/z/da76fY1n7

* AffineTransform has 2 specializations. While it's more lines of code due to the boilerplate, the logic in both is significantly reduced, as these two are impossible to nicely combine into one.
1) The first specialization is for cases when there's >=128 inputs. It uses a different approach to perform the affine transform and can make full use of AVX512 without any edge cases. Furthermore, it has higher theoretical throughput because less loads are needed in the hot path, requiring only a fixed amount of instructions for horizontal additions at the end, which are amortized by the large number of inputs.
2) The second specialization is made to handle smaller layers where performance is still necessary but edge cases need to be handled. AVX512 implementation for this was ommited by mistake, a remnant from the temporary implementation for the new... This could be easily reintroduced if needed. A slightly more detailed description of both implementations is in the code.

Overall it should be a minor speedup, as shown on fishtest:

passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 51520 W: 4074 L: 3888 D: 43558 Elo +1.25
Ptnml(0-2): 111, 3136, 19097, 3288, 128

and various tests shown in the pull request

closes https://github.com/official-stockfish/Stockfish/pull/3663

No functional change
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Tomasz Sobczyk
Date: Fri Aug 20 07:57:09 2021 +0200
Timestamp: 1629439029

Improve handling of the debug log file.

Fix handling of empty strings in uci options and reassigning of the log file

Fixes https://github.com/official-stockfish/Stockfish/issues/3650

Closes https://github.com/official-stockfish/Stockfish/pull/3655

No functional change
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Torsten Hellwig
Date: Wed Aug 18 09:17:22 2021 +0200
Timestamp: 1629271042

Update default net to nn-ac5605a608d6.nnue

This net was created with the nnue-pytorch trainer, it used the previous master net as a starting point.

The training data includes all T60 data (https://drive.google.com/drive/folders/1rzZkgIgw7G5vQMLr2hZNiUXOp7z80613), all T74 data (https://drive.google.com/drive/folders/1aFUv3Ih3-A8Vxw9064Kw_FU4sNhMHZU-) and the wrongNNUE_02_d9.binpack (https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq). The Leela data were randomly named and then concatenated. All data was merged into one binpack using interleave_binpacks.py.

python3 train.py \
../data/t60_t74_wrong.binpack \
../data/t60_t74_wrong.binpack \
--resume-from-model ../data/nn-e8321e467bf6.pt \
--gpus 1 \
--threads 4 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 300 \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--max_epochs=600 \
--seed $RANDOM \
--default_root_dir ../output/exp_24

STC:
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 15320 W: 1415 L: 1257 D: 12648 Elo +3.58
Ptnml(0-2): 50, 1002, 5402, 1152, 54
https://tests.stockfishchess.org/tests/view/611c404a4977aa1525c9c97f

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 9440 W: 345 L: 248 D: 8847 Elo +3.57
Ptnml(0-2): 3, 222, 4175, 315, 5
https://tests.stockfishchess.org/tests/view/611c6c7d4977aa1525c9c996

LTC with UHO_XXL_+0.90_+1.19.epd:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 6232 W: 1638 L: 1459 D: 3135 Elo +9.98
Ptnml(0-2): 5, 592, 1744, 769, 6
https://tests.stockfishchess.org/tests/view/611c9b214977aa1525c9c9cb

closes https://github.com/official-stockfish/Stockfish/pull/3664

Bench: 5375286
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Joost VandeVondele
Date: Tue Aug 17 21:08:34 2021 +0200
Timestamp: 1629227314

Regenerate dependencies on code change

fixes https://github.com/official-stockfish/Stockfish/issues/3658

dependencies are now regenerated for each code change, this adds some 1s overhead in compile time, but avoids potential miscompilations or build problems.

closes https://github.com/official-stockfish/Stockfish/pull/3659

No functional change
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Tomasz Sobczyk
Date: Sun Aug 15 12:05:43 2021 +0200
Timestamp: 1629021943

New NNUE architecture and net

Introduces a new NNUE network architecture and associated network parameters

The summary of the changes:

* Position for each perspective mirrored such that the king is on e..h files. Cuts the feature transformer size in half, while preserving enough knowledge to be good. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.b40q4rb1w7on.
* The number of neurons after the feature transformer increased two-fold, to 1024x2. This is possibly mostly due to the now very optimized feature transformer update code.
* The number of neurons after the second layer is reduced from 16 to 8, to reduce the speed impact. This, perhaps surprisingly, doesn't harm the strength much. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.6qkocr97fezq

The AffineTransform code did not work out-of-the box with the smaller number of neurons after the second layer, so some temporary changes have been made to add a special case for InputDimensions == 8. Also additional 0 padding is added to the output for some archs that cannot process inputs by <=8 (SSE2, NEON). VNNI uses an implementation that can keep all outputs in the registers while reducing the number of loads by 3 for each 16 inputs, thanks to the reduced number of output neurons. However GCC is particularily bad at optimization here (and perhaps why the current way the affine transform is done even passed sprt) (see https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit# for details) and more work will be done on this in the following days. I expect the current VNNI implementation to be improved and extended to other architectures.

The network was trained with a slightly modified version of the pytorch trainer (https://github.com/glinscott/nnue-pytorch); the changes are in https://github.com/glinscott/nnue-pytorch/pull/143

The training utilized 2 datasets.

dataset A - https://drive.google.com/file/d/1VlhnHL8f-20AXhGkILujnNXHwy9T-MQw/view?usp=sharing
dataset B - as described in https://github.com/official-stockfish/Stockfish/commit/ba01f4b95448bcb324755f4dd2a632a57c6e67bc

The training process was as following:

train on dataset A for 350 epochs, take the best net in terms of elo at 20k nodes per move (it's fine to take anything from later stages of training).
convert the .ckpt to .pt
--resume-from-model from the .pt file, train on dataset B for <600 epochs, take the best net. Lambda=0.8, applied before the loss function.

The first training command:

python3 train.py \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--max_epochs=600 \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2

The second training command:

python3 serialize.py \
--features=HalfKAv2_hm^ \
../nnue-pytorch-training/experiment_131/run_6/default/version_0/checkpoints/epoch-499.ckpt \
../nnue-pytorch-training/experiment_$1/base/base.pt

python3 train.py \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=0.8 \
--max_epochs=600 \
--resume-from-model ../nnue-pytorch-training/experiment_$1/base/base.pt \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2

STC: https://tests.stockfishchess.org/tests/view/611120b32a8a49ac5be798c4

LLR: 2.97 (-2.94,2.94) <-0.50,2.50>
Total: 22480 W: 2434 L: 2251 D: 17795 Elo +2.83
Ptnml(0-2): 101, 1736, 7410, 1865, 128

LTC: https://tests.stockfishchess.org/tests/view/611152b32a8a49ac5be798ea

LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9776 W: 442 L: 333 D: 9001 Elo +3.87
Ptnml(0-2): 5, 295, 4180, 402, 6

closes https://github.com/official-stockfish/Stockfish/pull/3646

bench: 5189338
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Joost VandeVondele
Date: Thu Aug 5 16:41:07 2021 +0200
Timestamp: 1628174467

Revert futility pruning patches

reverts 09b6d28391cf582d99897360b225bcbbe38dd1c6 and
dbd7f602d3c7622df294f87d7239b5aaf31f695f that significantly impact mate
finding capabilities. For example on ChestUCI_23102018.epd, at 1M nodes,
the number of mates found is nearly reduced 2x without these depth conditions:

sf6 2091
sf7 2093
sf8 2107
sf9 2062
sf10 2208
sf11 2552
sf12 2563
sf13 2509
sf14 2427
master 1246
patched 2467

(script for testing at https://github.com/official-stockfish/Stockfish/files/6936412/matecheck.zip)

closes https://github.com/official-stockfish/Stockfish/pull/3641

fixes https://github.com/official-stockfish/Stockfish/issues/3627

Bench: 5467570
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: VoyagerOne
Date: Thu Aug 5 16:32:07 2021 +0200
Timestamp: 1628173927

SEE simplification

Simplified SEE formula by removing std::min. Should also be easier to tune.

STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 22656 W: 1836 L: 1729 D: 19091 Elo +1.64
Ptnml(0-2): 54, 1426, 8267, 1521, 60
https://tests.stockfishchess.org/tests/view/610ae62f2a8a49ac5be79449

LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 26248 W: 806 L: 744 D: 24698 Elo +0.82
Ptnml(0-2): 6, 668, 11715, 728, 7
https://tests.stockfishchess.org/tests/view/610b17ad2a8a49ac5be79466

closes https://github.com/official-stockfish/Stockfish/pull/3643

bench: 4915145
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Thu Aug 5 08:52:07 2021 +0200
Timestamp: 1628146327

Update default net to nn-46832cfbead3.nnue

SPSA 1: https://tests.stockfishchess.org/tests/view/6100e7f096b86d98abf6a832
Parameters: A total of 256 net weights and 8 net biases were tuned (output layer)
Base net: nn-56a5f1c4173a.nnue
New net: nn-ec3c8e029926.nnue

SPSA 2: https://tests.stockfishchess.org/tests/view/610733caafad2da4f4ae3da7
Parameters: A total of 256 net biases were tuned (hidden layer 2)
Base net: nn-ec3c8e029926.nnue
New net: nn-46832cfbead3.nnue

STC:
LLR: 2.98 (-2.94,2.94) <-0.50,2.50>
Total: 50520 W: 3953 L: 3765 D: 42802 Elo +1.29
Ptnml(0-2): 138, 3063, 18678, 3235, 146
https://tests.stockfishchess.org/tests/view/610a79692a8a49ac5be793f4

LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 57256 W: 1723 L: 1566 D: 53967 Elo +0.95
Ptnml(0-2): 12, 1442, 25568, 1589, 17
https://tests.stockfishchess.org/tests/view/610ac5bb2a8a49ac5be79434

Closes https://github.com/official-stockfish/Stockfish/pull/3642

Bench: 5359314
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Stefan Geschwentner
Date: Thu Aug 5 08:47:33 2021 +0200
Timestamp: 1628146053

Simplify new cmh pruning thresholds by using directly a quadratic formula.

This decouples also the stat bonus updates from the threshold which creates less dependencies for tuning of stat bonus parameters.
Perhaps a further fine tuning of the now separated coefficients for constHist[0] and constHist[1] could give further gains.

STC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 78384 W: 6134 L: 6090 D: 66160 Elo +0.20
Ptnml(0-2): 207, 5013, 28705, 5063, 204
https://tests.stockfishchess.org/tests/view/6106d235afad2da4f4ae3d4b

LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 38176 W: 1149 L: 1095 D: 35932 Elo +0.49
Ptnml(0-2): 6, 1000, 17030, 1038, 14
https://tests.stockfishchess.org/tests/view/6107a080afad2da4f4ae3def

closes https://github.com/official-stockfish/Stockfish/pull/3639

Bench: 5098146
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: VoyagerOne
Date: Thu Aug 5 08:44:38 2021 +0200
Timestamp: 1628145878

Futile pruning simplification

Remove CMH conditions in futile pruning.

STC:
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 93520 W: 7165 L: 7138 D: 79217 Elo +0.10
Ptnml(0-2): 222, 5923, 34427, 5982, 206
https://tests.stockfishchess.org/tests/view/61083104e50a153c346ef8df

LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 59072 W: 1746 L: 1706 D: 55620 Elo +0.24
Ptnml(0-2): 13, 1562, 26353, 1588, 20
https://tests.stockfishchess.org/tests/view/610894f2e50a153c346ef913

closes https://github.com/official-stockfish/Stockfish/pull/3638

Bench: 5229673
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: VoyagerOne
Date: Sat Jul 31 15:29:19 2021 +0200
Timestamp: 1627738159

CMH Pruning Tweak

replace CounterMovePruneThreshold by a depth dependent threshold

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 35512 W: 2718 L: 2552 D: 30242 Elo +1.62
Ptnml(0-2): 66, 2138, 13194, 2280, 78
https://tests.stockfishchess.org/tests/view/6104442fafad2da4f4ae3b94

LTC:
LLR: 2.96 (-2.94,2.94) <0.50,3.50>
Total: 36536 W: 1150 L: 1019 D: 34367 Elo +1.25
Ptnml(0-2): 10, 920, 16278, 1049, 11
https://tests.stockfishchess.org/tests/view/6104b033afad2da4f4ae3bbc

closes https://github.com/official-stockfish/Stockfish/pull/3636

Bench: 5848718
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: Tomasz Sobczyk
Date: Fri Jul 30 17:15:52 2021 +0200
Timestamp: 1627658152

Avoid unnecessary stores in the affine transform

This patch improves the codegen in the AffineTransform::forward function for architectures >=SSSE3. Current code works directly on memory and the compiler cannot see that the stores through outptr do not alias the loads through weights and input32. The solution implemented is to perform the affine transform with local variables as accumulators and only store the result to memory at the end. The number of accumulators required is OutputDimensions / OutputSimdWidth, which means that for the 1024->16 affine transform it requires 4 registers with SSSE3, 2 with AVX2, 1 with AVX512. It also cuts the number of stores required by NumRegs * 256 for each node evaluated. The local accumulators are expected to be assigned to registers, but even if this cannot be done in some case due to register pressure it will help the compiler to see that there is no aliasing between the loads and stores and may still result in better codegen.

See https://godbolt.org/z/59aTKbbYc for codegen comparison.

passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 140328 W: 10635 L: 10358 D: 119335 Elo +0.69
Ptnml(0-2): 302, 8339, 52636, 8554, 333

closes https://github.com/official-stockfish/Stockfish/pull/3634

No functional change
see source
Windows x64 for Haswell CPUs
Windows x64 for modern computers + AVX2
Windows x64 for modern computers
Windows x64 + SSSE3
Windows x64
Windows 32
Linux x64 for Haswell CPUs
Linux x64 for modern computers + AVX2
Linux x64 for modern computers
Linux x64 + SSSE3
Linux x64
Author: SFisGOD
Date: Thu Jul 29 07:35:13 2021 +0200
Timestamp: 1627536913

Update default net to nn-56a5f1c4173a.nnue

SPSA 1: https://tests.stockfishchess.org/tests/view/60fd24efd8a6b65b2f3a796e
Parameters: A total of 256 net biases were tuned (hidden layer 2)
New best values: Half of the changes from the tuning run
New net: nn-5992d3ba79f3.nnue

SPSA 2: https://tests.stockfishchess.org/tests/view/60fec7d6d8a6b65b2f3a7aa2
Parameters: A total of 128 net biases were tuned (hidden layer 1)
New best values: Half of the changes from the tuning run
New net: nn-56a5f1c4173a.nnue

STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 140392 W: 10863 L: 10578 D: 118951 Elo +0.71
Ptnml(0-2): 347, 8754, 51718, 9021, 356
https://tests.stockfishchess.org/tests/view/610037e396b86d98abf6a79e

LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 14216 W: 454 L: 355 D: 13407 Elo +2.42
Ptnml(0-2): 4, 323, 6356, 420, 5
https://tests.stockfishchess.org/tests/view/61019995afad2da4f4ae3a3c

Closes #3633

Bench: 4801359
see source

< prev page next page >