| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: SFisGOD
Date: Fri Aug 27 07:51:26 2021 +0200 Timestamp: 1630043486 Update default net to nn-33495fe25081.nnue STC: LLR: 2.95 (-2.94,2.94) <-0.50,2.50> Total: 37368 W: 9621 L: 9391 D: 18356 Elo +2.14 Ptnml(0-2): 117, 4287, 9664, 4481, 135 https://tests.stockfishchess.org/tests/view/612768165318138ee1204977 LTC: LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 13328 W: 3446 L: 3246 D: 6636 Elo +5.21 Ptnml(0-2): 11, 1383, 3682, 1571, 17 https://tests.stockfishchess.org/tests/view/6127dc8d62d20cf82b5ad196 Closes https://github.com/official-stockfish/Stockfish/pull/3679 Bench: 5179347 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: ppigazzini
Date: Fri Aug 27 07:49:26 2021 +0200 Timestamp: 1630043366 Use "pedantic" flag also for mingw This will avoid to run in fishtest a test where the linux machines exit from the building process and only the windows machines run the test. See: https://tests.stockfishchess.org/tests/view/61122d732a8a49ac5be79996 https://github.com/SFisGOD/Stockfish/commit/4e422577d6ebd1f6ecf606189190b8f6fb03f6c9#comments closes https://github.com/official-stockfish/Stockfish/pull/3671 No functional change. see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Joost VandeVondele
Date: Fri Aug 27 07:48:18 2021 +0200 Timestamp: 1630043298 Fix empty EvalFile option some GUIs send an empty string for EvalFile, in that case explicitly try the default name fixes https://github.com/official-stockfish/Stockfish/issues/3675 closes https://github.com/official-stockfish/Stockfish/pull/3678 No functional change. see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: bmc4
Date: Sun Aug 22 09:15:19 2021 +0200 Timestamp: 1629616519 Simplify Declaration on Pawn Move Generation Removes possible micro-optimization in favor of readability. STC: LLR: 2.95 (-2.94,2.94) <-2.50,0.50> Total: 75432 W: 5824 L: 5777 D: 63831 Elo +0.22 Ptnml(0-2): 178, 4648, 28036, 4657, 197 https://tests.stockfishchess.org/tests/view/611fa7f84977aa1525c9cb75 LTC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 41200 W: 1156 L: 1106 D: 38938 Elo +0.42 Ptnml(0-2): 13, 981, 18562, 1031, 13 https://tests.stockfishchess.org/tests/view/611fcc694977aa1525c9cb9b Closes https://github.com/official-stockfish/Stockfish/pull/3669 No functional change see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: SFisGOD
Date: Sun Aug 22 09:09:58 2021 +0200 Timestamp: 1629616198 Update default net to nn-517c4f68b5df.nnue SPSA: https://tests.stockfishchess.org/tests/view/611cf0da4977aa1525c9ca03 Parameters: 256 net weights and 8 net biases (output layer) Base net: nn-ac5605a608d6.nnue New net: nn-517c4f68b5df.nnue STC: LLR: 2.93 (-2.94,2.94) <-0.50,2.50> Total: 11600 W: 998 L: 851 D: 9751 Elo +4.40 Ptnml(0-2): 30, 705, 4186, 846, 33 https://tests.stockfishchess.org/tests/view/611f84524977aa1525c9cb5b LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.50> Total: 9360 W: 338 L: 243 D: 8779 Elo +3.53 Ptnml(0-2): 0, 220, 4151, 303, 6 https://tests.stockfishchess.org/tests/view/611f8c5b4977aa1525c9cb64 closes https://github.com/official-stockfish/Stockfish/pull/3667 Bench: 4844618 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: candirufish
Date: Sun Aug 22 09:05:53 2021 +0200 Timestamp: 1629615953 do more LMR extensions for PV nodes LMR Pv and depth 6 Extension tweak: LTC: LLR: 2.93 (-2.94,2.94) <0.50,3.50> Total: 52488 W: 1542 L: 1394 D: 49552 Elo +0.98 Ptnml(0-2): 18, 1253, 23552, 1405, 16 https://tests.stockfishchess.org/tests/view/611e49c34977aa1525c9caa7 STC: LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 76216 W: 6000 L: 5784 D: 64432 Elo +0.98 Ptnml(0-2): 204, 4745, 28006, 4937, 216 https://tests.stockfishchess.org/tests/view/611e0e254977aa1525c9ca89 closes https://github.com/official-stockfish/Stockfish/pull/3666 Bench: 5046381 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: bmc4
Date: Sun Aug 22 09:00:15 2021 +0200 Timestamp: 1629615615 Simplify Null Move Search Reduction slightly simpler formula for reduction computation. first round of tests: STC: LLR: 2.97 (-2.94,2.94) <-2.50,0.50> Total: 15632 W: 1319 L: 1204 D: 13109 Elo +2.56 Ptnml(0-2): 33, 956, 5733, 1051, 43 https://tests.stockfishchess.org/tests/view/60bd03c7457376eb8bcaa600 LTC: LLR: 3.37 (-2.94,2.94) <-2.50,0.50> Total: 86296 W: 2814 L: 2779 D: 80703 Elo +0.14 Ptnml(0-2): 33, 2500, 38039, 2551, 25 https://tests.stockfishchess.org/tests/view/60bd1ff0457376eb8bcaa653 recent tests: STC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 23936 W: 1895 L: 1793 D: 20248 Elo +1.48 Ptnml(0-2): 40, 1470, 8869, 1526, 63 https://tests.stockfishchess.org/tests/view/611f9b7d4977aa1525c9cb6b LTC: LLR: 2.95 (-2.94,2.94) <-2.50,0.50> Total: 62568 W: 1750 L: 1713 D: 59105 Elo +0.21 Ptnml(0-2): 19, 1560, 28085, 1605, 15 https://tests.stockfishchess.org/tests/view/611fa4814977aa1525c9cb71 functional on high depth closes https://github.com/official-stockfish/Stockfish/pull/3535 Bench: 5375286 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Tomasz Sobczyk
Date: Fri Aug 20 08:50:25 2021 +0200 Timestamp: 1629442225 Optimize and tidy up affine transform code. The new network caused some issues initially due to the very narrow neuron set between the first two FC layers. Necessary changes were hacked together to make it work. This patch is a mature approach to make the affine transform code faster, more readable, and easier to maintain should the layer sizes change again. The following changes were made: * ClippedReLU always produces a multiple of 32 outputs. This is about as good of a solution for AffineTransform's SIMD requirements as it can get without a bigger rewrite. * All self-contained simd helpers are moved to a separate file (simd.h). Inline asm is utilized to work around GCC's issues with code generation and register assignment. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101693, https://godbolt.org/z/da76fY1n7 * AffineTransform has 2 specializations. While it's more lines of code due to the boilerplate, the logic in both is significantly reduced, as these two are impossible to nicely combine into one. 1) The first specialization is for cases when there's >=128 inputs. It uses a different approach to perform the affine transform and can make full use of AVX512 without any edge cases. Furthermore, it has higher theoretical throughput because less loads are needed in the hot path, requiring only a fixed amount of instructions for horizontal additions at the end, which are amortized by the large number of inputs. 2) The second specialization is made to handle smaller layers where performance is still necessary but edge cases need to be handled. AVX512 implementation for this was ommited by mistake, a remnant from the temporary implementation for the new... This could be easily reintroduced if needed. A slightly more detailed description of both implementations is in the code. Overall it should be a minor speedup, as shown on fishtest: passed STC: LLR: 2.96 (-2.94,2.94) <-0.50,2.50> Total: 51520 W: 4074 L: 3888 D: 43558 Elo +1.25 Ptnml(0-2): 111, 3136, 19097, 3288, 128 and various tests shown in the pull request closes https://github.com/official-stockfish/Stockfish/pull/3663 No functional change see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Tomasz Sobczyk
Date: Fri Aug 20 07:57:09 2021 +0200 Timestamp: 1629439029 Improve handling of the debug log file. Fix handling of empty strings in uci options and reassigning of the log file Fixes https://github.com/official-stockfish/Stockfish/issues/3650 Closes https://github.com/official-stockfish/Stockfish/pull/3655 No functional change see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Torsten Hellwig
Date: Wed Aug 18 09:17:22 2021 +0200 Timestamp: 1629271042 Update default net to nn-ac5605a608d6.nnue This net was created with the nnue-pytorch trainer, it used the previous master net as a starting point. The training data includes all T60 data (https://drive.google.com/drive/folders/1rzZkgIgw7G5vQMLr2hZNiUXOp7z80613), all T74 data (https://drive.google.com/drive/folders/1aFUv3Ih3-A8Vxw9064Kw_FU4sNhMHZU-) and the wrongNNUE_02_d9.binpack (https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq). The Leela data were randomly named and then concatenated. All data was merged into one binpack using interleave_binpacks.py. python3 train.py \ ../data/t60_t74_wrong.binpack \ ../data/t60_t74_wrong.binpack \ --resume-from-model ../data/nn-e8321e467bf6.pt \ --gpus 1 \ --threads 4 \ --num-workers 1 \ --batch-size 16384 \ --progress_bar_refresh_rate 300 \ --random-fen-skipping 3 \ --features=HalfKAv2_hm^ \ --lambda=1.0 \ --max_epochs=600 \ --seed $RANDOM \ --default_root_dir ../output/exp_24 STC: LLR: 2.95 (-2.94,2.94) <-0.50,2.50> Total: 15320 W: 1415 L: 1257 D: 12648 Elo +3.58 Ptnml(0-2): 50, 1002, 5402, 1152, 54 https://tests.stockfishchess.org/tests/view/611c404a4977aa1525c9c97f LTC: LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 9440 W: 345 L: 248 D: 8847 Elo +3.57 Ptnml(0-2): 3, 222, 4175, 315, 5 https://tests.stockfishchess.org/tests/view/611c6c7d4977aa1525c9c996 LTC with UHO_XXL_+0.90_+1.19.epd: LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 6232 W: 1638 L: 1459 D: 3135 Elo +9.98 Ptnml(0-2): 5, 592, 1744, 769, 6 https://tests.stockfishchess.org/tests/view/611c9b214977aa1525c9c9cb closes https://github.com/official-stockfish/Stockfish/pull/3664 Bench: 5375286 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Joost VandeVondele
Date: Tue Aug 17 21:08:34 2021 +0200 Timestamp: 1629227314 Regenerate dependencies on code change fixes https://github.com/official-stockfish/Stockfish/issues/3658 dependencies are now regenerated for each code change, this adds some 1s overhead in compile time, but avoids potential miscompilations or build problems. closes https://github.com/official-stockfish/Stockfish/pull/3659 No functional change see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Tomasz Sobczyk
Date: Sun Aug 15 12:05:43 2021 +0200 Timestamp: 1629021943 New NNUE architecture and net Introduces a new NNUE network architecture and associated network parameters The summary of the changes: * Position for each perspective mirrored such that the king is on e..h files. Cuts the feature transformer size in half, while preserving enough knowledge to be good. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.b40q4rb1w7on. * The number of neurons after the feature transformer increased two-fold, to 1024x2. This is possibly mostly due to the now very optimized feature transformer update code. * The number of neurons after the second layer is reduced from 16 to 8, to reduce the speed impact. This, perhaps surprisingly, doesn't harm the strength much. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.6qkocr97fezq The AffineTransform code did not work out-of-the box with the smaller number of neurons after the second layer, so some temporary changes have been made to add a special case for InputDimensions == 8. Also additional 0 padding is added to the output for some archs that cannot process inputs by <=8 (SSE2, NEON). VNNI uses an implementation that can keep all outputs in the registers while reducing the number of loads by 3 for each 16 inputs, thanks to the reduced number of output neurons. However GCC is particularily bad at optimization here (and perhaps why the current way the affine transform is done even passed sprt) (see https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit# for details) and more work will be done on this in the following days. I expect the current VNNI implementation to be improved and extended to other architectures. The network was trained with a slightly modified version of the pytorch trainer (https://github.com/glinscott/nnue-pytorch); the changes are in https://github.com/glinscott/nnue-pytorch/pull/143 The training utilized 2 datasets. dataset A - https://drive.google.com/file/d/1VlhnHL8f-20AXhGkILujnNXHwy9T-MQw/view?usp=sharing dataset B - as described in https://github.com/official-stockfish/Stockfish/commit/ba01f4b95448bcb324755f4dd2a632a57c6e67bc The training process was as following: train on dataset A for 350 epochs, take the best net in terms of elo at 20k nodes per move (it's fine to take anything from later stages of training). convert the .ckpt to .pt --resume-from-model from the .pt file, train on dataset B for <600 epochs, take the best net. Lambda=0.8, applied before the loss function. The first training command: python3 train.py \ ../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \ ../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \ --gpus "$3," \ --threads 1 \ --num-workers 1 \ --batch-size 16384 \ --progress_bar_refresh_rate 20 \ --smart-fen-skipping \ --random-fen-skipping 3 \ --features=HalfKAv2_hm^ \ --lambda=1.0 \ --max_epochs=600 \ --default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2 The second training command: python3 serialize.py \ --features=HalfKAv2_hm^ \ ../nnue-pytorch-training/experiment_131/run_6/default/version_0/checkpoints/epoch-499.ckpt \ ../nnue-pytorch-training/experiment_$1/base/base.pt python3 train.py \ ../nnue-pytorch-training/data/michael_commit_b94a65.binpack \ ../nnue-pytorch-training/data/michael_commit_b94a65.binpack \ --gpus "$3," \ --threads 1 \ --num-workers 1 \ --batch-size 16384 \ --progress_bar_refresh_rate 20 \ --smart-fen-skipping \ --random-fen-skipping 3 \ --features=HalfKAv2_hm^ \ --lambda=0.8 \ --max_epochs=600 \ --resume-from-model ../nnue-pytorch-training/experiment_$1/base/base.pt \ --default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2 STC: https://tests.stockfishchess.org/tests/view/611120b32a8a49ac5be798c4 LLR: 2.97 (-2.94,2.94) <-0.50,2.50> Total: 22480 W: 2434 L: 2251 D: 17795 Elo +2.83 Ptnml(0-2): 101, 1736, 7410, 1865, 128 LTC: https://tests.stockfishchess.org/tests/view/611152b32a8a49ac5be798ea LLR: 2.93 (-2.94,2.94) <0.50,3.50> Total: 9776 W: 442 L: 333 D: 9001 Elo +3.87 Ptnml(0-2): 5, 295, 4180, 402, 6 closes https://github.com/official-stockfish/Stockfish/pull/3646 bench: 5189338 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Joost VandeVondele
Date: Thu Aug 5 16:41:07 2021 +0200 Timestamp: 1628174467 Revert futility pruning patches reverts 09b6d28391cf582d99897360b225bcbbe38dd1c6 and dbd7f602d3c7622df294f87d7239b5aaf31f695f that significantly impact mate finding capabilities. For example on ChestUCI_23102018.epd, at 1M nodes, the number of mates found is nearly reduced 2x without these depth conditions: sf6 2091 sf7 2093 sf8 2107 sf9 2062 sf10 2208 sf11 2552 sf12 2563 sf13 2509 sf14 2427 master 1246 patched 2467 (script for testing at https://github.com/official-stockfish/Stockfish/files/6936412/matecheck.zip) closes https://github.com/official-stockfish/Stockfish/pull/3641 fixes https://github.com/official-stockfish/Stockfish/issues/3627 Bench: 5467570 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: VoyagerOne
Date: Thu Aug 5 16:32:07 2021 +0200 Timestamp: 1628173927 SEE simplification Simplified SEE formula by removing std::min. Should also be easier to tune. STC: LLR: 2.95 (-2.94,2.94) <-2.50,0.50> Total: 22656 W: 1836 L: 1729 D: 19091 Elo +1.64 Ptnml(0-2): 54, 1426, 8267, 1521, 60 https://tests.stockfishchess.org/tests/view/610ae62f2a8a49ac5be79449 LTC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 26248 W: 806 L: 744 D: 24698 Elo +0.82 Ptnml(0-2): 6, 668, 11715, 728, 7 https://tests.stockfishchess.org/tests/view/610b17ad2a8a49ac5be79466 closes https://github.com/official-stockfish/Stockfish/pull/3643 bench: 4915145 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: SFisGOD
Date: Thu Aug 5 08:52:07 2021 +0200 Timestamp: 1628146327 Update default net to nn-46832cfbead3.nnue SPSA 1: https://tests.stockfishchess.org/tests/view/6100e7f096b86d98abf6a832 Parameters: A total of 256 net weights and 8 net biases were tuned (output layer) Base net: nn-56a5f1c4173a.nnue New net: nn-ec3c8e029926.nnue SPSA 2: https://tests.stockfishchess.org/tests/view/610733caafad2da4f4ae3da7 Parameters: A total of 256 net biases were tuned (hidden layer 2) Base net: nn-ec3c8e029926.nnue New net: nn-46832cfbead3.nnue STC: LLR: 2.98 (-2.94,2.94) <-0.50,2.50> Total: 50520 W: 3953 L: 3765 D: 42802 Elo +1.29 Ptnml(0-2): 138, 3063, 18678, 3235, 146 https://tests.stockfishchess.org/tests/view/610a79692a8a49ac5be793f4 LTC: LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 57256 W: 1723 L: 1566 D: 53967 Elo +0.95 Ptnml(0-2): 12, 1442, 25568, 1589, 17 https://tests.stockfishchess.org/tests/view/610ac5bb2a8a49ac5be79434 Closes https://github.com/official-stockfish/Stockfish/pull/3642 Bench: 5359314 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Stefan Geschwentner
Date: Thu Aug 5 08:47:33 2021 +0200 Timestamp: 1628146053 Simplify new cmh pruning thresholds by using directly a quadratic formula. This decouples also the stat bonus updates from the threshold which creates less dependencies for tuning of stat bonus parameters. Perhaps a further fine tuning of the now separated coefficients for constHist[0] and constHist[1] could give further gains. STC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 78384 W: 6134 L: 6090 D: 66160 Elo +0.20 Ptnml(0-2): 207, 5013, 28705, 5063, 204 https://tests.stockfishchess.org/tests/view/6106d235afad2da4f4ae3d4b LTC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 38176 W: 1149 L: 1095 D: 35932 Elo +0.49 Ptnml(0-2): 6, 1000, 17030, 1038, 14 https://tests.stockfishchess.org/tests/view/6107a080afad2da4f4ae3def closes https://github.com/official-stockfish/Stockfish/pull/3639 Bench: 5098146 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: VoyagerOne
Date: Thu Aug 5 08:44:38 2021 +0200 Timestamp: 1628145878 Futile pruning simplification Remove CMH conditions in futile pruning. STC: LLR: 2.94 (-2.94,2.94) <-2.50,0.50> Total: 93520 W: 7165 L: 7138 D: 79217 Elo +0.10 Ptnml(0-2): 222, 5923, 34427, 5982, 206 https://tests.stockfishchess.org/tests/view/61083104e50a153c346ef8df LTC: LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 59072 W: 1746 L: 1706 D: 55620 Elo +0.24 Ptnml(0-2): 13, 1562, 26353, 1588, 20 https://tests.stockfishchess.org/tests/view/610894f2e50a153c346ef913 closes https://github.com/official-stockfish/Stockfish/pull/3638 Bench: 5229673 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: VoyagerOne
Date: Sat Jul 31 15:29:19 2021 +0200 Timestamp: 1627738159 CMH Pruning Tweak replace CounterMovePruneThreshold by a depth dependent threshold STC: LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 35512 W: 2718 L: 2552 D: 30242 Elo +1.62 Ptnml(0-2): 66, 2138, 13194, 2280, 78 https://tests.stockfishchess.org/tests/view/6104442fafad2da4f4ae3b94 LTC: LLR: 2.96 (-2.94,2.94) <0.50,3.50> Total: 36536 W: 1150 L: 1019 D: 34367 Elo +1.25 Ptnml(0-2): 10, 920, 16278, 1049, 11 https://tests.stockfishchess.org/tests/view/6104b033afad2da4f4ae3bbc closes https://github.com/official-stockfish/Stockfish/pull/3636 Bench: 5848718 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Tomasz Sobczyk
Date: Fri Jul 30 17:15:52 2021 +0200 Timestamp: 1627658152 Avoid unnecessary stores in the affine transform This patch improves the codegen in the AffineTransform::forward function for architectures >=SSSE3. Current code works directly on memory and the compiler cannot see that the stores through outptr do not alias the loads through weights and input32. The solution implemented is to perform the affine transform with local variables as accumulators and only store the result to memory at the end. The number of accumulators required is OutputDimensions / OutputSimdWidth, which means that for the 1024->16 affine transform it requires 4 registers with SSSE3, 2 with AVX2, 1 with AVX512. It also cuts the number of stores required by NumRegs * 256 for each node evaluated. The local accumulators are expected to be assigned to registers, but even if this cannot be done in some case due to register pressure it will help the compiler to see that there is no aliasing between the loads and stores and may still result in better codegen. See https://godbolt.org/z/59aTKbbYc for codegen comparison. passed STC: LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 140328 W: 10635 L: 10358 D: 119335 Elo +0.69 Ptnml(0-2): 302, 8339, 52636, 8554, 333 closes https://github.com/official-stockfish/Stockfish/pull/3634 No functional change see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: SFisGOD
Date: Thu Jul 29 07:35:13 2021 +0200 Timestamp: 1627536913 Update default net to nn-56a5f1c4173a.nnue SPSA 1: https://tests.stockfishchess.org/tests/view/60fd24efd8a6b65b2f3a796e Parameters: A total of 256 net biases were tuned (hidden layer 2) New best values: Half of the changes from the tuning run New net: nn-5992d3ba79f3.nnue SPSA 2: https://tests.stockfishchess.org/tests/view/60fec7d6d8a6b65b2f3a7aa2 Parameters: A total of 128 net biases were tuned (hidden layer 1) New best values: Half of the changes from the tuning run New net: nn-56a5f1c4173a.nnue STC: LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 140392 W: 10863 L: 10578 D: 118951 Elo +0.71 Ptnml(0-2): 347, 8754, 51718, 9021, 356 https://tests.stockfishchess.org/tests/view/610037e396b86d98abf6a79e LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.50> Total: 14216 W: 454 L: 355 D: 13407 Elo +2.42 Ptnml(0-2): 4, 323, 6356, 420, 5 https://tests.stockfishchess.org/tests/view/61019995afad2da4f4ae3a3c Closes #3633 Bench: 4801359 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: SFisGOD
Date: Mon Jul 26 07:52:59 2021 +0200 Timestamp: 1627278779 Update default net to nn-26abeed38351.nnue SPSA: https://tests.stockfishchess.org/tests/view/60fba335d8a6b65b2f3a7891 New best values: Half of the changes from the tuning run. Setting: nodestime=300 with 10+0.1 (approximate real TC is 2.5 seconds) The rest is the same as described in #3593 The change from nodestime=600 to 300 was suggested by gekkehenker to prevent time losses for some slow workers SFisGOD@94cd757#commitcomment-53324840 STC: LLR: 2.96 (-2.94,2.94) <-0.50,2.50> Total: 67448 W: 5241 L: 5036 D: 57171 Elo +1.06 Ptnml(0-2): 151, 4198, 24827, 4391, 157 https://tests.stockfishchess.org/tests/view/60fd50f2d8a6b65b2f3a798e LTC: LLR: 2.93 (-2.94,2.94) <0.50,3.50> Total: 48752 W: 1504 L: 1358 D: 45890 Elo +1.04 Ptnml(0-2): 13, 1226, 21754, 1368, 15 https://tests.stockfishchess.org/tests/view/60fd7bb2d8a6b65b2f3a79a9 Closes https://github.com/official-stockfish/Stockfish/pull/3630 Bench: 5124774 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Giacomo Lorenzetti
Date: Mon Jul 26 07:48:58 2021 +0200 Timestamp: 1627278538 Simplification in LMR This commit removes the `!captureOrPromotion` condition from ttCapture reduction and from good/bad history reduction (similar to #3619). passed STC: https://tests.stockfishchess.org/tests/view/60fc734ad8a6b65b2f3a7922 LLR: 2.97 (-2.94,2.94) <-2.50,0.50> Total: 48680 W: 3855 L: 3776 D: 41049 Elo +0.56 Ptnml(0-2): 118, 3145, 17744, 3206, 127 passed LTC: https://tests.stockfishchess.org/tests/view/60fce7d5d8a6b65b2f3a794c LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 86528 W: 2471 L: 2450 D: 81607 Elo +0.08 Ptnml(0-2): 28, 2203, 38777, 2232, 24 closes https://github.com/official-stockfish/Stockfish/pull/3629 Bench: 4951406 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: MichaelB7
Date: Sat Jul 24 18:04:59 2021 +0200 Timestamp: 1627142699 Update the default net to nn-76a8a7ffb820.nnue. combined work by Serio Vieri, Michael Byrne, and Jonathan D (aka SFisGod) based on top of previous developments, by restarts from good nets. Sergio generated the net https://tests.stockfishchess.org/api/nn/nn-d8609abe8caf.nnue: The initial net nn-d8609abe8caf.nnue is trained by generating around 16B of training data from the last master net nn-9e3c6298299a.nnue, then trained, continuing from the master net, with lambda=0.2 and sampling ratio of 1. Starting with LR=2e-3, dropping LR with a factor of 0.5 until it reaches LR=5e-4. in_scaling is set to 361. No other significant changes made to the pytorch trainer. Training data gen command (generates in chunks of 200k positions): generate_training_data min_depth 9 max_depth 11 count 200000 random_move_count 10 random_move_max_ply 80 random_multi_pv 12 random_multi_pv_diff 100 random_multi_pv_depth 8 write_min_ply 10 eval_limit 1500 book noob_3moves.epd output_file_name gendata/$(date +"%Y%m%d-%H%M")_${HOSTNAME}.binpack PyTorch trainer command (Note that this only trains for 20 epochs, repeatedly train until convergence): python train.py --features "HalfKAv2^" --max_epochs 20 --smart-fen-skipping --random-fen-skipping 500 --batch-size 8192 --default_root_dir $dir --seed $RANDOM --threads 4 --num-workers 32 --gpus $gpuids --track_grad_norm 2 --gradient_clip_val 0.05 --lambda 0.2 --log_every_n_steps 50 $resumeopt $data $val See https://github.com/sergiovieri/Stockfish/tree/tools_mod/rl for the scripts used to generate data. Based on that Michael generated nn-76a8a7ffb820.nnue in the following way: The net being submitted was trained with the pytorch trainer: https://github.com/glinscott/nnue-pytorch python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 30 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --auto_lr_find True --lambda=1.0 --max_epochs=240 --seed %random%%random% --default_root_dir exp/run_109 --resume-from-model ./pt/nn-d8609abe8caf.pt This run is thus started from Segio Vieri's net nn-d8609abe8caf.nnue all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack model.py modifications: loss = torch.pow(torch.abs(p - q), 2.6).mean() LR = 8.0e-5 calculated as follows: 1.5e-3*(.992^360) - the idea here was to take a highly trained net and just use all.binpack as a finishing micro refinement touch for the last 2 Elo or so. This net was discovered on the 59th epoch. optimizer = ranger.Ranger(train_params, betas=(.90, 0.999), eps=1.0e-7, gc_loc=False, use_gc=False) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.992) For this micro optimization, I had set the period to "5" in train.py. This changes the checkpoint output so that every 5th checkpoint file is created The final touches were to adjust the NNUE scale, as was done by Jonathan in tests running at the same time. passed LTC https://tests.stockfishchess.org/tests/view/60fa45aed8a6b65b2f3a77a4 LLR: 2.94 (-2.94,2.94) <0.50,3.50> Total: 53040 W: 1732 L: 1575 D: 49733 Elo +1.03 Ptnml(0-2): 14, 1432, 23474, 1583, 17 passed STC https://tests.stockfishchess.org/tests/view/60f9fee2d8a6b65b2f3a7775 LLR: 2.94 (-2.94,2.94) <-0.50,2.50> Total: 37928 W: 3178 L: 3001 D: 31749 Elo +1.62 Ptnml(0-2): 100, 2446, 13695, 2623, 100. closes https://github.com/official-stockfish/Stockfish/pull/3626 Bench: 5169957 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: Giacomo Lorenzetti
Date: Fri Jul 23 19:02:58 2021 +0200 Timestamp: 1627059778 Apply good/bad history reduction also when inCheck Main idea is that, in some cases, 'in check' situations are not so different from 'not in check' ones. Trying to use piece count in order to select only a few 'in check' situations have failed LTC testing. It could be interesting to apply one of those ideas in other parts of the search function. passed STC: https://tests.stockfishchess.org/tests/view/60f1b68dd1189bed71812d40 LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 53472 W: 4078 L: 4008 D: 45386 Elo +0.45 Ptnml(0-2): 127, 3297, 19795, 3413, 104 passed LTC: https://tests.stockfishchess.org/tests/view/60f291e6d1189bed71812de3 LLR: 2.92 (-2.94,2.94) <-2.50,0.50> Total: 89712 W: 2651 L: 2632 D: 84429 Elo +0.07 Ptnml(0-2): 60, 2261, 40188, 2294, 53 closes https://github.com/official-stockfish/Stockfish/pull/3619 Bench: 5185789 see source |
| Windows x64 for Haswell CPUs Windows x64 for modern computers + AVX2 Windows x64 for modern computers Windows x64 + SSSE3 Windows x64 Windows 32 Linux x64 for Haswell CPUs Linux x64 for modern computers + AVX2 Linux x64 for modern computers Linux x64 + SSSE3 Linux x64 | Author: pb00067
Date: Fri Jul 23 18:53:03 2021 +0200 Timestamp: 1627059183 Simplify lowply-history scoring logic STC: https://tests.stockfishchess.org/tests/view/60eee559d1189bed71812b16 LLR: 2.97 (-2.94,2.94) <-2.50,0.50> Total: 33976 W: 2523 L: 2431 D: 29022 Elo +0.94 Ptnml(0-2): 66, 2030, 12730, 2070, 92 LTC: https://tests.stockfishchess.org/tests/view/60eefa12d1189bed71812b24 LLR: 2.93 (-2.94,2.94) <-2.50,0.50> Total: 107240 W: 3053 L: 3046 D: 101141 Elo +0.02 Ptnml(0-2): 56, 2668, 48154, 2697, 45 closes https://github.com/official-stockfish/Stockfish/pull/3616 bench: 5199177 see source |