|Stockfish is brilliant.|
|I would like to see Carlsen try this.|
|I have two engines: the Positional and scientific Professor Stockfish and Leela: she is the most beautiful but impossible.
The rest does not exist.
|I believe that Stockfish running on a Laptop with a CPU Speed of 3000 kN/s at the first move will crushes Carlsen in 100 games.
Stockfish Time Control: Bullet 1m + 1s
Carlsen Time Control: Standard/Classic Time Control
It happens. Stockfish doesn't win all the games, nor does it avoid losing every time.
But it's still much better(Unless something changed and I didn't notice)
|>Once you find that Hash X is better than Hash Y, then this will apply to all engines, so the test only needs to be done once, and henceforth there is no need to test again.
I don't know if this is true.
Maybe it is true for stockfish versions close to the one you tested and not for other engines/very far away versions. Besides, not all engines have the same Kn/s
>That is why every engine listed on CCRL, SPCC, CEGT & FastGMs, their ELO can be different, it's because the number of games played is also different, and you can't say that CCRL is correct and SPCC is wrong, both are true based on their own data.
Randomness and the fact that they are using different conditions is the reason why they may differ a bit.
>You can't use an average value (386),
I never said you could. If the value of the formula is 386 then either 256 or 512 should be fine.
Maybe one is slightly better but I would think more like 1-3 elo.
Those 7 elo you mentioned don't seem to be for this case.
It seems like 256(or less?) or 512 at most was the best value and the test was for 1024 and 2048, clearly wrong values
> Because if you want to get the most accurate ELO estimate, then you have to play at least 60000 games, because only with that many games the Error can be 0 or 1 (based on Bayeselo).
The error is never zero, it just get's smaller and smaller with more and more games.
If the 2 engines are close then it may be the case that that's how many games are needed before any reliable conclusion can be reached. So all of those sites wouldn't really come to a conclusion that can be trusted, the conclusion would be like this: there's a 60% chance than engine X is better than engine Y.
>Of course, with more games, ELO will change, but at least I'm talking with Data, that the ELO Difference 7, it is based on 440 games and not assumptions. If you want to do a test with more games, it's up to you.
if the elo difference is +7 with +-30 elo error bars it could as well be that if you played 60K games you would find out that 512 is +2 +-1 elo error bars so you can't really claim 7 elo based on those results only note that if the results are correct then 7 elo is the difference.
>Which one is true?
The second one may be +5 elo with +-5 elo bars so it would tell you that it is up to 5 elo better. Your test would tell us that it is +7 elo with +-40 elo bars so it wouldn't be helpful.
So, what are your error bars?
>Both are true, because they are based on Data.
Both are false, because the truth is that there are elo bars that were not reported.
|I saw Komodo crushing SF on CCC! How come?|
You can't use an average value (386), because the best Hash is a multiple of 2 (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, ...)
Actually, there is no specific reference to how many games should be played to get the most accurate ELO estimate. Because if you want to get the most accurate ELO estimate, then you have to play at least 60000 games, because only with that many games the Error can be 0 or 1 (based on Bayeselo).
Okay, we assume that the reference must be 60000 games, so that means that the ELO displayed on CCRL, SPCC, CEGT & FastGMs, ALL is incorrect ?? because the game of every engine on these sites is far below 60000 games.
Of course, with more games, ELO will change, but at least I'm talking with Data, that the ELO Difference 7, it is based on 440 games and not assumptions. If you want to do a test with more games, it's up to you.
I test with 440 games and get an ELO Diff 7
You test with 1000 games and get an ELO Diff 5
Which one is true?
Both are true, because they are based on Data.
Once you find that Hash X is better than Hash Y, then this will apply to all engines, so the test only needs to be done once, and henceforth there is no need to test again.
That is why every engine listed on CCRL, SPCC, CEGT & FastGMs, their ELO can be different, it's because the number of games played is also different, and you can't say that CCRL is correct and SPCC is wrong, both are true based on their own data.
|That program is so good.|
|> but in some of the tests I've done, the best Hash can be a Hash that is far below it or far above it.
Did you take into account the elo bars or did you just go with 100 games and assumed the difference was because of the hash?
Indeed the best way to know is to do a test.
Let's see, how many games do you need to spot a 10 elo difference and know that hash X(which is 10 elo better) is better than hash Y ?
You don't need to know that it is by 10 elo, but let's say that X is better than Y 10 elo..
How many games are needed to be played to know that X is better than Y ?
It seems impractical to do this especially if this is needed for every new version...
>Take a look here
What would the optimal amount of hash be according to the formula?
If it is 512 mb then the test confirms this, although it seems like 444 games are not that much against so evenly matched opponents.
The 7 elo is if you assume that this result wouldn't change if more games were played.
440 games are not enough to measure how much the difference is...
Could this be because of luck?
>improper use of Hash can cause ELO Difference 7 to 10
Maybe completely improper use of hash.
If it is between 256 or 512 (for example the formula says you should use 386)
then there won't be such a high difference.
You do as you wish
|This fish is so lovely.|
Take a look here => https://sites.google.com/site/computerschess/scct-4men-vs-5men, improper use of Hash can cause ELO Difference 7 to 10 ... it's the same as Stockfish Update for 2 month.
500 is indeed close to 512, but in some of the tests I've done, the best Hash can be a Hash that is far below it or far above it.
So, the only way to find out is to do a test, and not just assume
|That program is gorgeous.|
|"I don’t have much time to waste on an engine version"
Almost the same here, ... I don’t have much time to waste on ...
|This fish is sweet.|
|>To find out which Hash is the best (256 or 512), do a test by playing 100 games using the same engine & the same opening, but with a different Hash.
I don't think this would work to determine which is better.
There is no significant difference and 100 games won't show the very slight difference.
Just use 512 which is very close to 500. Should be the best assuming the formula in 29763 is correct.
|That Stockfish is delicious.|
First, find out your CPU speed, use Fritz Benchmark (Download here => http://www.jens-hartmann.at/Fritzmarks/)
Second, find out what is the average Time per Move (in seconds) for Time Control you use (in this case: 5m+2s).
Then, the Hash value is determined by this formula:
Hash = (CPU Speed x Time / Steps) / 100
CPU Speed = 5000 kNps
Time / Move = 10 seconds
Hash = (5000 * 10) / 100 = 500
Then, the Hash value you can use is 256 MB or 512 MB.
Less than 256 or more than 512 will reduce engine performance.
To find out which Hash is the best (256 or 512), do a test by playing 100 games using the same engine & the same opening, but with a different Hash.
|@29757 Don't say that so absolutely. You never know his hardware. 😄|
|It is invincible.|
One would think that development has come a long way at this time but apparently not. The more recent one seems to play well but at times it goes all the way to force its way to a 5 piece end game.
I don’t have much time to waste on an engine version. One loss or drawn game by an app run on an iphone is all it takes for me to lose interest in a version of the engine.
Ponder is on for both sides.
It is not as invincible as I thought. The newer neural network engines might catch up soon because even a lowly program on an iphone can equalize its performance on PC.
What size should I use then?
The website that I used was this one because my ram is 8gb
|This fish is cute.|