Page 1 of 3

Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Sun Dec 23, 2012 9:00 pm
by mwyoung
CCRL 40/4 rating for Houdini 3, and Houdini 3 Tactical. At this fast time control Houdini 3 is clearly better the Houdini 3 Tactical.

40/4 Rating
Houdini 3 64-bit 3244
Houdini 3 Tactical 64-bit 3210

But at the 40/40 rating. Houdini 3 Tactical's rating jumps from -34 points below Houdini 3 at 40/4. To a +31 rating advantage over Houdini 3 at 40/40.

40/40 Rating
Houdini 3 Tactical 64-bit 3252
Houdini 3 64-bit 3221

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 5:03 am
by lucasart
mwyoung wrote:CCRL 40/4 rating for Houdini 3, and Houdini 3 Tactical. At this fast time control Houdini 3 is clearly better the Houdini 3 Tactical.

40/4 Rating
Houdini 3 64-bit 3244
Houdini 3 Tactical 64-bit 3210

But at the 40/40 rating. Houdini 3 Tactical's rating jumps from -34 points below Houdini 3 at 40/4. To a +31 rating advantage over Houdini 3 at 40/40.

40/40 Rating
Houdini 3 Tactical 64-bit 3252
Houdini 3 64-bit 3221
In general this could make sense. Often drastic pruning and reduction in search algorithm scale badly: they improve at fast time control, but don't improve much (or end up being worse) at long time conrtol. I've noticed that in DiscoCheck when I experimented with futility and move count pruning, for example.
Perhaps because the benefit of depth becomes less in long searches, so the reward becomes less and less while the penalty remains more constant: this is my educated guess, but unproven speculation, still.

But, I think you are jumping to conclusions too early:
- there is absolutely no reason why CCRL 40/40 and CCRL 40/4 elo scales are comparable, especially "near the tails"
- error bars.. error bars... and don't forget to compound them in pythagoric fashion (sqrt of sum of squares)

The best person to know the relative strength of H3 vs H3 tactical is Robert Houdart. He's a serious engineer, and he surely did his homework before releasing H3, and performed some extensive testing.

Robert, do you have your test results handy ?

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 10:11 am
by mwyoung
What conclusions have I jumped to? I only pointed out the data... The only ones saying Houdini tactical is the strongest engine right now is the boys at CCRL. I have not concluded anything yet.

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 10:47 am
by lucasart
mwyoung wrote:What conclusions have I jumped to? I only pointed out the data... The only ones saying Houdini tactical is the strongest engine right now is the boys at CCRL. I have not concluded anything yet.
The "CCRL boys" have never drawn such a conclusion. They have never said that CCRL 40/40 and CCRL 40/4 elo scales are comparable. As a matter of fact they are not.

I'm probably wasting my time answering to you... Anyway, my last post on the subject. Feel free to spam & troll this thread, if it makes you happy.

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 11:04 am
by mwyoung
lucasart wrote:
mwyoung wrote:What conclusions have I jumped to? I only pointed out the data... The only ones saying Houdini tactical is the strongest engine right now is the boys at CCRL. I have not concluded anything yet.
The "CCRL boys" have never drawn such a conclusion. They have never said that CCRL 40/40 and CCRL 40/4 elo scales are comparable. As a matter of fact they are not.

I'm probably wasting my time answering to you... Anyway, my last post on the subject. Feel free to spam & troll this thread, if it makes you happy.
It is not relevant if the two rating pool numbers are comparable, no one is comparing number values between the two rating pools, do you know anything about elo ratings.

What you can do is compare the relative relationship between the two programs in the 2 rating pools of different time controls.

So yes, CCRL data is showing at Longer time controls of 40/40 Houdini 3 tactical is stronger then Houdini 3, but not at the shorter time control of 40/4. Again the rating numbers in the two rating pools is not relevant. Both Houdini 3 and Houdini 3 Tactical are rated in both rating pools of 40/4, and 40/40. We are comparing apples to apples.

The only one who is getting their time wasted is me, by having to answer your wrong and misrepresentative statements...so feel free to not respond.

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 11:32 am
by Odeus37
The CCRL 40/40 rating you are refering to is irrelevant : only 53 games yet for Houdini Tactical...

http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Code: Select all

Name                         Rating   Elo+    Elo-   Score   Average Opponent   Draws   Games	
Houdini 3 Tactical 64-bit    3252     +70     −68    65.1%   −98.0              50.9%   53
Houdini 3 64-bit             3221     +20     −19    64.4%   −106.7             41.3%   750

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 12:08 pm
by mwyoung
Odeus37 wrote:The CCRL 40/40 rating you are refering to is irrelevant : only 53 games yet for Houdini Tactical...

http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Code: Select all

Name                         Rating   Elo+    Elo-   Score   Average Opponent   Draws   Games	
Houdini 3 Tactical 64-bit    3252     +70     −68    65.1%   −98.0              50.9%   53
Houdini 3 64-bit             3221     +20     −19    64.4%   −106.7             41.3%   750

It is not irrelevant to CCRL, they publish the data and rating.

The results themselves are not irrelevant to the question to the thread. Is Houdini 3 Tactical the Strongest Chess Engine? Since you do not need 100's of games to answer this question. You only need 100's or 1000's of games if the elo's are very close, or if you are trying to get very exact elo ratings.

What we are asking Is Houdini 3 Tactical the Strongest Chess Engine? If we don't care by the exact amount, the data is very relevant.

Even with only 50 games and since the elo difference is not very close, you can say even after 50 games that is much more likely that Houdini 3 Tactical is stronger at 40/40 then Houdini 3. We just can't have high confidence in the exact value after 50 games.

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 7:39 pm
by ernest
mwyoung wrote:Even with only 50 games
Go back to school...

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Mon Dec 24, 2012 10:37 pm
by mwyoung
CCRL 40/40 Match Results.

Houdini 3 Tactical 64-bit (3252 +70−68 )

Opponent Elo Diff Results Score

Houdini 3 64-bit 3221 +20 −19 (−31) 5.5 − 5.5(+1−1=9) 50.0% 5.5 / 11

Komodo 5 64-bit 3157 +11 −11 (−95) 8.5 − 1.5(+7−0=3) 85.0% 8.5 / 10

Critter 1.6a 64-bit 3139 +21 −21 (−113) 7.5 − 3.5(+5−1=5) 68.2% 7.5 / 11

Rybka 4.1 64-bit 3132 +12 −12 (−120) 7.5 − 3.5(+6−2=3) 68.2% 7.5 / 11

Stockfish 2.2.2 64-bit 3118 +19 −18 (−134) 5.5 − 4.5(+2−1=7) 55.0% 5.5 / 10

Re: Is Houdini 3 Tactical the Strongest Chess Engine?

Posted: Tue Dec 25, 2012 3:11 am
by Adam Hair
mwyoung wrote:
Odeus37 wrote:The CCRL 40/40 rating you are refering to is irrelevant : only 53 games yet for Houdini Tactical...

http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Code: Select all

Name                         Rating   Elo+    Elo-   Score   Average Opponent   Draws   Games	
Houdini 3 Tactical 64-bit    3252     +70     −68    65.1%   −98.0              50.9%   53
Houdini 3 64-bit             3221     +20     −19    64.4%   −106.7             41.3%   750

It is not irrelevant to CCRL, they publish the data and rating.

The results themselves are not irrelevant to the question to the thread. Is Houdini 3 Tactical the Strongest Chess Engine? Since you do not need 100's of games to answer this question. You only need 100's or 1000's of games if the elo's are very close, or if you are trying to get very exact elo ratings.

What we are asking Is Houdini 3 Tactical the Strongest Chess Engine? If we don't care by the exact amount, the data is very relevant.

Even with only 50 games and since the elo difference is not very close, you can say even after 50 games that is much more likely that Houdini 3 Tactical is stronger at 40/40 then Houdini 3. We just can't have high confidence in the exact value after 50 games.
The part in bold is the key. We can not have high confidence in the exact value after 53 games. There is a 95% chance that if the test was repeated that the measured Elo of Houdini 3 64-bit Tactical would fall between 3184 and 3322 (unless there is a more appropriate Bayesian interpretation of these error bars). The 95% interval for Houdini 3 64-bit is contained completely inside that interval. It is very hard to say which version is stronger at this point. Furthermore, the games come from different contributors, which makes the real error larger than the reported error for several reasons.

To be honest, though I have not done the math to determine the necessary value, 31 Elo is not a big enough difference after 53 games to say with high confidence that Houdini 3 Tactical is stronger.