Page 1 of 1

Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sat Jun 08, 2013 5:13 pm
by mwyoung
I have been testing a development version of Stockfish for over a week now at slower time controls. It is becoming clear that Stockfish is making rapid progress against Houdini 3.

In a 12 game match at 40/2 Hours. Stockfish draw Houdini 3 6-6.

While away on a business trip for a week, I setup a 100 game match at 40/20mins. That match is still on going, but the results after 95 games are...

Stockfish 270513 +19 =56 -20 TP = -4 elo

Houdini 3 Pro 64 +20 =56 -19 TP = +4 elo

There are later development versions in stockfish, as they are made available almost ever day it seems. So there could be a strong versions of stockfish, but testing takes time. What is becoming clear, Stockfish 270513 is on par with Houdini 3 in head to head play.

I don't know yet how stockfish 270513 plays against other engines, I will answer this question when I have a stockfish version that shows itself to be stronger then Houdini 3 in match play. As this testing takes much time, and new versions show up very quickly.

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sat Jun 08, 2013 11:05 pm
by zullil
mwyoung wrote: What is becoming clear, Stockfish 270513 is on par with Houdini 3 in head to head play.
Seems like you are drawing a big conclusion from a small sample of games. But I'm not a statistician. I am a big fan of Stockfish, so it would be nice if your conclusion were correct.

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sat Jun 08, 2013 11:48 pm
by mwyoung
zullil wrote:
mwyoung wrote: What is becoming clear, Stockfish 270513 is on par with Houdini 3 in head to head play.
Seems like you are drawing a big conclusion from a small sample of games. But I'm not a statistician. I am a big fan of Stockfish, so it would be nice if your conclusion were correct.
You can draw your own conclusions, you see that same data I am seeing, The conclusion I have come to is stockfish is on the rise, and it is becoming clear as I play move games. Now well over 100 that stockfish is on *par* with Houdini in *head to head matches at longer time controls*. I will let the results speak for themselves. There is a chance that stockfish was lucky, and there is a chance that Houdini was lucky in the opening positions they were give. But you are not going to get 1000 games at long time controls, takes to much time for a development version of stockfish at long time controls. You will have to be satisfied with a lower confidence level. Now when they release a new version of stockfish that is official. You well have many more testers and games to draw a conclusions with and a higher confident level. But if you are a tester, or know Houdini very well, you know that it does not happen that another program stays with Houdini 3 after 100 games, let alone 20 games. Houdini 3 is that strong above the rest of the computer chess field.

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sun Jun 09, 2013 12:51 am
by mwyoung
zullil wrote:
mwyoung wrote: What is becoming clear, Stockfish 270513 is on par with Houdini 3 in head to head play.
Seems like you are drawing a big conclusion from a small sample of games. But I'm not a statistician. I am a big fan of Stockfish, so it would be nice if your conclusion were correct.
I ran the match with Bayesian Elo Rating calculator so you can see the error bars for the match

after 106 games Stockfish has a error bar of +/- 25 elo points with a elo rating equal to Houdini 3.

Seems highly likely that stockfish 270513 is on par with Houdini 3.

Hope this helps.

Rank Name Elo + - games score oppo. draws
1 Houdini 3 Pro x64 0 25 25 106 50% 0 63%
2 Stockfish 270513 64 SSE4.2 0 25 25 106 50% 0 63%

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sun Jun 09, 2013 2:22 pm
by zullil
I ran the match with Bayesian Elo Rating calculator so you can see the error bars for the match

after 106 games Stockfish has a error bar of +/- 25 elo points with a elo rating equal to Houdini 3.

Seems highly likely that stockfish 270513 is on par with Houdini 3.

Hope this helps.

Rank Name Elo + - games score oppo. draws
1 Houdini 3 Pro x64 0 25 25 106 50% 0 63%
2 Stockfish 270513 64 SSE4.2 0 25 25 106 50% 0 63%
Thanks. Very interesting result. From where are you obtaining the Stockfish builds? Are these being built directly from the source code available at https://github.com/mcostalba/Stockfish ? Who is compiling the source code?

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sun Jun 09, 2013 3:05 pm
by mwyoung
zullil wrote:
I ran the match with Bayesian Elo Rating calculator so you can see the error bars for the match

after 106 games Stockfish has a error bar of +/- 25 elo points with a elo rating equal to Houdini 3.

Seems highly likely that stockfish 270513 is on par with Houdini 3.

Hope this helps.

Rank Name Elo + - games score oppo. draws
1 Houdini 3 Pro x64 0 25 25 106 50% 0 63%
2 Stockfish 270513 64 SSE4.2 0 25 25 106 50% 0 63%
Thanks. Very interesting result. From where are you obtaining the Stockfish builds? Are these being built directly from the source code available at https://github.com/mcostalba/Stockfish ? Who is compiling the source code?
You can find all the info and stockfish builds here. Enjoy!

http://abrok.eu/stockfish/

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sun Jun 09, 2013 6:13 pm
by zullil
Don't use Windows or Linux, so I need to compile my own for Mac OS X. Easy enough to do.

Any idea who is in charge of the abrok.eu site? I'm curious what optimization is being done for these builds.

Re: Stockfish on the Rise, Houdini SUPREMACY threatened

Posted: Sun Jun 09, 2013 6:32 pm
by Jeremy Bernstein
zullil wrote:Don't use Windows or Linux, so I need to compile my own for Mac OS X. Easy enough to do.

Any idea who is in charge of the abrok.eu site? I'm curious what optimization is being done for these builds.
They are just doing a normal "profile-build" with mingw.

jb