Page 1 of 1

Match request (2x self-play)

Posted: Fri Apr 29, 2011 9:11 am
by BB+
I can do this myself, but figure I'd ask if anyone has done it already.

I want to measure some Elo differences between an engine and itself when 2x the amount of time is given. About the only condition would be that the opening book used would not be too drawish. At this point, I really don't care whether Rybka, Houdini, IvanHoe, Stockfish, ..., is used. I would prefer to have numbers for (say) 4m+2s versus 8m+4s, and preferably both on single and multiple cpus (I'm more interested in the latter -- in particular, how much does a 6cpu engine gain from doubling the time at blitz time controls?). Anyone with info or a desire to carry this out? I expect that about 1000 games should suffice for the level of precision I would desire. At 20min/game, this would take about 2 weeks.

Re: Match request (2x self-play)

Posted: Fri Apr 29, 2011 7:16 pm
by ernest
BB+ wrote:I want to measure some Elo differences between an engine and itself when 2x the amount of time is given.
I did that some time ago, with Rybka 3 (also did R3_2cpu vs R3_1cpu and R3_64bit vs R3_32bit)
For Rybka3 2xtime, I used 4'+2" vs 2'+1", no ponder, 2cpu (Core2 Duo @3GHz), 64bit, book_5moves(M. Scheidl)
400games
result was +162 -25 =213 268.5 - 131.5 that is 67.1% (+126 Elo)

Re: Match request (2x self-play)

Posted: Sat Apr 30, 2011 11:05 am
by BB+
ernest wrote:For Rybka3 2xtime, I used 4'+2" vs 2'+1", no ponder, 2cpu (Core2 Duo @3GHz), 64bit, book_5moves(M. Scheidl)
400games result was +162 -25 =213 268.5 - 131.5 that is 67.1% (+126 Elo)
Thanks, I had expected about 100 Elo, so this is the right ballpark. I don't know where the lesser 50 or 70 Elo numbers really come from. Maybe it's a self-play phenomenon, or longer time controls are needed, or maybe that R3 has built-in contempt plays a role. My (current) interest in this was the data that Rybka X on 40 cores (with contempt) was beating Rybka 4 on a quad-core by ~220 Elo (at 1m+1s and DF11 book). There was an 8.7x hardware edge (the quad was 3.5Ghz, the 40cores averaged 3.04Ghz), and with a guess of 35% efficiency for the additional parallelisation [is this too high? -- maybe I rely on internal Rybka info/hype too much, or perhaps alternatively the 1m+1s time control has non-negligible overhead for the 40-core machine], this is already something like 200 Elo. With error margins and contempt to boot, the surety of any conclusion seems in doubt.

Re: Match request (2x self-play)

Posted: Sat Apr 30, 2011 6:14 pm
by ernest
BB+ wrote:I don't know where the lesser 50 or 70 Elo numbers really come from. Maybe it's a self-play phenomenon
Indeed the 50 to 70 Elo for doubling speed are verified in tournaments/gauntlets against different adversaries
(also see Bob Hyatt in http://www.talkchess.com/forum/viewtopi ... 776#404776)

But it's also "well known" that self-play exaggerates the level differences.

Re: Match request (2x self-play)

Posted: Sat Apr 30, 2011 6:36 pm
by ernest
Added note:
my Rybka3_2cpu vs Rybka3_1cpu (400 games) ended up 60.75% (+78 Elo)
my Rybka3_64bit vs Rybka3_32bit (400 games) ended up 62.88% (+94 Elo)