More on similarity testing

General discussion about computer chess...
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: More on similarity testing

Post by BB+ » Thu Dec 30, 2010 3:44 am

The tester seems to very clearly identify strong correlations between the playing styles of programs and it does this better than I had hoped.
I quite agree. I discussed this back with Larry (in PMs at Rybka forum) when you were first tossing this idea around. I had thought I had a few ideas for how to tweak the search, but the robustness in eval stays. Actually, now that I think of, the later IvanHoes have some sort of "randomiser", which merely seems to perturb the eval by some amount (I'd have to check the details). Maybe I can test eval versus perturbed-eval to see how much noise one needs to create to get an effect. I also think taking (at least the open-source) engines and cross-comparing correlations from evaluate() with go movetime 100 is a useful experiment.

One thing I like about fixed depth is that there's no dispute about what the "default" level of matching is (at least w/o SMP). I'm not sure this outweighs any negatives. Given that the time alloted appears to be a secondary factor, I would opt for whichever is easier. One issue with using "stop" (which does improve on "go movetime" I agree) is how the OS does time slicing with a "waiting" process (typically I think these are 1/100 of a second in Linux). As noted in the Stockfish discussion, you can still hit a "polling" discretisation behaviour when I/O is only checked every 30K nodes and the search is taking maybe 5 times this amount. If nothing else, as with any experiment, there needs to be some quality control.

One question I have about all of this: can this detect specific overlap in evaluation features, or is it more about evaluation numerology?

Sentinel
Posts: 122
Joined: Thu Jun 10, 2010 12:49 am
Real Name: Milos Stanisavljevic

Re: More on similarity testing

Post by Sentinel » Thu Dec 30, 2010 3:49 am

BB+ wrote:One question I have about all of this: can this detect specific overlap in evaluation features, or is it more about evaluation numerology?
As I said in my previous post it catches only material + PST.
Try the test with Ippo with only lazy eval and you'll see.

User avatar
kingliveson
Posts: 1388
Joined: Thu Jun 10, 2010 1:22 am
Real Name: Franklin Titus
Location: 28°32'1"N 81°22'33"W

Re: More on similarity testing

Post by kingliveson » Thu Dec 30, 2010 6:50 am

BB+ wrote: Actually, now that I think of, the later IvanHoes have some sort of "randomiser", which merely seems to perturb the eval by some amount (I'd have to check the details). Maybe I can test eval versus perturbed-eval to see how much noise one needs to create to get an effect.
I have a little data on that. IvanHoe 0A.0C.1A (from beta 999949j source) posted on the engine's sub-forum, actually uses the randomizer combined with the pieces weight tweaked a little. It does cause it to play slightly different, but nothing significant as far as similarity play style is concerned:
X:\chess\similar>similar -r 19
------ IvanHoe 0A.0C.1A x64 (time: 100 ms) ------
 74.30  IvanhoeB49jAx64p (time: 100 ms)
 73.95  IvanHoe 9.49b x64 (time: 100 ms)
 73.55  RobboLito 0.09 x64 (time: 100 ms)
 73.50  FireBird 1.01 x64 (time: 100 ms)
 72.70  IvanHoe 9.70b x64 (time: 100 ms)
 72.15  Houdini 1.01 x64 4_CPU (time: 100 ms)
 67.35  Houdini 1.5 x64 (time: 100 ms)
 66.25  Rybka 3  (time: 100 ms)
X:\chess\similar>similar -r 12
------ IvanHoe 9.49b x64 (time: 100 ms) ------
 74.80  IvanhoeB49jAx64p (time: 100 ms)
 74.45  FireBird 1.01 x64 (time: 100 ms)
 74.30  IvanHoe 9.70b x64 (time: 100 ms)
 73.95  IvanHoe 0A.0C.1A x64 (time: 100 ms)
 73.70  RobboLito 0.09 x64 (time: 100 ms)
 73.05  Houdini 1.01 x64 4_CPU (time: 100 ms)
 68.95  Houdini 1.5 x64 (time: 100 ms)
 67.25  Rybka 3  (time: 100 ms)
PAWN : Knight >> Bishop >> Rook >>Queen

Hood
Posts: 200
Joined: Thu Jun 10, 2010 2:36 pm
Real Name: Krzych C.

Re: More on similarity testing

Post by Hood » Sat Jan 01, 2011 2:31 pm

Hi,

how will you answer the following question

programs with different evals and searches are choosing the same move ?

It is possible because of the different searches they are estimating different future position.
Smolensk 2010. Murder or accident... Cui bono ?

There are not bugs free programms. There are programms with undiscovered bugs.
Alleluia.

Post Reply