Contempt

lucasart · Post by **lucasart** » Tue May 14, 2013 7:30 am

I recently implemented a contempt feature in DiscoCheck, as follows: the value of a draw by chess rules is -Contempt for the engine (root color) and +Contempt for the opponent.

I suppose this is the most basic and standard approach to contempt. The good thing about it is that it doesn't require any modification in the eval that would make it asymmetric (symmetry is important for me as I use the post null move trick to skip lots of eval calls).

I experimented with Contempt = 25 cp, and got the following results (no early stopping, tc=10+0.1, hash=16, opening played symmetrically and sequentially from same EPD file, so all is perfectly fair and equal)

=> self-play: equal

Code: Select all

ResultSet-EloRating>ratings
Rank Name          Elo    +    - games score oppo. draws 
   1 Contempt=0      1    4    3 12000   50%    -1   46% 
   2 contempt=25    -1    3    4 12000   50%     1   46% 

             Co co
Contempt=0      72
contempt=25  27

(BayesELO ratings and LOS matrix)

Difference is well with the error bar, as shown by the LOS matrix.

The draw rate is significantly lower than without contempt. Self-play gives me a draw rate around 55% in these conditions usually. So it behaves as expected, and avoid engines accepting 3-repetition draws in +/- equal positions (same for 50-move, insufficient material, stalemate, but 3-repeeition is by far the biggest plague).

=> foreign opponents: equal

Each version (Contempt=0 and Contempt=25) plays the same gauntlet in the exact same conditions (5000 games vs. Fruit and 5000 vs. Gaviota)

Code: Select all

Rank Name             Elo    +    - games score oppo. draws 
   1 Gaviota v0.86     22    6    5 10000   53%     0   22% 
   2 Contempt=25        1    6    6 10000   50%     0   25% 
   3 Contempt=0        -1    6    5 10000   50%     0   28% 
   4 Fruit 05/11/03   -22    5    5 10000   47%     0   30% 

                Ga Co Co Fr
Gaviota v0.86      99 99100
Contempt=25      0    70 99
Contempt=0       0 29    99
Fruit 05/11/03   0  0  0

Difference is well with the error bar, as shown by the LOS matrix.

Draw rate is also decreased against foreign opponents (starting from a much lower figure of 28%, while self-play w/o contempt was around 55%).

So I will leave the default value to 25cp, as it makes the playing style more entertaining. I much prefer when the engines fight to the death, rather than agree on a 3-move repetition a few moves out of the opening in a +/- equal position.

It's interesting to note that there is no ELO cost (nor gain obviously). I don't have the time and CPU resources to experiment with higher values, but it's possible that this value can be increased even more without hurting ELO.

Another thing that we would expectis that Contempt=25 performs relatively(to Contempt=0) better against Fruit (the weaker opponent) and worse against Gaviota (the stronger opponent). This is neither confirmed nor refuted by the empirical evidence, perhaps because the ELO difference between these engines is probably too small to be able to measure it reliably. Anyway, if I isolate the 2 gauntlets, here are the BayesELO ratings table:

Code: Select all

Rank Name             Elo    +    - games score oppo. draws 
   1 Gaviota v0.86     22    6    7  5000   53%    -1   23% 
   2 Contempt=0        -1    5    5 10000   50%     0   28% 
   3 Fruit 05/11/03   -21    6    6  5000   47%    -1   32% 

Rank Name             Elo    +    - games score oppo. draws 
   1 Gaviota v0.86     21    6    7  5000   53%     1   21% 
   2 Contempt=25        1    5    5 10000   50%     0   25% 
   3 Fruit 05/11/03   -22    6    6  5000   47%     1   29%

hyatt · Post by **hyatt** » Tue May 14, 2013 7:08 pm

From a LOT of testing, here is what you can expect:

as you increase the contempt, the draws will go down. And wins/losses will go up. And yes, you will lose a few that you should have drawn, and you will win a few that you might have drawn. The question is, which side gets the balance.

A "static contempt" is a HORRIBLE idea. Set it to zero. Play against an opponent that is 400 Elo weaker than you are. Too many draws that you could have won by avoiding the draws and outplaying a much weaker opponent. Play against a stronger opponent. Too many losses that you could have drawn had the contempt been set higher.

In Crafty, I have a dynamic contempt (draw score) that is set according to the difference in rating between Crafty and its opponent for this game. This can be automatically obtained when playing on a server (xboard protocol provides both ratings using the 'rating' command) or can be set by hand by manually entering a "rating" command. Then you can try to avoid draws against weaker opponents, and play toward them against stronger opponents.

lucasart · Post by **lucasart** » Wed May 15, 2013 12:27 am

hyatt wrote: A "static contempt" is a HORRIBLE idea. Set it to zero. Play against an opponent that is 400 Elo weaker than you are. Too many draws that you could have won by avoiding the draws and outplaying a much weaker opponent. Play against a stronger opponent. Too many losses that you could have drawn had the contempt been set higher.

In Crafty, I have a dynamic contempt (draw score) that is set according to the difference in rating between Crafty and its opponent for this game. This can be automatically obtained when playing on a server (xboard protocol provides both ratings using the 'rating' command) or can be set by hand by manually entering a "rating" command. Then you can try to avoid draws against weaker opponents, and play toward them against stronger opponents.

Static contempt is the only thing I can do. DiscoCheck has no way of knowing how much weaker/stronger the opponent is. And I don't like the idea.

The point is that my staitc contempt does not cost any ELO, and reduces significantly the draw rate. That's really the only thing I care about. Perhaps it makes differences bigger and DiscoCheck will destroy more weak engines and get destroyed more by strong ones, but the only thing I care about is the average here.

As there's no cost in ELO it's clear to me that contempt is good. There's nothing more annoying than seeing engines agree to a draw by 3-repetition a few moves out of the opening (just because both think thay have a very slightly negative eval, even though in reality it's equal and nothing has really happenned). I would rather see DiscoCheck fight to the death than accept such cowardly draws.

hyatt · Post by **hyatt** » Wed May 15, 2013 6:40 pm

So you can't enter a command that says "opponent rating is xxxx?" I've played in many USCF-type events, both as a human, and using Crafty, and I ALWAYS knew the rating of my opponent before the round started.

You are simply overlooking a basic feature of the game...

I would rather see Crafty accept such a draw against a stronger opponent, and fight on against a weaker opponent, waiting for him to make the inevitable mistake...

BB+ · Post by **BB+** » Mon May 27, 2013 3:45 pm

Contempt in some engines (Rybka) also used a concept of pawn spread. For instance, efgh vs fgh is drawish, while abgh vs afg is not so much so. LK indicated this at one point, and indeed unbalanced pawn structures are something that dynamic GM play often favours. IvanHoe dumps something vaguely related into the "drawish" adjustment, but as it lacks contempt, it is (vacuously) not used there.

lucasart · Post by **lucasart** » Mon Jun 03, 2013 3:50 am

I know, avoiding draws is not just about avoiding those you can see in the search, but also avoiding positions that are more drawish by eval consideration. For example, with a high contempt, the engine should
- try to retain the maximum number of pieces on the board (avoid exchanges)
- prefer asymmetric pawn structures to symmetric ones
- avoid blocked pawn structures like the plague

The problem with this approach, is that it makes the eval assymetric (ie. side-to-move dependant). This is fundamentally incompatible with the design of DiscoCheck, and doing this would cost me quite a lot of speed loss in the general case. After a null move I never calculate the eval, but invert the one of the parent instead (thanks to eval symmetry).

I suppose I could introduce two different features:
- drawscore (for the pt of view of the engine). this is what is explained in my first post.
- contemt: when this is zero, then use the post null move optimization, otherwise don't. only interesting when playing against a human, and tries to doo the things listed above.

I'll have a look, and if it doesn't add too much cruft to the code, why not? But given the lack of interest of human players, I'm not sure it's worth it. No one cares about engines that are not in the Top 10 (if not Top 3), and those that do only play automated computer vs computer matches for rating lists. I doubt I would get GMs to play against my engine.

hyatt · Post by **hyatt** » Mon Jun 03, 2013 9:48 pm

lucasart wrote:I know, avoiding draws is not just about avoiding those you can see in the search, but also avoiding positions that are more drawish by eval consideration. For example, with a high contempt, the engine should
- try to retain the maximum number of pieces on the board (avoid exchanges)
- prefer asymmetric pawn structures to symmetric ones
- avoid blocked pawn structures like the plague

The problem with this approach, is that it makes the eval assymetric (ie. side-to-move dependant). This is fundamentally incompatible with the design of DiscoCheck, and doing this would cost me quite a lot of speed loss in the general case. After a null move I never calculate the eval, but invert the one of the parent instead (thanks to eval symmetry).

I suppose I could introduce two different features:
- drawscore (for the pt of view of the engine). this is what is explained in my first post.
- contemt: when this is zero, then use the post null move optimization, otherwise don't. only interesting when playing against a human, and tries to doo the things listed above.

I'll have a look, and if it doesn't add too much cruft to the code, why not? But given the lack of interest of human players, I'm not sure it's worth it. No one cares about engines that are not in the Top 10 (if not Top 3), and those that do only play automated computer vs computer matches for rating lists. I doubt I would get GMs to play against my engine.

It doesn't have to be asymmetric. For example, how do you evaluate a position where white is a piece down? I don't call that asymmetric. By the same token, if the position has drawish characteristics, you can simply pull the score toward the contempt score as opposed to zero, and it works for either side equally, so that the side that is ahead will see a decrease, the side that is behind sees things get better if it is drawish...

This is simply a symmetric adjustment to the evaluation score based on the drawishness of the position...

lucasart · Post by **lucasart** » Tue Jun 04, 2013 4:09 pm

hyatt wrote:It doesn't have to be asymmetric. For example, how do you evaluate a position where white is a piece down? I don't call that asymmetric. By the same token, if the position has drawish characteristics, you can simply pull the score toward the contempt score as opposed to zero, and it works for either side equally, so that the side that is ahead will see a decrease, the side that is behind sees things get better if it is drawish...

This is simply a symmetric adjustment to the evaluation score based on the drawishness of the position...

That's interesting. I'll experiment with it, although I suspect it will have to small an impact to measure.

hyatt · Post by **hyatt** » Wed Jun 05, 2013 12:46 am

lucasart wrote:
hyatt wrote:It doesn't have to be asymmetric. For example, how do you evaluate a position where white is a piece down? I don't call that asymmetric. By the same token, if the position has drawish characteristics, you can simply pull the score toward the contempt score as opposed to zero, and it works for either side equally, so that the side that is ahead will see a decrease, the side that is behind sees things get better if it is drawish...

This is simply a symmetric adjustment to the evaluation score based on the drawishness of the position...
That's interesting. I'll experiment with it, although I suspect it will have to small an impact to measure.

You might be surprised.

--from experience and lots of testing, btw.

sje · Post by **sje** » Wed Jun 12, 2013 3:42 am

I will not dispute the observation that careful use of an adjustable contempt offset (not "factor") will likely increase a program's score over time.

However, I've never used it in my programs because it violates the spirit of the adage "Play the board, not the man". A contempt offset will taint the move selection process, thus making any post mortem analysis less transparent.

OpenChess

OpenChess

Contempt

Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt

Re: Contempt