To kick off some technical discussions

Code, algorithms, languages, construction...
Post Reply
mcostalba
Posts: 91
Joined: Thu Jun 10, 2010 11:45 pm
Real Name: Marco Costalba

Re: To kick off some technical discussions

Post by mcostalba » Sun Jun 13, 2010 12:24 pm

hyatt wrote: My starting set of positions were chosen by using a high-quality PGN game collection, going thru each game one at a time and writing out the FEN when it it white's turn to move, move number 12, one position per game. These were then sorted by popularity to get rid of dups, and then the first 5,000 or so were kept. We are currently using 3,000 positions, where each game alternates colors so two games per position, and we use opponents including Stockfish, fruit, toga, etc...
Is it possible to have the 3000 positions that you are using ?

Now we use a varied book truncated at 7 moves, we don't have experience of PGN positions. Why do you prefer positions to books ? Have you tested with both and found positions are better or it is just an impression from your side ?

User avatar
kingliveson
Posts: 1388
Joined: Thu Jun 10, 2010 1:22 am
Real Name: Franklin Titus
Location: 28°32'1"N 81°22'33"W

Re: To kick off some technical discussions

Post by kingliveson » Sun Jun 13, 2010 2:17 pm

mcostalba wrote: .
.
.
Now we use a varied book truncated at 7 moves, we don't have experience of PGN positions. Why do you prefer positions to books ? Have you tested with both and found positions are better or it is just an impression from your side ?
Here is a tournament that someone ran using a book. Look at the ECO distribution. This is why using a book is bad in my opinion for such test.
White Wins 13 (21.7%)
Black Wins 10 (16.7%), 9 playing the Sicilian
Draws      37 (61.7%)
           60

ECO A   7 Games (11.7%)
ECO B  42 Games (70.0%)
ECO C   6 Games (10.0%)
ECO D   2 Games ( 3.3%)
ECO E   3 Games ( 5.0%)
PAWN : Knight >> Bishop >> Rook >>Queen

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: To kick off some technical discussions

Post by hyatt » Sun Jun 13, 2010 6:20 pm

mcostalba wrote:
hyatt wrote: My starting set of positions were chosen by using a high-quality PGN game collection, going thru each game one at a time and writing out the FEN when it it white's turn to move, move number 12, one position per game. These were then sorted by popularity to get rid of dups, and then the first 5,000 or so were kept. We are currently using 3,000 positions, where each game alternates colors so two games per position, and we use opponents including Stockfish, fruit, toga, etc...
Is it possible to have the 3000 positions that you are using ?
Yes. They are on my ftp box in /pub/hyatt/tests/openings.epd. There are actually 4,000 positions, I use the first 3,000 in my testing.


Now we use a varied book truncated at 7 moves, we don't have experience of PGN positions. Why do you prefer positions to books ? Have you tested with both and found positions are better or it is just an impression from your side ?
How do you make sure that (a) every book line gets played twice; (b) no book line gets repeated more than once; that all book lines are played just once without skipping any?

I am shooting for the same set of positions each time, without having to worry with the built-in randomness in my opening book, nor trying to figure out how to create a similar book for the opponents. I just stuff the positions into the programs and let the game start immediately.

mcostalba
Posts: 91
Joined: Thu Jun 10, 2010 11:45 pm
Real Name: Marco Costalba

Re: To kick off some technical discussions

Post by mcostalba » Sun Jun 13, 2010 6:46 pm

hyatt wrote: How do you make sure that (a) every book line gets played twice; (b) no book line gets repeated more than once; that all book lines are played just once without skipping any?

I am shooting for the same set of positions each time, without having to worry with the built-in randomness in my opening book, nor trying to figure out how to create a similar book for the opponents. I just stuff the positions into the programs and let the game start immediately.
Yes I have heard this argument before, but I am not sure it is still valid when we talk of 20000 or more games because in that case you need to think in terms of statistics and the wrorry to make each opening to be played alternative by both players is much less sounding then, for instance, on a match based on 50 or 100 games because what it counts is the probability that the opening are equally distributed and this we can safely assume that converges on such big numbers of games.

Sentinel
Posts: 122
Joined: Thu Jun 10, 2010 12:49 am
Real Name: Milos Stanisavljevic

Re: To kick off some technical discussions

Post by Sentinel » Sun Jun 13, 2010 8:26 pm

hyatt wrote:Yes. They are on my ftp box in /pub/hyatt/tests/openings.epd. There are actually 4,000 positions, I use the first 3,000 in my testing.
Thx Bob, this is really useful.
I've just realized that out of 500 positions that I use, 487 are already in your 4000 :).
Are they sorted in descending way for more to less frequent?

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: To kick off some technical discussions

Post by hyatt » Sun Jun 13, 2010 10:25 pm

mcostalba wrote:
hyatt wrote: How do you make sure that (a) every book line gets played twice; (b) no book line gets repeated more than once; that all book lines are played just once without skipping any?

I am shooting for the same set of positions each time, without having to worry with the built-in randomness in my opening book, nor trying to figure out how to create a similar book for the opponents. I just stuff the positions into the programs and let the game start immediately.
Yes I have heard this argument before, but I am not sure it is still valid when we talk of 20000 or more games because in that case you need to think in terms of statistics and the wrorry to make each opening to be played alternative by both players is much less sounding then, for instance, on a match based on 50 or 100 games because what it counts is the probability that the opening are equally distributed and this we can safely assume that converges on such big numbers of games.

It just removes one more cause of uncertainty. And without books, you don't have to worry about hidden book learning, which has happened in the past. Nor do you have to figure out how to disable normal book learning, which some progs do not allow anyway.

This way I know I start with the same 3,000 positions for every run, that I am not going to play a bunch of KID games this match, and way more Sicilians the next time. There is enough randomness caused by timing issues, adding an opening book was something I simply wanted to eliminate up front. Trying to come up with one book and create a binary version for different programs is another pain I wanted to avoid. And I certainly did not want to let them use their native books because I didn't want to play entire games in book as has happened in the past.

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: To kick off some technical discussions

Post by hyatt » Sun Jun 13, 2010 10:27 pm

Sentinel wrote:
hyatt wrote:Yes. They are on my ftp box in /pub/hyatt/tests/openings.epd. There are actually 4,000 positions, I use the first 3,000 in my testing.
Thx Bob, this is really useful.
I've just realized that out of 500 positions that I use, 487 are already in your 4000 :).
Are they sorted in descending way for more to less frequent?
Yes. The first one is the most popular, the last one is the least. I do not remember exact duplicate counts when creating the file, but I do not believe any were played less than 5 times in the "strong GM" pgn I used.

mcostalba
Posts: 91
Joined: Thu Jun 10, 2010 11:45 pm
Real Name: Marco Costalba

Re: To kick off some technical discussions

Post by mcostalba » Sun Jun 13, 2010 11:01 pm

hyatt wrote: It just removes one more cause of uncertainty. And without books, you don't have to worry about hidden book learning, which has happened in the past. Nor do you have to figure out how to disable normal book learning, which some progs do not allow anyway.
I have just fixed a very nasty bug that I had introduced some weeks ago.

The bug is impossible to spot with node counting fingerprints and it would be impossible to spot also using positions to benchmark or to test on real games.

The bug was due to an erroneous zeroing of game ply when start thinking. It messed up with rule50 so that the net effect was that draw by 50 detection was not working properly. I have incidentaly found this probably because I use books so that at each new move the GUI sends (uci protocol) all the move list from starting position up to current one. Perhaps using fixed position you lose the info on rule50 for previous moves and this bug could remain hidden or is more difficult to spot.

The bottom line, the experience I took away from this bug hunting session, is that as more you diverge from a "normal" chess game more you are prone to subtle and nasty bugs that have more possibility to remain untriggered, but they fire during normal games (as done in the public rating lists).

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: To kick off some technical discussions

Post by hyatt » Mon Jun 14, 2010 2:05 am

mcostalba wrote:
hyatt wrote: It just removes one more cause of uncertainty. And without books, you don't have to worry about hidden book learning, which has happened in the past. Nor do you have to figure out how to disable normal book learning, which some progs do not allow anyway.
I have just fixed a very nasty bug that I had introduced some weeks ago.

The bug is impossible to spot with node counting fingerprints and it would be impossible to spot also using positions to benchmark or to test on real games.

The bug was due to an erroneous zeroing of game ply when start thinking. It messed up with rule50 so that the net effect was that draw by 50 detection was not working properly. I have incidentaly found this probably because I use books so that at each new move the GUI sends (uci protocol) all the move list from starting position up to current one. Perhaps using fixed position you lose the info on rule50 for previous moves and this bug could remain hidden or is more difficult to spot.

The bottom line, the experience I took away from this bug hunting session, is that as more you diverge from a "normal" chess game more you are prone to subtle and nasty bugs that have more possibility to remain untriggered, but they fire during normal games (as done in the public rating lists).

How would that fail to be caught with my form of testing. I play _entire_ games, I just start at the same position for each pair of games in a 30K game match. Next 30K games will all start at the same starting positions again. But repetitions, 50 move draws and all that happen normally and I have actually found bugs in the 50 move code myself. One was the infamous if both sides make 50 moves, with no pawn moves or no captures, the game can be declared a draw by the side on move. UNLESS the side on move is checkmated. Then the game is over and the other side wins. Most miss that mate on ply 100 is a win. I modified the code to do this correctly, and broke it for a bit. Showed up like a sore thumb on cluster testing, however.

I don't see how one would consider these to be "not normal chess games." With a book, you don't start thinking until some random point where you drop out of book. I do exactly the same here, I just choose those random points myself rather than letting book authors do so.

The only place I would agree with you on this is that you do avoid testing your book code. And I do very rarely run "book on" tests. I have an unusual mode in Crafty, used in tournaments, where when it is in book, when it is time to ponder, it first looks at the opponent's possible moves and eliminates any that are in book since we have a reply for those. From the remaining moves we do a search to find the best, and then use that to ponder. The idea came from Murray Campbell many years ago and is based on the idea that if your opponent runs out of book, he may well not make a book move, because the book could miss a threat or have a bug. Since he is out of book, you can sit and do nothing, or you can make a stab on the best non-book reply and then ponder that so at least you are doing something. And even though you might ponder the wrong move, you do load the hash table with useful information that will help on a ponder miss. I have another mode that for lines that are not played very often. Crafty will do a search on the set of known book moves to do something more than just randomly choosing between a group of moves that are rarely played. And for either of those, I run normal book-on tests. But excluding testing the book code, I don't see what would be missed.

Fernando
Posts: 38
Joined: Thu Jun 10, 2010 1:34 am

Re: To kick off some technical discussions

Post by Fernando » Mon Jun 14, 2010 11:11 pm

What a pleasure to see you here, Bob! Respect what Chris W told to you and what you answered, I cannot add nothing but just the banal assertion that when you have a massive tool to do something, better to drop the old stone hammer.
What I do fully take from Chris thinking, not expressed here but elsewhere and lot time ago, is his idea that there is something Ok in the engine pushing the adversary to uncharted waters, that is, choosing sometimes lines that are not the best in terms of conventional scores, but are the best in terms of risky sub lines coming from there. Against humans this is usually lethal. At that you can add a superb improvement in the amount of interesting games.
I wonder if you could program a Crafty tailored specifically to take humans.
My best
Fern

Post Reply