Some tablebase formats

General discussion about computer chess...
syzygy
Posts: 148
Joined: Sun Oct 16, 2011 4:21 pm

Re: Some tablebase formats

Post by syzygy » Sat Mar 03, 2012 3:56 am

BB+ wrote:I'm still not quite sure what Shredder is doing for the extra compression in the 157MB version. One guess is simply a Huffman layer on top of what I presume is run-length encoding (in the 441MB version), though I can't quite get that much space-reduction in my own experiments. I don't know if the slowdown from such uncompression would be ~10x as with Shredder either.
I don't think the difference can be explained by better compression.

The one-sided trick seems to fit the numbers very well. I wouldn't exclude this possibility too easily.

If it is not the one-sided trick, it must be some kind of extended capture resolution. Either searching a few plies for a winning or drawing conversion, or searching for a winning or drawing pawn move.

With capture resolution added to the compression scheme I outlined here I currently get 461MB for two-sided 3-4-5 piece tables (storing somewhat more information than the Shredder bitbases, since I differentiate between wins/losses affected by the 50-move rule and those not affected).

The subset of pawnful tables add up to about 387MB, so adding "pawn-move resolution" should save quite a bit of space, but it would also seem to create the possibility of a search-space explosion during probing...

My DTZ (or rather DTZ50+) tables are now just over 1950MB (they store signless values, so require the WDL tables). I think I can get them a bit smaller still. Applying the one-sided trick would again more than half them in size. Since they are only probed at the root and probing is quite fast, I don't think it makes sense not to apply the trick. Also DTZ tables for positions with an overwhelming material advantage for one side should probably just be completely dispensed with (especially with WDL available).

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Some tablebase formats

Post by BB+ » Sat Mar 03, 2012 6:55 am

The one-sided trick seems to fit the numbers very well. I wouldn't exclude this possibility too easily.
I asked SMK in Tilburg last November. He confirmed the 157MB 5-man number was 1-sided. [He also said the 6-man Shredderbase build had a bug(s?), and they were hoping to fix it, but it seems not to be a priority -- he could have used 6-man in the Junior/Shredder game, but still escaped with a draw]. I didn't ask the details of compression (and SMK might not know off-hand). The RobboBases appear to use some sort of RLE encoding, as Schaeffer had done with checkers [there's also junk in the code for Huffmann encoding, but it doesn't seem currently to be used]. My recollection is that they divvy things up into 64-byte containers (thus 4/64 of the memory usage is indices).
Also DTZ tables for positions with an overwhelming material advantage for one side should probably just be completely dispensed with (especially with WDL available).
One idea (I think I am recollecting it from checkers, maybe Trice) is to count anything with a small enough DTZ as just an adorned "win", on the theory that search can easily find it.

syzygy
Posts: 148
Joined: Sun Oct 16, 2011 4:21 pm

Re: Some tablebase formats

Post by syzygy » Sat Mar 03, 2012 2:31 pm

BB+ wrote:
The one-sided trick seems to fit the numbers very well. I wouldn't exclude this possibility too easily.
I asked SMK in Tilburg last November. He confirmed the 157MB 5-man number was 1-sided.
So that mystery has been solved :)
The RobboBases appear to use some sort of RLE encoding, as Schaeffer had done with checkers [there's also junk in the code for Huffmann encoding, but it doesn't seem currently to be used]. My recollection is that they divvy things up into 64-byte containers (thus 4/64 of the memory usage is indices).
Yes, I read that too. I'm doing the same (well, not the RLE), but with 2 bytes per container to keep track of the number of positions stored in it (1-65536), and 6 bytes per about 16 containers to store a 4-byte container index and a 2-byte position index within that container. Container size for WDL is 64 bytes unless it compresses so well that 32 bytes is better (given the max of 65536 positions per container). For DTZ I use 1024 bytes or less.
Also DTZ tables for positions with an overwhelming material advantage for one side should probably just be completely dispensed with (especially with WDL available).
One idea (I think I am recollecting it from checkers, maybe Trice) is to count anything with a small enough DTZ as just an adorned "win", on the theory that search can easily find it.
I was thinking of that as well. With WDL available it should be fine, except for 50-move critical tables where every ply might count (and which I store with a ply count instead of a move count). Another idea is storing move-count / 4 or so, but that would require a lot of DTZ -probing which I would like to avoid (the idea being that WDL is in RAM or SSD and DTZ is on a slower medium).

Post Reply