Arena 3 gaviotaTB vs robbotripleTB

syzygy · Post by **syzygy** » Sat May 25, 2013 2:36 pm

User923005 wrote:I think it would be good to make a separate license file for the data and the probing code. Perhaps Berkeley or MIT or something like that. Then it would be clear that your tablebase files can be used even for professional chess programs.

The "Terms of use" section in the README and the notices in the relevant source files should suffice. If those are not clear enough, a separate file will not make the difference.

My gcc version:
$ gcc --version
gcc.exe (rev1, Built by MinGW-builds project) 4.8.0

I've installed gcc-4.8.0 now and I get the same warnings. I don't know when gcc is supposed to emit an array bounds warning, but imho it does not make sense to emit one only because gcc can't see that the subscript remains within the array bounds. The compiler is also not consistent, because probe.c has lots of code fragments similar to lines 1638-1640. I have already noticed more of these cases in other code.

I'm fine with the -Wmaybe-unitialized warnings: they only say "may" be unititialized. The -Warray-bounds warnings say "is" above array bounds. This is wrong.

My conclusion is that gcc-4.8.0 is buggy in respect of the -Warray-bounds warning. I still might eliminate the warnings by telling the compiler that the subscripts are in fact within the bounds.

syzygy · Post by **syzygy** » Sat May 25, 2013 9:23 pm

syzygy wrote:My conclusion is that gcc-4.8.0 is buggy in respect of the -Warray-bounds warning. I still might eliminate the warnings by telling the compiler that the subscripts are in fact within the bounds.

It's probably related to this bug report.

BB+ · Post by **BB+** » Mon May 27, 2013 3:40 pm

One issue could be that, in your initial position, there are a couple of blocked pawns, and Robbobases can count those as one unit. As stated by syzygy, the probing in qsearch is another reason for the large numbers. Comparatively, IvanHoe probes in eval rather than search, but it has some depth limiting capability (with weak probing), and so could presumably ignore qsearch probes [the UCI parameters appear to allow some fiddling with this, and the same are in Comstock]. OTOH, I'm not sure why you would particularly want to avoid qsearch probes, at least if the probes are sufficiently fast.

I have never been able to sort out what Gaviota bitbases are supposed to do, even when asking MB. As syzygy said, I think it just caches the WDL info from the fuller info from a lookup.

I don't think Robbobases should use up 4GB of RAM, unless you pre-load a bunch of 6 piece endgames. For that matter, I'm not really sure that the data locality for 6 piece endgames is good enough for their "weak loading" mechanism to be that great. As a test case, if you are going to analyse something with various RPP vs R endgames coming up, should you pre-load the whole bitbase, or allow the the relevant subset of them to be loaded via the background loader (it brings up 256K chunks IIRC, which should not guarantee that much overlap between similar positions).

I think RobboTripleBases for 5 pc were actually offered smaller than the current 570MB at one point, but there was extra slowdown in reading (something like a Huffmann coding prior to the run-length encoding). In the same guise, the Triple Bases compress a reasonable amount (40%?) with bzip2.

syzygy · Post by **syzygy** » Tue May 28, 2013 12:24 am

BB+ wrote:OTOH, I'm not sure why you would particularly want to avoid qsearch probes, at least if the probes are sufficiently fast.

Speed would be the reason, but maybe the Triplebases can indeed be probed sufficiently fast. I have not tried it yet with my WDL tables. Run-length encoding could give Robbobases an edge here.

I think RobboTripleBases for 5 pc were actually offered smaller than the current 570MB at one point, but there was extra slowdown in reading (something like a Huffmann coding prior to the run-length encoding). In the same guise, the Triple Bases compress a reasonable amount (40%?) with bzip2.

I think I read somewhere that in-check positions are no longer compressed out.

syzygy · Post by **syzygy** » Tue May 28, 2013 12:31 am

syzygy wrote:I think I read somewhere that in-check positions are no longer compressed out.

Here it is:

Again RobboBases shine superior in every all ways. These sit at 570MB for 5 pieces, no half-way. Old RobboBases excised in-check to plummet more at 450MB, but RAM is easy these days.

BB+ · Post by **BB+** » Tue May 28, 2013 1:32 pm

syzygy wrote:I think I read somewhere that in-check positions are no longer compressed out.

From what I have read, in-check positions (ROBBO_SCACCO) being don't-cared was/is the main difference between "Old" and "New" triplebases. The latter seem not to be supported currently (to the chagrin of various users).

BB+ wrote:something like a Huffmann coding prior to the run-length encoding

I think I found what I meant. In the RobboBuild stuff, there is some RobboTripleHuffman.c code, and the Internal_RLE() function seems to be able to use this. From what I can tell, this post-processes the RLE tokens. However, there is "#if 0" code (with the comment "unhappy") that ensures that it does not do so in practise. Looking back, in the 999950 tarball, RobboTripleValue.c has a GetDataHuffmann() function might be able to read the resulting data. That function seems to have been dropped in the current RobboBaseLib scheme. [Probably just applying a Huffman encoding (rather than bzip2) to the RobboTripleBase files will suffice to show that they shrink by some percentage].

syzygy wrote:Speed would be the reason, but maybe the Triplebases can indeed be probed sufficiently fast. I have not tried it yet with my WDL tables. Run-length encoding could give Robbobases an edge here.

I don't know for a fact, but I would think that serialising the position and computing the indices would be comparable time to decoding/stepping-thru 64-byte segments (or half that on average).

BB+ wrote:For that matter, I'm not really sure that the data locality for 6 piece endgames is good enough for their "weak loading" mechanism to be that great [...]

I had thought that one of the Chinook papers discussed this (along with other I/O issues) in the checkers context, but if so can't find it. My recollection was that they had 99% cache hits with 100MB of buffers (in 1K chunks), but of course 99% means that 1% of the time you are accessing disk, and that will dominate.

EDIT: Regarding Chinook: "The databases have been organized to increase data locality ... The result is that the program, using 200 MB of page buffers ends up doing one disk I/O for an average of 500 data position value requests." They then note that the early 90s and 10 years later had different I/O and cpu splits.

RobboWarrior wrote:Again RobboBases shine superior in every all ways. [...]

I'm not sure whether this a technical report or a promotional brochure... It does seem correct about the Robbo stuff (or at least I've seen what it says echoed elsewhere), though it doesn't address the speed of probing (other than to talk a bit about indexing).

syzygy · Post by **syzygy** » Tue May 28, 2013 11:39 pm

BB+ wrote:
syzygy wrote:I think I read somewhere that in-check positions are no longer compressed out.
From what I have read, in-check positions (ROBBO_SCACCO) being don't-cared was/is the main difference between "Old" and "New" triplebases.

That's what I mean by "compress out". Any idea what they do for for the 6-piece TripleBases?

My WDL tables grow from 378 MB to 68.2 GB when going from 5 to 6 pieces. If their ratio is similar, the 450 MB scheme (in-check compressed out) would give about 81.2 GB for 6 pieces and their 570 MB scheme (in-check not compressed out) would go to 102.8 GB. In reality they are 87 GB, so it seems likely they set in-check positions for 6-piece tables to don't care. The alternative is that my compression scheme becomes relatively less efficient (space-wise) when going from 5 to 6.

I don't set in-check positions or queen-promotions to don't care, because I don't want the overhead of multiple probes into 6-piece tables when probing a position with 6 pieces. Setting in-check positions to don't care would anyway not play very well with the fact that I distinguish between wins within the 50-move rule and wins outside the 50-move rule (which means I should only probe directly after a zeroing move), although this is a minor issue.

What I do set to "semi"-don't care are captures that draw. If a position is a draw and there is at least one capture that draws, then I can store the position as a loss or as a draw, whatever compresses best. Since I also have "win but drawn by the 50-move rule" and "loss but drawn by the 50-move rule" I have a few more of those possibilities. I don't know if the Chinook and/or Robbo-tables do this as well.

RobboWarrior wrote:Again RobboBases shine superior in every all ways. [...]
I'm not sure whether this a technical report or a promotional brochure... It does seem correct about the Robbo stuff (or at least I've seen what it says echoed elsewhere), though it doesn't address the speed of probing (other than to talk a bit about indexing).

It is written in their usual style, but the technical details they give seem very plausible.

BB+ · Post by **BB+** » Wed May 29, 2013 2:09 am

syzygy wrote:it seems likely they set in-check positions for 6-piece tables to don't care [...] I don't set in-check positions or queen-promotions to don't care [...]

My impression is that there is no current "in-check" don't-caring in the Robbo Triple Bases at all, even for 6 piece:

Code: Select all

OPTIONINGS |= (0 & ROBBO_SCACCO_BIT); /* HACK */ /* in the none */

The reader code (RobboBaseLib/RobboTripleCode directory) in RobboTripleScore.c similarly does not seem to generate evasions (see CapturesByWhite in RobboTripleMakeUndo.c).

It does seem that they consider queen promotions in GenCapturesTotal as called by RobboBuild/RobboTripleBuild.c, at least in the code from include/Robbo_gen_mossa.h (not the most obvious place). I am not sure why there would be a 5-vs-6 ratio issue there, though. Another possible difference is that they split pawns by file in their 6 piece bases (at least in many cases), which could help compression. This is a bit related to piece-ordering issues (i.e., you typically want to make the innermost index induce the fewest result-perturbations, so thus greater uniformity).

BB+ · Post by **BB+** » Wed May 29, 2013 1:17 pm

syzygy wrote:I don't know if that is the latest code, but I would suggest to whoever maintains it to probe the tables only AFTER probing the TT table

If you look at the probing code (ROBBO_TRIPLE_BASE_LOOKUP), the first thing it does is check its own hash table. So, conversely, assuming you are probing everywhere, I am not sure why they are bothering to store the probe results in the main HT.

syzygy · Post by **syzygy** » Thu May 30, 2013 11:22 pm

BB+ wrote:It does seem that they consider queen promotions in GenCapturesTotal as called by RobboBuild/RobboTripleBuild.c, at least in the code from include/Robbo_gen_mossa.h (not the most obvious place). I am not sure why there would be a 5-vs-6 ratio issue there, though.

I also don't think that should affect the ratio.

Another possible difference is that they split pawns by file in their 6 piece bases (at least in many cases), which could help compression.

I do that for all tables with pawns.

This is a bit related to piece-ordering issues (i.e., you typically want to make the innermost index induce the fewest result-perturbations, so thus greater uniformity).

With my indexing scheme I try to allow for as many permutations as possible, and I do test runs to find a good one.

Regarding, the 5 to 6 ratios, it could be that their 6-piece ordering is somehow better than their 5-piece ordering. It could also be that the increase in size of my tables due to distinguishing between wins and win-but-50-move-rule-draws is bigger for 6-piece tables than for 5-piece tables (probably that is the case, but I think the size increase is pretty small anyway at least for the large majority of tables). Another factor might be that my compression scheme which is based on a small dictionary plus huffman (per table and stm or per pawn file and stm) performs less well on the much larger 6-piece tables. If I'm not mistaken the Chinook paper describes an RLE scheme fixed for all tables, and such a scheme should perform equally well (or equally bad) for all table sizes.

Maybe I'll try to get their generator running so I can compare the size of individual tables. (But I don't like the fact that I can't generate the TripleBases directly but have to go through the RobboBases.)

OpenChess

OpenChess

Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB

Re: Arena 3 gaviotaTB vs robbotripleTB