To the best of my understanding [and I am not following Rybka Forum], I think he is simply measuring something else, and that I don't find his to be the most trenchant consideration (or "finest microscope") for the R/F situation. To be more precise: If all one had were the Fruit 2.1 arrays to go along with the Rybka 1.0 Beta arrays, then his reconstruction has more relevance, in that there would be no "natural supposition" as to the methodology behind the creation of PSTs.Rebel wrote:Alright then, let's keep it simple. Same question for you as to Zach in Rybka forum.
1. Do you totally reject Miguel's PST experiments? Is it all bull, or is there some truth in it?
However, with the Fruit 2.1 code sitting in front of us, it seems rather unnatural to speculate about alternative methods that could re-create R/F-style arrays -- I claim that a more direct method is to determine: how little the Fruit 2.1 code needs to be modified to give the Rybka 1.0 Beta numbers; and whether this "code differential" is notably smaller than with other engines. [This is a simplification of any "comparison" process, as many other engines don't even have 12 PSTs, and so one has to juggle the methodology before even starting to compare].
I still think they are "major", by my understanding of ICGA Rule #2. I do not know about other Panel members. Whether or not PST by itself would be enough for "strong sanctions" from Rule #2 (rather than just a wrist-slap) is yet another question.Rebel wrote:2. Do you still stand by your document that on 2 places states that the PST's are a MAJOR issue?
My personal opinion is that the Panel proceeded in a reasonable manner. One of the first orders of business was to determine the relative import of various aspects, which should be discussed more, what would be convincing evidence that Rule #2 had indeed been broken, etc. At this stage, some things like 10-30-60-100 scaling (section 6.2) were completely rejected, others (like UCI parsing) were said to be only marginally of chess-related interest. This (essentially) left the "big three", namely the three so-called major pieces of evidence. [Other things were also raised during this period, such as which Rybka versions should be investigated for ICGA purposes, what could be considered "black-box" code, and more].Rebel wrote:3. What's your personal opinion, in the light of all that has been said about the PST's here and at Rybka forum, should the panel had to pay more attention to the PST issue?
It was then generally agreed [via the survey mentioned below] that "Commonality of evaluation features" would suffice, if indeed the evidence there was as strong as claimed (and that it also applied to R232a, not just R1). I don't know whether this was because I put it first in RYBKA_FRUIT [as Alan Sassler said, you usually lead with the best stuff], or whether the Panel sensed that EVAL was the most chess-relevant of these three [EVAL, root search ordering, PST]. There was then a discussion of how EVAL "similarity" could be measured. In particular, my RYBKA_FRUIT wording that R/F had "the use of exactly the same evaluation features" was scrutinised [it seems that MarkL quoted this in the final Report as my initial impression, tempering it with: This has been expanded with more statistical rigour in a separate 50+ page paper], as at the most abstract level of general chess knowledge, this could be said about many engine pairs. Even if one passed down to finer detail, there were specific issues brought up with other engines (e.g., they used contact squares for king safety), and thus there was a desire for some sort of "statistical" analysis of the R/F "evaluation overlap". This led to EVAL_COMP, which was first discussed in preliminary form, and then "instrumented" so as to get a final result. However, although EVAL_COMP took centre stage for some purposes, it should be stressed that it was still only a part of the evidence.
By this point, Mark Lefler had surveyed the Panel. In the early part of this survey, some people had expressed various doubts. In particular, some of those who were undecided noted that the evaluation comparison of R/F was the main issue, with additional caveats about whether EVAL similarities could be explained by other "public" influences (either on FL in making Fruit, or on VR with Rybka). After EVAL_COMP was finished and discussed, MarkL then made the decision to pass to the "voting" stage, and proceeded to put together the Report. Referring to my RYBKA_FRUIT work, he included the line: Disassembly of the root search analysis indicates nearly identical code and variables, even including the ordering of the variables (the 2nd "major" issue), but did not directly mention PST. For Zach's it says: Identical formulas for calculating piece-square tables for: pawns, knights, bishops, rooks, queens. Highly similar formulas for piece square tables for kings. One could argue that this is more of a statement of Zach's evidence, rather than bringing forth any conclusion one might draw from it.