Back to R3/IPPOLIT ?
Posted: Mon Feb 07, 2011 2:22 pm
To beggar the question then (and I expect you are fully correct): why doesn't VR just state this [particularly to a forum with many programmers] in the manner that you did, rather than give the mealy-mouthed answers that many have come to expect? If the code similarities to IPPOLIT are as great as claimed, I'd fully expect any version within a week's time -- maybe even a month's -- either before or after the R3 release should suffice for most purposes [and any missing things could be filled by RE if necessary]. The main point here is then not that VR lacks the precise R3 code -- but rather that he mentions it almost as an excuse (of his own making no less), when as you say it is not all that relevant.M ANSARI: I think way too much is made out of this "Vas lost R3 code". There were tons of different R3 iterations when it was in beta testing. Things were added and compiled out almost daily. He probably just didn't keep track of the pre compiled source of the released version. That doesn't mean he lost all the code, just that he probably lost track of which code was the release version.
See also his correspondence with Schüle, who directly mentioned this in his codicil after requesting a code-snippet so as to clear up the IPPOLIT situation (either publicly, or to a trusted person like Corbit): (I assume you don't have the exact R3 source version anymore but "close" could be good enough) --- this met with the response: Re. Rybka 3 source code: Unfortunately, I don't have it. [...] I wouldn't quite call this "non-responsive", but the manner in which the question was framed seems to have taken most of the legs out of the answer, unless he really does have nothing from a rather long time span.
Having had some discussion with LK, I think he knows quite well what is in the R3 evaluation function.Milos to LK: Your values have nothing to do with actual values in Rybka 3 evaluation which were heavily automatically tuned. Anyone that can use debugger and can read your posts here (in which you brag about your evaluation values) knows that.
The Appendices of my R3/IPPOLIT comparison give a sampling of this. I'd have to futz around, but if LK is willing for it to be published, a complete comparison should be possible (alternatively, LK could himself publish these, though VR might have some rights that need to be considered). I'm not sure that revealing everything in R3 would imply much more than can already be seen from the above Appendix.Houdart: Please provide us with some real data: evaluation terms of Rybka 3, compared to evaluation terms of Ippolit.
This could be an interesting test, but again I might ask for actual data (e.g., how is "remarkably similar" quantified?), and I'd still say the actual evaluation function is a better bet than derived IDeA trees. I would be willing to do a comparison of (say) a million positions of Rybka/Houdini/Komodo/IPPOLIT/Stockfish (only the first 2 or 3 should require much work -- maybe Critter too, as it has an undocumented feature for this) if it would be of use. [I did this awhile back for IPP/R3/Fruit/Stockfish I think, but the data are at the Rybka forum, and it's probably better just to start anew].LK: One way to demonstrate this [the similarity in evaluation] is to constuct an IDeA tree in the Aquarium software for all major openings, letting the program analyze each position for a minute or so. Whether you use Rybka 3, Robbo, Fire, Houdini or any other Ippo-related engine, the final values of the opening moves look remarkably similar (scaled down a bit in Houdini), and quite different from the tree resulting from using other engines like Stockfish, Shredder, Hiarcs, Naum, or Komodo.
Although I tend to agree with this, it is not clear to me exactly what the thrust here is. Does "substantially different" mean the values, or what I consider the major R3 concept, that is, pieces attacking others (either unguarded pieces of lesser value, or "good SEE" attacks, or via x-rays, maybe guarding one's own King too)? I guess one could argue that the latter is just an idea, while the former values might be "close" via tuning [many IPPOLIT values are often "rounded to the nearest five" or close to that -- could this conceivably come from tuning with a granularity of 5?]. I think there needs to be more guidance on what it meant here by "substantially different". As stated above, the best bet might be just to take a suite of a million positions, call eval() directly on each, and do a rigourous statistical comparison.LK: But until the Ippo family starts using a substantially different evaluation function I consider myself to be the inadvertent co-author of the evaluation functions in all of these programs ...
Somewhat predictably, I think Fabien Letouzey could say something similar (though again I'd prod him to be more specific) about Rybka 1.0 through Rybka 2.3.2a. Admittedly the Fruit 2.1 values were not that well tuned, so that there is a bit more numerical difference (if I recall my data) between Rybka 1.0 and Fruit 2.1 than between IPPOLIT and Rybka 3 -- on the other hand, the "feature set" differences between Rybka 1.0 and Fruit 2.1 are almost non-existent, while they amount to maybe ~15% of the evaluation components for R3/IPPOLIT (I'd have to check the exact number, and some of the 85% matches were already in Fruit, like king safety computations via attacking a square near the opposing King).