More on Rybka/IPPOLIT (with occasional reference to Fruit)
Posted: Wed Dec 01, 2010 1:37 am
As my R3/IPPOLIT report seems to be being used in the kangaroo courts of TalkChess, perhaps I should comment:
Contrasted to this, with IPPOLIT there are various (perhaps many) technical improvements, and a few minor additions, but nothing that I would rate as substantial as the above [my impression is that it is about 15% faster on a pure engineering basis, and has improved the pruning (almost an extra ply deeper than R3 after converting the depths of the latter), while eval is a toss-up -- IPPOLIT threw away more than it added to R3's eval, and perhaps there is some additional speed from this]. I'd say the RobboBase code is a more notable achievement (as I mentioned in the other thread, it seems much superior to the other formats for most purposes, the main wart being that DTM is not used) than IPPOLIT/IvanHoe, though as I mentioned before, gaining even 50 Elo over R3 is no small shakes either [this is contrary to the claim of Don Dailey that "but the cloners so far have not been able to increase it even by 1 ELO" (I think I am applying the context correctly), though he is correct that "Every chess programmer knows that the last 100 ELO are by far the hardest"]. Using LMR at PV nodes (in later IvanHoes) is a fair-sized gain I think, but I wouldn't say it is "due" to them.
That's a fairly jaundiced view of the report. Maybe if "clones" were put in inverted commas I could agree. I interpret the word "clone" rather strictly, and by that measure, R3 and IPPOLIT don't come remotely close to such a descriptor. The word "derivative" has a technical quasi-legal meaning that I prefer to avoid (similarly with "code") -- by the traditional standards of computer chess, I would say that R3/IPPOLIT and Fruit/R1 are essentially on the same footing [qualitatively, and as I say, quantitatively it can depend on your metric], in that both R1 and IPPOLIT re-use a substantial quantity of specifics of the respective pre-cursors. [The fact that Fruit was "free and open source" and R3 a "commercial product" is not relevant to me -- there are a number of dissenters in the intellectual property world, but the more common opinion is that once software is obtained legally, an end-user can use it for the purposes of discovery unless there is an agreement to the contrary].M ANSARI: You ask "where is the proof" that they are clones ... I think the best proof is the BB report.
I quite appreciate the reminder in first part of this, as it is easy to forget that ZMII beat R2.3.2a(+) in the Mexico match (after Deep Junior accepted, but Convekta balked about conditions with remote machines, the prize fund was cut by 90% with Zappa as the replacement, and then remote machines were used for the first games in any case). However, I might dispute the second part, as my perception is that at least half of the R3 gain (which was ~85 Elo on CCRL) was from LK rewriting the Fruit-like evaluation function that was used in previous Rybka versions [there were initial elements in some 2.3.1 version, but not until R3 was the work complete]. To me, the big gain for Rybka qua VR was the 150 Elo from R1 to R2.3.2a --- I've never been able to sort out VR's claims about performance of his SMP implementation, but this (at the least) is another facet entirely due to him [unlike the IvanHoe code, which seems to borrow from Crafty or maybe Glaurung/Stockfish].M ANSARI: Even with Rybka 2.3.2a, at LTC Zappa Mexico II was scoring very close or even equal to Rybka 2.3.2a. The big breakthrough in chess engine strength came with R3 ... and R3 is the original creation of Vas
As I've said before, I think modern testing methods (at ultra-fast time controls) is one of his major innovations. I don't know what he used for eval tuning, but perhaps that too [though others have been advancing here just as much, I think]. His original stubborn championing (in the face of various naysayers) of material imbalances [based on work of LK] is another one. The idea of what I call "statistical pruning" (though Ed seemed to favour something more akin to "LMR on steroids") also seems to be largely from VR, though it has some connections to the above aspects (for instance, upon noting modern testing methods and eval tuning, doing "search" tuning is not that great a jump). In some sense, the peculiarity is that so many of these ideas actually worked (I mean, anyone can generate ideas that end up in the dustbin, or are debatable as to their merits). If I were forced to make a tabulation of how R3 was 3153 on CCRL 64-bit 1cpu while R1 was 2920, I would say: 30-40 from "bug fixes" of R1 [as ZW has noted, there were some fairly ugly things like lazy eval was over-used due to confusion of scalings], 20-40 from further improving material imbalances and other pre-LK eval tweaks, 60-70 from LK's rewriting of the eval (out-sourcing is another VR innovation ), and 100 from search improvements (the substantial majority from pruning, and already ~75 of this gained by the time of R2.3.2a). But certainly don't take this as more than a wild guess. For the 100 or so points for Fruit to R1 (after making the 32-bit to 64-bit adjustment), I think 15 is from superior engineering (bitboards are about 15% faster even for 32-bit), 40-50 from the material imbalance table, 20-30 from eval tuning, 20-30 from (gasp!) going beyond strict AEL pruning (a primordial phase for pruning in some sense, but perhaps psychologically significant), and minus 25 (or more) from introduced bugs.Dann Corbit: Clearly, what Vas has done is also very innovative. You do not pull hundreds of Elo out of thin air.
Contrasted to this, with IPPOLIT there are various (perhaps many) technical improvements, and a few minor additions, but nothing that I would rate as substantial as the above [my impression is that it is about 15% faster on a pure engineering basis, and has improved the pruning (almost an extra ply deeper than R3 after converting the depths of the latter), while eval is a toss-up -- IPPOLIT threw away more than it added to R3's eval, and perhaps there is some additional speed from this]. I'd say the RobboBase code is a more notable achievement (as I mentioned in the other thread, it seems much superior to the other formats for most purposes, the main wart being that DTM is not used) than IPPOLIT/IvanHoe, though as I mentioned before, gaining even 50 Elo over R3 is no small shakes either [this is contrary to the claim of Don Dailey that "but the cloners so far have not been able to increase it even by 1 ELO" (I think I am applying the context correctly), though he is correct that "Every chess programmer knows that the last 100 ELO are by far the hardest"]. Using LMR at PV nodes (in later IvanHoes) is a fair-sized gain I think, but I wouldn't say it is "due" to them.