We went to depth 15 or 16. And found _no_ statistical significance to depth. That is, we (crafty) were changing our mind at a pretty consistent rate of 15% for each additional ply. Whether it was going from depth 5 to 6 or depth 14 to 15.. That was the point of the paper (Monty Newborn's idea, he did most of the writing, I did most of the data testing) in fact... 5% is not much when median is 15%.Sentinel wrote:If you used Rybka you would get not more than 10%. If you used SF, you would get more than 20%. How reliable are these numbers?hyatt wrote:ponder hits go up as two programs are closer in overall design. In the "Crafty goes deep" (and others that also tested the same ideas) we found that if you go one ply deeper, a program will change its mind about 15% of the time, or it will stay with the best move from the previous search 85% of the time.
What is the depth where you tried "Crafty goes deep"? My feeling is that the percentage will very with the depth and not necessarily be lower at higher depth. Also I would expect that standard deviation of the given percentage would be more than 5% across various depths.
Ponder hit rate of the "Litos" to Rybka 3/4
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Ponder hit rate of the "Litos" to Rybka 3/4
Re: Ponder hit rate of the "Litos" to Rybka 3/4
This is quite a useful information, meaning that search tree is very stable. I would expect to have more variance (I actually meant 33% sigma, just didn't put it in right words).hyatt wrote:We went to depth 15 or 16. And found _no_ statistical significance to depth. That is, we (crafty) were changing our mind at a pretty consistent rate of 15% for each additional ply. Whether it was going from depth 5 to 6 or depth 14 to 15.. That was the point of the paper (Monty Newborn's idea, he did most of the writing, I did most of the data testing) in fact... 5% is not much when median is 15%.
However, it would be an interesting thing to be tried today with top engines which have much more aggressive pruning and null move reductions, would the stability between depths hold (and would it hold with engines like Stockfish that change its mind more often).
If it holds, it means basically that the gain with each additional depth iteration is constant. It might be also used as a measure of pruning aggressiveness. If change rate starts dropping with increased depth, we are simply pruning too much (assuming ofc that the score at higher depth is more accurate than at lower depth)...
Re: Ponder hit rate of the "Litos" to Rybka 3/4
It is not exactly correct.hyatt wrote:
If you play A vs A (same program) then a ponder hit rate of 85% would be pretty common. program A1 searches to depth D to choose its move, and therefore the ponder move is based on a search of depth D-1. When program A2 searches it will go one ply deeper, and change its mind 15% of the time. Staying with the D-1 move 85% of the time and giving A1 85% ponder hits.
The problem is that program A is not going to always play the same move at specific depth and the move may be dependent on the content of the hash in previous searches(and in case of using more than one processor also on luck because the program is not deterministic).
The only way to find A vs A ponder hit is by testing and I suspect that it may be dependent on the program and not only on the probability that the program changes its mind from depth D-1 to depth D.
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Ponder hit rate of the "Litos" to Rybka 3/4
While I don't disagree in general, here, specifically, I suspect the 85% will hold overall. In 50 moves, you might play 1 different move because of hash, or pruning due to history counters, or killer move ordering, etc. But once you get past that one move, you are back to the 85% most of the time.Uri_Blass wrote:It is not exactly correct.hyatt wrote:
If you play A vs A (same program) then a ponder hit rate of 85% would be pretty common. program A1 searches to depth D to choose its move, and therefore the ponder move is based on a search of depth D-1. When program A2 searches it will go one ply deeper, and change its mind 15% of the time. Staying with the D-1 move 85% of the time and giving A1 85% ponder hits.
The problem is that program A is not going to always play the same move at specific depth and the move may be dependent on the content of the hash in previous searches(and in case of using more than one processor also on luck because the program is not deterministic).
The only way to find A vs A ponder hit is by testing and I suspect that it may be dependent on the program and not only on the probability that the program changes its mind from depth D-1 to depth D.