Page 1 of 15

Strange Stockfish behavior?

Posted: Sun Mar 13, 2011 10:33 am
by Uly
I've noticed something weird at fixed depth, for instance:
.17/29	 0:00 	-2.63 	35...Bd3 36.e6 Bxf5 37.e7 Rfb8 38.Rb4 Bd7 39.Rf4 Kh7 
 18/10	 0:01 	-2.52--	35...Bd3 36.e6 Bxf5 37.e7 Rfb8 38.Bc4 Bd7 39.Bb5 
 18/10	 0:01 	-2.41--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/10	 0:01 	-2.19--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/10	 0:02 	-1.74--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/16	 0:03 	-0.85--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/21	 0:04 	-2.40 	35...Rfb8 36.Rb4 Bd3 37.e6 Bxf5 38.exf7+ Kf8 39.Rxh4 
 19/12	 0:05 	-2.10--	35...Rfb8 36.e6 Kf8 37.Bd5 g5 38.Rxb7 Rxb7 39.Bxb7 
 19/12	 0:05 	-1.79--	35...Rfb8 36.e6 Kf8 37.Bd5 g5 38.Rxb7 Rxb7 39.Bxb7 
 19/25	 0:07 	-1.40 	35...Rfb8 36.e6 Kf8 37.Bd5 Ba6 38.Kf2 Rc8 39.Bb3   
 20/32	 0:14 	-0.97 	35...Rfb8 36.e6 Kf8 37.Bd5 fxe6 38.fxe6 Bg4 39.Rb4   
 20/26	 0:15 	-1.27 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6 
 21/26	 0:21 	-1.14 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6
 22/27	 0:26 	-1.09 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6  
 22/15	 0:28 	-1.37++	35...Rfb8 36.e6 Kf8 37.Bd5 fxe6 38.fxe6 Bg4 39.Kf2 g5
 best move: Ra8-a6 time: 0:28.860 min  n/s: 3.459.849  nodes: 99.847.806
Here I'm using fixed depth 22, Stockfish is expected to resolve that fail high, OR to make Rfb8 on the board. Instead, it aborts and plays Ra6.

I recall seeing other instances in games, like, Stockfish has a fail low and finds an alternative that fails high, but it makes the move that failed low anyway.

Re: Designing an analysis friendly Stockfish?

Posted: Sun Mar 13, 2011 4:18 pm
by keoki010
Uly wrote:I've noticed something weird at fixed depth, for instance:
.17/29	 0:00 	-2.63 	35...Bd3 36.e6 Bxf5 37.e7 Rfb8 38.Rb4 Bd7 39.Rf4 Kh7 
 18/10	 0:01 	-2.52--	35...Bd3 36.e6 Bxf5 37.e7 Rfb8 38.Bc4 Bd7 39.Bb5 
 18/10	 0:01 	-2.41--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/10	 0:01 	-2.19--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/10	 0:02 	-1.74--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/16	 0:03 	-0.85--	35...Bd3 36.e6 fxe6 37.fxe6 Rf1+ 38.Kh2 Kf8 39.Rxb7 
 18/21	 0:04 	-2.40 	35...Rfb8 36.Rb4 Bd3 37.e6 Bxf5 38.exf7+ Kf8 39.Rxh4 
 19/12	 0:05 	-2.10--	35...Rfb8 36.e6 Kf8 37.Bd5 g5 38.Rxb7 Rxb7 39.Bxb7 
 19/12	 0:05 	-1.79--	35...Rfb8 36.e6 Kf8 37.Bd5 g5 38.Rxb7 Rxb7 39.Bxb7 
 19/25	 0:07 	-1.40 	35...Rfb8 36.e6 Kf8 37.Bd5 Ba6 38.Kf2 Rc8 39.Bb3   
 20/32	 0:14 	-0.97 	35...Rfb8 36.e6 Kf8 37.Bd5 fxe6 38.fxe6 Bg4 39.Rb4   
 20/26	 0:15 	-1.27 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6 
 21/26	 0:21 	-1.14 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6
 22/27	 0:26 	-1.09 	35...Ra6 36.Rxb7 Bg4 37.e6 fxe6 38.fxe6 Bxe6  
 22/15	 0:28 	-1.37++	35...Rfb8 36.e6 Kf8 37.Bd5 fxe6 38.fxe6 Bg4 39.Kf2 g5
 best move: Ra8-a6 time: 0:28.860 min  n/s: 3.459.849  nodes: 99.847.806
Here I'm using fixed depth 22, Stockfish is expected to resolve that fail high, OR to make Rfb8 on the board. Instead, it aborts and plays Ra6.

I recall seeing other instances in games, like, Stockfish has a fail low and finds an alternative that fails high, but it makes the move that failed low anyway.
Give me a fen on the position and I'll try it on my machine Uly.

Re: Designing an analysis friendly Stockfish?

Posted: Mon Mar 14, 2011 4:15 am
by Uly
It's not something that can be reproduced (because of multiprocessor and hash contents), it just sometimes rarely(?) randomly happens on a fail high at fixed depth or a fail low then fail high at fixed time per move.

Will try to find a position to reproduce this at 1CPU

Re: Designing an analysis friendly Stockfish?

Posted: Mon Mar 14, 2011 6:25 pm
by Ancalagon
Uly wrote:It's not something that can be reproduced (because of multiprocessor and hash contents), it just sometimes rarely(?) randomly happens on a fail high at fixed depth or a fail low then fail high at fixed time per move.

Will try to find a position to reproduce this at 1CPU
I think this is probably a instance of a little design flaw that Uri also mentioned a while back; if Stockfish enters the Fail High loop, it will widen its search window upwards. But if on the next pass through the loop it finds that the move now actually is worse again than the previous best move, it will just abort the fail high loop without updating the PV. It costs no elo because AFAIK when it is time to move, Stockfish will just play the move it thinks it is best. But for the user it is a bit confusing, if you had let Stockfish calculate a 23rd ply you would see he had gone back to the previous best move. That is my explanation of what you see...

Ancalagon

Re: Designing an analysis friendly Stockfish?

Posted: Tue Mar 15, 2011 1:15 am
by Uly
Ancalagon wrote: But for the user it is a bit confusing, if you had let Stockfish calculate a 23rd ply you would see he had gone back to the previous best move.
In all the stances that I've seen, the fail high of the move that wasn't played, stays (but I haven't seen many instances, so it could have been luck.)

In this specific example, I restarted analysis at fixed depth 23 (I'm using fixed depth to analyze my games) 35...Rfb8 was indeed better, and it should have been played (or Stockfish should have kept analyzing it, or something, I don't know what the behavior should be).

Anyway, this is probably unrelated to the changes made in the Designing an analysis friendly Stockfish? thread.

Re: Strange Stockfish behavior?

Posted: Sun Mar 20, 2011 4:09 am
by Uly
Okay, confirmed, what Ancalagon says is right, Stockfish goes back to the old move in the new iteration even at infinite analysis, so its behavior is correct.

What is weird is that in all the cases that I've seen, the move that is failing high and is discarded is better than the old move once you force it or make Stockfish look at it at depth +1, so it's still strange (Stockfish seems wrong in abandoning the new move).

Re: Strange Stockfish behavior?

Posted: Sun Mar 20, 2011 7:42 pm
by hyatt
The behavior is _not_ correct. Perhaps you meant that what he observed actually happens? How can it be correct to get a fail high and then not play that move??? If you fail high and it is not a better move, that's a bug. If you fail high but play the original move because you didn't have time to resolve the fail high, that's also a bug. I see no reason to throw away information you worked hard to discover (the original fail high)...

It is quite easy to have bugs there, from experience...

If you do a parallel split at the root, it can be even trickier...

Re: Strange Stockfish behavior?

Posted: Sun Mar 20, 2011 8:09 pm
by Ancalagon
Uly wrote:Okay, confirmed, what Ancalagon says is right, Stockfish goes back to the old move in the new iteration even at infinite analysis, so its behavior is correct.

What is weird is that in all the cases that I've seen, the move that is failing high and is discarded is better than the old move once you force it or make Stockfish look at it at depth +1, so it's still strange (Stockfish seems wrong in abandoning the new move).
Yes, I think so, Uly, I never tried this with fixed depth but Stockfish should not be doing the Fail High loops any different with fixed depth, so that can't make any difference. It is just a bit complicated going back to the old PV and send it to the GUI when there is bound to be a new one from the next iteration anyway, assuming no other move fails high, the trouble is with fixed depth there is no new iteration... So that is confusing if you rely on this. (I am not sure but I thought Rybka does the same and Vas stated that he does not want to output a PV of a move when it no longer fails high, I think because it is no longer the move you would play if you had to play a move right away. So it is probably elo related? If the first move fails, i.e. your main variation drops in value, at that iteration you don't have anything yet searched to the same depth so that is different, there is no alternative yet that you can fall back on)

Maybe what you saw was a coincidence. There are bound to be false fail highs that are actually from inferior moves, but maybe these are fewer, that would only mean Stockfish has a good "intuition" but in the PV search it may find some resources for the opponent that were pruned in the null window search that went out of the top of PV searchwindow and that caused a Fail High. It is not always straightforward even if you have the first moves right.

Regards, Eelco

Re: Strange Stockfish behavior?

Posted: Mon Mar 21, 2011 11:23 am
by Uly
hyatt wrote: If you fail high and it is not a better move, that's a bug.
This is what seems to be happening, only, when the user forces the move it does turn out to be better at higher depth. I'd suggest users to be aware of these "false" fail highs that Stockfish does not play, but are better.

Zappa Mexico does this too (the move that seems to fail high but turns out to be worse), with the difference that the moves ARE worse.

Shredder does this silently, a move fails high but it doesn't show anything, unless the user stops the analysis or tells Shredder to make a move (then Shredder makes the move that fails high, not the one that that it was showing).
Ancalagon wrote:It is just a bit complicated going back to the old PV and send it to the GUI when there is bound to be a new one from the next iteration anyway, assuming no other move fails high, the trouble is with fixed depth there is no new iteration... So that is confusing if you rely on this.
It also happens at fixed time per move.
Ancalagon wrote: (I am not sure but I thought Rybka does the same
No, on fail highs Rybka always plays the move that fails high, and if a move fails high, it's always the better move (unless later on Rybka finds an even better one, but you never see Rybka returning to old moves).
Ancalagon wrote:Maybe what you saw was a coincidence.
I've seen about 10 cases, where the same happens, Stockfish has a fail high, but plays another move. If one checks the move that failed high, it's better than what Stockfish played. I've never seen cases where the played move was better.

Unfortunately those cases have been on corr games and I'd find bad etiquette to publish positions of games that are still going on (besides, if I showed the last case, it could be an advantage against my opponent, I'm a bit paranoid).
Ancalagon wrote: There are bound to be false fail highs that are actually from inferior moves, but maybe these are fewer
Stockfish and Zappa are the only engines that behave like this, all other engines that I know only let moves fail high when they are sure they are better. Zappa always plays the better move, Stockfish seems to abandon it, it's what I find weird in this kind of behavior (that is rare itself).

Re: Strange Stockfish behavior?

Posted: Mon Mar 21, 2011 11:55 am
by Jeremy Bernstein
Uly, send me some positions and whatever additional info you can give me via PM. I can try to isolate the issue (and I won't put your Corr games in jeopardy, promise).