Strange Stockfish behavior?

Uly · Post by **Uly** » Mon Mar 21, 2011 12:27 pm

Okay, will send them next time I see them (I should have kept a record!)

Ancalagon · Post by **Ancalagon** » Mon Mar 21, 2011 7:25 pm

hyatt wrote:The behavior is _not_ correct. Perhaps you meant that what he observed actually happens? How can it be correct to get a fail high and then not play that move??? If you fail high and it is not a better move, that's a bug. If you fail high but play the original move because you didn't have time to resolve the fail high, that's also a bug. I see no reason to throw away information you worked hard to discover (the original fail high)...

It is quite easy to have bugs there, from experience...

If you do a parallel split at the root, it can be even trickier...

The difference with the process as you describe it is, as I tried to describe above that Stockfish after a Fail High will enter a Fail High loop staying at the same depth/iteration. Alpha stays the same but beta is increased. Now if on one of these passes the move fails low against alpha, because of search instability for instance, you must conclude it was a false fail high. At that moment the original best move with which you started the iteration, assuming it hasn't been replaced yet by something better, is your best choice again. The only thing is that Stockfish makes no mention of this. The only reason I can think of is that sending Fail Low PV is both inaccurate (it is a nullwindow search), costs a little time and a GUI might be in doubt which is the best move, the move that should be played if the time is up. I'm not sure how the UCI protocol resolves this exactly.

If there are many iterations with short intervals between them it is not a big deal and indeed the time saved can be of some theoretical influence in very short timecontrols. In correspondence chess it would be better to just send the Fail Low as that is readily available. The user can then conclude there was a move with a higher score at the same depth.

Maybe I'm missing something but Bob you should be able to find the relevant code in Stockfish quickly and see if you see something wrong. If Stockfish consistently or in more than 50% of the cases rejects better moves, maybe only when the transposition table is saturated, then at the least the Fail High loop could be improved upon...

Regards, Eelco

mcostalba · Post by **mcostalba** » Thu Mar 24, 2011 11:25 am

User is confused by a fail high (sent to GUI) followed by a fail low (not sent)

We will probably change SF behavior to the standard one followed by other engines, namely to send info to GUI only after any fail high / low is resolved.

Ancalagon · Post by **Ancalagon** » Thu Mar 24, 2011 3:25 pm

mcostalba wrote:User is confused by a fail high (sent to GUI) followed by a fail low (not sent)

We will probably change SF behavior to the standard one followed by other engines, namely to send info to GUI only after any fail high / low is resolved.

But Marco, would that not be throwing out the baby with the bathwater? In a typical analysis run, at high depths the times between updates of the PV can still easily be the better part of an hour (if not more), even with Fail High output as it is now, and the users would, with the proposed change, be completely in the dark what Stockfish is thinking about in the entire Fail High loop. I think it is well enough thought out as it is, but maybe improving search could mean there is less chance of a false Fail low in the Fail High loop, which is what you really want I think. This should really not be very hard (to improve), but it depends on what Uly is seeing is really true or if it was just bad luck in his fixed depth analysis. He better come up with a convincing example with a Fen to match, if that means he has to give up a running game or three, it is but for the greater good

Eelco

hyatt · Post by **hyatt** » Thu Mar 24, 2011 5:33 pm

Ancalagon wrote:
hyatt wrote:The behavior is _not_ correct. Perhaps you meant that what he observed actually happens? How can it be correct to get a fail high and then not play that move??? If you fail high and it is not a better move, that's a bug. If you fail high but play the original move because you didn't have time to resolve the fail high, that's also a bug. I see no reason to throw away information you worked hard to discover (the original fail high)...

It is quite easy to have bugs there, from experience...

If you do a parallel split at the root, it can be even trickier...
The difference with the process as you describe it is, as I tried to describe above that Stockfish after a Fail High will enter a Fail High loop staying at the same depth/iteration. Alpha stays the same but beta is increased. Now if on one of these passes the move fails low against alpha, because of search instability for instance, you must conclude it was a false fail high. At that moment the original best move with which you started the iteration, assuming it hasn't been replaced yet by something better, is your best choice again. The only thing is that Stockfish makes no mention of this. The only reason I can think of is that sending Fail Low PV is both inaccurate (it is a nullwindow search), costs a little time and a GUI might be in doubt which is the best move, the move that should be played if the time is up. I'm not sure how the UCI protocol resolves this exactly.

I do not believe in the concept "false fail high". That's a convenient excuse for a bug. How can a position fail high on several attempts to raise beta, and then fail low? Only one reason. A TT entry has a fail-high result stored, with sufficient draft to be used in this search. And it will keep telling you "score is >= this bound" which will make you fail high until beta is bumped about the TT bound. Then you might not be able to search deeply enough to see the actual best score due to time constraints or TT overwrites. But that fail high is _not_ false. It is dead correct and you should play that move instead of the one already found best.

I continue to be amazed at how many ugly solutions there are to problems that don't exist if care is taken. "Don't store actual mate scores". "Don't store Draw scores." "Don't store null-move search scores". Etc. If my search failed high, and then low, and the move was actually bad, I would be looking to see what I am doing wrong in the search. How could one justify the concept that "a fail high is wrong if you get a fail low, at the root" where in the normal search, we take a fail-high every time it happens. That's ridiculous.

If there are many iterations with short intervals between them it is not a big deal and indeed the time saved can be of some theoretical influence in very short timecontrols. In correspondence chess it would be better to just send the Fail Low as that is readily available. The user can then conclude there was a move with a higher score at the same depth.

Maybe I'm missing something but Bob you should be able to find the relevant code in Stockfish quickly and see if you see something wrong. If Stockfish consistently or in more than 50% of the cases rejects better moves, maybe only when the transposition table is saturated, then at the least the Fail High loop could be improved upon...

Regards, Eelco

It is not necessarily a function of the fail high code. But it does suggest that something _ugly_ is going on somewhere in the search, if a fail high is not a fail high. You can certainly get a fail high and not be able to resolve it. One example:

You ponder for a very long time, and at a key position, you discover score >= 1.6... You store that with a draft of 25. You ponder the wrong move so you start over with the real move. And you quickly find this same position at the root or somewhere below the current branch. You fail high because the draft is way bigger than what you need. But when you relax beta, you have no chance to actually resolve the big score as your depth is not deep enough. So you fail low. But this move is _still_ the correct one to play. If you can get to the same depth as before, you will finally resolve the fail high and see the real score now that you have enough depth. But if time runs out, you either have to trust the TT entry, or throw the entire TT out. I'm not throwing the TT out. And I am not giving up on a "better move" just because I can't resolve the score due to time constraints.

hyatt · Post by **hyatt** » Thu Mar 24, 2011 5:40 pm

mcostalba wrote:User is confused by a fail high (sent to GUI) followed by a fail low (not sent)

We will probably change SF behavior to the standard one followed by other engines, namely to send info to GUI only after any fail high / low is resolved.

I don't think this is standard. I display a fail high always, because it is a better move. I do, on occasion, get a 1-move PV when the search fails high and then fails low due to the previously explained TT anomaly.

Uly · Post by **Uly** » Thu Mar 24, 2011 11:36 pm

mcostalba wrote:We will probably change SF behavior to the standard one followed by other engines, namely to send info to GUI only after any fail high / low is resolved.

Please consider making this behavior optional, as I've said, the move that fails high and then low tends to be the best move most of the time.

Ancalagon · Post by **Ancalagon** » Sun Mar 27, 2011 5:06 pm

Jeremy Bernstein wrote:Uly, send me some positions and whatever additional info you can give me via PM. I can try to isolate the issue (and I won't put your Corr games in jeopardy, promise).

I can say what I do in Rainbow Serpent that may help a little. I just let alpha rise a little, including oldalpha, when beta is raised too, so that when there is a Fail Low against alpha, this is the result of a nullwindow search out of the searchwindow at the bottom but because it is an upper limit, it may still be a little bit better than the best score i.e. the original alpha at the start of the Fail High loop. (Because you are inside the fail high loop you can choose to do different things there that will not have much impact on overall performance, you you could think of a variety of things, fractional plies added to the search etc. But adding depth to the PV search is inadvisable because then at the next iteration the PV search will just return a hash result that is deep enough, and this disturbs iterative deepening.)

The Aspiration Deltas are calculated a bit differently. This only indirectly would have some effect on Fail Highs, but you never know if it helps. Instead of

Code: Select all

        // Calculate dynamic aspiration window based on previous iterations
        if (MultiPV == 1 && Iteration >= 6 && abs(ValueByIteration[Iteration - 1]) < VALUE_KNOWN_WIN)
        {
            int prevDelta1 = ValueByIteration[Iteration - 1] - ValueByIteration[Iteration - 2];
            int prevDelta2 = ValueByIteration[Iteration - 2] - ValueByIteration[Iteration - 3];

            AspirationDelta = Max(abs(prevDelta1) + abs(prevDelta2) / 2, 16);
            AspirationDelta = (AspirationDelta + 7) / 8 * 8; // Round to match grainSize

            alpha = Max(ValueByIteration[Iteration - 1] - AspirationDelta, -VALUE_INFINITE);
            beta  = Min(ValueByIteration[Iteration - 1] + AspirationDelta,  VALUE_INFINITE);
        }

I have

Code: Select all

        // Calculate dynamic aspiration window based on previous iterations
        if (MultiPV == 1 && Iteration >= 6 && abs(ValueByIteration[Iteration - 1]) < VALUE_KNOWN_WIN)
        {
            int prevDelta1 = ValueByIteration[Iteration - 1] - ValueByIteration[Iteration - 2];
            int prevDelta2 = ValueByIteration[Iteration - 2] - ValueByIteration[Iteration - 3];
            int prevDelta3 = ValueByIteration[Iteration - 3] - ValueByIteration[Iteration - 4];


            AspirationDelta = Max(abs(prevDelta1) + abs(prevDelta2) / 2 + abs(prevDelta3) / 4, Iteration);
            AspirationDelta = (AspirationDelta + 7) / 8 * 8; // Round to match grainSize (Still necessary?)

            alpha = Max(ValueByIteration[Iteration - 1] - AspirationDelta, -VALUE_INFINITE);
            beta  = Min(ValueByIteration[Iteration - 1] + AspirationDelta,  VALUE_INFINITE);
        }

And in the Fail High loop the original code was

Code: Select all

                // Prepare for a research after a fail high, each time with a wider window
                beta = Min(beta + AspirationDelta * (1 << researchCountFH), VALUE_INFINITE);
                researchCountFH++;

            } // End of fail high loop

with changes which at moment lets alpha rise a bit in Rainbow Serpent by

Code: Select all

                
                // Prepare for a research after a fail high, each time with a wider window
                oldBeta = beta;
                beta = Min(Max(beta + AspirationDelta * (1 << researchCountFH), value), VALUE_INFINITE);
                Value newAlpha = alpha + (beta - oldBeta)/(2 + researchCountFH);
                if (newAlpha > VALUE_DRAW && alpha <= VALUE_DRAW) newAlpha = VALUE_DRAW;
                oldAlpha = alpha = newAlpha;

                researchCountFH++;

            } // End of fail high loop

The difficulty is testing whether this has the desired effect. If you could tackle the problem at the transposition table level, if that is indeed the cause, I have trouble visualizing how that could be but also I would not know how to test that. So this is just one example of what you could try, and repeated Fail Highs are rare enough that you could do some fancy things without making the search immediately slower.

Eelco

Uly · Post by **Uly** » Mon Mar 28, 2011 8:53 am

Uly wrote:
Ancalagon wrote:It is just a bit complicated going back to the old PV and send it to the GUI when there is bound to be a new one from the next iteration anyway, assuming no other move fails high, the trouble is with fixed depth there is no new iteration... So that is confusing if you rely on this.
It also happens at fixed time per move.

Now confirmed to happen at infinite analysis, this is unrelated to play mode. Stockfish has a move failing high, abandons it, and it turns to be best (or, better than what Stockfish chose). I have sent a position to Jeremy.

mcostalba's solution is not good, it would hide the best move from analysis! At least now the user can see the false fail high and force it (a false "false fail high"?)

hyatt · Post by **hyatt** » Mon Mar 28, 2011 3:57 pm

This is a particular form of bug fixing called "treat the symptom rather than treating the bug".

false fail-highs should not occur...

OpenChess

OpenChess

Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?

Re: Strange Stockfish behavior?