Shogi Bonanza

Code, algorithms, languages, construction...
Post Reply
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Shogi Bonanza

Post by BB+ » Tue Jan 31, 2012 7:06 am

As some people know, Bonanza is a (very strong) open-source Shogi program. Here are some relevant comments about its use/adaptation in other Shogi programs.

First, from 2009, by Reijer Grimbergen (Spear author) http://www.teu.ac.jp/gamelab/SHOGI/CSA2009/19csa.html
This year I was planning to skip the CSA tournament for a long time. I was really fed up with spending a huge amount of time tuning the evaluation function of Spear in the months running up to the tournament. Bonanza had opened my eyes (and those of many others) to a new method for automatic optimization of the evaluation function, which would basically end these fruitless hours of adding a few points here and subtracting a few points there. As I saw it, adding learning to Spear would be the only way to improve the program enough to warrant an entry.

The problem was that everybody seems to know how to implement Hoki's learning method except me. Of course I had read the papers, but I still didn't have a full grasp of the method in such a way that I could program it. The initial plan was to use the summer to try and implement Hoki's learning method, but at that time it became clear that I would start a new job from April 1st 2009 and work just seemed to snowball from there. During the summer, I generated a new opening book and set up a new evaluation function for Spear using hashed patterns, but there was no time to try and implement Bonanza's learning method.
[...]
Then early February, just a few days before my 42nd birthday, I got an early present: the Bonanza source code was released! This was really a gift, because what I already suspected turned out to be true: the inner workings of Spear and Bonanza are very similar, in general based on the Crafty chess program. Therefore, for me the Bonanza source code was very easy to understand. Because of this, it took me less than two weeks (not full time, just a couple of hours in the evening) to get the learning code to work. As a result, I had more than two months to optimize the new evaluation function I had made. This got me excited, because I was now in full preparation of moving and the idea of Spear getting stronger while I was packing was almost too good to be true.

Unfortunately, things didn't go as well as hoped. The learning programming did its job, but the evaluation function I had made turned out to be unable to cover all the important aspects of shogi positions. The Bonanza method is based on the assumption that evaluation function features can be updated in such a way that the program will learn to select the moves that professionals played, but even after long learning sessions there were just too many position where the professional move was not selected by Spear. I had a version of Spear that just used the evaluation features of Bonanza and this version kept beating the other versions into the ground. This was the situation just a couple of days before moving from Yamagata to Tokyo (end of March). What to do?

In the end, I decided to further tune the original evaluation features of Bonanza, tweaking some of the learning parameters. After a few weeks of learning, this improved the program slightly over the original Bonanza evaluation features. I also added some of the hashed patterns for king defence to the evaluation and this was the version I entered in the tournament.
Conclusion
Three years after its shocking victory in the CSA tournament, Hoki's Bonanza program is still having a huge impact on computer shogi. The "Bonanza Method", a supervised learning method for tuning evaluation function features is now employed by all the top programs in this year's tournament. GPS Shogi, Otsuki Shogi, Monju, KCC Shogi and Bonanza (of course) have all benefited from the Bonanza Method.
Another interesting aside was that only a few programs had actually used the code that Bonanza made public in February. Only Monju was using the full code, and most programmers even seemed to take some pride in not even reading the Bonanza code. I read the code and I can assure everyone that there are quite some gems in there that I didn't have time to add to Spear but that I will use for next year's version. As said, Spear's code is very similar to Bonanza, so this is not a difficult task.

I am sure that next year a lot of other programs will be using at least part of the Bonanza code. Bonanza will become the standard and everybody can build its own ideas on top of it. This opens the door for different programs and it will be interesting to see how this new wave of "Bonanza Children" will have an impact on next year's tournament. I truly hope that Spear will be one of the program benefiting from Hoki's generosity.
Monju, as can be read in the link, was a 6-way majority-randomised Bonanza, with the blessing of the author of the latter (and the tournament committee)].
After the tournament...

It was only after the tournament finished that news reached me about the release of a new version of Bonanza. This version was released on May 1st, one day after the my university closed for the holidays. Because I had no Internet at home yet, I had no access to this news. Therefore, Spear played the CSA tournament with a version of the previous Bonanza feature file instead of the latest one. This was a big difference. Playing Spear on the floodgate server after the tournament with the new Bonanza evaluation features improved the program dramatically (an increase of more than 100 ELO points).

It is unclear how much impact this would have had in the actual tournament. On the floodgate server most programs play on hardware that is significantly slower than the hardware used at the CSA tournament. Furthermore, because the new Bonanza version was released so close to the CSA tournament, there would have been no time for trying any improvements, both in the learning or otherwise. Spear would have basically been a Bonanza clone, and playing it would be quite strange considering the fact that I vowed last year not to do like Bonanza at all costs.
As can be seen here, the 2011 competition had 5 of about 35 entries using the Bonanza library. Reading the descriptions (the links in the far right column), one finds that many use a so-called Bonanza-like method for learning.

Post Reply