SedatChess

As in chess tournaments and matches...
Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Sun Sep 29, 2024 6:59 pm

UPDATE 2

Really sad... for example:
After checking more closely the latest Error margin test...
I've found several games lost on time by SF-POLY 210924a

Note also that about former Champion: SF-Poly 220723:
I never seen/noticed any game to be lost on time...
What does it mean ? why some of latest engines
Are became as worst (not so much stable)?
Dual Nets can not be reason.. because some
SF based ones (with dual nets) never loose on time...
But I wonder much ? Any opinions over these issues ?

By the way, the good news is that,
So far Mr. Eduard's engines seems be very stable..great !
E.g so far no any game is recorded to be lost on time !!

Ok...that's all for now...exc. all engines are played with move overhead: 400

Greetings

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Sun Sep 29, 2024 7:33 pm

Meanwhile,
I realized to quote one of my old posting/ ranking...
viewtopic.php?f=4&p=34207#p34207

Who knows? older stats may help here...where in those times:
0 (zero) game is recorded on time loss (based on 27750 games)
Sedat Canbaz wrote:
Thu Mar 07, 2024 8:27 am

Code: Select all

Rank Name              Elo    +    - games score oppo. draws 
   1 Brainlearn 27    3780    2    2  2800   51%  3776   95% 
   2 CoolIris 11.80   3779    2    3  2700   50%  3776   96% 
   3 RapTora 2.3      3779    3    3  2400   50%  3776   96% 
   4 SF-PB 080124     3778    2    2  2800   50%  3776   96% 
   5 Raid v3.4        3778    2    2  2800   50%  3777   95% 
   6 Brainlearn 26.5  3778    3    3  2200   50%  3776   96% 
   7 Polyfish 140124  3778    2    3  2700   50%  3776   97% 
   8 CoolIris 11.90   3778    3    3  2100   50%  3777   96% 
   9 SF-PB 051123     3778    3    3  2200   50%  3776   96% 
  10 Patzer AI X256   3778    3    3  2100   50%  3776   96% 
  11 Eman 9.90        3778    2    2  2800   50%  3777   96% 
  12 DarkSisTer 8.50  3777    3    3  2000   50%  3776   97% 
  13 SF POLY 261123   3777    2    2  2700   50%  3776   96% 
  14 Killfish 231123  3776    3    3  2000   50%  3776   97% 
  15 Incognito 5 Pro  3776    2    2  2700   50%  3776   95% 
  16 Hazard 3.78      3775    3    3  2000   50%  3776   96% 
  17 Tactical 281023  3775    3    3  2000   50%  3776   97% 
  18 SF POLY 220723   3775    3    2  2700   50%  3777   95% 
  19 SunLight 3       3773    3    3  2100   50%  3776   95% 
  20 XTD 010723       3773    3    3  2100   50%  3776   96% 
  21 SpecTral 5.50    3773    3    3  2100   49%  3776   95% 
  22 AWOL Z11         3773    3    3  2100   49%  3776   95% 
  23 ShashChess 34.6  3771    3    3  1900   49%  3778   94% 
  24 Sawfish 2TC      3768    3    3  1500   49%  3776   94% 
Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, Balsa, 64 Hash, 4-MEN
Note: In the beginning is started at 30s+0.5s but later switched to 30s+0.6s
In other words, mostly of the current games are played at TC: 30sec + 0.6sec

GAMES:
https://mega.nz/file/D5ginASb#r0qRGjiBY ... mnuNINhpmk
Btw, If you need more data (as facts that all stable...) just let me know please...

For these reasons, again and again I wish to say..
Not always newer is better..and not everything as it seems!

Best,
Sedat

Homayoun
Posts: 1443
Joined: Tue Mar 21, 2023 4:57 pm
Real Name: Homayoun

Re: SedatChess

Post by Homayoun » Sun Sep 29, 2024 9:02 pm

Sedat Canbaz wrote:
Fri Sep 27, 2024 9:51 pm
One thing more and once more:
Thanks to all eng authors as well, otherwise I would not run...
Special thanks to authors: Cfish, Berserk, Spectral, Shashchess!

One note more,
Unfortunately all my tested Eman engines series are crashed
And buggy on my tournament machine via Eman exp (1.5+ GB)
It seems Eman eng needs serious optimization on modern machines!
At least in case of using HUGE exp files..sure in Gauntlet mode etc.
And these crashes are appearing in beginning of test..Cutechess is
Directly terminating...sad really..we are in 2024.. but you see...

But the good news is that,
Spectral is Spectral... this great engine is stable via Eman exp as well!!
It does not matter in Gauntlet, Round-Robin etc. well-done to Mr. Anton

And keep up the great work!

Greetings )
Agree completely. ;)

Homayoun
Posts: 1443
Joined: Tue Mar 21, 2023 4:57 pm
Real Name: Homayoun

Re: SedatChess

Post by Homayoun » Sun Sep 29, 2024 9:08 pm

Sedat Canbaz wrote:
Sun Sep 29, 2024 12:55 pm
Homayoun wrote:
Sat Sep 28, 2024 7:04 pm
Thank you very much Sedat. Very Useful test for chess fans who want to use this new engines. Although besides strength, the Play style of engine is also very important factor for a chess player to choose an engine for his daily practice and analysis.
All the bests
Hello Homayoun,

Not at al..its my pleasure )

A good point.. as you said,
Exc. for analysis, plus authors may use Top engines,
For newer and stronger Book / Exp move ideas too!

And as other important issue,
To check/test which of engines are stronger as well..
You know.. without tests, facts etc. will be pointless..
I mean to rely on only comments..or on small number
Of games..For these reasons.. many games required..
And this is not all..vs various players as well... but
Unfortunately not many follow what I mention here )

Btw, soon as possible I will share new results, where
The picture will be more clear about ERROR margin...
Sure in case of running small number of games (per player)
And plus when we increase the number of the games...

So please stay tuned.. )

Greetings
I agree completely.( My previous post was a misplace, I wanted to agree with this content) ;)

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Sun Sep 29, 2024 11:52 pm

Homayoun wrote:
Sun Sep 29, 2024 9:08 pm
Sedat Canbaz wrote:
Sun Sep 29, 2024 12:55 pm
Homayoun wrote:
Sat Sep 28, 2024 7:04 pm
Thank you very much Sedat. Very Useful test for chess fans who want to use this new engines. Although besides strength, the Play style of engine is also very important factor for a chess player to choose an engine for his daily practice and analysis.
All the bests
Hello Homayoun,

Not at al..its my pleasure )

A good point.. as you said,
Exc. for analysis, plus authors may use Top engines,
For newer and stronger Book / Exp move ideas too!

And as other important issue,
To check/test which of engines are stronger as well..
You know.. without tests, facts etc. will be pointless..
I mean to rely on only comments..or on small number
Of games..For these reasons.. many games required..
And this is not all..vs various players as well... but
Unfortunately not many follow what I mention here )

Btw, soon as possible I will share new results, where
The picture will be more clear about ERROR margin...
Sure in case of running small number of games (per player)
And plus when we increase the number of the games...

So please stay tuned.. )

Greetings
I agree completely.( My previous post was a misplace, I wanted to agree with this content) ;)
Thank you too :difus_19

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Fri Oct 04, 2024 10:37 am

Hello Chess Friends,

As usually, I'm very pleased to announce also that,
I managed to organize another new championship!
And what's new: each book contains 5600 games !!
In other words, I think that they deserve more...but
That's what I can do my best.. at least for nowadays!

Some notes about the current played Top book participants:
The Winners of SIZE tours: Small / Medium / Large / Giant
I know too that it is not so much fair...but anyhow, I think that
It's not so bad idea to be in fight each other, right ?) if nothing
Else mainly for fun..what I can add more, a lot of things but no
Free time for all, exc. Messi's old dated one (by Mr. Angel) proves
Again to all of us as to be the strongest under these conditions!!
Sure I'm impressed a lot by rest Top books too, for examples:
Super strong performance by SENTINEL 2409 despite its very
Small in size, plus its produced DrawRatio is lowest, just: 89%
Geralt is the only Public one, plus small + old dated...so nothing
Strange...that ranked at last place...but in 7th tour (via Cfish..):
Geralt is Geralt..where managed to be 3rd place...really good!

As other very important issue is that,
I realized to run many separate tours, played by various engines!
And via this testing method..now is much clear the influences e.g
Error margin and this is not all, we can compare Eng/Books Draw
Records as well...for more notes I suggest to read 'More Details'

XXXVI's GRAND Champion: Chucaro - Congrats to Angel Morano!!
My Congratulations to all rest Former Champions Authors as well!

For More Details, Full Standings etc:
https://sites.google.com/site/computers ... k-nn-cs-36

GAMES:
https://mega.nz/file/OppjxDTI#ZusW5Fi7K ... 8T3g-ho9io

That's all for now...thanks for your interest...

Best Regards,
Sedat Canbaz

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Fri Oct 04, 2024 2:50 pm

UPDATE

A new STAR is born, but belongs to brightest ones!
And the name of this great star is SF-PB 220324 SC
A super strong engine, plus so far the less drawish
Than all tested engines, which are close to 3800+
Really that means a lot ..especially for book tours!
One thing in SF-PB missing: not capable to use CTG..
But no one work is perfect and we've to be satisfied..

Meanwhile and just to be more clear,
SF-PB 220324 SC = SF-PB via nn-b1a57edbea57.nnue

And here are the latest new strength NN results:

Code: Select all

SF-PB 220324 SC Vs SF-CTG 150724: 9 Elo difference
Here we need more games, sure for accurate metrics..
                     
1   SF-PB 220324 SC  +31/-16/=553 51.25%  307.5/600
2   SF-CTG 150724    +16/-31/=553 48.75%  292.5/600

DrawRatio is normal: 92%, since played via strong lines
-------------------------------------------------------

Default (nn-1ceb1ade0001.nnue) Vs SC (nn-b1a57edbea57.nnue)

SF-PB 220324 SC Vs SF-PB 220324 Def: 0 Elo difference
In short: just great as we see identical (in strength)
                   
           
1   SF-PB 220324 Def  +22/-21/=879 50.05%  461.5/922
2   SF-PB 220324 SC   +21/-22/=879 49.95%  460.5/922


DrawRatio high: 95% but here it seems nn-1ceb1ade0001 
Played as serious role to appear more draws..because
According to SC's itself testings: the draws were 92%

And here is the mentioned SF-PB 220324 SC Draw Test:
Note: Played each other, sure with 2 other SF-PB eng
                     
1   SF-PB 220324 SC  +23/-20/=557 50.25%  301.5/600
2   SF-PB SC (Copy)  +20/-23/=557 49.75%  298.5/600

----------------------------------------------------

Last test: Vs Brainlearn 28.1, which has CTG future!
And theirs Elo difference is almost same..not so bad!
That means just in case CTG books will be played under 
More fair conditions.. since strength matters a lot!
                     
1   SF-PB 220324 SC  +19/-17/=756 50.13%  397.0/792
2   Brainlearn 28.1  +17/-19/=756 49.87%  395.0/792

Btw, here the draw ratio is high: 95%, but sometimes
Not all in my hands.. but I will see what I can do..
Sure for appearing 'less' Draw percentage values, but 
If running SF-PB 220324 SC (for all books) then in
Recent XXXVI CS is already proved as less drawish than
All Top engines, which are close to 3800 Elo points !!

Conditions:
2x Epyc 7B12, CuteChess, 1 Core, Ponder OFF, 30s+0.6s, Balsa/Unique, 64 Hash, 4-MEN
GAMES:
https://mega.nz/file/OlB0HDxS#_PWsDKR9Z ... uXCyorOcdw

Meantime, I'd happy also the programmers to make theirs
Best too..sure for appearing less draws as well..Reminder:
I am just a simple Tester/TD here... no more no less... !)

And as a last note,
I've tested many more engines..but they were out..as
Reason: they are more drawish..and this is not all..
Some are not so stable..e.g time forfeits...rarely, but..
Or crashing in Gauntlet..and it seems they need some
Optimizations on fast + modern hardwares, if nothing
Else on 2x EPYC 7B12 (with 256 Threads / 128 Cores)
But the good news is that, current tested Top engines
Are stable..at least so far!! You know, not easy.. e.g
Playing at Bullet (30s+0.6s) + High Concurrency (64)!

Thanks for reading and have a nice weekend )

Greetings

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Fri Oct 04, 2024 3:40 pm

UPDATE 2

Just one more testing...

SF-PB 220324 SC vs Rems EXP 160824: + 6 Elo (in favor for SC)

Code: Select all

1   SF-PB 220324 SC  +50/-34/=916 50.80%  508.0/1000
2   RemsEXP 160824   +34/-50/=916 49.20%  492.0/1000
On other hand, here I am slightly surprised..e.g normally newer
Should be better (I mean the newer ones have to be stronger...)

Btw, as you may see too, this time:
Both Top engines are produced the lowest draw values: 91% great !!

Note also that
Rems EXP played as without Eng Learning (as all other engines)
Plus for all are used same conditions (such as 30s+0.6s etc.)
Be aware that all played games are included in previous post..

Best,
Sedat

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Sat Oct 05, 2024 12:26 pm

Hello there,

1st of all, just I'd like to inform you that
Cutechess 1.2.0 is one of best/most stable GUIs!
At least it's more stable than Cutechess 1.3.1

And why I say like that...? sure according to my tests!
At least, all latest new produced results indicate this!
And depending on Chess GUIs: we may see instability
Issues... and sure I am not going to re-test all GUIs...
Or to count all etc..but at least I wish to say that
Cutechess 1.3.1 is quite sensitive, not so so stable!
In same time, I am not going to share all the previous
SCCT GUI testings..but here I've some experience too,
Not much..but I have...where we've noticed also that
Not all of tested engines or tested GUIs are very stable!
Sure it all depends... but not always GUIs are as reason!
Sometimes, the engines can be as reason too...you know,
There is no any fixed formula over these stability issues!

And to be more clear (about all recent time forfeits),
Especially if via High Concurrency games + Bullet TC
I started to be afraid a lot.. such as again will appear..)
Because latest SCCT tours played under Cutechess 1.3.1
But before (e.g several months ago..) I've used to play
Mainly Cutechess 1.2.0 GUI and those times...the games
Were much more stable, at least I can't remember to appear
Often time loses by latest Top engines...sure in next days,
Weeks, months...the picture will be more clear...in short,
Time will tell...

Btw, after re-testing some Top engines which are lost on time
Under Cutechess 1.3.1, sure as next under Cutechess 1.2.0 GUI:
All of these times forfeits are disappeared...really good news!
So many troubles..sorry here..but trying newer GUI ver sounds
Good.. but not always..at least not via newer Cutechess series!
And now I wonder much too: who will pay my electricity bills ?)
Just joking...))

As final words,
Not all, but mostly engines are stable under 1.3.1 too!
But this is also true that some of the Top chess engines
Suffer under Cutechess 1.3.1, at least on my tournament
Machine..because all these time forfeits shoudl not be
As reason such as only engine bugs...In other words,
What about GUI bugs? all of them are so much stable?
I hardly doubt..because according to many GUI testings,
Sure I refer from past to present..I am perfectly aware:
Some Chess GUIs can play as serious bad role as well..

Note also that
I tried also some of latest Cutechess pre-releases too..
But they are not so so stable too..I mean similar story (
Sure rarely.. but the same engines produce time losses etc
Where via Cutechess 1.2.0, so far it seems all stable!!
If nothing else..with my tested Top engines so far...

And please stay tuned.. .soon as possible I hope to
Share new tests but played under Cutechess 1.2.0

Greetings

Sedat Canbaz
Posts: 1690
Joined: Wed Jun 21, 2023 6:29 am

Re: SedatChess

Post by Sedat Canbaz » Sat Oct 05, 2024 2:50 pm

UPDATE

1st of all,
No any game is lost on time..in short, just great!
As reason, very likely Cutechess 1.2.0 played as BIG
Influences...there is no other explanation of that...

And for anyone missed, once more:
All previous tests, tours are played under Cutechess 1.3.1
But now under Cutechess 1.2.0 .. so latest test are done
Just only for comparing..you know..I'm eng/gui doctor too )

On other hand,
Sorry to say..but it's pity that exc. Engines and Books,
Nowadays I've started testing GUI stability, influences ..
Actually nothing new... sometimes I run GUI testings..
And in this way, we can compare GUIs influences too!

Yes, we are in 2024, but unfortunately still many bugs!
I don't know about all recent as Engine or GUI bug/s..
But there is one true: under these new tested cond.
All worked flawlessly...so it seems Chess GUIs can
Play as serious and important roles over results!

1st test, but this time is played under CuteChess 1.2.0:

Rem EXP vs SF-PB SC: 1 Elo difference (almost identical)

Code: Select all

1   SF-PB 220324 SC  +42/-40/=918 50.10%  501.0/1000
2   RemsEXP 160824   +40/-42/=918 49.90%  499.0/1000
Draw Ratio is lowest again just: 91% (as in previous test)
But this is also true that via this test is more reliable..!!
Sure I refer about theirs strength, performances etc.
Btw, very likely Cutechess 1.3.1 did not like a lot Rems ))
You may know.in previous test, there was 6+ Elo diff.
---------------------------------------------------------

2nd test: It's played under CuteChess 1.2.0 too

TR vs SC: 5 Elo difference (in favor for SC)

Code: Select all

1   SF-PB 220324 SC  +43/-29/=928 50.70%  507.0/1000
2   Artemis TR       +29/-43/=928 49.30%  493.0/1000
Draw Ratio is little bit high: 93% , but acceptable..
Since there are such engines which produce much more!
But again surprise..older is overcome newer..strange!

But this is also true that via this test is more reliable..!
Sure I refer about theirs strength, performances etc.
Btw, very likely Cutechess 1.3.1 did not like a lot Rems ))
You may know, in previous test: there was 6+ Elo difference

----------------------------------------------------------

3rd test: it's played under CuteChess 1.2.0 too
Actually since today, all played under v 1.2.0

SF-POLY vs SF-PB SC: 3 Elo diff (in favor for SF-POLY)
Oh...finely I feel much better...otherwise I'd switch
To another older GUIs )), because till this test, SC
Managed to overcome almost alll newer releases...!)

Code: Select all

1   SF-POLY 210924a  +37/-29/=934 50.40%  504.0/1000
2   SF-PB 220324 SC  +29/-37/=934 49.60%  496.0/1000
------------------------------------------------------

SF-PB SC DRAW Test (sure under CuteChess 1.2.0 too)

High draw values this time... as we see 95% draw-ratio

Code: Select all

1   SF-PB SC (Copy)  +16/-14/=570 50.17%  301.0/600
2   SF-PB 220324 SC  +14/-16/=570 49.83%  299.0/600
GAMES:
https://mega.nz/file/uhhkkBLZ#NEB14lQYg ... FDhp6IhStA

And as last notes,
I've produced many more tests, but not required to be
Shared...at least now we know that all stable so far!

On other hand, and as I stated earlier:
Especially since NN era, we see many more bugs..
Sure I admit that we see Eng strength too..and
Who knows about tomorrow? what's waiting us?
But after all,
Let's hope only for good news ..otherwise:
Be ready and face your future with no fear !
And just do not do same mistake as twice...

That's why and once more..
I have a very small request from all ENG/GUI programmers:
Be sure, run serious beta testings (before final releases)!
Thanks in advance..if not, why each time I have to check ?)
It is your turn...and wishing good luck...

Best,
Sedat

Post Reply