Here you find more info:
http://www.inwoba.de
Have fun
Ingo
Komodo 5 running for the IPON
Komodo 5 running for the IPON
Ponder ON rating list: http://www.inwoba.de
Re: Komodo 5 running for the IPON
Sorry for the delay but finaly the result for Komodo 5 is online.
http://www.inwoba.de
The IPON-RRRL will be updated soon.
Bye
Ingo
http://www.inwoba.de
The IPON-RRRL will be updated soon.
Bye
Ingo
Ponder ON rating list: http://www.inwoba.de
Re: Komodo 5 running for the IPON
Just to note my agreement with MB's Ordo:
I think the only difference is that I estimate the White advantage (42.17), while I guess Ordo does not. As colours are equalised, this has little overall effect -- I get a rating higher by 1.1 Elo at the extreme. Note that I agree with all the "close" orderings, including DR4.1/DR4/Critter12 [with the 0.6 and 0.1 margins], Critter1.01 over SF 2.01 by 0.4, SF 1.91 over Rybka 3 mp by 0.6 or so, Naum4.2 squeaking over DF13, the latter 0.4 above Chiron 1.1a, DJ13 better than DJ13.3 by 0.1 Elo, etc. So I guess we both implemented the same algorithm.
One puzzlement to me is how Ordo estimates error, as H2 has (significantly) more games than H1.5, and about the same draw rate and win%, but less error in Ordo's output.
Code: Select all
STEP is 100.000000 Termination Point is 0.000100
Observed White advantage is 42.17
Normalising to "Deep Shredder 12" as 2800.00
RATING NAME RESULTS GAMES WIN% DRAW% ERROR
3042.92 Houdini 2.0 STD +3529 =1385 -486 5400 78.2% 25.6% 4.5
3034.78 Houdini 1.5a +2644 =1037 -319 4000 79.1% 25.9% 5.3
3026.81 Komodo 5 +1670 =936 -244 2850 75.0% 32.8% 5.7
3003.35 Critter 1.4a +2700 =1439 -311 4450 76.8% 32.3% 4.6
2999.25 Komodo 4 +2915 =1476 -459 4850 75.3% 30.4% 4.5
2990.56 Critter 1.6a +1587 =1259 -304 3150 70.4% 40.0% 4.9
2987.59 Komodo 3 +1646 =859 -295 2800 74.1% 30.7% 6.0
2979.01 Stockfish 2.2.2 JA +2667 =1695 -438 4800 73.2% 35.3% 4.3
2977.97 Deep Rybka 4.1 +3263 =2298 -639 6200 71.2% 37.1% 3.7
2977.38 Deep Rybka 4 +2810 =1634 -456 4900 74.0% 33.3% 4.2
2977.26 Critter 1.2 +1666 =1132 -302 3100 72.0% 36.5% 5.2
2975.52 Houdini 1.03a +2048 =944 -208 3200 78.8% 29.5% 5.7
2970.21 Komodo 2.03 DC +1583 =805 -312 2700 73.5% 29.8% 6.0
2961.35 Stockfish 2.1.1 JA +1797 =1259 -444 3500 69.3% 36.0% 4.9
2941.69 Critter 1.01 +1465 =1010 -325 2800 70.4% 36.1% 5.5
2941.34 Stockfish 2.01 JA +1707 =1078 -315 3100 72.5% 34.8% 5.3
2919.48 Stockfish 1.9.1 JA +1598 =1066 -336 3000 71.0% 35.5% 5.4
2918.86 Rybka 3 mp +2580 =1296 -324 4200 76.9% 30.9% 4.8
2911.48 Critter 0.90 +1712 =1231 -457 3400 68.5% 36.2% 4.9
2903.72 Stockfish 1.7.1 JA +1654 =954 -292 2900 73.5% 32.9% 5.7
2859.65 Rybka 3 32b +894 =595 -211 1700 70.1% 35.0% 7.0
2843.82 Stockfish 1.6.x JA +1308 =969 -323 2600 68.9% 37.3% 5.6
2839.30 Komodo 1.3 JA +1335 =1222 -743 3300 59.0% 37.0% 4.9
2833.35 Naum 4.2 +3397 =3742 -2161 9300 56.6% 40.2% 2.8
2833.29 Deep Fritz 13 32b +954 =1335 -1011 3300 49.1% 40.5% 4.6
2832.88 Chiron 1.1a +1684 =1961 -1305 4950 53.8% 39.6% 4.0
2824.98 Critter 0.80 +1294 =1003 -503 2800 64.1% 35.8% 5.4
2819.40 Fritz 13 32b +1445 =1726 -1129 4300 53.7% 40.1% 4.2
2809.52 Komodo 1.2 JA +1444 =1462 -794 3700 58.8% 39.5% 4.6
2805.01 Rybka 2.3.2a mp +1466 =1413 -621 3500 62.1% 40.4% 4.6
2800.00 Deep Shredder 12 +3596 =3944 -2860 10400 53.5% 37.9% 2.8
2797.31 Hannibal 1.2 +921 =1423 -1256 3600 45.3% 39.5% 4.6
2795.13 Gull 1.2 +1926 =2293 -2081 6300 48.8% 36.4% 3.5
2791.59 Critter 0.70 +764 =686 -450 1900 58.3% 36.1% 6.5
2791.28 Gull 1.1 +1089 =1173 -838 3100 54.0% 37.8% 5.0
2789.61 Naum 4.1 +1009 =912 -379 2300 63.7% 39.7% 5.7
2788.90 Deep Sjeng c't 2010 32b +2121 =2832 -2347 7300 48.5% 38.8% 3.3
2784.95 Komodo 1.0 JA +1154 =1205 -541 2900 60.6% 41.6% 5.0
2782.31 Spike 1.4 32b +1785 =2467 -2148 6400 47.2% 38.5% 3.5
2778.05 Deep Fritz 12 32b +2069 =2399 -1832 6300 51.9% 38.1% 3.5
2775.78 Naum 4 +1089 =1079 -532 2700 60.3% 40.0% 5.2
2775.18 Rybka 2.2n2 mp +895 =833 -372 2100 62.5% 39.7% 6.0
2766.73 Gull 1.0a +811 =886 -603 2300 54.5% 38.5% 5.8
2761.98 Stockfish 1.5.1 JA +763 =731 -406 1900 59.4% 38.5% 6.4
2761.07 Rybka 1.2f +1147 =863 -390 2400 65.8% 36.0% 5.9
2756.84 Protector 1.4.0 +1684 =2406 -2410 6500 44.4% 37.0% 3.5
2756.62 spark-1.0 +1749 =2685 -2566 7000 44.2% 38.4% 3.4
2747.57 Hannibal 1.1 +1183 =1918 -1799 4900 43.7% 39.1% 4.0
2745.81 HIARCS 13.2 MP 32b +1691 =2463 -2646 6800 43.0% 36.2% 3.5
2745.05 Deep Junior 13 +796 =1313 -1491 3600 40.3% 36.5% 4.8
2744.96 Deep Junior 13.3 +634 =994 -1372 3000 37.7% 33.1% 5.4
2740.13 Fritz 12 32b +687 =808 -505 2000 54.5% 40.4% 6.2
2734.62 Quazar 0.4 +761 =1478 -1661 3900 38.5% 37.9% 4.5
2727.51 HIARCS 13.1 MP 32b +1068 =1333 -1199 3600 48.2% 37.0% 4.8
2726.31 Deep Junior 12.5 +1161 =1604 -2085 4850 40.5% 33.1% 4.3
2720.28 Deep Fritz 11 32b +494 =501 -305 1300 57.3% 38.5% 7.8
2710.05 Doch64 1.2 JA +491 =659 -450 1600 51.3% 41.2% 6.7
2708.64 spark-0.4 +849 =1218 -1033 3100 47.0% 39.3% 4.9
2707.78 Stockfish 1.4 JA +523 =652 -525 1700 49.9% 38.4% 6.7
2705.76 Zappa Mexico II +2945 =4250 -4505 11700 43.3% 36.3% 2.6
2704.65 Shredder Bonn 32b +726 =786 -688 2200 50.9% 35.7% 6.0
2693.29 Critter 0.60 +666 =812 -722 2200 48.7% 36.9% 6.0
2693.20 Protector 1.3.2 JA +1364 =1995 -1941 5300 44.6% 37.6% 3.8
2689.68 MinkoChess 1.3 +473 =1323 -1804 3600 31.5% 36.8% 4.8
2684.34 Deep Shredder 11 +922 =980 -798 2700 52.3% 36.3% 5.4
2681.51 Doch64 09.980 JA +424 =572 -504 1500 47.3% 38.1% 7.0
2674.05 Deep Junior 12 +809 =1094 -1697 3600 37.7% 30.4% 5.1
2673.75 Onno-1-1-1 +1064 =1718 -1518 4300 44.7% 40.0% 4.1
2673.47 Hannibal 1.0a +897 =1406 -1897 4200 38.1% 33.5% 4.5
2672.39 Deep Onno 1-2-70 +1421 =2771 -3508 7700 36.4% 36.0% 3.3
2672.02 Naum 3.1 +927 =1175 -898 3000 50.5% 39.2% 5.0
2671.59 Zappa Mexico I +774 =894 -532 2200 55.5% 40.6% 5.8
2670.57 Rybka 1.0 Beta +617 =813 -870 2300 44.5% 35.3% 5.9
2667.40 Spark-0.3 VC(a) +903 =1444 -1253 3600 45.1% 40.1% 4.5
2664.65 Onno-1-0-0 +347 =495 -358 1200 49.5% 41.2% 7.7
2662.34 Deep Sjeng WC2008 +1407 =2055 -2138 5600 43.5% 36.7% 3.8
2658.65 Toga II 1.4 beta5c BB +1707 =3097 -3496 8300 39.2% 37.3% 3.1
2657.30 Deep Junior 11.2 +725 =902 -1273 2900 40.6% 31.1% 5.7
2653.12 Strelka 2.0 B +866 =1825 -2809 5500 32.3% 33.2% 4.0
2649.36 Hiarcs 12.1 MP 32b +1366 =2123 -2111 5600 43.3% 37.9% 3.7
2647.32 Tornado 4.88 +408 =790 -1202 2400 33.5% 32.9% 6.2
2646.75 Deep Sjeng 3.0 +362 =479 -559 1400 43.0% 34.2% 7.7
2646.64 Umko 1.2 +458 =1117 -1725 3300 30.8% 33.8% 5.2
2635.64 Critter 0.52b +594 =1006 -1000 2600 42.2% 38.7% 5.4
2635.48 Shredder Classic 4 32b +581 =683 -536 1800 51.2% 37.9% 6.4
2625.45 Deep Junior 11.1a +673 =960 -1167 2800 41.2% 34.3% 5.6
2623.84 Naum 2.2 32b +323 =582 -395 1300 47.2% 44.8% 7.3
2618.83 Nemo 1.0.1 +299 =818 -1583 2700 26.2% 30.3% 6.0
2618.77 Umko 1.1 +510 =1272 -2118 3900 29.4% 32.6% 4.9
2616.51 Deep Junior 2010 +735 =950 -1415 3100 39.0% 30.6% 5.4
2615.60 Glaurung 2.2 JA +538 =979 -1083 2600 39.5% 37.7% 5.5
2615.51 Rybka 1.0 Beta 32b +301 =410 -389 1100 46.0% 37.3% 8.3
2610.54 HIARCS 11.2 32b +468 =718 -714 1900 43.5% 37.8% 6.3
2607.92 Fruit 05/11/03 32b +882 =1784 -1734 4400 40.3% 40.5% 4.1
2600.95 Loop 2007 +1139 =2634 -4127 7900 31.1% 33.3% 3.4
2597.68 Toga II 1.2.1a +388 =657 -555 1600 44.8% 41.1% 6.7
2597.12 Jonny 4.00 32b +673 =1433 -3094 5200 26.7% 27.6% 4.5
2593.30 ListMP 11 +506 =963 -1131 2600 38.0% 37.0% 5.6
2591.16 LoopMP 12 32b +351 =568 -581 1500 42.3% 37.9% 7.2
2589.40 Tornado 4.80 +328 =707 -1665 2700 25.2% 26.2% 6.4
2587.43 Deep Shredder 10 +1031 =1446 -1923 4400 39.9% 32.9% 4.4
2583.02 Twisted Logic 20100131x +607 =1066 -1827 3500 32.6% 30.5% 5.2
2578.38 Crafty 23.3 JA +592 =1397 -3211 5200 24.8% 26.9% 4.5
2560.51 Spike 1.2 Turin 32b +1087 =2525 -4088 7700 30.5% 32.8% 3.4
2536.25 Deep Sjeng 2.7 32b +217 =497 -686 1400 33.2% 35.5% 7.8
2524.90 Crafty 23.1 JA +470 =1064 -2266 3800 26.4% 28.0% 5.2
One puzzlement to me is how Ordo estimates error, as H2 has (significantly) more games than H1.5, and about the same draw rate and win%, but less error in Ordo's output.
-
- Posts: 26
- Joined: Tue Aug 09, 2011 7:58 pm
- Real Name: Miguel A. Ballicora
Re: Komodo 5 running for the IPON
Ordo calculates the white advantage and it was 41.1, but I have not released the version that does it, yet. This is a reminder I need to do a release.BB+ wrote:Just to note my agreement with MB's Ordo:I think the only difference is that I estimate the White advantage (42.17), while I guess Ordo does not. As colours are equalised, this has little overall effect -- I get a rating higher by 1.1 Elo at the extreme. Note that I agree with all the "close" orderings, including DR4.1/DR4/Critter12 [with the 0.6 and 0.1 margins], Critter1.01 over SF 2.01 by 0.4, SF 1.91 over Rybka 3 mp by 0.6 or so, Naum4.2 squeaking over DF13, the latter 0.4 above Chiron 1.1a, DJ13 better than DJ13.3 by 0.1 Elo, etc. So I guess we both implemented the same algorithm.Code: Select all
STEP is 100.000000 Termination Point is 0.000100 Observed White advantage is 42.17 Normalising to "Deep Shredder 12" as 2800.00 RATING NAME RESULTS GAMES WIN% DRAW% ERROR 3042.92 Houdini 2.0 STD +3529 =1385 -486 5400 78.2% 25.6% 4.5 3034.78 Houdini 1.5a +2644 =1037 -319 4000 79.1% 25.9% 5.3 3026.81 Komodo 5 +1670 =936 -244 2850 75.0% 32.8% 5.7 3003.35 Critter 1.4a +2700 =1439 -311 4450 76.8% 32.3% 4.6 2999.25 Komodo 4 +2915 =1476 -459 4850 75.3% 30.4% 4.5 2990.56 Critter 1.6a +1587 =1259 -304 3150 70.4% 40.0% 4.9 2987.59 Komodo 3 +1646 =859 -295 2800 74.1% 30.7% 6.0 2979.01 Stockfish 2.2.2 JA +2667 =1695 -438 4800 73.2% 35.3% 4.3 2977.97 Deep Rybka 4.1 +3263 =2298 -639 6200 71.2% 37.1% 3.7 2977.38 Deep Rybka 4 +2810 =1634 -456 4900 74.0% 33.3% 4.2 2977.26 Critter 1.2 +1666 =1132 -302 3100 72.0% 36.5% 5.2 2975.52 Houdini 1.03a +2048 =944 -208 3200 78.8% 29.5% 5.7 2970.21 Komodo 2.03 DC +1583 =805 -312 2700 73.5% 29.8% 6.0 2961.35 Stockfish 2.1.1 JA +1797 =1259 -444 3500 69.3% 36.0% 4.9 2941.69 Critter 1.01 +1465 =1010 -325 2800 70.4% 36.1% 5.5 2941.34 Stockfish 2.01 JA +1707 =1078 -315 3100 72.5% 34.8% 5.3 2919.48 Stockfish 1.9.1 JA +1598 =1066 -336 3000 71.0% 35.5% 5.4 2918.86 Rybka 3 mp +2580 =1296 -324 4200 76.9% 30.9% 4.8 2911.48 Critter 0.90 +1712 =1231 -457 3400 68.5% 36.2% 4.9 2903.72 Stockfish 1.7.1 JA +1654 =954 -292 2900 73.5% 32.9% 5.7 2859.65 Rybka 3 32b +894 =595 -211 1700 70.1% 35.0% 7.0 2843.82 Stockfish 1.6.x JA +1308 =969 -323 2600 68.9% 37.3% 5.6 2839.30 Komodo 1.3 JA +1335 =1222 -743 3300 59.0% 37.0% 4.9 2833.35 Naum 4.2 +3397 =3742 -2161 9300 56.6% 40.2% 2.8 2833.29 Deep Fritz 13 32b +954 =1335 -1011 3300 49.1% 40.5% 4.6 2832.88 Chiron 1.1a +1684 =1961 -1305 4950 53.8% 39.6% 4.0 2824.98 Critter 0.80 +1294 =1003 -503 2800 64.1% 35.8% 5.4 2819.40 Fritz 13 32b +1445 =1726 -1129 4300 53.7% 40.1% 4.2 2809.52 Komodo 1.2 JA +1444 =1462 -794 3700 58.8% 39.5% 4.6 2805.01 Rybka 2.3.2a mp +1466 =1413 -621 3500 62.1% 40.4% 4.6 2800.00 Deep Shredder 12 +3596 =3944 -2860 10400 53.5% 37.9% 2.8 2797.31 Hannibal 1.2 +921 =1423 -1256 3600 45.3% 39.5% 4.6 2795.13 Gull 1.2 +1926 =2293 -2081 6300 48.8% 36.4% 3.5 2791.59 Critter 0.70 +764 =686 -450 1900 58.3% 36.1% 6.5 2791.28 Gull 1.1 +1089 =1173 -838 3100 54.0% 37.8% 5.0 2789.61 Naum 4.1 +1009 =912 -379 2300 63.7% 39.7% 5.7 2788.90 Deep Sjeng c't 2010 32b +2121 =2832 -2347 7300 48.5% 38.8% 3.3 2784.95 Komodo 1.0 JA +1154 =1205 -541 2900 60.6% 41.6% 5.0 2782.31 Spike 1.4 32b +1785 =2467 -2148 6400 47.2% 38.5% 3.5 2778.05 Deep Fritz 12 32b +2069 =2399 -1832 6300 51.9% 38.1% 3.5 2775.78 Naum 4 +1089 =1079 -532 2700 60.3% 40.0% 5.2 2775.18 Rybka 2.2n2 mp +895 =833 -372 2100 62.5% 39.7% 6.0 2766.73 Gull 1.0a +811 =886 -603 2300 54.5% 38.5% 5.8 2761.98 Stockfish 1.5.1 JA +763 =731 -406 1900 59.4% 38.5% 6.4 2761.07 Rybka 1.2f +1147 =863 -390 2400 65.8% 36.0% 5.9 2756.84 Protector 1.4.0 +1684 =2406 -2410 6500 44.4% 37.0% 3.5 2756.62 spark-1.0 +1749 =2685 -2566 7000 44.2% 38.4% 3.4 2747.57 Hannibal 1.1 +1183 =1918 -1799 4900 43.7% 39.1% 4.0 2745.81 HIARCS 13.2 MP 32b +1691 =2463 -2646 6800 43.0% 36.2% 3.5 2745.05 Deep Junior 13 +796 =1313 -1491 3600 40.3% 36.5% 4.8 2744.96 Deep Junior 13.3 +634 =994 -1372 3000 37.7% 33.1% 5.4 2740.13 Fritz 12 32b +687 =808 -505 2000 54.5% 40.4% 6.2 2734.62 Quazar 0.4 +761 =1478 -1661 3900 38.5% 37.9% 4.5 2727.51 HIARCS 13.1 MP 32b +1068 =1333 -1199 3600 48.2% 37.0% 4.8 2726.31 Deep Junior 12.5 +1161 =1604 -2085 4850 40.5% 33.1% 4.3 2720.28 Deep Fritz 11 32b +494 =501 -305 1300 57.3% 38.5% 7.8 2710.05 Doch64 1.2 JA +491 =659 -450 1600 51.3% 41.2% 6.7 2708.64 spark-0.4 +849 =1218 -1033 3100 47.0% 39.3% 4.9 2707.78 Stockfish 1.4 JA +523 =652 -525 1700 49.9% 38.4% 6.7 2705.76 Zappa Mexico II +2945 =4250 -4505 11700 43.3% 36.3% 2.6 2704.65 Shredder Bonn 32b +726 =786 -688 2200 50.9% 35.7% 6.0 2693.29 Critter 0.60 +666 =812 -722 2200 48.7% 36.9% 6.0 2693.20 Protector 1.3.2 JA +1364 =1995 -1941 5300 44.6% 37.6% 3.8 2689.68 MinkoChess 1.3 +473 =1323 -1804 3600 31.5% 36.8% 4.8 2684.34 Deep Shredder 11 +922 =980 -798 2700 52.3% 36.3% 5.4 2681.51 Doch64 09.980 JA +424 =572 -504 1500 47.3% 38.1% 7.0 2674.05 Deep Junior 12 +809 =1094 -1697 3600 37.7% 30.4% 5.1 2673.75 Onno-1-1-1 +1064 =1718 -1518 4300 44.7% 40.0% 4.1 2673.47 Hannibal 1.0a +897 =1406 -1897 4200 38.1% 33.5% 4.5 2672.39 Deep Onno 1-2-70 +1421 =2771 -3508 7700 36.4% 36.0% 3.3 2672.02 Naum 3.1 +927 =1175 -898 3000 50.5% 39.2% 5.0 2671.59 Zappa Mexico I +774 =894 -532 2200 55.5% 40.6% 5.8 2670.57 Rybka 1.0 Beta +617 =813 -870 2300 44.5% 35.3% 5.9 2667.40 Spark-0.3 VC(a) +903 =1444 -1253 3600 45.1% 40.1% 4.5 2664.65 Onno-1-0-0 +347 =495 -358 1200 49.5% 41.2% 7.7 2662.34 Deep Sjeng WC2008 +1407 =2055 -2138 5600 43.5% 36.7% 3.8 2658.65 Toga II 1.4 beta5c BB +1707 =3097 -3496 8300 39.2% 37.3% 3.1 2657.30 Deep Junior 11.2 +725 =902 -1273 2900 40.6% 31.1% 5.7 2653.12 Strelka 2.0 B +866 =1825 -2809 5500 32.3% 33.2% 4.0 2649.36 Hiarcs 12.1 MP 32b +1366 =2123 -2111 5600 43.3% 37.9% 3.7 2647.32 Tornado 4.88 +408 =790 -1202 2400 33.5% 32.9% 6.2 2646.75 Deep Sjeng 3.0 +362 =479 -559 1400 43.0% 34.2% 7.7 2646.64 Umko 1.2 +458 =1117 -1725 3300 30.8% 33.8% 5.2 2635.64 Critter 0.52b +594 =1006 -1000 2600 42.2% 38.7% 5.4 2635.48 Shredder Classic 4 32b +581 =683 -536 1800 51.2% 37.9% 6.4 2625.45 Deep Junior 11.1a +673 =960 -1167 2800 41.2% 34.3% 5.6 2623.84 Naum 2.2 32b +323 =582 -395 1300 47.2% 44.8% 7.3 2618.83 Nemo 1.0.1 +299 =818 -1583 2700 26.2% 30.3% 6.0 2618.77 Umko 1.1 +510 =1272 -2118 3900 29.4% 32.6% 4.9 2616.51 Deep Junior 2010 +735 =950 -1415 3100 39.0% 30.6% 5.4 2615.60 Glaurung 2.2 JA +538 =979 -1083 2600 39.5% 37.7% 5.5 2615.51 Rybka 1.0 Beta 32b +301 =410 -389 1100 46.0% 37.3% 8.3 2610.54 HIARCS 11.2 32b +468 =718 -714 1900 43.5% 37.8% 6.3 2607.92 Fruit 05/11/03 32b +882 =1784 -1734 4400 40.3% 40.5% 4.1 2600.95 Loop 2007 +1139 =2634 -4127 7900 31.1% 33.3% 3.4 2597.68 Toga II 1.2.1a +388 =657 -555 1600 44.8% 41.1% 6.7 2597.12 Jonny 4.00 32b +673 =1433 -3094 5200 26.7% 27.6% 4.5 2593.30 ListMP 11 +506 =963 -1131 2600 38.0% 37.0% 5.6 2591.16 LoopMP 12 32b +351 =568 -581 1500 42.3% 37.9% 7.2 2589.40 Tornado 4.80 +328 =707 -1665 2700 25.2% 26.2% 6.4 2587.43 Deep Shredder 10 +1031 =1446 -1923 4400 39.9% 32.9% 4.4 2583.02 Twisted Logic 20100131x +607 =1066 -1827 3500 32.6% 30.5% 5.2 2578.38 Crafty 23.3 JA +592 =1397 -3211 5200 24.8% 26.9% 4.5 2560.51 Spike 1.2 Turin 32b +1087 =2525 -4088 7700 30.5% 32.8% 3.4 2536.25 Deep Sjeng 2.7 32b +217 =497 -686 1400 33.2% 35.5% 7.8 2524.90 Crafty 23.1 JA +470 =1064 -2266 3800 26.4% 28.0% 5.2
One puzzlement to me is how Ordo estimates error, as H2 has (significantly) more games than H1.5, and about the same draw rate and win%, but less error in Ordo's output.
Once the ratings are calculated you assume this is the "truth" and the results of the games are randomly "simulated" based on the ratings obtained and their respective probabilities estimated for a win in each game. You do this n times, so you obtain n different ratings for a given engine. SD is calculated from this. This way you can estimate the error of any type of difference. The issue is that the errors obtained, which are the errors taking into account the average of the pool as a reference, are generally not what one is interested in. What most of the time one wants, it the error of the diffence between two given engines. This could vary tremendously. Ordo provides a matrix file with error engineA-engineB, error engineA-engineC etc. The advantage of this becomes evident when there are, say, two pools of engines that played lots of intra-pool games, but fewer inter-pool games. The error between two engines from different pools will be much bigger than the error between two engines from the same pool, as it should be.
The discrepancy you noted in the errors is because there were only 100 simulations, so the estimations of the errors were not that precise. I ran 5000 simulations and I got
Code: Select all
# ENGINE : RATING ERROR POINTS PLAYED (%)
1 Houdini 2.0 STD : 2590.3 6.3 4221.5 5400 78.2%
2 Houdini 1.5a : 2582.2 7.1 3162.5 4000 79.1%
3 Komodo 5 : 2574.3 7.9 2138.0 2850 75.0%
4 Critter 1.4a : 2550.9 6.7 3419.5 4450 76.8%
5 Komodo 4 : 2546.9 6.3 3653.0 4850 75.3%
6 Critter 1.6a : 2538.2 7.2 2216.5 3150 70.4%
7 Komodo 3 : 2535.3 8.1 2075.5 2800 74.1%
8 Stockfish 2.2.2 JA : 2526.7 6.1 3514.5 4800 73.2%
9 Deep Rybka 4.1 : 2525.7 5.4 4412.0 6200 71.2%
10 Deep Rybka 4 : 2525.1 6.0 3627.0 4900 74.0%
11 Critter 1.2 : 2525.0 7.5 2232.0 3100 72.0%
12 Houdini 1.03a : 2523.2 7.9 2520.0 3200 78.8%
13 Komodo 2.03 DC : 2518.0 8.1 1985.5 2700 73.5%
14 Stockfish 2.1.1 JA : 2509.1 7.0 2426.5 3500 69.3%
15 Critter 1.01 : 2489.6 7.6 1970.0 2800 70.4%
16 Stockfish 2.01 JA : 2489.2 7.4 2246.0 3100 72.5%
17 Stockfish 1.9.1 JA : 2467.5 7.4 2131.0 3000 71.0%
18 Rybka 3 mp : 2466.8 6.5 3228.0 4200 76.9%
19 Critter 0.90 : 2459.5 6.9 2327.5 3400 68.5%
20 Stockfish 1.7.1 JA : 2451.8 7.7 2131.0 2900 73.5%
21 Rybka 3 32b : 2407.9 9.4 1191.5 1700 70.1%
22 Stockfish 1.6.x JA : 2392.1 7.8 1792.5 2600 68.9%
23 Komodo 1.3 JA : 2387.6 6.5 1946.0 3300 59.0%
24 Naum 4.2 : 2381.7 3.9 5268.0 9300 56.6%
25 Deep Fritz 13 32b : 2381.7 6.5 1621.5 3300 49.1%
26 Chiron 1.1a : 2381.3 5.4 2664.5 4950 53.8%
27 Critter 0.80 : 2373.4 7.3 1795.5 2800 64.1%
28 Fritz 13 32b : 2367.8 5.9 2308.0 4300 53.7%
29 Komodo 1.2 JA : 2358.0 6.2 2175.0 3700 58.8%
30 Rybka 2.3.2a mp : 2353.5 6.6 2172.5 3500 62.1%
31 Deep Shredder 12 : 2348.5 3.7 5568.0 10400 53.5%
32 Hannibal 1.2 : 2345.9 6.2 1632.5 3600 45.3%
33 Gull 1.2 : 2343.7 4.9 3072.5 6300 48.8%
34 Critter 0.70 : 2340.2 8.5 1107.0 1900 58.3%
35 Gull 1.1 : 2339.9 6.7 1675.5 3100 54.0%
36 Naum 4.1 : 2338.2 7.9 1465.0 2300 63.7%
37 Deep Sjeng c't 2010 32b : 2337.5 4.5 3537.0 7300 48.5%
38 Komodo 1.0 JA : 2333.6 7.1 1756.5 2900 60.6%
39 Spike 1.4 32b : 2330.9 4.8 3018.5 6400 47.2%
40 Deep Fritz 12 32b : 2326.7 4.7 3268.5 6300 51.9%
41 Naum 4 : 2324.4 7.2 1628.5 2700 60.3%
42 Rybka 2.2n2 mp : 2323.8 8.0 1311.5 2100 62.5%
43 Gull 1.0a : 2315.4 7.8 1254.0 2300 54.5%
44 Stockfish 1.5.1 JA : 2310.7 8.4 1128.5 1900 59.4%
45 Rybka 1.2f : 2309.8 8.0 1578.5 2400 65.8%
46 Protector 1.4.0 : 2305.6 4.8 2887.0 6500 44.4%
47 spark-1.0 : 2305.3 4.6 3091.5 7000 44.2%
48 Hannibal 1.1 : 2296.3 5.5 2142.0 4900 43.7%
49 HIARCS 13.2 MP 32b : 2294.6 4.7 2922.5 6800 43.0%
50 Deep Junior 13 : 2293.8 6.4 1452.5 3600 40.3%
51 Deep Junior 13.3 : 2293.7 7.0 1131.0 3000 37.7%
52 Fritz 12 32b : 2288.9 8.2 1091.0 2000 54.5%
53 Quazar 0.4 : 2283.5 6.2 1500.0 3900 38.5%
54 HIARCS 13.1 MP 32b : 2276.4 6.3 1734.5 3600 48.2%
55 Deep Junior 12.5 : 2275.2 5.6 1963.0 4850 40.5%
56 Deep Fritz 11 32b : 2269.2 10.2 744.5 1300 57.3%
57 Doch64 1.2 JA : 2259.0 9.0 820.5 1600 51.3%
58 spark-0.4 : 2257.6 6.6 1458.0 3100 47.0%
59 Stockfish 1.4 JA : 2256.7 8.8 849.0 1700 49.9%
60 Zappa Mexico II : 2254.7 3.4 5070.0 11700 43.3%
61 Shredder Bonn 32b : 2253.6 7.7 1119.0 2200 50.9%
62 Critter 0.60 : 2242.3 7.8 1072.0 2200 48.7%
63 Protector 1.3.2 JA : 2242.2 5.2 2361.5 5300 44.6%
64 MinkoChess 1.3 : 2238.7 6.6 1134.5 3600 31.5%
65 Deep Shredder 11 : 2233.4 7.2 1412.0 2700 52.3%
66 Doch64 09.980 JA : 2230.6 9.4 710.0 1500 47.3%
67 Deep Junior 12 : 2223.2 6.4 1356.0 3600 37.7%
68 Onno-1-1-1 : 2222.9 5.6 1923.0 4300 44.7%
69 Hannibal 1.0a : 2222.6 5.9 1600.0 4200 38.1%
70 Deep Onno 1-2-70 : 2221.5 4.4 2806.5 7700 36.4%
71 Naum 3.1 : 2221.1 6.8 1514.5 3000 50.5%
72 Zappa Mexico I : 2220.7 7.9 1221.0 2200 55.5%
73 Rybka 1.0 Beta : 2219.7 7.8 1023.5 2300 44.5%
74 Spark-0.3 VC(a) : 2216.5 6.1 1625.0 3600 45.1%
75 Onno-1-0-0 : 2213.8 10.4 594.5 1200 49.5%
76 Deep Sjeng WC2008 : 2211.5 5.0 2434.5 5600 43.5%
77 Toga II 1.4 beta5c BB : 2207.8 4.0 3255.5 8300 39.2%
78 Deep Junior 11.2 : 2206.5 7.1 1176.0 2900 40.6%
79 Strelka 2.0 B : 2202.3 5.5 1778.5 5500 32.3%
80 Hiarcs 12.1 MP 32b : 2198.6 5.0 2427.5 5600 43.3%
81 Tornado 4.88 : 2196.6 8.0 803.0 2400 33.5%
82 Deep Sjeng 3.0 : 2196.0 9.7 601.5 1400 43.0%
83 Umko 1.2 : 2195.9 7.0 1016.5 3300 30.8%
84 Critter 0.52b : 2184.9 7.2 1097.0 2600 42.2%
85 Shredder Classic 4 32b : 2184.8 8.4 922.5 1800 51.2%
86 Deep Junior 11.1a : 2174.8 6.9 1153.0 2800 41.2%
87 Naum 2.2 32b : 2173.2 10.1 614.0 1300 47.2%
88 Nemo 1.0.1 : 2168.2 7.8 708.0 2700 26.2%
89 Umko 1.1 : 2168.1 6.4 1146.0 3900 29.4%
90 Deep Junior 2010 : 2165.9 6.7 1210.0 3100 39.0%
91 Glaurung 2.2 JA : 2165.0 7.3 1027.5 2600 39.5%
92 Rybka 1.0 Beta 32b : 2164.9 10.7 506.0 1100 46.0%
93 HIARCS 11.2 32b : 2159.9 8.6 827.0 1900 43.5%
94 Fruit 05/11/03 32b : 2157.3 5.7 1774.0 4400 40.3%
95 Loop 2007 : 2150.4 4.4 2456.0 7900 31.1%
96 Toga II 1.2.1a : 2147.1 9.0 716.5 1600 44.8%
97 Jonny 4.00 32b : 2146.6 5.7 1389.5 5200 26.7%
98 ListMP 11 : 2142.8 7.3 987.5 2600 38.0%
99 LoopMP 12 32b : 2140.7 9.4 635.0 1500 42.3%
100 Tornado 4.80 : 2138.9 8.2 681.5 2700 25.2%
101 Deep Shredder 10 : 2136.9 5.8 1754.0 4400 39.9%
102 Twisted Logic 20100131x : 2132.6 6.6 1140.0 3500 32.6%
103 Crafty 23.3 JA : 2127.9 5.9 1290.5 5200 24.8%
104 Spike 1.2 Turin 32b : 2110.2 4.5 2349.5 7700 30.5%
105 Deep Sjeng 2.7 32b : 2086.0 10.4 465.5 1400 33.2%
106 Crafty 23.1 JA : 2074.7 6.7 1002.0 3800 26.4%
The matrix error is (only for the top 11) in .csv format
"N","NAME",0,1,2,3,4,5,6,7,8,9,10,11,
0," Houdini 2.0 STD"
1," Houdini 1.5a",9.4
2," Komodo 5",9.5,10.6
3," Critter 1.4a",8.8,9.5,10.1
4," Komodo 4",8.4,9.4,9.9,8.7
5," Critter 1.6a",9.1,10.1,10.2,9.6,9.2
6," Komodo 3",9.9,10.5,11.2,10.5,10.2,10.9
7," Stockfish 2.2.2 JA",8.3,9.2,9.6,8.7,8.4,9.2,10.1
8," Deep Rybka 4.1",7.8,8.8,9.1,8.2,7.9,8.4,9.6,7.7
9," Deep Rybka 4",8.5,9.0,10.0,8.9,8.8,9.3,9.8,8.5,7.9
10," Critter 1.2",9.5,10.0,10.8,9.8,9.5,10.2,10.8,9.5,8.9,9.4
Miguel