Critter 1.0 SSE42 running for the IPON
Critter 1.0 SSE42 running for the IPON
Ponder ON rating list: http://www.inwoba.de
- Swaminathan
- Posts: 375
- Joined: Wed Jun 09, 2010 12:14 pm
Re: Critter 1.0 SSE42 running for the IPON
So far the results as reported by Richard Vida prior to the release appears consistent with this tournament result. IE Slightly better than Stockfish but somewhat weaker than Houdini.
Around 20-30 elo gain.
Logo made by Ulysses P (Vytron)
Co-Authored with Dann Corbit: Strategic Test Suite
Co-Authored with Dann Corbit: Strategic Test Suite
Re: Critter 1.0 SSE42 running for the IPON
Critter 1.0 included into the IPON list
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
Ponder ON rating list: http://www.inwoba.de
- kingliveson
- Posts: 1388
- Joined: Thu Jun 10, 2010 1:22 am
- Real Name: Franklin Titus
- Location: 28°32'1"N 81°22'33"W
Re: Critter 1.0 SSE42 running for the IPON
Do we really know these games are being played?IWB wrote:Critter 1.0 included into the IPON list
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
PAWN : Knight >> Bishop >> Rook >>Queen
Re: Critter 1.0 SSE42 running for the IPON
No, you cant be sure, unfortunately you cant be sure that you realy read this sentence. The only thing you can be sure is that you are and that you will look into a monitor again to see the outcome of the games!kingliveson wrote:Do we really know these games are being played?IWB wrote:Critter 1.0 included into the IPON list
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
Bye
Ingo
Ponder ON rating list: http://www.inwoba.de
-
- Posts: 44
- Joined: Thu Jun 10, 2010 1:43 am
- Real Name: Justin Blanchard
- Location: United States
Re: Critter 1.0 SSE42 running for the IPON
I don't understand why people keep making this point. Even if a tester publishes the games with the results, it is extremely difficult to prove whether he's including all games the programs played, or just sampling selectively. In other words, Ingo would not become more trustworthy if he started publishing games. Instead, if we want to use his test results, we have to trust his integrity -- which, so far, I've seen no reason to doubt.kingliveson wrote:Do we really know these games are being played?IWB wrote:Critter 1.0 included into the IPON list
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
If you want to ask him, "will you publish the games so others can use them for elaborate data mining", that's pretty reasonable. But "you look like you're making the results up" or (a complaint I've seen from others) "your numbers are useless without the games" are unreasonable.
- kingliveson
- Posts: 1388
- Joined: Thu Jun 10, 2010 1:22 am
- Real Name: Franklin Titus
- Location: 28°32'1"N 81°22'33"W
Re: Critter 1.0 SSE42 running for the IPON
I brought this topic up with Ingo close to 2 years ago (Rybka forum?) and it's not because there is reason to believe he's making these numbers up.UncombedCoconut wrote:I don't understand why people keep making this point. Even if a tester publishes the games with the results, it is extremely difficult to prove whether he's including all games the programs played, or just sampling selectively. In other words, Ingo would not become more trustworthy if he started publishing games. Instead, if we want to use his test results, we have to trust his integrity -- which, so far, I've seen no reason to doubt.kingliveson wrote:Do we really know these games are being played?IWB wrote:Critter 1.0 included into the IPON list
http://www.inwoba.de
500 more games are running and will be included tomorrow evening.
Bye
Ingo
If you want to ask him, "will you publish the games so others can use them for elaborate data mining", that's pretty reasonable. But "you look like you're making the results up" or (a complaint I've seen from others) "your numbers are useless without the games" are unreasonable.
It is rather peculiar that he refuses to provide the games. What is the point of publicly publishing results of an experiment but refusing to publish along-side the data? Sedat made a valid point (CCC 381538) regarding testers and trust.
P.S. And please don't tell me not to look at the rating list because by you posting results on chess forums (CCC, OpenChess, Rybka, etc), you are telling me to look at it.
PAWN : Knight >> Bishop >> Rook >>Queen
Re: Critter 1.0 SSE42 running for the IPON
No games = IPON = No trust = No Believe
bye
bye
Re: Critter 1.0 SSE42 running for the IPON
Personaly, I don't care about downloading the games. I never did it once from any chess rating site.
What I care is :
- IPON have about same relative ratings than others sites. I fail to see then why I shouldn't trust IPON...
- IPON tests new engines way earlier than other rating sites.
What I care is :
- IPON have about same relative ratings than others sites. I fail to see then why I shouldn't trust IPON...
- IPON tests new engines way earlier than other rating sites.
Re: Critter 1.0 SSE42 running for the IPON
Hi
1. Actually a while back I would have agreed with your first point. Seeing what has happened recently with some derivates (and therefore a lot of lost credit) I think it is quite a good decision not to publish the games when using a fixed test set!
2. To publish my games would ONLY be good for croscheking if I "cheat" as a 5 +3 game set with a fixed opening set would be pretty useless for any data mining. (Or do you have any other use for 130000 games with 'only' 50 openings?)
3. If the opening set would be open I would have to discuss these openings again and again and again. As I know that the set works (compare with other lists) I am perfectly happy with the current situation
4. I started to publish the list because I was asked to (for several reasons). Looking at the hit rate of my page it seems that not a lot of people share your concerns.
In short: I doubt that many people really would "check" the games. As I do crosscheck regulary with another rating list, I know that all engines (except one) are within it's error margins (if the engines are listed there). The single exception is Zappa Mexico I (one!) and there I doubt a sufficant test and it is impossible to decide which list is right or wrong (actually I could remove it out of my list as no one is interested in that engine anymore). This checking can be done by anyone else as well ... and I am sure if I would push or pull any engine it would be revieled very fast.
At the end it comes down to: This list is a byproduct of my testing and I provide it (at the moment) voluntary. If you do not thrust it it is on you to be consequent
Bye
Ingo
Some reasoning in the order of its importance:kingliveson wrote: ...
It is rather peculiar that he refuses to provide the games. What is the point of publicly publishing results of an experiment but refusing to publish along-side the data?...
1. Actually a while back I would have agreed with your first point. Seeing what has happened recently with some derivates (and therefore a lot of lost credit) I think it is quite a good decision not to publish the games when using a fixed test set!
2. To publish my games would ONLY be good for croscheking if I "cheat" as a 5 +3 game set with a fixed opening set would be pretty useless for any data mining. (Or do you have any other use for 130000 games with 'only' 50 openings?)
3. If the opening set would be open I would have to discuss these openings again and again and again. As I know that the set works (compare with other lists) I am perfectly happy with the current situation
4. I started to publish the list because I was asked to (for several reasons). Looking at the hit rate of my page it seems that not a lot of people share your concerns.
In short: I doubt that many people really would "check" the games. As I do crosscheck regulary with another rating list, I know that all engines (except one) are within it's error margins (if the engines are listed there). The single exception is Zappa Mexico I (one!) and there I doubt a sufficant test and it is impossible to decide which list is right or wrong (actually I could remove it out of my list as no one is interested in that engine anymore). This checking can be done by anyone else as well ... and I am sure if I would push or pull any engine it would be revieled very fast.
At the end it comes down to: This list is a byproduct of my testing and I provide it (at the moment) voluntary. If you do not thrust it it is on you to be consequent
Bye
Ingo
Ponder ON rating list: http://www.inwoba.de