Calibrate ELO: need user feedback
Posted: Tue May 28, 2013 9:07 am
I implemented the UCI_LimitStrength and UCI_Elo features in my engine, that lets you limit the level of play to a given ELO. But it is quite difficult to calibrate it by statistical methods, as those inevitably rely on computer vs. computer matches that are not very meaningful to calibrate the low levels IMO.
So I would like to get some user feedback to calibrate it better. If you are a decent chess player (1400 ELO minimum, but more is better) could you please:
1/ Download the program (SSE 4.2 is for recent CPUs, if it doesn't work on yours use the SSE 2 one)
Linux
https://github.com/lucasart/chess/raw/m ... 4.2.1_sse2
https://github.com/lucasart/chess/raw/m ... 2.1_sse4.2
Windows
https://github.com/lucasart/chess/raw/m ... 1_sse2.exe
https://github.com/lucasart/chess/raw/m ... sse4.2.exe
2/ Play a few games against the engine, setting it at your ELO level (click the limit strength feature and select appropriate ELO). Please don't cheat: the engine must play with a opening book (at least a few moves), and you should not look at the engine PV, replay the same opening several times in order to win by trial and error, etc.
3/ And answer this question: Was your opponent under-rated, over-rated, or correctly rated ? Any other qualitative feedback is welcome.
Thank you very much!
PS: For engine testers, DiscoCheck 4.2 and 4.2.1 are the same in ELO terms. Version 4.2.1 just has this limit strength feature, that's all.
So I would like to get some user feedback to calibrate it better. If you are a decent chess player (1400 ELO minimum, but more is better) could you please:
1/ Download the program (SSE 4.2 is for recent CPUs, if it doesn't work on yours use the SSE 2 one)
Linux
https://github.com/lucasart/chess/raw/m ... 4.2.1_sse2
https://github.com/lucasart/chess/raw/m ... 2.1_sse4.2
Windows
https://github.com/lucasart/chess/raw/m ... 1_sse2.exe
https://github.com/lucasart/chess/raw/m ... sse4.2.exe
2/ Play a few games against the engine, setting it at your ELO level (click the limit strength feature and select appropriate ELO). Please don't cheat: the engine must play with a opening book (at least a few moves), and you should not look at the engine PV, replay the same opening several times in order to win by trial and error, etc.
3/ And answer this question: Was your opponent under-rated, over-rated, or correctly rated ? Any other qualitative feedback is welcome.
Thank you very much!
PS: For engine testers, DiscoCheck 4.2 and 4.2.1 are the same in ELO terms. Version 4.2.1 just has this limit strength feature, that's all.