CCRL 404FRC
Downloads and Statistics
August 18, 2008
Testing summary:
Total: 32'500 games
played by 41 programs
270 CPU days (X2 4600+)

White wins: 13'360 (41.1%)
Black wins: 12'021 (37.0%)
Draws: 7'119 (21.9%)
White score: 52.1%

Pure list

"Pure" list removes rating distortion

"Pure" list is computed to remove the distortion that may affect the main rating list. Distortion appears when several versions or settings of the same engine are included together in the testing study. Suppose you have engine A and several versions of engine B: B1, B2, B3. Suppose also that A is particularly strong versus any version of B, which often happens in real testing because of some characteristics of those engines. In such case A will have higher rating, comparing to the study where only one version of B is present. Same thing may happen when A is weak versus B, getting lower rating.

To remove that distortion, a separate game database is constructed from games played only by best version in each engine "family". To save some space and time, pure database has all moves stripped out, it contains PGN header and results only. Then the "Pure list" is computed based for that "pure" database using Bayeselo.

Pure lists for all classes of engines

All engines   (32-bit)

1-2-CPU engines   (32-bit)

Single-CPU engines   (32-bit)

Free engines   (32-bit)

Free 1-2-CPU engines   (32-bit)

Free single-CPU engines   (32-bit)

Open source engines   (32-bit)

Open source 1-2-CPU engines   (32-bit)

Open source single-CPU engines   (32-bit)

Pure lists for complete database

Pure database download

To save space, pure database has all moves stripped out, it contains PGN header and results only. This pure database is useful only for rating calculation or similar analysis, it does not have actual games, only the results.

Download pure database, 7'900 games:

CCRL 404FRC Rating List — Pure all engines

Shredder UCI GUI, Ponder off, 3-4-5 piece EGTB, 128MB hash, random openings with switched sides
Time control: Equivalent to 40 moves in 4 minutes on Athlon 64 X2 4600+ (2.4 GHz)
Computed on August 18, 2008 with Bayeselo based on 7'900 games
RankNameRatingScoreAverage
Opponent
DrawsGamesLOS
ELO+
1Rybka 3 64-bit3188 (+13)+28−2782.8%−264.718.3%600 
100.0%
2Shredder 112974 (+4)+20−2059.7%−75.921.8%900
74.6%
3Naum 3.1 64-bit2965 (+8)+19−1961.5%−91.323.8%1000
82.9%
4Deep Sjeng 3.0 64-bit 1CPU2952 (+3)+19−1959.8%−77.024.4%1000
97.4%
5Hiarcs Paderborn 20072926 (+1)+19−1856.5%−48.424.9%1000
100.0%
6Fruit 0511032863 (0)+17−1756.1%−49.122.5%1300
61.9%
7Loop 10.32f2860 (0)+17−1755.2%−45.822.6%1300
97.9%
8Glaurung 2.1 64-bit2835 (0)+17−1755.8%−49.919.1%1200
97.6%
9Spike 1.2 Turin2810 (−9)+17−1752.6%−22.823.2%1200
100.0%
10Movei 00.8.4382684 (+9)+18−1835.9%+113.718.7%1200
99.9%
11Pharaon 3.5.12645 (−5)+18−1837.0%+104.918.3%1200
100.0%
12Ufim 8.022592 (+2)+20−2042.1%+61.316.7%1000
69.5%
13Hamsters 0.62585 (−10)+20−2041.1%+69.019.0%1000
91.7%
14Hermann 2.3 64-bit2565 (0)+20−2039.1%+91.016.4%1000
100.0%
15Aice 0.99.22362 (0)+28−2928.1%+180.211.8%500
96.5%
16Ayito 0.2.9942324 (−22)+32−3325.4%+202.013.8%400
 

Explanation of the columns

"Rank" — 1 is best, 2 is second best, etc.. It's simple.
"Engine" — Name and version of an engine.
"ELO" — Engine rating computed with Bayeselo. This column has also a number in brackets, which shows the difference between "Pure" rating and rating computed for complete database. For example "2850 (+10)" in the ELO column means that engine's "pure" rating is 2850, which is 10 points higher than its rating in the complete list.
"+" and "−" — 95% confidence intervals. For example, if engine's rating is 2850, "+" is +20 and "−" is −15, it means that there is only 5% estimated probability that engine's "true" rating is outside of the [2850−15 .. 2850+20] range.
"Score" — Number of points scored by an engine, divided by the number of games. Win is 1 point, draw is 1/2 of a point, and loss is 0. Please note that this is computed for "pure" database, so the numbers are different from the main list.
"Average Opponent" — Difference between the rating of engine tested and average of the opponent ratings for all games played by that engine. (Only games from the "pure" database were counted). Positive number means that engine was playing with stronger opponents, averagely. Negative number - weaker opponents.
"Draws" — Percentage of games by an engine, that ended in a draw. (Only games in "pure" database are counted).
"Games" — Total number of games played by an engine. (Only games in the "pure" database are counted).

The detailed explanation how we construct the "pure" list:

1. We have to find the best versions in each engine family. We can't use the "Best versions" list for that, because the "Best versions" list may be affected by distortion which we are trying to remove. To find the true best version in a family of engines we create separate game database, containing only games by engines from that family. Then we compute the ratings for that small database and take the highest rated engine as best, to represent that family in the "pure" list. There is also a requirement that every engine in the "pure" list must have at least 150 games played with other "pure" engines, and it also must be a public release, not beta or private version.

2. After finding a set of "pure" best versions, we exctract all games where both side engines are from that set, and those games form a "pure" database. Pure list is simply a rating list computed for that database using Bayeselo.

Features of the pure list

First thing that you have to realize about the "pure" list is that it is not necessarily more relevant than the big list of all versions. "Pure" list removes one kind of distortion - distortion that may occur from multiple version of same engine. But the price for that is big - the "pure" database is several times smaller than complete database. This results in much larger statistical error, as you can see in the + / - columns. Also, the "pure" list can still have other types of distortion - distortion resulting from too small (including 0) or too large number of games in particular pairs.

So, don't take this list as certainly superior to the "Best versions" list. This list does not substitute the "Best versions" list, but simply provides a different view for those who may be afraid of distortions. It is possible though that in time this list will become clearly superior, when the "pure" database will be large enough.

Please also realize that some engine version being listed in the "Best versions" list does not guarantee that the same version will be listed in the "Pure" list. Most often it will be the case, but theoretically it is possible that different version will turn out to be the best in the "pure" context.


Crosstable for "pure" database

Results matrix

Pure all engines
#NameELO12345678910111213141516
1Rybka 3 64-bit3188 72 − 28
+61−17=22
−52
79.5 − 20.5
+66−7=27
−10
82.5 − 17.5
+72−7=21
+15
84 − 16
+76−8=16
+17
89 − 11
+84−6=10
+31
90 − 10
+83−3=14
+23
         
2Shredder 11297428 − 72
+17−61=22
+52
 51 − 49
+39−37=24
−1
51.5 − 48.5
+36−33=31
−10
52.5 − 47.5
+38−33=29
−31
57.5 − 42.5
+45−30=25
−57
67 − 33
+58−24=18
+15
69 − 31
+61−23=16
+8
73.5 − 26.5
+62−15=23
+9
87 − 13
+83−9=8
+48
      
3Naum 3.1 64-bit296520.5 − 79.5
+7−66=27
+10
49 − 51
+37−39=24
+1
 51.5 − 48.5
+35−32=33
−2
55 − 45
+44−34=22
−1
67.5 − 32.5
+53−18=29
+20
67.5 − 32.5
+53−18=29
+15
63 − 37
+55−29=16
−30
70 − 30
+57−17=26
−13
84 − 16
+75−7=18
−8
87 − 13
+80−6=14
−3
     
4Deep Sjeng 3.0 64-bit 1CPU295217.5 − 82.5
+7−72=21
−15
48.5 − 51.5
+33−36=31
+10
48.5 − 51.5
+32−35=33
+2
 57.5 − 42.5
+39−24=37
+22
58 − 42
+51−35=14
−28
68 − 32
+53−17=30
+31
68 − 32
+59−23=18
+18
68 − 32
+55−19=26
−12
79 − 21
+68−10=22
−48
85 − 15
+79−9=12
−5
     
5Hiarcs Paderborn 2007292616 − 84
+8−76=16
−17
47.5 − 52.5
+33−38=29
+31
45 − 55
+34−44=22
+1
42.5 − 57.5
+24−39=37
−22
 62 − 38
+46−22=32
+17
56.5 − 43.5
+43−30=27
−19
65.5 − 34.5
+52−21=27
+17
66.5 − 33.5
+54−21=25
+1
83.5 − 16.5
+78−11=11
+47
79.5 − 20.5
+68−9=23
−60
     
6Fruit 051103286311 − 89
+6−84=10
−31
42.5 − 57.5
+30−45=25
+57
32.5 − 67.5
+18−53=29
−20
42 − 58
+35−51=14
+28
38 − 62
+22−46=32
−17
 51.5 − 48.5
+26−23=51
+5
51.5 − 48.5
+39−36=25
−18
56 − 44
+46−34=20
−9
78.5 − 21.5
+69−12=19
+43
71.5 − 28.5
+58−15=27
−64
82 − 18
+76−12=12
−1
85 − 15
+76−6=18
+4
87.5 − 12.5
+82−7=11
+35
  
7Loop 10.32f286010 − 90
+3−83=14
−23
33 − 67
+24−58=18
−15
32.5 − 67.5
+18−53=29
−15
32 − 68
+17−53=30
−31
43.5 − 56.5
+30−43=27
+19
48.5 − 51.5
+23−26=51
−5
 52 − 48
+39−35=26
−10
57 − 43
+43−29=28
−1
74.5 − 25.5
+65−16=19
+12
76 − 24
+66−14=20
−16
84.5 − 15.5
+78−9=13
+24
88 − 12
+83−7=10
+69
86.5 − 13.5
+82−9=9
+34
  
8Glaurung 2.1 64-bit2835 31 − 69
+23−61=16
−8
37 − 63
+29−55=16
+30
32 − 68
+23−59=18
−18
34.5 − 65.5
+21−52=27
−17
48.5 − 51.5
+36−39=25
+18
48 − 52
+35−39=26
+10
 51.5 − 48.5
+41−38=21
−14
70.5 − 29.5
+61−20=19
+4
76.5 − 23.5
+69−16=15
+21
82 − 18
+74−10=16
+16
76.5 − 23.5
+67−14=19
−45
81.5 − 18.5
+76−13=11
−1
  
9Spike 1.2 Turin2810 26.5 − 73.5
+15−62=23
−9
30 − 70
+17−57=26
+13
32 − 68
+19−55=26
+12
33.5 − 66.5
+21−54=25
−1
44 − 56
+34−46=20
+9
43 − 57
+29−43=28
+1
48.5 − 51.5
+38−41=21
+14
 66 − 34
+51−19=30
−16
77 − 23
+68−14=18
+44
75.5 − 24.5
+67−16=17
−17
78 − 22
+67−11=22
−14
77 − 23
+66−12=22
−43
  
10Movei 00.8.4382684 13 − 87
+9−83=8
−48
16 − 84
+7−75=18
+8
21 − 79
+10−68=22
+48
16.5 − 83.5
+11−78=11
−47
21.5 − 78.5
+12−69=19
−43
25.5 − 74.5
+16−65=19
−12
29.5 − 70.5
+20−61=19
−4
34 − 66
+19−51=30
+16
 58.5 − 41.5
+50−33=17
+26
64 − 36
+53−25=22
+10
65 − 35
+55−25=20
+11
66.5 − 33.5
+57−24=19
+5
  
11Pharaon 3.5.12645  13 − 87
+6−80=14
+3
15 − 85
+9−79=12
+5
20.5 − 79.5
+9−68=23
+60
28.5 − 71.5
+15−58=27
+64
24 − 76
+14−66=20
+16
23.5 − 76.5
+16−69=15
−21
23 − 77
+14−68=18
−44
41.5 − 58.5
+33−50=17
−26
 53 − 47
+42−36=22
−32
58.5 − 41.5
+46−29=25
0
57 − 43
+50−36=14
−26
86.5 − 13.5
+80−7=13
+29
 
12Ufim 8.022592     18 − 82
+12−76=12
+1
15.5 − 84.5
+9−78=13
−24
18 − 82
+10−74=16
−16
24.5 − 75.5
+16−67=17
+17
36 − 64
+25−53=22
−10
47 − 53
+36−42=22
+32
 51 − 49
+40−38=22
0
51 − 49
+41−39=20
−20
80 − 20
+74−14=12
+21
80.5 − 19.5
+75−14=11
−9
13Hamsters 0.62585     15 − 85
+6−76=18
−4
12 − 88
+7−83=10
−69
23.5 − 76.5
+14−67=19
+45
22 − 78
+11−67=22
+14
35 − 65
+25−55=20
−11
41.5 − 58.5
+29−46=25
0
49 − 51
+38−40=22
0
 51.5 − 48.5
+38−35=27
−8
76 − 24
+70−18=12
−11
85.5 − 14.5
+78−7=15
+34
14Hermann 2.3 64-bit2565     12.5 − 87.5
+7−82=11
−35
13.5 − 86.5
+9−82=9
−34
18.5 − 81.5
+13−76=11
+1
23 − 77
+12−66=22
+43
33.5 − 66.5
+24−57=19
−5
43 − 57
+36−50=14
+26
49 − 51
+39−41=20
+20
48.5 − 51.5
+35−38=27
+8
 75 − 25
+69−19=12
0
74.5 − 25.5
+65−16=19
−53
15Aice 0.99.22362          13.5 − 86.5
+7−80=13
−29
20 − 80
+14−74=12
−21
24 − 76
+18−70=12
+11
25 − 75
+19−69=12
0
 58 − 42
+53−37=10
+27
16Ayito 0.2.9942324           19.5 − 80.5
+14−75=11
+9
14.5 − 85.5
+7−78=15
−34
25.5 − 74.5
+16−65=19
+53
42 − 58
+37−53=10
−27
 
Performance color legend:
(Only pairs with at least 30 games)
-120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120

History of "pure" testing

History of "pure" testing for all engines


Likelihood of superiority for "pure" database

LOS matrix

Pure all engines
#NameELO12345678910111213141516
1Rybka 3 64-bit3188 100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0
2Shredder 1129740.0 74.694.5100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0
3Naum 3.1 64-bit29650.025.4 82.999.8100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0
4Deep Sjeng 3.0 64-bit 1CPU29520.05.517.1 97.4100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0
5Hiarcs Paderborn 200729260.00.00.22.6 100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0100.0
6Fruit 05110328630.00.00.00.00.0 61.999.0100.0100.0100.0100.0100.0100.0100.0100.0
7Loop 10.32f28600.00.00.00.00.038.1 97.9100.0100.0100.0100.0100.0100.0100.0100.0
8Glaurung 2.1 64-bit28350.00.00.00.00.01.02.1 97.6100.0100.0100.0100.0100.0100.0100.0
9Spike 1.2 Turin28100.00.00.00.00.00.00.02.4 100.0100.0100.0100.0100.0100.0100.0
10Movei 00.8.43826840.00.00.00.00.00.00.00.00.0 99.9100.0100.0100.0100.0100.0
11Pharaon 3.5.126450.00.00.00.00.00.00.00.00.00.1 100.0100.0100.0100.0100.0
12Ufim 8.0225920.00.00.00.00.00.00.00.00.00.00.0 69.597.0100.0100.0
13Hamsters 0.625850.00.00.00.00.00.00.00.00.00.00.030.5 91.7100.0100.0
14Hermann 2.3 64-bit25650.00.00.00.00.00.00.00.00.00.00.03.08.3 100.0100.0
15Aice 0.99.223620.00.00.00.00.00.00.00.00.00.00.00.00.00.0 96.5
16Ayito 0.2.99423240.00.00.00.00.00.00.00.00.00.00.00.00.00.03.5 
LOS color legend:
0 10 20 30 40 50 60 70 80 90 100

Created in 2005-2008 by CCRL team
Last games added on August 18, 2008