FFT Benchmark Results - Roy Longbottom's PC benchmark Collection

FFT Benchmark Results On PCs

Version 1 Single Precision Results Version 1 Double Precision Results
Version 2 Single Precision Results Version 2 Double Precision Results
Version 3 Single Precision Results Version 3 Double Precision Results
64 Bit vs 32 Bit Results Cache and RAM Code Key

Description

FFTGraf benchmark runs code for single and double precision Fast Fourier Transforms (FFTs) of size 1024 to 1048576 (1K to 1024K) , producing a graph of results. As the time for a single calculation can vary, the tests are run a number of times (default 5). Results given here are minimum times in milliseconds. Three versions are available:

Version 1 is all C code using optimised procedures produced by Scott Taylor. The data for FFTs is not loaded or stored from sequential memory addresses for much of the time. As the hardware loads this data in 32 or 64 byte bursts (or more), much of it is redundant, resulting in slow performance.

Version 2 of the program produces significant speed improvements mainly by making more efficient use of caches and using all burst data. Scott’s new code divides the data from RAM into L2 cache sized segments, leading to a 2 x improvement on large FFTs. Roy’s new code unrolls the critical loop to make effective use of burst data read into caches, leading to up to a further 2 x improvement or more with PCs that use 64 byte bursts. Roy has also supplied assembly code for the main calculations which helps a little.

Version 3 includes additional code from Roy to use SSE or 3DNow assembly instructions for single precision calculations and SSE2 instructions for double precision, when these are provided by the CPU.

NEW - The benchmarks, with text displays, instead of graphs, have been converted to run via Windows and Linux commands and Android app downloads. Details, results and download links are in FFTBenchmarks.htm.

Pre-compiled versions of the benchmarks can be found in FFTGraf.zip which also contains the source code and more detailed explanations. The three versions have also been compiled to run at 64 bits using Windows 64. Version 1 is the same but compiles using SSE/SSE2 instructions. Version 2 uses C for the main calculations, as i386 floating point instructions are not available under Win64. Version 3 is the same as before except it has no i386 floating point or 3Dnow facilities. The 64 bit versions are in More64Bit.zip. Then there is My Main Page for other PC benchmarks and results.

Version 1 memory demands in bytes are up to 16 times the FFT size on single precision and 28 times on double precision. These also apply to Versions 2 and 3 up to size 8K. Above this, they can be up to 28 times on SP and 52 times on DP - maximum 52 MB. Data spanned in the critical timing loop is 8 and 16 times (SP and DP) FFT size at 8K and below then 16 and 32 times above 8K.

Following is an example of Version 3 output for a 3.0 GHz Pentium 4E. The output shows Scott's MagSq'd[n/16], Peak Noise and Average Noise accuracy checks for FFTs (at 1024K). It should be noted that these might vary slightly using different compilers, particularly on single precision. Examples below show that SSE instructions in Version 3 produce different checksums.


 Size  Single Precision FFTs using SSE  
    K  Millisecond each pass            
    1  0.057  0.052  0.052  0.052  0.054
    2  0.108  0.117  0.108  0.107  0.109
    4  0.235  0.234  0.236  0.235  0.235
    8  0.545  0.491  0.489  0.490  0.490
   16   1.31   1.23   1.24   1.23   1.24
   32   3.31   2.83   3.18   2.93   2.86
   64   6.38   6.89   6.81   6.25   6.76
  128   17.3   16.9   18.2   17.9   17.4
  256   40.2   39.3   39.2   40.0   39.8
  512   84.8   84.6   83.5   82.9   83.3
 1024    184    182    182    183    182

 Size  Double Precision FFTs using SSE2 
    K  Millisecond each pass            
    1  0.059  0.058  0.058  0.058  0.058
    2  0.134  0.133  0.133  0.133  0.133
    4  0.287  0.282  0.281  0.281  0.281
    8  0.602  0.619  0.602  0.606  0.604
   16   1.56   1.57   1.54   1.53   1.55
   32   4.08   3.82   3.73   4.23   3.81
   64   10.6   10.6   10.5   10.5   11.1
  128   25.2   24.6   24.7   24.7   24.1
  256   52.1   52.8   52.2   51.4   52.2
  512    114    115    116    114    115
 1024    262    255    257    255    260

 Checks SP  9.999890e-001  3.338029e-006  1.043487e-011
 Checks DP  1.000000e+000  1.133294e-023  1.428096e-028

 Version 1                                             
 Checks SP  9.999891e-001  3.338028e-006  1.043382e-011
 Checks DP  1.000000e+000  1.133294e-023  1.428096e-028

 Version 2                                             
 Checks SP  9.999891e-001  3.338028e-006  1.043382e-011
 Checks DP  1.000000e+000  1.133294e-023  1.428096e-028

To Start

Version 1 Single Precision Milliseconds

                     Cache       FFT Size K --->                                              
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024

80486            66 15B  33     17    39    85   196   509  1240  2752  5864 12427            
Pentium         100 16B  50    3.0   9.7    22    54   127   307   801  1790  3844            
Pentium MMX     200 27B  66    1.2   3.1    11    24    52   119   277   807  1806  3844      
Pentium Pro     200 16F  66    1.1   2.9   6.4    14    37   101   358   797  1740  3717      
Celeron A       400 25F  66   0.36  0.85   2.6   7.5    36   106   254   569  1188  2543  5356
Pentium II      450 27H 100   0.32  0.86   4.1   9.2    20    47   132   395   985  2257  4627
Pentium IIIE    550 26F 100   0.26  0.60   1.6   3.5    12    34   134   309   684  1461  3313
Pentium IIIEB   733 26F 133   0.19  0.46   1.2   2.6   6.2    27   128   291   626  1377  2876
Pentium IIIEB  1000 26F 133   0.14  0.33  0.82   1.8   4.6    33   122   300   657  1414  3029
Pentium IIIEB  1000 26F RD1   0.14  0.33  0.82   1.8   4.2    16    91   216   478  1029  2126
Pentium 4      1500 16F RD2   0.14  0.33  0.77   1.7   4.3    17    93   235   565  1296  2809
Pentium 4      1800 16F DD1   0.11  0.28  0.63   1.4   3.7    16   116   279   645  1440  3053
Pentium 4      1900 16F 133   0.11  0.27  0.60   1.4   3.4    18   172   402   907  1985  4214
P4 Xeon        2200 17F RD2  0.093  0.23  0.53   1.2   3.0   7.4    31   194   480  1121  2435
Celeron M      1295 38F      0.089  0.20  0.49   1.4   3.0   6.6    15    75   584  1379  3121
Pentium 4E     3000 28F DC3  0.072  0.15  0.38  0.83   1.8   4.2    10    40   226   494  1043
Pentium 4N     3066 17F DD1  0.067  0.17  0.37  0.84   2.1   5.3    32   268   617  1368  2877
Pentium M2     1862 39F DC1  0.063  0.14  0.34  0.94   2.1   4.5    10    24    78   452  1266
Atom M         1600 H7F SCC   0.53  0.57   1.3   3.0   6.5    15    51   228   506  1095  2241
Core 2 Duo M   1830 39F DC5  0.078  0.19  0.34  0.94   2.2   4.8    11    24    80   318   814
Celeron C2 M   2000 38F DC4  0.053  0.13  0.31  0.86   2.0   4.7    10    54   264   571  1211
Core2 Duo A1CP 2400 3AF DC4  0.043  0.11  0.26  0.72   1.7   3.7   8.2    18    42   134  1404
Core2 Duo B1CP 2400 3AF DC4  0.043  0.11  0.26  0.72   1.7   3.7   8.2    18    42   108   565
Core2 Duo B1CP 2400 3AF DC6  0.043  0.11  0.26  0.71   1.6   3.7   8.2    18    42   104   439
Core i5 2467M  @@@@ 3WF DC8  0.036 0.080  0.18  0.48   1.1   2.5   6.7    16    34   111   258
Core i7 930    **** 3XF DC7  0.033 0.076  0.18  0.46   1.0   2.4   6.5    14    31    75   168
Core i7 860    #### 3XF DC8  0.033 0.076  0.18  0.46   1.0   2.4   6.3    14    30    72   171
Core i7 4820K  $$$$ 3VF QC9  0.021 0.047  0.11  0.28   0.6   1.5   4.1     9    19    48   113

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

AMD K62         350 37B 100    1.1   2.4   6.2    27    65   167   375   903  2012  4336  9219
Duron           700 44F 133   0.17  0.37  0.82   2.4    14    74   170   399  1065  2423  5361
Athlon Tbird   1200 46F 133   0.10  0.21  0.46   1.3   6.1    20   167   401   934  2056  4605
Athlon Tbird   1330 46F 133  0.090  0.20  0.43   1.2   5.5    19   122   291   751  1712  3623
Athlon 4       1725 46F DD1  0.066  0.15  0.32  0.91   4.3    11    82   193   462  1035  2160
Athlon 4 Bart  1800 47F#DD1  0.064  0.14  0.32  0.88   4.1    10    28   361   819  1800  3716
Turion 64 M    1900 47F DC4  0.072  0.16  0.34  0.89   4.0   9.3    23    99   233   556  1226
Athlon XP      2080 46F DD2  0.056  0.12  0.27  0.76   3.5   9.2    74   176   428   967  2014
Athlon 64aa    2210 47F DC3  0.051  0.11  0.25  0.73   3.0   7.4    17   101   227   514  1139
Phenom         3000 4ZF DC8  0.037 0.082  0.19  0.50   1.8   4.4    11    30    66   192   598

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

Version 1 Double Precision Milliseconds

                     Cache       FFT Size K --->                                              
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024

80486            66 15B  33     21    46    99   262   677  1493  3131  6595 13489            
Pentium         100 16B  50    4.7    11    29    65   159   415   947  2010  4256            
Pentium MMX     200 27B  66    1.6   5.5    13    27    60   176   393   903  1911  4051      
Pentium Pro     200 16F  66    1.4   3.0   6.7    19    65   190   430   925  1980  4222      
Celeron A       400 25F  66   0.49   1.2   4.6    18    56   134   295   635  1385  2916  6081
Pentium II      450 27H 100   0.45   2.1   4.7    10    23    65   233   528  1161  2381  4935
Pentium IIIE    550 26F 100   0.30  0.79   1.7   3.9    17    82   193   416   848  1819  3742
Pentium IIIEB   733 26F 133   0.23  0.60   1.3   3.0    16    69   160   349   750  1600  3295
Pentium 4      1500 16F RD2   0.19  0.43  0.96   2.5   9.8    48   119   284   650  1392  3080
Pentium IIIEB  1000 26F 133   0.17  0.41   1.0   2.6    16    68   166   360   772  1645  3493
Pentium IIIEB  1000 26F RD1   0.17  0.41  0.91   2.1   8.7    47   110   240   512  1142  2400
Pentium 4      1800 16F DD1   0.16  0.35  0.79   2.0    12    59   139   318   701  1478  3271
Pentium 4      1900 16F 133   0.15  0.33  0.75   1.9    13    92   213   463  1006  2095  4463
P4 Xeon        2200 17F RD2   0.13  0.29  0.65   1.7   4.1    19   101   247   574  1217  2684
Celeron M      1295 38F       0.11  0.25  0.67   1.5   3.2   7.2    39   296   712  1518  3127
Pentium 4N     3066 17F DD1  0.084  0.19  0.43   1.1   2.8    19   138   314   696  1421  3070
Pentium 4E     3000 28F DC3  0.076  0.20  0.42  0.93   2.1   5.0    22   114   251   524  1144
Pentium M2     1862 39F DC1  0.074  0.17  0.47   1.0   2.2   4.9    12    45   260   625  1361
Atom M         1600 H7F SCC   0.26  0.64   1.4   3.1   6.8    26   118   262   567  1156  2439
Core 2 Duo M   1830 39F DC5  0.069  0.17  0.45   1.0   2.3   5.0    12    41   200   428   871
Celeron C2 M   2000 38F DC4  0.064  0.15  0.42  0.94   2.1   4.6    26   139   301   605  1231
Core2 Duo A1CP 2400 3AF DC4  0.052  0.13  0.35  0.79   1.8   3.9   8.5    20    85   781  1824
Core2 Duo B1CP 2400 3AF DC4  0.052  0.13  0.35  0.79   1.8   3.9   8.5    20    54   293   704
Core2 Duo B1CP 2400 3AF DC6  0.052  0.13  0.35  0.78   1.7   3.8   8.5    20    52   226   543
Core i5 2467M  @@@@ 3WF DC8  0.041 0.094  0.24  0.54   1.2   3.3   7.3    17    55   128   281
Core i7 930    **** 3XF DC7  0.040 0.091  0.23  0.52   1.2   3.2   7.1    15    37    86   284
Core i7 860    #### 3XF DC8  0.040 0.092  0.23  0.52   1.2   3.1   6.8    15    36    87   259
Core i7 4820K  $$$$ 3VF QC9  0.023 0.054  0.14  0.31   0.7   2.0   4.4     9    23    57   192

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

AMD K62         350 37B 100    1.1   3.0    12    24    66   172   501  1141  2448  5082 10275
Duron           700 44F 133   0.20  0.43   1.3   7.6    39    90   205   547  1248  2756  5972
Athlon Tbird   1200 46F 133   0.11  0.23  0.66   3.0    11    89   209   529  1188  2605  5629
Athlon Tbird   1330 46F 133   0.11  0.23  0.63   2.6   8.7    75   170   410   912  1952  4242
Athlon 4       1725 46F DD1  0.074  0.16  0.47   2.1   5.8    47   107   248   545  1146  2464
Athlon 4 Bart  1800 47F#DD1  0.075  0.16  0.46   1.9   4.6    16   186   422   926  1918  4065
Turion 64 M    1900 47F DC4  0.069  0.15  0.44   1.9   4.4    11    50   118   277   614  1366
Athlon XP      2080 46F DD2  0.065  0.14  0.40   1.7   4.7    34    83   211   479  1009  2196
Athlon 64aa    2210 47F DC3  0.058  0.13  0.36   1.4   3.4   8.9    51   119   258   559  1219
Phenom         3000 4ZF DC8  0.042  0.10  0.25  0.90   2.2   5.3    15    33    94   303   740

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

Version 2 Single Precision Milliseconds

                     Cache       FFT Size K --->                                              
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024

80486            66 15B  33     16    35    82   186   403   858  1870  3948  8451            
Pentium         100 16B  50    3.1   7.3    16    36    86   195   431   924  1952            
Pentium MMX     200 27B  66    1.4   3.3   8.0    17    38    87   194   423   899  1894      
Pentium Pro     200 16F  66   0.67   1.5   3.2   7.0    23    54   119   250   526  1115      
Celeron A       400 25F  66   0.30  0.68   1.7   6.4    16    38    84   189   401   850  1789
Pentium IIIE    550 26F 100   0.21  0.47   1.1   2.3   7.1    19    44    95   201   429   940
Pentium IIIEB   660 26F 133   0.17  0.39   0.9   1.9   6.2    17    38    85   188   410   872
Celeron 2       900 25F 100   0.13  0.30  0.82   2.7    12    33    73   166   344   736  1568
PIII Tualatin  1266 27F 133  0.088  0.20  0.45   1.0   2.3   6.1    19    50   117   264   569
Pentium 4      1900 16F 133  0.075  0.18  0.46   1.1   3.3   9.4    27    69   160   353   768
Celeron M      1295 38F      0.073  0.16  0.36  0.84   1.9   4.3    11    34    91   211   484
Pentium 4E     3000 28F DC3  0.061  0.13  0.40   1.0   2.2   4.7    10    25    61   128   297
Pentium 4N     2400 17F RD2  0.060  0.14  0.35  0.78   1.9   5.1    16    48   118   259   575
Pentium 4N     2400 17F 133  0.060  0.14  0.35  0.78   2.0   5.9    20    58   128   283   648
Pentium M2     1862 39F DC1  0.052  0.11  0.25  0.59   1.4   2.9   6.3    16    41   107   245
Pentium 4N     3066 17F DD1  0.045  0.12  0.28  0.62   1.5   4.3    15    46   111   235   524
Atom M         1600 H7F SCC   0.46  0.49   1.1   2.3   5.3    12    28    68   147   324   700
Core 2 Duo M   1830 39F DC5  0.048  0.11  0.25  0.58   1.4   2.9   6.4    15    37    90   198
Celeron C2 M   2000 38F DC4  0.044  0.10  0.23  0.53   1.2   2.7   6.1    17    42    96   216
Core2 Duo A1CP 2400 3AF DC4  0.035 0.080  0.18  0.44   1.0   2.2   4.7    10    27    83   246
Core2 Duo B1CP 2400 3AF DC4  0.036 0.080  0.18  0.44   1.0   2.2   4.8    10    24    60   151
Core2 Duo B1CP 2400 3AF DC6  0.054  0.12  0.19  0.44   1.0   2.2   4.7    11    24    58   140
Core i5 2467M  @@@@ 3WF DC8  0.030 0.061  0.13  0.30  0.70   1.5   3.3   7.3    16    39    84
Core i7 930    **** 3XF DC7  0.026 0.054  0.12  0.27  0.64   1.4   3.0   6.5    14    32    78
Core i7 860    #### 3XF DC8  0.026 0.055  0.12  0.28  0.63   1.4   3.0   6.4    14    32    74
Core i7 4820K  $$$$ 3VF QC9  0.016 0.034  0.07  0.17  0.40   0.9   1.9   4.1     9    20    49

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

Duron           700 44F 133   0.13  0.26  0.55   1.7   6.4    17    42    96   229   524  1199
Athlon Tbird   1200 46F 133  0.075  0.16  0.33  0.89   3.5    12    36    82   199   465  1089
Athlon 4       1410 46F DD1  0.062  0.13  0.27  0.76   2.2   6.8    18    41   100   217   497
Athlon 4       1794 46F DD3  0.049  0.11  0.22  0.60   1.8   5.3    14    31    75   163   364
Athlon 4 Bart  1800 47F#DD1  0.049  0.10  0.22  0.61   1.6   4.8    22    52   126   277   620
Turion 64 M    1900 47F DC4  0.047  0.10  0.20  0.55   1.5   3.7    11    26    59   132   301
Athlon XP      2080 46F DD2  0.043 0.089  0.19  0.52   1.6   4.9    13    29    71   171   380
Athlon 64aa    2210 47F DC3  0.040 0.086  0.18  0.47   1.2   3.0   9.2    21    47   106   247
Phenom         3000 4ZF DC8  0.026 0.056  0.12  0.30  0.75   1.8   4.5    11    24    57   162

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

Version 2 Double Precision Milliseconds

                     Cache       FFT Size K --->                                              
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024

80486            66 15B  33     20    50   113   251   536  1258  2660  5654 11698            
Pentium         100 16B  50    4.0   8.6    20    50   121   268   582  1224  2614            
Pentium MMX     200 27B  66    1.7   4.5    10    21    49   111   244   560  1148  2417      
Pentium Pro     200 16F  66   0.91   2.0   4.2    14    35    81   172   374   817  1779      
Celeron A       400 25F  66   0.34  0.88   4.2    13    38    78   172   365   782  1645  3486
Pentium IIIE    550 26F 100   0.23  0.55   1.2   3.0    11    26    58   127   278   618  1374
Pentium IIIEB   660 26F 133   0.19  0.46   1.0   2.6    10    23    54   121   262   577  1276
Celeron 2       900 25F 100   0.15  0.36   1.6    12    33    77   170   333   683  1438  3073
PIII Tualatin  1266 27F 133   0.10  0.23  0.49   1.1   4.3    11    31    78   184   412   917
Pentium 4      1900 16F 133   0.10  0.23  0.51   1.4   5.9    16    36    85   185   406   907
Celeron M      1295 38F      0.082  0.19  0.44  0.94   2.1   6.2    21    56   130   295   668
Pentium 4N     2400 17F 133  0.075  0.18  0.39   1.0   3.7    12    33    75   169   372   819
Pentium 4N     2400 17F RD2  0.074  0.18  0.39   1.0   3.0   9.0    23    57   128   285   651
Pentium 4E     3000 28F DC3  0.062  0.18  0.49  0.97   2.8   5.9    15    34    73   167   390
Pentium 4N     3066 17F DD1  0.058  0.14  0.30  0.80   2.6   8.6    24    56   124   273   620
Pentium M2     1862 39F DC1  0.058  0.13  0.31  0.65   1.5   3.2   8.2    23    63   146   334
Atom M         1600 H7F SCC   0.23  0.51   1.1   2.4   5.6    14    32    71   156   337   739
Core 2 Duo M   1830 39F DC5  0.055  0.12  0.29  0.63   1.4   3.1   7.2    20    48   105   233
Celeron C2 M   2000 38F DC4  0.051  0.12  0.27  0.58   1.3   3.3   9.9    24    53   118   269
Core2 Duo A1CP 2400 3AF DC4  0.041 0.094  0.22  0.48   1.1   2.3   5.0    16    59   164   418
Core2 Duo B1CP 2400 3AF DC4  0.041 0.094  0.22  0.48   1.1   2.3   5.0    12    32    83   191
Core2 Duo B1CP 2400 3AF DC6  0.042  0.10  0.22  0.48   1.1   2.4   5.2    12    31    75   167
Core i5 2467M  @@@@ 3WF DC8  0.032 0.069  0.15  0.33  0.90   1.7   3.7   8.6    20    44    97
Core i7 930    **** 3XF DC7  0.028 0.062  0.14  0.30  0.73   1.6   3.4   7.2    17    41    95
Core i7 860    #### 3XF DC8  0.028 0.062  0.14  0.30  0.71   1.5   3.3   7.0    16    39    88
Core i7 4820K  $$$$ 3VF QC9  0.017 0.038  0.09  0.19  0.47   1.0   2.2   4.6    10    26    62

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

Duron           700 44F 133   0.14  0.28  0.88   5.0    15    34    76   172   379   836  1870
Athlon Tbird   1200 46F 133  0.081  0.17  0.45   1.6   7.9    22    53   123   282   645  1485
Athlon 4       1410 46F DD1  0.066  0.14  0.38   1.3   4.7    12    26    61   137   306   695
Athlon 4       1794 46F DD3  0.057  0.11  0.31   1.1   3.8   9.5    21    47   105   227   517
Athlon 4 Bart  1800 47F#DD1  0.053  0.11  0.30   1.0   4.2    15    35    81   183   409   925
Turion 64 M    1900 47F DC4  0.049  0.10  0.28   1.0   2.6   7.1    16    34    78   177   396
Athlon XP      2080 46F DD2  0.046  0.10  0.26  0.89   3.6   8.8    19    44   102   229   527
Athlon 64aa    2210 47F DC3  0.041 0.086  0.22  0.77   2.1   5.8    13    29    66   147   326
Phenom         3000 4ZF DC8  0.028 0.059  0.15  0.45   1.0   2.5   5.6    13    32    88   216

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

Version 3 Single Precision Milliseconds

                     Cache       FFT Size K --->                                                
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024  

Pentium         200 16B  66    1.5   3.9   8.4    19    43    97   220   484  1048  2218  4611  
Pentium MMX     200 27B  66    1.4   3.2   7.8    17    38    86   192   417   882  1869        
Pentium Pro     200 16F  66   0.80   1.8   3.8   8.3    22    55   121   263   557  1220        
Pentium II      400 27H 100   0.30  0.77   2.5   5.4    12    31    81   189   409   876  1897  
Celeron A       450 25F 100   0.27  0.60   1.4   3.3    10    24    51   109   237   502  1092  
Pentium IIIE    550 26F 100   0.18  0.40  0.90   1.9   6.2    18    40    87   185   394   838 S
Pentium 4      1900 16F 133  0.074  0.16  0.35  0.71   2.3   7.6    22    50   107   236   571 S
Celeron M      1295 38F      0.071  0.15  0.33  0.83   1.9   4.3    11    33    86   194   436 S
Pentium 4N     2400 17F 133  0.058  0.13  0.27  0.57   1.4   4.3    16    49   104   224   521 S
Pentium 4N     2400 17F RD2  0.057  0.12  0.27  0.56   1.4   3.6    12    32    70   156   364 S
Pentium 4N     2533 17F DD1  0.055  0.12  0.25  0.52   1.3   3.6    12    37    78   169   393 S
Pentium 4N     2533 17F RD3  0.055  0.12  0.26  0.53   1.3   3.3    10    26    57   124   289 S
Pentium 4N     2533 17F DC1  0.054  0.12  0.25  0.52   1.3   3.4    11    30    65   144   338 S
Pentium 4E     3000 28F DC3  0.052  0.11  0.23  0.49   1.2   2.8   6.3    17    39    83   182 S
Pentium M2     1862 39F DC1  0.050  0.10  0.23  0.58   1.3   2.9   6.3    15    39    95   213 S
Pentium 4N     3066 17F DD1  0.044  0.10  0.21  0.44   1.1   3.1    11    33    71   154   359 S
Pentium 4N     3678 17F DC3  0.038 0.086  0.18  0.37  0.91   2.3   6.9    19    42    92   231 S
Atom M         1600 H7F SCC   0.22  0.23  0.58   1.2   2.9   6.6    17    42    92   200   437 S
Core 2 Duo M   1830 39F DC5  0.033  0.07  0.16  0.38  0.89   2.0   5.0    10    27    65   136 S
Celeron C2 M   2000 38F DC4  0.032  0.07  0.15  0.35  0.82   1.8   4.4    14    34    73   159 S
Core2 Duo A1CP 2400 3AF DC4  0.024 0.053  0.12  0.29  0.67   1.5   3.2   6.8    19    66   213 S
Core2 Duo B1CP 2400 3AF DC4  0.025 0.053  0.12  0.29  0.67   1.5   3.2   6.8    16    42   108 S
Core2 Duo B1CP 2400 3AF DC6  0.025 0.053  0.12  0.29  0.70   1.5   3.2   7.0    16    40   100 S
Core i5 2467M  @@@@ 3WF DC8  0.019 0.044  0.09  0.21  0.50   1.1   2.5   5.4    12    31    68 S
Core i7 860    #### 3XF DC8  0.023 0.048  0.10  0.23  0.57   1.3   2.7   5.7    12    27    65 S
Core i7 930    **** 3XF DC7  0.017 0.035  0.08  0.18  0.45   1.0   2.1   4.6    10    23    58 S
Core i7 4820K  $$$$ 3VF QC9  0.010 0.022  0.05  0.12  0.29   0.7   1.4   3.0     7    15    37 S

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

Duron           750 44F 133   0.11  0.23  0.48   1.4   5.9    15    36    81   201   475  1112  
Athlon Tbird   1200 46F 133  0.072  0.15  0.30  0.77   3.3    11    35    77   176   402   932  
Athlon 4       1794 46F DD3  0.050  0.10  0.21  0.56   2.0   6.7    19    41    90   203   478 S
Athlon 4 Bart  1800 47F#DD1  0.050  0.10  0.22  0.56   1.5   4.8    22    54   119   265   602 S
Turion 64 M    1900 47F DC4  0.052  0.10  0.21  0.50   1.4   3.4    10    23    52   112   252 S
Athlon XP      2080 46F DD2  0.043 0.089  0.19  0.48   1.6   5.0    13    29    65   150   364 S
Athlon 64a     2000 48F DD3  0.041 0.083  0.17  0.46   1.3   3.1   7.2    20    49   112   262 S
Opteron        2000 48F DD3  0.040 0.082  0.17  0.46   1.3   3.0   7.4    21    50   121   289 S
Athlon 64aa    2210 47F DC3  0.036 0.074  0.15  0.40   1.1   2.9   8.4    18    43    95   214 S
Phenom         3000 4ZF DC8  0.020 0.041 0.085  0.22  0.60   1.5   3.8   8.5    19    46   133 S

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

Version 3 Double Precision Milliseconds

                     Cache       FFT Size K --->                                                
Processor       MHz  & RAM       1     2     4     8    16    32    64   128   256   512  1024  

Pentium         200 16B  66    2.2   4.8    11    23    58   131   291   701  1509  3031  6274  
Pentium MMX     200 27B  66    1.6   4.4   9.4    20    49   111   244   532  1133  2392        
Pentium Pro     200 16F  66    1.0   2.2   4.9    17    37    88   193   410   881  1932        
Pentium II      400 27H 100   0.41   1.6   3.5   7.7    21    54   128   285   620  1339  2914  
Celeron A       450 25F 100   0.29  0.73   1.7   8.4    21    44    95   205   445   949  2037  
Pentium IIIE    550 26F 100   0.23  0.53   1.1   2.8    11    25    57   127   276   604  1319  
Celeron M      1295 38F       0.10  0.23  0.54   1.2   2.7   7.3    23    60   138   308   691 S
Pentium 4      1900 16F 133  0.091  0.19  0.40   1.0   4.9    14    32    69   147   322   755 S
Pentium 4N     2400 17F 133  0.071  0.15  0.31  0.70   3.0    11    31    68   147   315   712 S
Pentium M2     1862 39F DC1  0.070  0.16  0.38  0.81   1.9   4.0    10    26    66   151   331 S
Pentium 4N     2400 17F RD2  0.069  0.14  0.31  0.68   2.3   7.1    19    42    92   197   440 S
Pentium 4N     2533 17F RD3  0.067  0.14  0.30  0.66   2.0   5.8    15    34    74   158   354 S
Pentium 4N     2533 17F DC1  0.065  0.14  0.30  0.64   2.2   6.8    19    41    89   190   428 S
Pentium 4N     2533 17F DD1  0.065  0.14  0.29  0.64   2.4   7.9    23    49   107   230   519 S
Pentium 4E     3000 28F DC3  0.058  0.13  0.28  0.60   1.5   3.7    11    24    51   114   255 S
Pentium 4N     3066 17F DD1  0.054  0.11  0.24  0.54   2.1   7.3    21    45    99   212   475 S
Pentium 4N     3678 17F DC3  0.046  0.10  0.21  0.44   1.4   4.2    11    25    54   115   256 S
Atom M         1600 H7F SCC   0.19  0.46  0.99   2.1   5.2    13    30    70   155   339   746 S
Core 2 Duo M   1830 39F DC5  0.042 0.094  0.23  0.50   1.1   2.5   6.3    17    42    89   190 S
Celeron C2 M   2000 38F DC4  0.040 0.087  0.21  0.46   1.1   2.8   9.1    22    49   106   232 S
Core2 Duo A1CP 2400 3AF DC4  0.031 0.070  0.17  0.38  0.87   1.9   4.1    13    50   154   362 S
Core2 Duo B1CP 2400 3AF DC4  0.031 0.071  0.18  0.38  0.87   1.9   4.1    10    28    70   158 S
Core2 Duo B1CP 2400 3AF DC6  0.031 0.071  0.18  0.38  0.90   1.9   4.1    10    25    63   139 S
Core i5 2467M  @@@@ 3WF DC8  0.024 0.059  0.14  0.27  0.66   1.5   3.2   7.2    17    38    84 S
Core i7 860    #### 3XF DC8  0.028 0.062  0.14  0.30  0.76   1.7   3.6   7.6    17    40    88 S
Core i7 930    **** 3XF DC7  0.023 0.051  0.12  0.26  0.65   1.4   3.1   6.5    15    37    86 S
Core i7 4820K  $$$$ 3VF QC9  0.013 0.030  0.07  0.15  0.40   0.9   1.9   4.0     9    22    55 S

     $$$$ 3.7 GHz i7 4820K, running at up to 3.90 GHz using Turbo Boost              
     #### 2.8 GHz i7 860,   running at up to 3.46 GHz using Turbo Boost (but detuned)
     **** 2.8 GHz i7 930,   running at up to 3.06 GHz using Turbo Boost              
     @@@@ 1.6 GHz i5 2467M, running at up to 2.30 GHz using Turbo Boost              

Duron           750 44F 133   0.12  0.25  0.85   5.1    15    33    73   163   365   821  1872  
Athlon Tbird   1200 46F 133  0.080  0.16  0.46   1.6   8.1    23    52   120   274   636  1505  
Athlon 64a     2000 48F DD3  0.063  0.13  0.34   1.1   2.5   5.9    16    35    77   174   392 S
Opteron        2000 48F DD3  0.062  0.13  0.34   1.1   2.5   6.0    15    34    80   187   433 S
Athlon 64aa    2210 47F DC3  0.056  0.12  0.29  0.90   2.4   6.3    14    30    65   145   315 S
Athlon 4       1794 46F DD3  0.049  0.10  0.32   1.2   4.6    12    25    53   117   265   636  
Athlon 4 Bart  1800 47F#DD1  0.049  0.10  0.31   1.2   4.2    15    36    79   172   373   872  
Turion 64 M    1900 47F DC4  0.068  0.14  0.36   1.1   2.9   7.4    16    35    76   167   367 S
Athlon XP      2080 46F DD2  0.043 0.092  0.27  0.99   3.6   9.1    20    42    90   205   482  
Phenom         3000 4ZF DC8  0.028 0.058  0.15  0.53   1.3   2.9   6.3    14    32    82   186 S

Core 2 Duo A nForce 570 chipset, Core 2 Duo B Intel 965 chipset

To Start

AMD Athlon 64, Phenom II, Intel Core 2 Duo, Core i7 - 32/64 Bit Milliseconds

                                 FFT Size K --->                                                  
                                 1     2     4     8    16    32    64   128   256   512  1024    
Single Precision                                                                                

Athlon 64    2210 MHz  Windows XP Pro x64                                                       

V1 MS compiler 64b All C/C++ 0.043  0.15  0.43   1.9   6.0    16    50   160   357   753  1618 S
V2 MS compiler 64b All C/C++  0.10  0.22  0.48   1.1   2.4   5.6    14    34    73   161   356 S
V3 MS compiler 64b SSE code  0.035 0.072  0.15  0.40   1.1   2.9   8.8    20    43    96   211 S

V1 Original    32b All C/C++ 0.051  0.11  0.25  0.73   3.0   7.4    17   101   227   514  1139  
V2 Original    32b i386 code 0.040 0.086  0.18  0.47   1.2   3.0   9.2    21    47   106   247  
V3 Original    32b SSE Code  0.036 0.074  0.15  0.40   1.1   2.9   8.4    18    43    95   214 S


Phenom II 3000 MHz Windows 7 64 bit                                                             

V1 MS compiler 64b All C/C++  0.026  0.07  0.16   1.4     6    14    33    79   193   453  1022 S
V2 MS compiler 64b All C/C++  0.068  0.15  0.32  0.76   1.7   4.0   9.3    22    47   113   287 S
V3 MS compiler 64b SSE code   0.020 0.040  0.08  0.22  0.64   1.6   3.9   8.7    19    46   133 S

V1 Original    32b All C/C++  0.037  0.08  0.19  0.50   1.8   4.4    11    30    66   192   598  
V2 Original    32b i386 code  0.026 0.056  0.12  0.30   0.8   1.8   4.5    11    24    57   162  
V3 Original    32b SSE Code   0.020 0.041  0.09  0.22  0.60   1.5   3.8   8.5    19    46   133 S


Core 2 Duo   2400 MHz Windows Vista 64 bit                                                      

V1 MS compiler 64b All C/C++ 0.039  0.10  0.28  0.69   1.6   3.7   8.4    21    50   121   394 S
V2 MS compiler 64b All C/C++ 0.047  0.11  0.26  0.41   0.9   2.0   4.3    10    22    52   127 S
V3 MS compiler 64b SSE code  0.034  0.07  0.11  0.28   0.6   1.4   3.0   6.7    16    37    90 S

V1 Original    32b All C/C++ 0.043  0.11  0.26  0.71   1.6   3.7   8.2    18    42   104   439  
V2 Original    32b i386 code 0.054  0.12  0.19  0.44   1.0   2.2   4.7    11    24    58   140  
V3 Original    32b SSE Code  0.025  0.05  0.12  0.29   0.7   1.5   3.2   7.0    16    40   100 S


Core i7 2800 MHz Windows 7 64 bit (up to 3.06 GHz using Turbo Boost)                            

V1 MS compiler 64b All C/C++ 0.023  0.05  0.15  0.38  0.89   2.1   5.5    13    34    80   178 S
V2 MS compiler 64b All C/C++ 0.021  0.05  0.10  0.24  0.54   1.2   2.5   5.5    12    30    76 S
V3 MS compiler 64b SSE code  0.015  0.03  0.07  0.17  0.42  0.94   2.0   4.3   9.3    22    54 S

V1 Original    32b All C/C++ 0.033  0.08  0.18  0.46   1.0   2.4   6.5    14    31    75   168  
V2 Original    32b i386 code 0.026  0.05  0.12  0.27  0.64   1.4   3.0   6.5    14    32    78  
V3 Original    32b SSE  Code 0.017  0.04  0.08  0.18  0.45   1.0   2.1   4.6    10    23    58 S


Core i7 4820K 3900 MHz Windows 8.1 64 bit (Using Turbo Boost, )                            

V1 MS compiler 64b All C/C++ 0.013  0.03  0.09  0.26   0.63   1.6   4.0    10    27    69   171 S
V2 MS compiler 64b All C/C++ 0.022  0.05  0.11  0.24   0.54   1.2   2.5   5.1    11    24    53 S
V3 MS compiler 64b SSE code  0.009  0.02 0.043  0.11   0.28  0.62   1.4   2.9   6.3    15    35 S

V1 Original    32b All C/C++ 0.021 0.047  0.11  0.28   0.60   1.5   4.1     9    19    48   113  
V2 Original    32b i386 code 0.016 0.034  0.07  0.17   0.40   0.9   1.9   4.1     9    20    49  
V3 Original    32b SSE Code  0.010 0.022  0.05  0.12   0.29   0.7   1.4   3.0     7    15    37 S


Double Precision                                                                                

Athlon 64    2210 MHz  Windows XP Pro x64                                                       

V1 MS compiler 64b All C/C++ 0.069  0.22  0.87   3.0   7.6    24    67   193   427   907  1923 S
V2 MS compiler 64b All C/C++  0.11  0.24  0.54   1.4   3.3   8.5    19    42    90   192   438 S
V3 MS compiler 64b SSE2 code 0.056  0.12  0.29  0.87   2.3   5.9    13    29    63   138   306 S

V1 Original    32b All C/C++ 0.058  0.13  0.36   1.4   3.4   8.9    51   119   258   559  1219  
V2 Original    32b i386 code 0.041 0.086  0.22  0.77   2.1   5.8    13    29    66   147   326  
V3 Original    32b SSE2 Code 0.056  0.12  0.29  0.90   2.4   6.3    14    30    65   145   315 S


Phenom II 3000 MHz Windows 7 64 bit                                                             

V1 MS compiler 64b All C/C++ 0.042  0.10  0.71   2.8   6.9    17    40    94   242   575  1305  
V2 MS compiler 64b All C/C++ 0.086  0.19  0.44   1.3   2.8   6.1    14    29    65   153   351  
V3 MS compiler 64b SSE2 code 0.028 0.059  0.15  0.54   1.3   2.9   6.3    14    32    81   180  

V1 Original    32b All C/C++ 0.042  0.10  0.25   0.9   2.2   5.3    15    33    94   303   740  
V2 MS compiler 64b All C/C++ 0.028 0.059  0.15  0.45   1.0   2.5   5.6    13    32    88   216  
V3 Original    32b SSE2 Code 0.028 0.058  0.15  0.53   1.3   2.9   6.3    14    32    82   186 S


Core 2 Duo   2400 MHz Windows Vista 64 bit                                                      

V1 MS compiler 64b All C/C++ 0.063  0.15  0.37  0.83   1.9   4.2  10.3    25    61   213   577 S
V2 MS compiler 64b All C/C++ 0.057  0.13  0.29  0.63   1.4   3.0   6.6    15    36    82   186 S
V3 MS compiler 64b SSE2 code 0.030  0.07  0.17  0.37   0.8   1.8   4.2    10    24    59   131 S

V1 Original    32b All C/C++ 0.052  0.13  0.35  0.78   1.7   3.8   8.5    20    52   226   543  
V2 Original    32b i386 code 0.042  0.10  0.22  0.48   1.1   2.4   5.2    12    31    75   167  
V3 Original    32b SSE2 Code 0.031  0.07  0.18  0.38   0.9   1.9   4.1    10    25    63   139 S


Core i7 2800 MHz Windows 7 64 bit (up to 3.06 GHz using Turbo Boost)                            

V1 MS compiler 64b All C/C++ 0.035  0.09  0.21  0.49   1.2   3.0   6.8    18    41    93   295 S
V2 MS compiler 64b All C/C++ 0.032  0.07  0.15  0.33  0.76   1.6   3.5   7.6    19    51   128 S
V3 MS compiler 64b SSE2 code 0.023  0.05  0.12  0.26  0.62   1.4   2.9   6.2    15    39    93 S

V1 Original    32b All C/C++ 0.040  0.09  0.23  0.52   1.2   3.2   7.1    15    37    86   284  
V2 Original    32b i386 code 0.028  0.06  0.14  0.30  0.73   1.6   3.4   7.2    17    41    95  
V3 Original    32b SSE2 Code 0.023  0.05  0.12  0.26  0.65   1.4   3.1   6.5    15    37    86 S


Core i7 4820K 3900 MHz Windows 8.1 64 bit (Using Turbo Boost, from 3700 MHz)                            

V1 MS compiler 64b All C/C++ 0.023  0.06  0.14  0.35  0.84   2.0   4.6    12    31    77   247 S
V2 MS compiler 64b All C/C++ 0.030  0.07  0.14  0.31  0.67   1.4   3.0   6.4    15    36    89 S
V3 MS compiler 64b SSE2 code 0.012  0.03  0.07  0.15  0.37  0.80   1.7   3.7   8.5    22    58 S

V1 Original    32b All C/C++ 0.023 0.054  0.14  0.31   0.7   2.0   4.4     9    23    57   192  
V2 Original    32b i386 code 0.017 0.038  0.09  0.19  0.47   1.0   2.2   4.6    10    26    62  
V3 Original    32b SSE2 Code 0.013 0.030  0.07  0.15  0.40   0.9   1.9   4.0     9    22    55 S

To Start

Cache & RAM Key


 L1 and L2 cache size e.g. 16 = 8 KB L1 and 256 KB L2

 1 =   8 KB  2 =  16 KB   3 = 32 KB   4 = 64 KB  5 = 128 KB
 6 = 256 KB  7 = 512 KB   8 =  1 MB   9 =  2 MB  A =   4 MB
 H =  24 KB  

 Z = 512 KB + 6 MB    X = 256 KB + 8 MB   W = 256 KB + 3 MB, V = 256 KB + 10 MB

 B = L2 on memory bus   F = At CPU MHz  H = Half CPU MHz   


Bus/Memory Speed

 Numbers 33, 50, 66, 100, 133 = MHz   

 DD1 = DDR at 133 MHz    DC1 = Dual Channel DDR at 133 MHz
 DD2 = DDR at 166 MHz    DC2 = Dual Channel DDR at 166 MHz
 DD3 = DDR at 200 MHz    DC3 = Dual Channel DDR at 200 MHz
 RD2 = RDRAM  400 MHz    RD1 = One  Channel RDRAM  400 MHz
 RD3 = RDRAM  533 MHz    DC4 = DDR2  533 MHz 
 DC5 = DDR2   666 MHz    DC6 = DDR2  800 MHz
 DC7 = DDR3  1066 MHz    DC8 = DDR3 1333 MHz    
 QC9 = DDR3 1600 MHz 4 channel
 SCC = DDR2 533 MHz single channel
 # = Paticularly slow memory

 S - last column - uses SSE or SSE2 instructions

   To Start

Roy Longbottom September 2015

The new Internet Home for my PC Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection