Classic Computer Benchmarks
Roy Longbottom
Index
Note - Here, most of the links to benchmark downloads and results are from a 2013 versions of the Benchmarks.
Introduction
The Classic Benchmarks are the first programs that set
standards of performance for computers. They were the initial benchmarks in
the Roy Longbottom PC Benchmark Collection, produced via C/C++ in 1996,
following the same execution format:
1: Running time to be calibrated to run for a noticeable time
2: Results to be displayed as programs are running
3: Results or summary to be saved in text log files
4: Facilities to type details of the system used for inclusion in the log
4: Later versions collected automatically
5: Two versions to be provided, optimised and non-optimised
6: Installation not to be required, just click and go
7: Execution and source code to be provided in zip files
|
The initial series of Classic Benchmarks,
to run via Windows, are available to download from
BenchNT.zip and later 32 bit and 64 bit versions in
Win64.zip.
Then there are preconpiled versions for DOS and OS/2 in
DosTests.zip and
OS2Tests.zip.
Also there are 32 bit and 64 bit versions to run on PCs using Linux in
classic_benchmarks.tar.gz
See lists of files below.
Original ARMv7 Android apps can be downloaded from
Android Benchmarks.htm
that also contains results, with project files in
Android Benchmarks.zip.
Then newer ones, that automatically select benchmark code for ARM, Intel or MIPS processors at run time, for 32 bit architecture or 64 bit operation, can be obtained via
the web site index
along with 32 bit and 64 bit compilations for the ARM CPU based Raspberry Pi systems.
These Classic Benchmarks were first made available in 1996. Then, with the tools available, performance was measured using low resolution timers when, for more accurate results, 5 seconds minimum running time of individually timed functions was chosen. This lead to extended running times for the Whetstone Benchmark and, particularly, the Livermore Loops program. With later high resolution timers, minimum test times were reduced to 1 second.
Early versions of the benchmarks produced text based logs showing the same details as running displays. Later, the text files provided a summary of the important performance measurements. Details below include copies of the running displays. Of particular significance are details of running times during calibration, that determines the number of repeat passes for the required running time. The code would be suspect if pass count and time did not increase linearly.
Where appropriate, the program source codes include tables of numeric results of floating point calculations that can depend on different compilers and CPU technology. These are checked, producing “Was” and “Should be” numbers, when inconsistent. The former can be included in a table for a new compilation.
Go Back
Older Files
The following files have been uploaded, containing 71 benchmark execution files and source code to use via DOS, OS/2, Windows or Linux.
dostests.zip 32 bit
Dhry1ND.exe - Dhrystone 1 Benchmark Non-Optimised
Dhry1OD.exe - Dhrystone 1 Benchmark Optimised
Dhry2ND.exe - Dhrystone 2 Benchmark Non-Optimised
Dhry2OD.exe - Dhrystone 2 Benchmark Optimised
LinpCND.exe - Linpack Benchmark Non-Optimised
LinpCOD.exe - Linpack Benchmark Optimised
LiveCND.exe - Livermore Loops Benchmark Non-Optimised
LiveCOD.exe - Livermore Loops Benchmark Optimised
WhetCND.exe - Whetstone Benchmark Non-Optimised Single Precision
WhetCOD.exe - Whetstone Benchmark Optimised Single Precision
WhetDCD.exe - Whetstone Benchmark Optimised Double Precision
cb16bit.zip - DOS 16 bit
DHRY1C16.EXE - Dhrystone 1 compiled with Watcom 10.5 C/C++
WHETC16.EXE - Whetstone compiled with Watcom 10.5 C/C++
WHETBAS.EXE - Whetstone BASIC compiled with BASCOM
WHETPROF.EXE - Whetstone FORTRAN compiled with PROFORT
WHETMS5F.EXE - Whetstone FORTRAN compiled with MS 5.1
os2tests.zip
Dhry1NO2.exe - Dhrystone 1 Benchmark Non-Optimised
Dhry1OO2.exe - Dhrystone 1 Benchmark Optimised
Dhry2NO2.exe - Dhrystone 2 Benchmark Non-Optimised
Dhry2OO2.exe - Dhrystone 2 Benchmark Optimised
LinpCNO2.exe - Linpack Benchmark Non-Optimised
LinpCOO2.exe - Linpack Benchmark Optimised
LiveCNO2.exe - Livermore Loops Benchmark Non-Optimised
LiveCOO2.exe - Livermore Loops Benchmark Optimised
WhetCNO2.exe - Whetstone Benchmark Non-Optimised Single Precision
WhetCOO2.exe - Whetstone Benchmark Optimised Single Precision
WhetDCO2.exe - Whetstone Benchmark Optimised Double Precision
benchnt.zip, for Windows, includes source codes
Dhry1NNT.exe - Dhrystone 1 Benchmark Non-Optimised
Dhry1ONT.exe - Dhrystone 1 Benchmark Optimised
Dhry2NNT.exe - Dhrystone 2 Benchmark Non-Optimised
Dhry2ONT.exe - Dhrystone 2 Benchmark Optimised
LinpCNNT.exe - Linpack Benchmark Non-Optimised
LinpCONT.exe - Linpack Benchmark Optimised
LiveCNNT.exe - Livermore Loops Benchmark Non-Optimised
LiveCONT.exe - Livermore Loops Benchmark Optimised
WhetCNNT.exe - Whetstone Benchmark Non-Optimised Single Precision
WhetCONT.exe - Whetstone Benchmark Optimised Single Precision
WhetDCNT.exe - Whetstone Benchmark Optimised Double Precision
WhetsMS6.exe - Whetstone Benchmark Optimised Single Precision via MS6 C
whetvb4.exe - Whetstone Benchmark compiled by Visual Basic 4
Win64.zip, later 32 bit and 64 bit source codes in newsource.zip
32 bit 64 bit 64b large integers
dhry132.exe dhry164int32.exe dhry164int64.exe
dhry232.exe dhry264int32.exe dhry264int64.exe
linp32SSE2.exe linpack64.exe
lloops32SSE2.exe lloops64.exe
whets32SSE.exe
classic_benchmarks.tar.gz, for Linux, includes source codes
32 bit 64 bit
dhrystone1_NoOpt dhrystone1_64_NoOpt
dhrystone1 dhrystone1_64
dhrystone2_NoOpt dhrystone2_64_NoOpt
dhrystone2 dhrystone2_64
linpack_NoOpt linpack_64_NoOpt
linpack linpack_64
lloops_NoOpt lloops_64_NoOpt
lloops lloops_64
whetstone_NoOpt whetstone_64_NoOpt
whetstone whetstone_64
|
Go Back
Livermore Kernels (Livermore Loops)
This supercomputer benchmark was first introduced in 1970,
initially comprising 14 kernels of numerical application, written
in Fortran. The number of kernels was increased to 24 in the
1980's. Performance measurements are in terms of Millions of
Floating Point Operations Per Second or MFLOPS. The program also
checks the results for computational accuracy. One main aim was
to avoid producing single number performance comparisons, the 24
kernels being executed three times at different Do-loop spans to
produce short, medium and long vector performance measurements.
If overall averages are quoted, the benchmark reference below
indicates that the geometric mean may be interpreted as a
characteristic rate of computation but it would be more realistic
to retain the range of statistics in terms of geometric, harmonic
and arithmetic means, minimum and maximum. See details in Livermore Loops Results.htm.
The original
Fortran Version
is available from Netlib, plus the
Kernels in C.
I used the latter and data generation converted from Fortran to C, to produce a program more suitable to run on PCs.
The benchmark ran successfully until gcc 6 was used for Raspberry Pi 3 at 64
bits, when a few minor changes were required for that platform.
On screen Log
L.L.N.L. 'C' KERNELS: MFLOPS P.C. VERSION 4.0
Optimisation Optimised
Calculating outer loop overhead
1000 times 0.00 seconds
10000 times 0.00 seconds
100000 times 0.00 seconds
1000000 times 0.01 seconds
10000000 times 0.05 seconds
20000000 times 0.07 seconds
40000000 times 0.15 seconds
80000000 times 0.29 seconds
Overhead for each loop 3.6829e-009 seconds
Calibrating part 2 of 3 (There are three displays similar to the following)
Loop count 8 0.01 seconds
Loop count 32 0.04 seconds
Loop count 128 0.16 seconds
Loops 200 x 2 x Passes
Kernel Floating Pt ops
No Passes E No Total Secs. MFLOPS Span Checksums OK
------------ -- ------------- ----- ------- ---- ---------------------- --
1 40 x 534 5 4.314720e+009 1.00 4320.00 101 5.253344778937972e+002 16
2 40 x 553 4 3.433024e+009 1.00 3435.39 101 1.539721811668384e+003 15
3 53 x 171 2 7.322904e+008 1.02 716.27 101 1.009741436578952e+000 16
4 70 x 475 2 1.596000e+009 1.00 1593.73 101 5.999250595473891e-001 16
5 55 x 128 2 5.632000e+008 1.00 565.94 101 4.589031939600982e+001 16
6 7 x 246 2 6.612480e+008 0.95 695.38 32 8.631675645333210e+001 16
7 22 x 361 16 5.133709e+009 0.95 5407.69 101 6.345586315784055e+002 16
8 6 x 287 36 4.909766e+009 0.95 5195.20 100 1.501268005625795e+005 15
9 21 x 355 17 5.120094e+009 0.93 5494.16 101 1.189443609974981e+005 16
10 19 x 50 9 3.454200e+008 0.95 362.60 101 7.310369784325296e+004 16
11 64 x 168 1 4.300800e+008 0.95 453.49 101 3.433560407475758e+004 16
12 68 x 554 1 1.506880e+009 0.95 1592.06 100 7.127569130821465e-006 16
13 41 x 23 7 8.449280e+007 0.96 87.93 32 9.816387810944356e+010 15
14 10 x 49 11 2.177560e+008 0.95 229.34 101 3.039983465145392e+007 15
15 1 x 118 33 7.788000e+008 0.95 820.44 101 3.943816690352044e+004 15
16 27 x 458 10 1.384992e+009 0.95 1458.55 40 6.480410000000000e+005 16
17 20 x 82 9 5.963040e+008 0.96 624.40 101 1.114641772902486e+003 16
18 1 x 371 44 3.232152e+009 0.95 3403.80 100 1.015727037502299e+005 15
19 23 x 112 6 6.244224e+008 0.95 654.34 101 5.421816960147207e+002 16
20 8 x 72 26 5.990400e+008 0.96 626.43 100 3.126205178815432e+004 16
21 1 x 72 2 1.800000e+009 0.95 1891.86 50 7.824524877232093e+007 16
22 7 x 106 17 5.096056e+008 0.96 533.42 101 2.938604376566698e+002 15
23 5 x 134 11 1.459260e+009 0.95 1536.42 100 3.549900501563624e+004 15
24 31 x 661 1 8.196400e+008 0.91 901.82 101 5.000000000000000e+001 16
Overall Maximum Rate 5670.13
Average Rate 1761.87
Geometric Mean 1080.28
Harmonic Mean 608.53
Minimum Rate 81.48
|
Livermore Reference
F.H. McMahon, The Livermore Fortran Kernels: A Computer Test
Of The Numerical Performance Range, Lawrence Livermore National
Laboratory, Livermore, California, UCRL-53745, December 1986.
Go Back
Whetstone Benchmark
The Whetstone benchmark was written by Harold Curnow of CCTA,
the British government computer procurement agency, based on work
by Brian Wichmann of the National Physical Laboratory. An Algol
version of the benchmark was released in November 1972 and
Fortran single and double precision varieties in April 1973. The
Fortran codes became the first general purpose benchmarks that
set industry standards of performance.
The benchmark produced speed ratings in terms of Thousands of
Whetstone Instructions Per Second (KWIPS). In 1978, self timing
versions (by Roy Longbottom also of CCTA) produced speed ratings
in MOPS (Millions of Operations Per Second) and MFLOPS (Floating
Point) and overall rating in MWIPS
Whetstone benchmark source code can also be downloaded in
programming languages Fortran, C, Basic, Java, Visual Basic,
Excel Spreadsheet Visual Basic. These are all of the same sort of
format with self timing for PCs. Performance from these is
included in the benchmark results file. Please excuse all code
looking like Fortran.
For results on PCs see the following
Whetstone Benchmark History and Results
and
Whetstone Benchmark Detailed Results On PCs,
which includes performance ratings of computers produced in the 1960's to today's PCs.
At the end of the latter there are tables demonstrating efficiency of the different PC interpreters and compilers, from up 12 benchmarks, starting with an 8088/8087 CPU, up to a multithreaded Core i7. The first table provides overall MWIPS ratings and the second shows efficiency, based on %MWIPS per MHz, the latter varying between 0.4 and 1000 for a Core i7 based PC.
With the original benchmark producing a single performance measure (and using manual timing), results were subject to manipulation. As is usual, compilers were arranged to produced the most efficient code, but some computer manufacturers arranged for unrealistic over optimisation. Loop N6 (see below) used to be the dominant procedure, by far, and was most affected. It should never produce faster speeds than N1 or N2. Hence, the timing modification were introduced to identify irregular behaviour. N5 and N8 function tests now tend to consume the most time. Particularly with the availability of more registers with 64 bit working, N3, N4 and N7 can be subject to over optimisation, but this does not affect the overall performance rating much.
On screen Log
Single Precision C/C++ Whetstone Benchmark Optimised
Calibrate
0.01 Seconds 1 Passes (x 100)
0.03 Seconds 5 Passes (x 100)
0.08 Seconds 25 Passes (x 100)
0.41 Seconds 125 Passes (x 100)
Use 3048 passes (x 100)
Single Precision C/C++ Whetstone Benchmark Optimised
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475025653839100 872.850 0.067
N2 floating point -1.12274754047393800 813.164 0.504
N3 if then else 1.00000000000000000 697.500 0.452
N4 fixed point 12.00000000000000000 3934.509 0.244
N5 sin,cos etc. 0.49904659390449520 81.161 3.125
N6 floating point 0.99999988079071040 517.339 3.178
N7 assignments 3.00000000000000000 898.894 0.627
N8 exp,sqrt etc. 0.75110864639282230 52.437 2.162
MWIPS 2942.474 10.359
|
Whetstone Reference
H J Curnow and B A Wichmann, "A Synthetic
Benchmark", Computer Journal Vol 19, No 1 1976
Go Back
Linpack Benchmark
This benchmark was produced by Jack Dongarra from the
"LINPACK" package of linear algebra routines. Following initial release in 1979, it became
the primary benchmark for scientific applications from the mid
1980's with a slant towards supercomputer performance.
The original version was produced in Fortran but a
"C" version appeared later. The standard "C"
version operates on 100x100 matrices in double precision with
rolled/unrolled and single/double precision options. The
pre-compiled versions are double precision, rolled. Other
versions are available with different sizes of matrices.
Performance rating is in terms of MFLOPS. For original results on PCs see
Linpack Results.htm
This program's arrangement of measuring and excluding overheads, and running with variation of array leading dimensions, were in the original Linpack Benchmark. My variation, linpack-pc.c, for PCs was accepted by Netlib and can be downloaded from
here.
Note that clicking on the link might produce details without appropriate line feeds. Using Windows, you might have to download and open with WordPad or suitable language editor.
On screen Log
Unrolled Double Precision Linpack Benchmark - PC Version in 'C/C++'
Optimisation Optimised
norm resid resid machep x[0]-1 x[n-1]-1
0.4 7.41628980e-014 1.00000000e-015 -1.49880108e-014 -1.89848137e-014
Times are reported for matrices of order 100
1 pass times for array with leading dimension of 201
dgefa dgesl total Mflops unit ratio
0.00047 0.00002 0.00049 1402.78 0.0014 0.0087
Calculating matgen overhead
10 times 0.00 seconds
100 times 0.02 seconds
1000 times 0.10 seconds
10000 times 0.95 seconds
20000 times 1.87 seconds
Overhead for 1 matgen 0.00009 seconds
Calculating matgen/dgefa passes for 1 seconds
10 times 0.00 seconds
100 times 0.04 seconds
1000 times 0.35 seconds
2000 times 0.71 seconds
4000 times 1.42 seconds
Passes used 2823
Times for array with leading dimension of 201
dgefa dgesl total Mflops unit ratio
0.00026 0.00001 0.00027 2545.49 0.0008 0.0048
0.00026 0.00001 0.00027 2545.55 0.0008 0.0048
0.00026 0.00001 0.00027 2547.66 0.0008 0.0048
0.00027 0.00001 0.00028 2457.21 0.0008 0.0050
0.00026 0.00001 0.00027 2545.73 0.0008 0.0048
Average 2528.33
Calculating matgen2 overhead
Overhead for 1 matgen 0.00009 seconds
Times for array with leading dimension of 200
dgefa dgesl total Mflops unit ratio
0.00026 0.00001 0.00027 2541.88 0.0008 0.0048
0.00026 0.00001 0.00027 2543.99 0.0008 0.0048
0.00026 0.00001 0.00027 2540.51 0.0008 0.0048
0.00026 0.00001 0.00027 2539.97 0.0008 0.0048
0.00026 0.00001 0.00027 2541.71 0.0008 0.0048
Average 2541.61
Unrolled Double Precision 2528.33 Mflops
|
Linpack Reference
Jack Dongarra, Performance of Various Computers Using Standard
Linear Algebra Software in a Fortran Environment (includes numerous results) and can be downloaded from
Here.
Go Back
Dhrystone Benchmarks
The Dhrystone "C" benchmark, a sort of Whetstone
without floating point, became the key standard benchmark, from
1984, with the growth of Unix systems. The first version was
produced by Reinhold P. Weicker in ADA and translated to
"C" by Rick Richardson.
Two versions are available Dhrystone versions 1.1 and 2.1. The
second version was produced to avoid over-optimization problems
encountered with version 1. Although it is recommended that
advanced optimization levels should be avoided with the latter,
it is clear from published results that the recommendation is
usually ignored. The default option in the Watcom compiler
produces high levels of optimization and omits some constant
calculations from the timing loop. Version 2 is compiled from
three source files.
Original versions of the benchmark gave performance ratings in
terms of Dhrystones per second. This was later changed to VAX
MIPS by dividing Dhrystones per second by 1757, the DEC VAX
11/780 result.
For early results see
Dhrystone Results.htm.
My versions are identical to the original, except for the calibration, timing and results checking modifications.
On screen Log
Dhrystone Benchmark, Version 2.1 (Language: C or C++)
Optimisation Optimised
Register option not selected
10000 runs 0.00 seconds
100000 runs 0.01 seconds
1000000 runs 0.05 seconds
2000000 runs 0.10 seconds
4000000 runs 0.20 seconds
8000000 runs 0.40 seconds
16000000 runs 0.80 seconds
32000000 runs 1.59 seconds
64000000 runs 3.25 seconds
Final values (* implementation-dependent):
Int_Glob: O.K. 5 Bool_Glob: O.K. 1
Ch_1_Glob: O.K. A Ch_2_Glob: O.K. B
Arr_1_Glob[8]: O.K. 7 Arr_2_Glob8/7: O.K. 64000010
Ptr_Glob-> Ptr_Comp: * 4320480
Discr: O.K. 0 Enum_Comp: O.K. 2
Int_Comp: O.K. 17 Str_Comp: O.K. DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob-> Ptr_Comp: * 4320480 same as above
Discr: O.K. 0 Enum_Comp: O.K. 1
Int_Comp: O.K. 18 Str_Comp: O.K. DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: O.K. 5 Int_2_Loc: O.K. 13
Int_3_Loc: O.K. 7 Enum_Loc: O.K. 1
Str_1_Loc: O.K. DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: O.K. DHRYSTONE PROGRAM, 2'ND STRING
Microseconds for one run through Dhrystone: 0.05
Dhrystones per Second: 19695907
VAX MIPS rating = 11209.96
|
Dhrystone Reference
Reinhold P. Weicker, CACM Vol 27, No 10, 10/84,pg.1013, plus results
Dhrystone 1 Dhrystones per second
and
Dhrystone 1 and 2 VAX MIPS.
Go Back
|
|