Description
This supercomputer benchmark was first introduced in 1970, initially comprising 14 kernels of numerical application, written
in Fortran. The number of kernels was increased to 24 in the 1980's. Performance measurements are in terms of Millions of
Floating Point Operations Per Second or MFLOPS. The program also checks the results for computational accuracy. One main aim was
to avoid producing single number performance comparisons, the 24 kernels being executed three times at different Do-loop spans to
produce short, medium and long vector performance measurements. If overall averages are quoted, the benchmark reference below
indicates that the geometric mean may be interpreted as a characteristic rate of computation but it would be more realistic
to retain the range of statistics in terms of geometric, harmonic and arithmetic means, minimum and maximum.
The results quoted here are for pre-compiled C/C++ versions available in BenchNT.zip which also
contains the source code, providing further explanatory comments. DOS versions are available in
DosTests.zip and those to run via OS/2 in OS2Tests.zip.
Then there is My Main Page for other PC benchmarks and results.
The benchmark has also been compiled with Microsoft 32 bit and 64 bit compilers that generate SSE and SSE2 instructions for floating point.
The original 2006 64 bit version indicated poor performance on Core 2 Duo CPUs but this was corrected using a later compiler in 2009.
Compiled codes (2006 and 2009 versions) are in
Win64.zip with source code in NewSource.zip. See also Win64.htm.
Livermore Loops Reference - F.H. McMahon, The Livermore Fortran Kernels: A Computer Test
Of The Numerical Performance Range, Lawrence Livermore National Laboratory, Livermore, California, UCRL-53745, December 1986.
Results
The following is a sample of results. Performance tends to be proportional to CPU MHz for a
given type of processor but is also affected by cache size and speed. There can also be variations
probably depending on where the data happens to be stored in cache.
Details of cache sizes and range of CPU MHz can be found in CPUSpeed.htm.
The benchmarks were compiled with and without optimisation options, providing two sets of results.
For each, a summary is provided, comprising geometric, harmonic and arithmetic means, minimum and maximum.
This is followed by MFLOPS for each of the 24 loops. All are sorted by maximum speed values.
Results include those from DOS and Windows compilations that produce very similar speed measurements. Some SSE2 and OS/2 results are included at the bottom of the tables.
The latter are slow as data was misaligned on 4 byte boundaries. Note that the 64 bit results on Core 2 Duo are disappointing - see Vista64.htm.
For reference, the speeds obtained on the Cray 1 supercomputer are also shown.