Roy Longbottom at Linkedin  Roy Longbottom's Raspberry Pi 2 and 3 Stress Tests


General Temperature & MHz Recorder Livermore Loops Stress Test
OpenGL Stress Test # Loops + OpenGL Scripts # Loops + OpenGL Results
Maximum MFLOPS Maximum MFLOPS Paging # Maximum MFLOPS + OpenGL
# Just Maximum MFLOPS Fixed Point Test # Fixed Point Tests + OpenGL
# Fixed Point Tests No OpenGL Drive, USB and LAN Test Drive Test, Multiple Devices
System Stress Test


# Multitasking with multiple CPU tests

General

The programs used for stress testing are based on those described in Raspberry Pi Benchmarks.htm and Raspberry Pi Stress Tests.htm. They are generally those compiled to make use of Raspberry Pi 2 ARMv7 CPU features, run by script files for multitasking purposes. The benchmarks, test programs and sample script files are in a single folder, available in Raspberry_Pi_2_Stress_Tests.zip. A Separate folder contains source codes, with the compile commands used in the header area.

For stress testing purposes, these programs have command parameters that determine running time and, sometimes, which particular hardware to use. Most also include performance measurements, reported at regular intervals, to identify speed reductions due to such as overheating or system interference. The test programs can be run via command lines in shell scripts, including allowing multiple programs to be run at the same time, each in its own Terminal window. The commands sometimes include an option for different log file names, for cleaner results when more than one copy of the same program is run.Temperature and CPU MHz recording applications can be included in the mix. All test programs check numeric answers or data transfers for correct or consistent values and report in the log files if incorrect.

2016 - Results of stress tests are included below, using a new OpenGL GLUT Benchmark and Maximum MFLOPS, also with Livermore Loops. Later additions are results from a Raspberry Pi 3.

On conducting Raspberry Pi 3 multiprocessor stress tests, starting below, CPU speed was seen to be slowing down, or throttling, as temperature increased. Running the same tests on a tablet, with apparently the same Cortex-A53 processor, did not show this effect. Googling indicated that the Raspberry Pis Broadcom BCM2837 version is manufactured using the 40 nm process, the tablet having a cooler Snapdragon implementation with 0.28 nm lithography.

Raspberry Pi 3 stress tests were carried out using various heatsinks and with the case cover off, but throttling still occurred. Based on advice from Raspberry Pi Forum, an aluminium FLIRC Case was purchased (relatively expensive). The RPi board is screwed to the lid that has a cuboid protraction, acting as an heatsink when clamped to the CPU via a thermal pad. This proved to be effective in reducing temperatures, with no throttling over the usual testing times. Results are included below.

To Start


Temperature and CPU MHz Recorder - RPiHeatMHz

This is designed to run in its own window, concurrently with other test programs. It identifies boot time speed settings, then CPU MHz and temperature, as defined by a command. The following command (optionally) opens a new terminal window with parameters for the number of samples and interval between reports. As with other commands, upper or lower case can be used and only the first character is needed.

                  lxterminal -e ./RPiHeatMHz passes 60, seconds 15

Following is an example, as displayed and saved in RPiHeatMHz.txt log file, using default settings with 10 samples at 1 second intervals.

2016 - the original MHz was measured using function scaling_cur_freq. Now, it is apparent that this does not show dynamic variations. The latest version of RPiHeatMHz now includes results from the measure_clock arm command. Below are example reports from the revised program running on a Raspberry Pi 3, showing that the two measures can be the same when nearly idling.


 Temperature and CPU MHz Measurement

 Start at Sun Mar  1 03:10:15 2015

 Using 10 samples at 1 second intervals

 Boot Settings

 arm_freq=900
 hdmi_force_hotplug=1
 config_hdmi_boost=4
 overscan_left=24
 overscan_right=24
 overscan_top=16
 overscan_bottom=16
 disable_overscan=0
 core_freq=250
 sdram_freq=450
 over_voltage=0

 Seconds
    0.0   900 MHz  temp=47.6C
    1.0   600 MHz  temp=47.1C
    2.1   600 MHz  temp=46.5C
    3.2   600 MHz  temp=47.1C
    4.2   600 MHz  temp=46.5C
    5.3   600 MHz  temp=46.5C
    6.4   600 MHz  temp=46.5C
    7.5   600 MHz  temp=46.5C
    8.5   600 MHz  temp=46.5C
    9.6   600 MHz  temp=46.5C
   10.7   600 MHz  temp=46.5C

 End at   Sun Mar  1 03:10:25 2015


 #################### New RPiHeatMHz ####################

 Boot Settings

 dtparam=audio=on
 dtoverlay=vc4-kms-v3d

 Seconds
    0.0     1200 scaling MHz,   1200 ARM MHz, temp=58.0C
   15.0     1200 scaling MHz,   1200 ARM MHz, temp=67.1C
   30.1     1200 scaling MHz,   1200 ARM MHz, temp=70.9C
   45.1     1200 scaling MHz,   1200 ARM MHz, temp=73.6C
   60.2     1200 scaling MHz,   1200 ARM MHz, temp=75.8C
   75.3     1200 scaling MHz,   1200 ARM MHz, temp=78.4C
   90.5     1200 scaling MHz,   1200 ARM MHz, temp=79.5C
  105.6     1200 scaling MHz,   1160 ARM MHz, temp=80.6C
  120.7     1200 scaling MHz,   1075 ARM MHz, temp=81.1C
  135.8     1200 scaling MHz,   1051 ARM MHz, temp=81.7C
  150.9     1200 scaling MHz,   1023 ARM MHz, temp=81.7C
  166.0     1200 scaling MHz,   1020 ARM MHz, temp=82.2C
  181.1     1200 scaling MHz,   1006 ARM MHz, temp=82.2C

 Seconds
    0.0      600 scaling MHz,    600 ARM MHz, temp=55.8C
    1.0     1200 scaling MHz,   1200 ARM MHz, temp=56.4C
    2.0     1200 scaling MHz,   1200 ARM MHz, temp=56.4C
    3.1     1200 scaling MHz,   1200 ARM MHz, temp=56.9C
    4.1     1200 scaling MHz,   1200 ARM MHz, temp=56.9C
    5.2      600 scaling MHz,    600 ARM MHz, temp=56.4C
  


To Start


Livermore Loops Stress Test - liverloopsPiA7R

The Livermore Loops benchmark was converted to act as a stress test, following wrong numeric results being produced on an overclocked, PC using a Pentium Pro CPU. The Loops comprise 24 double precision floating point kernels, with performance measurements in terms of Millions of Floating Point Operations Per Second or MFLOPS. The kernel tests are repeated three times, with different data sizes. By including the running time of each loop converts the benchmark into a stress test, whereby numeric results of calculations are checked for correctness after each of the numerous passes, with errors errors being logged, along with performance details. Detailed results are displayed continuously, as the tests are running. There is too much detail for logging. So, as shown below, the start times of each section are reported.

Certain changes were made to, liverloopsPiA7, the RPi gcc 4.8 benchmark, firstly to include the expected results of computation. Then, code for the rather convoluted system configuration details was removed and an option included to use different log files, for when multiple copies are run at the same time. Following is an example command to open a new terminal window, run each test for approximately 12 seconds and save results in LoopsLog1.txt. Total time will be around 24 x 3 x 12 = 864 seconds.

                  lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
Above is an example command to open a new terminal window, run each test for approximately 12 seconds and save results in LoopsLog1.txt. Total time will be around 24 x 3 x 12 = 864 seconds.


 Livermore Loops Benchmark vfpv4 32 Bit via C/C++ Tue Mar 24 12:24:22 2015

 Reliability test  12 seconds each loop x 24 x 3

 Part 1 of 3 start at Tue Mar 24 12:24:23 2015

 Part 2 of 3 start at Tue Mar 24 12:29:10 2015

 Part 3 of 3 start at Tue Mar 24 12:33:57 2015

 Numeric results were as expected

 MFLOPS for 24 loops
  130.3  161.0  222.0  220.1   85.3  122.5  220.0  210.2  190.5  138.0   99.1   56.0
   67.7   82.6  124.2  132.9  201.7  180.1  160.0  124.2   95.0   42.6  184.2  127.2

 Overall Ratings
 Maximum Average Geomean Harmean Minimum
   222.0   137.1   125.8   113.2    42.4

                      End of test Tue Mar 24 12:38:47 2015
  


To Start


OpenGL Stress Test - OpenGL1PiR.bin

This uses the OpenGL ES Benchmark , that has command parameters for window width and height, plus running time in minutes. With the latter specified, the most demanding textured test is run. Following is a typical command.


         lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15

The test is run fifteen times for 4 seconds per specified minute. Actual running times and speed in Frames Per Second (FPS) are displayed and saved in log file OpenGLPi.txt.

NOTE - The original OpenGL benchmark was found to be producing FPS speeds twice as high as they should be. Existing relative performance comparisons of results on Raspberry Pi systems are still valid. The benchmark has been modified. So do not compare old scores with new ones.


 Raspberry Pi OpenGL ES Benchmark 1.1, Wed Mar 25 02:41:29 2015

 Reliability Mode 15 Tests of 60 Seconds

 Test  1   60.24 seconds,   5.61 FPS
 Test  2   60.07 seconds,   6.23 FPS
 Test  3   60.03 seconds,   6.23 FPS
 Test  4   60.15 seconds,   6.18 FPS
 Test  5   60.10 seconds,   6.29 FPS
 Test  6   60.02 seconds,   6.26 FPS
 Test  7   60.02 seconds,   6.30 FPS
 Test  8   60.19 seconds,   6.31 FPS
 Test  9   60.07 seconds,   6.36 FPS
 Test 10   60.22 seconds,   6.34 FPS
 Test 11   60.01 seconds,   6.67 FPS
 Test 12   60.17 seconds,   6.58 FPS
 Test 13   60.00 seconds,   6.57 FPS
 Test 14   60.03 seconds,   6.56 FPS
 Test 15   60.19 seconds,   6.45 FPS

      Screen Pixels 1920 Wide 1080 High
      End Time Wed Mar 25 02:56:31 2015
   


To Start


Multiple Livermore Loops Plus OpenGL Scripts

Following are the scripts used for the first set of stress tests, each with the temperature and MHz speed program being run at the same time. The tests were one OpenGL, one Livermore Loops then one OpenGL and three Livermore Loops. The tests were run at the normal CPU speed of 900 MHz and overclocked at 1000 MHz.

Unfortunately, the lxterminal command does not appear to have the screen position option. However, with appropriate size parameters, and an extended testing time, they can be moved around with the mouse. The Multiple Windows example below, with MHz/heat, four Livermore Loops and one vmstat performance monitor, is appropriate for displaying six windows on a 1920 x 1080 TV, with some space at the side for a terminal for further commands.


 Test 1 - Script 1opengl.sh
 lxterminal -e ./RPiHeatMHz passes 60, seconds 15
 lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15

 Test 2 - Script 1loop.sh
 lxterminal -e ./RPiHeatMHz passes 60, seconds 15
 lxterminal -e ./liverloopsPiA7R Seconds 12

 Test 3 - Script 1ogl3loops.sh
 lxterminal -e ./RPiHeatMHz passes 60, seconds 15
 lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 2
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 3

 Multiple Windows
 lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, seconds 15
 lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 1
 lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 2
 lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 3
 lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 4
 lxterminal --geometry=80x15 -e vmstat 5
   


To Start


Multiple Livermore Loops Plus OpenGL Results

As confirmed by vmstat, the OpenGL benchmark used very little CPU time and, when run by itself, the CPU did not switch to the full speed at 900 MHz and rarely did so at 1 GHz. On the other hand, the Livermore Loops benchmark ran at full speed continuously.

OpenGL Frames Per Second was slightly faster, at normal CPU speed, when run at the same times as the other programs, probably as the CPU was running at 900 MHz. Other than this, FPS speeds were consistent over all tests. Livermore Loops summary MFLOPS indicated no degradation when running three copies, at the lower MHz, but a little at 1 GHz, both demonstrating efficient multitasking.

The OpenGL, graphics processor speed limited benchmark, generated higher temperatures than one Livermore Loops program. This was confirmed running the above script for four copies of the latter, at 900 MHz, where maximum temperature was 70C, compared with 78C, with with three CPU tests and OpenGL. Highest temperatures were during the latter, at 1 GHz, where 84.7C was reached.

Below are later results using a new OpenGL GLUT Benchmark.


  900 MHz ----------------------------  1000 MHz ---------------------------
  Test 1 -----  Test 2 -  Test 3 -----  Test 1 -----  Test 2 -  Test 3 -----

  MHz  C  FPS  MHz  C   MHz  C  FPS  MHz  C  FPS  MHz  C   MHz  C  FPS

  600 48.2      900 49.2  900 47.6     1000 47.6      600 49.2  600 48.7
  600 50.8      900 51.9  900 59.5      600 51.9     1000 54.6 1000 65.4
  600 54.1      900 53.0  900 64.3      600 53.0     1000 56.2 1000 71.3
  600 54.1 5.7  900 53.0  900 68.1 5.9  600 54.1 5.9 1000 57.3 1000 74.5 5.9
  600 55.1      900 53.0  900 68.1      600 54.1     1000 55.7 1000 74.5
  600 55.1      900 52.5  900 68.6      600 56.2     1000 57.3 1000 75.6
  600 55.7      900 53.0  900 68.1      600 55.1     1000 56.8 1000 75.6
  600 55.7 6.3  900 53.0  900 68.6 6.6  600 55.7 6.4 1000 56.2 1000 76.1 6.4
  600 56.2      900 53.0  900 70.2      600 57.3     1000 57.3 1000 77.7
  600 56.2      900 53.0  900 71.3      600 56.8     1000 57.3 1000 79.4
  600 56.8      900 53.0  900 72.4      600 57.3     1000 57.3 1000 79.9
  600 56.8 6.3  900 53.5  900 71.3 6.6 1000 59.5 6.5 1000 58.4 1000 79.4 6.5
  600 57.8      900 53.0  900 70.8      600 57.3     1000 56.2 1000 78.3
  600 58.4      900 53.0  900 70.8      600 57.8     1000 57.3 1000 79.4
  600 57.3      900 53.0  900 71.8      600 57.3     1000 56.8 1000 79.9
  600 57.8 6.3  900 53.0  900 72.4 6.5  600 57.3 6.2 1000 57.3 1000 80.4 6.2
  600 57.8      900 52.5  900 72.9      600 57.8     1000 56.8 1000 81.0
  600 58.4      900 53.0  900 72.4      600 57.3     1000 56.8 1000 79.9
  600 57.8      900 53.0  900 73.4      600 57.8     1000 56.8 1000 82.0
  600 57.8 6.4  900 53.5  900 74.5 6.6  600 57.3 6.4 1000 57.8 1000 82.6 6.4
  600 59.5      900 53.5  900 75.6      600 57.8     1000 57.3 1000 83.7
  600 57.8      900 53.5  900 75.6      600 57.3     1000 57.8 1000 84.7
  600 59.5      900 54.1  900 76.1      600 57.8     1000 57.8 1000 83.7
  600 58.4 6.4  900 54.1  900 75.1 6.7  600 57.8 6.3 1000 57.8 1000 83.7 6.3
  600 58.4      900 54.1  900 74.5      600 60.5     1000 57.8 1000 83.1
  600 58.4      900 53.0  900 75.1      600 57.8     1000 57.3 1000 83.1
  600 60.0      900 53.5  900 75.1      600 57.8     1000 57.3 1000 84.2
  600 58.9 6.3  900 53.0  900 76.1 6.7  600 58.9 6.4 1000 57.8 1000 83.1 6.4
  600 59.5      900 53.5  900 76.7      600 58.4     1000 57.8 1000 84.2
  600 58.9      900 53.0  900 75.6      600 58.4     1000 57.3 1000 81.5
  600 60.0      900 54.1  900 75.6      600 61.6     1000 58.4 1000 83.7
  600 58.4 6.3  900 53.0  900 74.5 6.7  600 59.5 6.5 1000 57.3 1000 83.1 6.5
  600 58.4      900 53.0  900 76.1      600 58.9     1000 57.3 1000 84.2
  600 58.9      900 53.0  900 75.6      600 58.4     1000 57.3 1000 83.7
  600 60.5      900 53.0  900 75.1      600 61.6     1000 57.8 1000 84.2
  600 60.0 6.3  900 53.0  900 74.5 6.7  600 60.5 6.5 1000 57.3 1000 83.1 6.5
  600 58.4      900 53.0  900 75.6      600 58.4     1000 56.8 1000 83.1
  600 59.5      900 53.0  900 76.7     1000 63.8     1000 57.3 1000 84.2
  600 59.5      900 53.5  900 76.7      600 62.1     1000 57.3 1000 82.0
  600 59.5 6.3  900 53.5  900 78.3 6.7  600 58.4 6.7 1000 57.8 1000 84.2 6.7
  600 58.4      900 53.5  900 78.3      600 59.5     1000 57.8 1000 84.2
  600 58.4      900 54.1  900 77.7     1000 63.8     1000 58.4 1000 82.6
  600 59.5      900 54.1  900 77.7      600 60.5     1000 57.8 1000 83.7
  600 59.5 6.5  900 53.5  900 77.2 6.8  600 58.9 6.5 1000 57.8 1000 83.1 6.5
  600 59.5      900 53.5  900 77.2     1000 61.6     1000 57.3 1000 83.1
  600 59.5      900 53.5  900 77.2     1000 64.8     1000 57.3 1000 84.2
  600 59.5      900 54.1  900 77.7      600 60.0     1000 57.3 1000 84.2
  600 59.5 6.5  900 53.0  900 78.3 6.9  600 60.0 6.5 1000 57.8 1000 84.2 6.5
  600 59.5      900 53.5  900 76.7      600 59.5     1000 57.3 1000 81.5
  600 59.5      900 53.5  900 76.7      600 60.0     1000 57.3 1000 83.7
  600 59.5      900 53.0  900 76.1      600 59.5     1000 56.8 1000 83.1
  600 60.0 6.5  900 53.0  900 77.2 6.9  600 60.0 6.3 1000 56.8 1000 83.1 6.3
  600 59.5      900 53.0  900 76.1      600 59.5     1000 56.2 1000 82.0
  600 59.5      900 53.0  900 76.7      600 59.5     1000 57.3 1000 81.5
  600 58.4      900 53.0  900 76.7      600 59.5     1000 56.2 1000 82.6
  600 59.5 6.6  900 53.0  900 75.6 6.9  600 59.5 6.3 1000 56.2 1000 81.0 6.3
  600 59.5      900 53.0  900 74.5      600 61.1     1000 56.2 1000 81.5
  600 59.5      900 53.5  900 72.4      600 60.0     1000 56.8 1000 81.0
  600 59.5      600 50.8  600 69.1      600 59.5      600 53.5 1000 81.5
  600 60.0 6.6  600 49.8  600 67.0 6.8 1000 61.6 6.4  600 50.8  600 73.4 6.4
  600 59.5      600 49.8  600 65.9      600 60.0      600 50.3  600 71.8

 Max  60.5          54.1      78.3          64.8          58.4      84.7

                                Livermore Loops MFLOPS
  
                Test 2 -  Test 3 -----                Test 2 -  Test 3 -----

      Maximum     222     224  224 220                   247    249  249 247
      Average     137     137  137 135                   151    149  148 147
      Geomean     126     125  126 124                   139    136  136 134
      Harmean     113     113  113 112                   125    122  122 121
      Minium       42      43   43  42                    47     47   47  44

  

2016 - With New OpenGL GLUT Benchmark - videogl32 (Failed)

This benchmark is my main Linux OpenGL program. For further details see: Raspberry Pi Benchmarks.htm. The later Operating System would not execute the script with lxterminal commands. However, the details could be copied and pasted as a single list of commands. The extra export command turned off Wait For Vertical Blank (VSYNC) to demonstrate maximum speeds.

Only measurements with the CPU at 1000 MHz are provided, showing much faster OpenGL Frames Per Second than the original benchmark and, maybe, a slightly higher maximum temperature. Running at the higher temperatures, the OpenGL display disappeared occasionally, displaying the multi-coloured square seen on booting. The display had to be restored by moving the mouse.

The test was rerun on a hotter day, with room temperature around 25C. The second set of results are included below, where CPU MHz was the same. The test produced many more display failures, with the CPU temperature being higher. Continuously moving the mouse, to avoid failures, probably lead to the lower FPS and reduced temperatures (CPU part of chip, not graphics?). Later in the tests, the coloured square was broken up.

In the Raspberry Pi Forum, it was suggested that the failures were due to inadequate power supply, indicating an under-voltage warning. That used was an official 5 volt, 2 amp unit and an available one, rated at 2.5 amps, also produced the same failures. In order to investigate, a DROK Digital Ampere Voltage Multimeter was obtained. This has USB 2 ports in and out and could be used immediately with the the 2.5 amp power supply and a further available one, rated at 1.5 amps. When a suitable micro USB to USB 2 converter is obtained, the 2 amp unit can be tested.

Following the temperature and performance details are power consumption measurements from 15 minute tests, with the CPU overclocked at 1000 MHz, using both 2.5A and 1.5A power supplies, plus with the latter at 900 MHz. The voltage and current recordings provide no indication of potential power supply issues.

My conclusion is that the cause of the problem appears to be temperature related.

Note - For maximum videogl32 speed, a new driver had to be installed. Attempting to run the original OpenGL1PiR.bin program, via this, produced the same display failure with the coloured square.

Raspberry Pi 3 - The measurements on the RPi 2 were via 2016-03-18-raspbian-jessie Operating System, released a month after RPi 3 launch, and needed for OpenGL GLUT support. Running the tests on a RPi 3, where overclocking is not available, produced that rainbow coloured square after a few minutes testing time. Results of a short run are provided below, showing the high temperatures that lead to the failures. Although the normal monitor displays disappeared, it should be noted that all programs continued running.

The newer 2016-05-27-raspbian-jessie was burnt onto a different micro SD card and the tests repeated. As shown below, these tests ran successfully at temperatures similar to those on the RPi 2. The Livermore Loops Benchmark produces results at the end. In this case, RPi 3 soak tests lead to a considerable reduction in measured MFLOPS, and no better than RPi2, where performance remained fairly constant.

Using the older Operating System, display failures were also found to occur running four processor tests, without the OpenGL benchmark. See 2016 Maximum MFLOPS Results that also shows how performance is degraded with time, due to CPU MHz being throttled.

The tests were repeated with the CPU installed in the FLIRC Case, where the whole aluminium enclosure becomes the heatsink. Then, as seen below, the maximum temperature reached was far less than before and Loop MFLOPS measurements indicated no degradation due to throttling. CPU MHz measurements were not recorded properly for the earlier tests (see Temperature & MHz Recorder) but, using the later app, a constant 1200 MHz was recorded for the new measurements.


 Test 3
 export vblank_mode=0
 lxterminal -e ./RPiHeatMHz passes 31, seconds 30
 lxterminal -e ./videogl32 Wide 1920, High 1080, Minutes 15
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 2
 lxterminal -e ./liverloopsPiA7R Seconds 12 Log 3

  Running OpenGL Benchmark and Three Floating Point Programs

                   Hotter  Old OS  New OS   FLIRC
                    Room                     Case
            Rpi 2   Rpi 2   Rpi 3   Rpi 3   Rpi 3
            Run 1   Run 2
       MHz   1000    1000    1200    1200    1200
       FPS     17      17      17      17      18

   Minutes     C      C      C      C      C

         0   51.9    53.5    49.4    52.1    52.1
       0.5   71.3    72.4    66.6    69.3    61.2
       1.0   74.5    75.6    73.1    75.8    62.8
       1.5   75.6    76.7    76.8    78.4    63.9
       2.0   77.2    78.8    80.1    82.2    64.5
       2.5   79.4    81.0    81.7    82.7    65.5
       3.0   77.7    79.4    81.7    83.8    64.5
       3.5   79.9    81.0    82.2    82.7    66.1
       4.0   79.9    81.0    82.7    82.7    66.1
       4.5   81.0    82.0    80.1    83.8    66.1
       5.0   83.1    85.3    80.6    83.3    66.6
       5.5   84.2    84.2    Stop    83.8    67.7
       6.0   83.1    84.2            84.4    69.3
       6.5   83.7    84.2            85.4    68.8
       7.0   85.3    84.2            84.4    68.8
       7.5   83.1    84.2            84.4    69.8
       8.0   83.1    83.1            84.9    68.8
       8.5   83.7    83.1            84.9    68.8
       9.0   83.1    83.7            84.9    69.3
       9.5   82.6    84.7            84.4    70.4
      10.0   83.7    83.1            83.8    71.4
      10.5   82.6    85.3            84.9    71.4
      11.0   85.3    84.2            82.7    70.9
      11.5   85.3    83.7            84.9    70.9
      12.0   84.2    83.1            83.3    70.9
      12.5   84.2    83.1            83.3    70.4
      13.0   84.2    84.7            83.8    70.9
      13.5   84.2    83.1            84.9    70.9
      14.0   83.1    84.2            84.9    68.2
      14.5   78.8    84.7            84.9    65.0
      15.0   75.6    82.0            82.7    62.3
      15.5   63.8    69.1

   Maximum   85.3    85.3    82.7    85.4    71.4

           MFLOPS  MFLOPS  MFLOPS  MFLOPS  MFLOPS
  Per Core
  Maximum   242.9   240.0    N/A    388.5   387.8
  Average   147.1   141.8           140.7   206.7
  Geomean   134.0   129.1           121.3   182.3
  Harmean   118.9   114.2           102.8   156.9
  Minimum    38.9    33.2            33.4    54.7

  Stand Alone
  Maximum   244.9   244.9           398.4
  Average   150.7   150.7           210.7
  Geomean   138.2   138.2           186.0
  Harmean    46.7    46.7            56.6


  ########################### Volts and Amps Measurements #########################

                      2.5A PS 1000 MHz     1.5A PS 900 MHz       1.5A PS 1000 MHz
                      Volts      Amps      Volts      Amps       Volts     Amps

 Power on              5.14          0.30   5.04          0.29   5.04          0.30
 VideoGL32 only        5.14  0.47 to 0.56   5.05  0.44 to 0.48   5.06  0.47 to 0.53
 3 x Livermore Loops   5.13  0.50 to 0.58
 VideoGL32 and Loops   5.12  0.71 to 0.75   5.05  0.59 to 0.65   5.06  0.66 to 0.75
   


To Start


Maximum MFLOPS - burninfpuPiA7, burninfpuPi2

This uses the same program test code as MP-MFLOPS, but just for a single CPU. The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2, 8 or 32 operations per input data word. The same variables are used for each word and final results are checked for consistency, any errors being reported. The benchmark has input parameters for KWords, Section 1, 2 or 3 (for 2, 8 or 32 operations per word) and log number (0 to 99).

Below is an example log file, followed by sample resuults using L1 cache, L2 cache and RAM. Normally, 32 operations per word would produce the fastest speed, but it seems that this NEON based compilation runs out of registers, leading best performance being at 8 operations per word.

Raspberry Pi 3 - Results are included below. Compared with RPi 2 and relative to clock MHz, RPi 3 was slightly faster using L1 cache data, twice as fast from L2 and 2.5 times via RAM. Then see 2016 Maximum MFLOPS, where RPi 3 performance was severely degraded using multiple copies of the program.

A new version, burninfpuPiA7, was produced, following experiments, introducing more calculations in the 2 and 8 operations per word test functions. Results are below, where single core maximum Raspberry Pi 3 speeds are around 3 GFLOPS. This new version is included in Raspberry_Pi_2_Stress_Tests.zip.


    Command:- ./burninfpuPiA7  KWords 4 Section 1 Log 1
 
   Burn-In-FPU Linux/ARM A7 v1.0 Sun Mar 29 12:22:02 2015

 Using 16 KBytes, 2 Operations Per Word, For Approximately 1 Minutes

   Pass    4 Byte  Ops/   Repeat    Seconds   MFLOPS          First   All
            Words  Word   Passes                            Results  Same

      1      4000     2   888000      15.01      473    0.400158763   Yes
      2      4000     2   888000      14.97      475    0.400158763   Yes
      3      4000     2   888000      14.97      475    0.400158763   Yes
      4      4000     2   888000      14.97      475    0.400158763   Yes

                   End at Sun Mar 29 12:23:02 2015

############################################################################

                         Raspberry Pi 2, 900 MHz
 16 KB L1 Cache
      1      4000     2   888000      15.01      473    0.400158763   Yes
      1      4000     8   376000      15.14      795    0.540749788   Yes
      1      4000    32    84000      15.58      690    0.353297979   Yes
 96 KB L2 Cache
      1     24000     2   141192      15.05      450    0.400418609   Yes
      1     24000     8    61272      15.14      777    0.580618143   Yes
      1     24000    32    13986      15.69      685    0.580735326   Yes
 200 MB RAM
      1  50000000     2       51      15.25      335    0.998471677   Yes
      1  50000000     8       23      15.06      611    0.999586105   Yes
      1  50000000    32        7      17.46      641    0.999661446   Yes


 ############################################################################

                         Raspberry Pi 3, 1200 MHz

 16 KB L1 Cache
      1      4000     2  1380000      15.04      734    0.400158763   Yes
      1      4000     8   780000      15.02     1661    0.540749788   Yes
      1      4000    32   204000      15.28     1709    0.353159577   Yes
 96 KB L2 Cache
      1     24000     2   225774      15.04      721    0.400158763   Yes
      1     24000     8   128538      15.01     1644    0.541149199   Yes
      1     24000    32    33300      15.02     1703    0.406550229   Yes
 200 MB RAM
      1  50000000     2       67      15.00      447    0.997993290   Yes
      1  50000000     8       53      15.13     1402    0.999047875   Yes
      1  50000000    32       16      15.13     1692    0.999228001   Yes
  


To Start


Maximum MFLOPS Paging

Selecting an input parameter of KWords 200000, for 800 MB, generated an error message from the program, as it could not allocate that amount of memory. Reducing this alowed the program to run, but painfully slowly, as the data was being swapped out and in to/from the SD card. In my case, selecting 180000 KW, for 720 MB, provided a demonstration of swapping, whilst checking that no data is corrupted. Results from the 720 MB test are below. The program calibrates the number of passes for all phases, during the first one. As in this case, other phases can run for a different length of time.

The program starts with 683 MB free, soon falling to 12 MB plus taking over 25 MB of cache space (696 hex x 1.049 = 730). Noting that 25% CPU utilisation (us + sy) implies 100% of one core, after an initial burst of swapping activity (allocating memory space and generating data, outside timed tests), the benchmark executes at full speed. Next there is 30 seconds of swapping out and in, followed by phase 2, 3 and 4 calculations, at some speed, with some swapping in.


    Command:- ./burninfpuPiA7  KW 180000 Section 1 Log 1

    Burn-In-FPU Linux/ARM A7 v1.0 Sun Mar 29 14:33:48 2015

 Using 720000 KBytes, 2 Operations Per Word, For Approximately 1 Minutes

   Pass     4 Byte  Ops/   Repeat    Seconds   MFLOPS          First   All
             Words  Word   Passes                            Results  Same

      1  180000000     2       14      15.99      315    0.999579191   Yes
      2  180000000     2       14      46.24      109    0.999579191   Yes
      3  180000000     2       14      15.40      327    0.999579191   Yes
      4  180000000     2       14      15.20      332    0.999579191   Yes

                   End at Sun Mar 29 14:35:46 2015

############################################################################

 Command for 5 second sample3:- vmstat 5

number ------------KB-----------   --K/sec--   -KB/sec- -num/sec- -----%-----

procs  ----------memory----------  --swaps--   ---io--- --system- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa

 1  0  27864 683936    484  28368    0    0     0     3 1211  135  1  0 99  0
 0  0  27864 683804    492  28368    0    0     0     5 1210  150  0  0 99  0
 0  0  27864 683812    500  28368    0    0     0    14 1218  159  0  0 99  0
 1  0  27840  73568    508  28636   12    0    63     4 1237  217  4  5 91  0
 2  3  47176  11956    228   2988   46 3957  2788  3976 1474  301  2 20  8 70
 1  0  44164  11776    200   7416  393    0  5644     6 2025  834 25  5 53 17
 1  5  39408  11624    232   6240 1288   31  9816    46 3004 1597 28  8 38 26
 0  5  37104  11636    220   6712  275    0 15533    14 3330 1604 27  9  3 60
 0  8  60764  17852    200   5316  618 5285 10427  5303 4006 1382  7  6 23 64
 0  4  39352  11036    188   3472 2915   87 18823   110 3776 2748  0  5 38 56
 0  8  41456  10664    188   5344 1080 1684 12798  1684 2991 2018  0  3 59 37
 0  5  44024  11860    188   7016  918 1380  9091  1380 2461 1383  2  4 52 42
 0  7  62476  18144    188   4448 2076 5368  5546  5370 2190  936  0  2 31 67
 1  2  33552  11296    188   3228 3615    0  9820     5 2609 1362  9  5 42 44
 1  3  33564  12136    192   2788   10   11 17031    30 2826 1242 25  7 19 49
 1  3  33548  12064    196   2796   14    2 17375     6 2945 1310 25  7 17 50
 0  4  33544  11748    188   3324    2    0 16335     2 2663 1161 23  6 24 46
 1  0  32996  12464    172   3076   29    1 11823    10 2174  895 17 13 47 22
 2  0  35800  11876    172   6272   30  574  8144   586 2099  791 25  5 44 26
 1  0  36596  12008    176   7236   11  166  1976   179 1425  245 25  1 67  6
 1  0  36228 426196    184   7076  225  121  1078   126 1444  286 22  4 72  2
 2  0  36032  10900    188   7988  126    0  3898    17 1701  492 22  6 67  5
 2  0  36180  12388    188   6044    5   34  2226    44 1486  297 25  1 70  4
 2  0  36516  12024    196   6632    0   67  2090    70 1418  241 25  1 70  4
 0  0  35800 707804    320  14848  227   23  2976    26 1714  501 15  2 78  5
 0  0  34952 704208    328  17128  222    0   680     6 1355  384  1  0 97  1
 0  0  34332 702976    336  17692  103    0   225    10 1302  462  1  2 97  0

  r = waiting to run,    b = sleeped, i = in, o = out, in = interrups, 
 cs = context switches, us = user, sy = system, id = idle, wa = wait for I/O
  


To Start


Multiple Maximum MFLOPS Tests and OpenGL

OpenGL1PiR was run concurrently with four copies of burninfpuPiA7 and the temperature/MHz monitoring program, with data sizes for L1 cache, L2 cache and RAM, at both 900 MHz and 1000 MHz. The scripts and results are shown below.

CPU MHz - This was at maximum frequency during all tests.

CPU performance - With the OpenGL program not using much CPU time, throughput of the four core CPU test was 3.7 to 3.8 times that of one core, via L1 cache, and somewhat less so using the shared L2 cache. With RAM based data, the CPU tests ran for much longer than the 15 minute parameters (calibration complication) but, at least, thoughput was greater than using one core.

Temperature - With the CPU temperatures being mainly dependent on the OpenGL test, there were no surprises compared with the Livermore Loops tests. Then, the programs can provide a thorough test of a selected memory area.

OpenGL FPS - As with the CPU tests, performance, using shared RAM, was much slower than cache based speeds.

Below are later results using a new OpenGL GLUT Benchmark.


 4fpul1.sh # Kwds = 4, 4fpul2.sh # Kwds = 10, 4fpuram.sh # Kwds = 40000
             4 x16 KB              4 x 40 KB                 4 x 160 MB

 lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, seconds 15
 lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 11
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 12
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 13
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 14

 900 MHz
            L1                    L2                  RAM
  Minute     C MFLOPS    FPS     C MFLOPS    FPS     C MFLOPS    FPS
       0   47.6                 45.5                 47.6
       1   64.3   3037    5.6   63.2   2793    4.9   60.5    668    3.4
       2   67.0   3028    6.2   66.4   2760    5.4   61.6           3.3
       3   69.7   3018    6.2   69.1   2771    5.4   63.8    665    4.1
       4   70.8   3016    6.2   70.8   2771    5.3   64.3           3.3
       5   71.8   3046    6.3   71.3   2773    5.2   65.4    650    3.9
       6   72.9   3003    6.2   72.9   2724    5.4   65.9           3.9
       7   74.0   3074    6.2   72.9   2756    5.3   66.4    666    3.2
       8   74.5   2961    6.3   73.4   2750    5.4   66.4           4.1
       9   74.0   3018    6.3   74.5   2750    5.5   67.0    693    3.7
      10   74.0   3026    6.3   74.5   2727    5.4   68.1           3.4
      11   75.1   3015    6.6   74.5   2810    5.8   68.1    741    4.3
      12   75.6   3085    6.6   75.1   2798    5.8   68.1           3.7
      13   76.1   3056    6.6   75.6   2868    5.8   68.1    704    3.8
      14   75.6   3069    6.5   75.6   2825    5.7   66.4           4.3
      15   68.1   3039    6.5   68.6   2937    5.8   63.8    963    3.9

 Average          3033    6.3          2787    5.5           719    3.8
 Maximum   76.1                 75.6                 68.1
 1 Core            795                  777                  611

 1000 MHz
       0   50.3                 47.6                 49.2
       1   72.4   3382    6.7   70.8   3244    6.3   65.9    891    4.4
       2   75.6   3381    7.2   75.6   3242    6.7   69.7           4.0
       3   78.8   3393    7.5   78.8   3241    6.7   70.8    880    4.4
       4   79.9   3395    7.4   81.0   3236    6.9   72.4           4.7
       5   82.0   3380    7.2   81.0   3228    7.1   74.0    887    4.5
       6   83.1   3367    7.2   82.6   3244    6.8   74.5           4.0
       7   84.2   3375    7.5   83.7   3246    6.6   75.1    896    4.5
       8   83.1   3337    7.4   84.2   3179    6.7   75.6           4.7
       9   84.2   3266    7.1   83.7   3111    6.6   76.1    872    4.6
      10   83.1   3235    7.0   83.1   3077    6.7   76.7           4.1
      11   81.5   3092    7.5   84.2   3025    7.1   77.7    932    4.6
      12   83.1   3194    7.7   84.7   2998    7.1   77.2           4.9
      13   83.7   3207    7.3   82.0   2997    6.9   77.7    919    4.9
      14   83.1   3096    7.2   84.2   3170    6.8   77.7           4.8
      15   78.3   3315    7.5   78.3   3352    7.0   76.7    934    4.3

 Average          3294    7.3          3173    6.8           901    4.5
 Maximum   84.2                 84.7                 77.7
 1 Core            887                  866                  702
  

2016 - With New OpenGL GLUT Benchmark - videogl32

This benchmark is my main Linux OpenGL program. For further details see: Raspberry Pi Benchmarks.htm. The later Operating System would not execute the script with lxterminal commands. However, the details could be copied and pasted as a single list of commands. The extra export command turned off Wait For Vertical Blank (VSYNC) to demonstrate maximum speeds.

Using this benchmark appears to produce lower maximum temperatures being recorded.than the original OpenGL tests. Then, it seems to use more CPU time, reducing overall throughput of the floating point tests.

The tests were rerun, using three copies of burninfpuPiA7, to make more CPU time available for videogl32. See first column All 1000 MHz below, where all programs obtained a fair share of CPU time, at almost full speed. However, this did not lead to the high temperatures obtained using the Livermore Loops, that produced a display failure.

For the second results columns, 64 KB was used for the burninfpuPiA7 programs, jointly to using 192 KB of the 512 KB shared L2 cache. The slow speeds suggest that L2 cache space used by videogl32 leads to too high a demand. The third results columns are for 3 times 48 KB for the MFLOPS tests, using 32 operations per word, with all speeds running at a high efficiency, and the high temperature that caused the temporary display failure. Similar burninfpuPiA7 speeds were obtained at 48 KB with 8 operations per word.

Raspberry Pi 3 - These test were run for 5 minutes, via the newer Operating System, again using three copies of the floating point stress test, but with no display failures. As shown in the results below, maximum temperatures could be produced after two minutes. Instantaneous CPU MHz is sometimes reduced by a half, with MFLOPS speeds, averaged over 15 seconds or more, down by more than 40% and not much faster than the overclocked Raspberry Pi 2.

The 16 KB tests were repeated with the RPi 3 system in the FLIRC Case, with a running time of 15 minutes. There was no degradation in performance, up to a maximum of 69.3'C, with each core constantly executing around 1.7 GFLOPS and OpenGL between 17 and 18 FPS.

Next are 2016 Maximum MFLOPS , using four processor tests, without the OpenGL benchmark. In this case, the test failed using the old OS (2016-03-18-raspbian-jessie).

 ######################################################################

 export vblank_mode=0
 lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 63, seconds 15
 lxterminal -e ./videogl32 Wide 1920, High 1080, Minutes 15
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 11
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 12
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 13
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 14

              900 MHz                  1000 MHz                  
   
  Seconds      C  MFLOPS    FPS       C  MFLOPS    FPS       C
                                                           Increase
     Min     49.8                    54.6                     4.8
     Max     72.4                    79.4                     7.0

      150    65.4    2619     11     74.0    2977     14      8.6
      165    65.9    2671     12     75.1    2916     13      9.2
      180    66.4    2734     11     75.1    2977     13      8.7
      195    66.4    2631     11     75.1    2945     14      8.7
      210    67.0    2661     11     75.6    3030     14      8.6
      225    67.0    2611     11     76.1    2884     14      9.1
      240    67.5    2715     11     76.1    2920     14      8.6
      255    68.1    2598     11     76.7    3022     14      8.6
      270    68.1    2559     12     76.7    2950     13      8.6
      285    68.6    2835     12     77.2    2981     14      8.6

 Average             2663   11.3             2960   13.7
 1 Program            795   15.9              887   17.2
 4 Programs          3180                    3548
 Efficiency %        83.8   71.1             83.4   79.7

 ###############################################################################

                    All 1000 MHz - 3 burninfpuPiA7 programs

            16 KB 8 Ops Per Word    64 KB 8 Ops Per Word   48 KB 32 Ops Per Word

  Seconds      C  MFLOPS    FPS       C  MFLOPS    FPS       C  MFLOPS    FPS

 Min         50.8                    55.7                    56.2
 Max         81.0                    83.1                    85.3

      150    71.8    2620     17     76.7    1920     13     78.8    2202     16
      165    72.4    2603     17     77.2    1892     13     79.9    2215     16
      180    72.4    2630     17     76.7    1922     13     79.9    2192     16
      195    73.4    2616     17     77.7    1967     14     81.0    2233     16
      210    73.4    2633     17     77.7    1930     14     81.0    2238     16
      225    74.0    2631     17     78.3    1937     13     81.0    2222     16
      240    74.5    2608     17     78.8    1898     13     81.5    2218     16
      255    74.5    2601     17     78.8    1925     13     81.5    2228     16
      270    74.5    2608     17     78.8    1955     13     82.0    2185     16
      285    75.1    2628     17     78.8    1941     13     82.0    2237     16

 Average             2618   17.0             1929   13.2             2217   16.0
 1 Program            882   17.3              871   17.3              771   17.3
 3 Programs          2646                    2613                    2313
 Efficiency %        98.9   98.3             73.8   76.3             95.8   92.5


 ###############################################################################

              Raspberry Pi 3, 1200 MHz - 3 burninfpuPiA7 programs

         16 KB 8 Ops Per Word     64 KB 8 Ops Per Word     48 KB 32 Ops Per Word

 Seconds   C   MHz MFLOPS FPS      C   MHz MFLOPS FPS      C   MHz MFLOPS FPS

      0  73.6  1200               69.8  1200               74.1  1200
     30  81.1  1130   4921  17    79.5  1200   4821  17    82.7   994   5019  17
     60  82.7   922   4351  17    82.7   947   4646  17    84.4   776   3636  17
     90  83.8   816   3847  17    83.8   854   3624  17    82.7   600   3189  17
    120  84.4   756   3287  17    84.9   766   3337  17    84.4   829   2935  17
    150  84.4   735   3152  17    84.4   750   2999  17    83.3   600   2921  17
    180  84.4   723   2983  17    84.9   784   2864  17    85.4   708   2834  17
    210  84.9   715   2908  17    82.7   897   2838  17    82.7   600   2805  17
    240  84.9   753   2895  17    84.4   808   2847  16    85.4   710   2787  16
    270  84.9   714   2886  16    83.8   600   2838  16    83.8   600   2741  16
    300  82.7   600   2860  16    84.9   707   2819  16    85.4   600   2821  16 
 


To Start


Raspberry Pi 3 Multiple Maximum MFLOPS Tests without OpenGL

Following are results from running four copies of burninfpuPiA7 test program, along with the new Temperature & MHz Recorder. The first results are for the initial 13 passes, via the older Operating System (2016-03-18-raspbian-jessie), with the OpenGL GLUT driver enabled, showing how CPU MHz and all stress tests become slower as temperature rises. Note that run time parameters are determined at full speed, leading to increased time at lower speeds. In this case, the display was replaced by that rainbow coloured square when the temperature reached 80C but the programs continued running.

Results from other tests, that all ran successfully for more than the 15 minutes heat/speed measurements, are shown below. Two used the older OS, one with no GLUT driver and one with the driver installed, but not enabled. Yet, the latter produced higher temperatures and slower speeds. Finally, details of another run using the newer OS (2016-05-27-raspbian-jessie) are provided, with the driver installed and enabled but with even higher temperatures and slower speeds. Note that, particularly for the latter, higher temperatures and slower speeds, than noted during the failing tests, were recorded.

The third table of Raspberry Pi 3 results provides New OS, Driver Enabled test comparisons using different CPU heatsinks. Old is the original one supplied with the initial kit, Black is the latest from Pi Hut in September 2016, and Copper is the rather swish Enzotech BMR-C1, kindly supplied by Doc Watson in September 2016.

For comparison purposes, the MFLOPS speed is the most reliable, representing the average over at least 15 seconds, at the minute intervals. CPU MHz (I believe) is an instantaneous measurement that can go up and down during the test period. All tests suffered from CPU throttling of more than 24%. Black was the least sufferer, but it did start with the CPU a little cooler.

The fourth table is for the revised program burninfpuPi2, where single core speeds approach 3 GFLOPS, this time for the black and copper heatsinks, plus the latter with the case cover removed. There was not much difference on tests with the two heatsinks, but throttling started earlier than when using the previous stress test version. There was still some throttling with the covers removed.

Again, tests on the RPi 3, in the FLIRC Case, constantly ran at the maximum CPU speed of 1200 MHz, with throughput of around 11.7 GFLOPS from the four cores, and maximum CPU temperature of 69.8'C.

The tests for table 3 were repeated later, where, over the whole testing time, each core ran at around 1730 MFLOPS and 1200 MHz, with maximum CPU temperature of 67.1'C (with a warmer start).

Room temperature was around 22C for all measurements.


 ######################################################################

  4 CPU Tests Old OS, Driver Enabled - Failed with temperature > 80C.
 
    Burn-In-FPU Linux/ARM A7 v1.0 Sat Jun 25 22:26:57 2016

     Using 16 KBytes, 8 Operations Per Word, 15 Seconds Per Pass

                                                    Prog2 Prog3 Prog4
 Pass  4 B Ops Repeat   Secs MFLOPS    First    All ----- MFLOPS ----    C    MHz
       Wds /Wd Passes                 Results  Same                     
    0                                                                   60.1  1200
    1 4000  8  740000  15.08  1571 0.540749788  Yes  1583  1623  1569   69.8  1200
    2 4000  8  740000  14.56  1626 0.540749788  Yes  1648  1635  1621   74.7  1200
    3 4000  8  740000  14.55  1627 0.540749788  Yes  1648  1654  1598   76.8  1200
    4 4000  8  740000  14.57  1625 0.540749788  Yes  1644  1648  1643   79.5  1200
    5 4000  8  740000  14.44  1640 0.540749788  Yes  1635  1629  1613   81.1  1200
    6 4000  8  740000  15.52  1525 0.540749788  Yes  1515  1501  1487   81.1  1128
    7 4000  8  740000  16.60  1427 0.540749788  Yes  1417  1411  1395   81.7  1052
    8 4000  8  740000  17.29  1369 0.540749788  Yes  1357  1348  1329   82.2  1020
    9 4000  8  740000  17.93  1320 0.540749788  Yes  1311  1298  1292   82.7   971
   10 4000  8  740000  18.19  1302 0.540749788  Yes  1292  1274  1294   82.7   953
   11 4000  8  740000  18.41  1287 0.540749788  Yes  1278  1258  1278   83.3   920
   12 4000  8  740000  18.69  1267 0.540749788  Yes  1254  1227  1254   83.3   903
   13 4000  8  740000  19.15  1237 0.540749788  Yes  1225  1200  1224   82.7   881

 ######################################################################

  4 CPU Tests - All Successful - Above Failed Old OS with Driver Enabled
 
           New OS                Old OS                Old OS
           Driver Enabled        No Driver             Driver Installed

                       1 Test                1 Test                1 Test
  Minute     C    MHz MFLOPS      C    MHz MFLOPS      C    MHz MFLOPS

       0   56.4   1200           42.9   1200           53.7   1200
       1   75.8   1200   1661    63.9   1200   1640    73.1   1200   1657
       2   81.1   1125   1597    69.8   1200   1655    79.5   1200   1657
       3   82.2   1015   1407    73.6   1200   1657    81.1   1088   1517
       4   82.7    962   1333    76.3   1200   1653    81.7   1035   1443
       5   82.7    938   1272    77.9   1200   1649    82.2   1007   1391
       6   82.7    919   1251    80.1   1200   1665    82.2    972   1340
       7   82.7    905   1236    80.6   1146   1652    82.2    975   1340
       8   83.8    882   1211    81.7   1099   1568    82.2    945   1303
       9   83.8    886   1201    81.7   1075   1536    82.2    939   1298
      10   83.8    858   1190    81.7   1056   1500    82.7    934   1287
      11   83.3    856   1182    81.7   1035   1446    82.7    933   1278
      12   83.3    887   1295    81.7   1039   1452    82.7    933   1272
      13   82.2    942   1296    81.7   1015   1438    82.7    912   1262
      14   82.2    924   1296    82.2   1002   1426    82.7    913   1259
      15   82.7    936   1270    82.2    991   1414    82.7    906   1253
      16   82.7    927   1149    82.2   1001   1400    83.3    916   1239

    Min            856   1149            991   1400            906   1239
    Max    83.8                  82.2                  83.3

 ######################################################################

   Repeat 4 CPU Tests New OS Driver Enabled Different Heatsinks

          Old Heatsink          Black Heatsink        Copper Heatsink       FLIRC
                                                                             Case
                       1 Core                1 Core                1 Core   
  Minute     C    MHz MFLOPS      C    MHz MFLOPS      C    MHz MFLOPS      C

       0   47.2   1200           44.0   1200           46.2   1200           47.8
       1   67.1   1199   1729    65.0   1200   1731    65.0   1200   1726    60.1
       2   74.7   1200   1725    71.4   1200   1729    72.5   1199   1727    61.8
       3   78.4   1200   1725    76.3   1200   1731    77.4   1200   1732    62.3
       4   81.1   1133   1658    79.5   1200   1734    80.6   1157   1725    63.4
       5   81.1   1051   1530    81.7   1132   1637    81.7   1044   1532    63.4
       6   81.7   1011   1471    81.7   1058   1538    81.7   1005   1459    64.5
       7   82.2   1004   1430    80.6   1042   1489    82.2    976   1408    64.5
       8   82.2    978   1392    82.2   1011   1467    82.2    966   1374    65.0
       9   82.7    942   1360    82.2    976   1429    82.7    935   1344    65.5
      10   82.7    939   1341    82.2    985   1407    82.7    939   1331    65.5
      11   82.7    930   1327    82.7    963   1381    82.7    922   1320    66.1
      12   83.3    904   1309    82.2    939   1353    83.3    892   1302    66.6
      13   82.7    892   1290    82.7    936   1339    83.3    895   1292    66.6
      14   83.3    877   1269    82.7    915   1315    83.3    878   1267    67.1
      15   83.8    872   1253    82.7    910   1321    83.3    873   1267     Fin

    min    67.1    872   1253    65.0    910   1315    65.0    873   1267    60.1
    max    83.8   1200   1729    82.7   1200   1734    83.3   1200   1732    67.1
   Loss
    %             27.3   27.5           24.2   24.2           27.3   26.8

 ######################################################################

    Revised Benchmark Max MFLOPS > 2900 Per Core - New OS Driver Enabled

                  FLIRC Case constant 1200 MHz and 4 x 2925 MFLOPS 

          Black Heatsink        Copper Heatsink       Copper No Cover       FLIRC
                                                                             Case
                       4 Core                4 Core                4 Core
  Minute     C    MHz MFLOPS      C    MHz MFLOPS      C    MHz MFLOPS      C

       0   49.9   1200           41.9   1200           46.2   1200           41.9
       1   73.6   1200  11699    65.0   1200  11706    67.1   1200  11720    56.9    
       2   81.7   1124  11282    73.6   1200  11709    74.1   1200  11709    59.6
       3   82.7    977   9489    79.0   1200  11726    79.0   1200  11682    61.2
       4   82.7    917   8954    81.7   1038  10322    80.6   1118  11059    62.3
       5   83.8    867   8545    82.2    963   9629    81.7   1048  10296    63.4
       6   83.8    846   8252    82.7    932   9165    81.7   1015  10073    65.0
       7   83.8    830   8085    83.8    876   8832    81.7    991   9812    65.5
       8   83.8    809   7991    83.3    867   8558    81.7    991   9684    66.3
       9   83.8    816   7860    83.8    842   8318    82.2    963   9556    67.1
      10   83.8    795   7738    83.8    824   8146    82.7    965   9369    67.1
      11   84.4    782   7663    83.8    821   8051    82.7    968   9342    68.2
      12   84.4    787   7625    83.8    813   7966    82.7    953   9241    69.3
      13   83.8    844   8212    83.8    812   7879    82.2    956   9203    69.3
      14   83.8    827   8177    84.4    796   7780    82.7    948   9194    69.8
      15   84.4    830   8133    84.4    794   7710    82.7    949   9109     Fin 

    min    73.6    782   7625    65.0    794   7710    67.1    948   9109    56.9
    max    84.4   1200  11699    84.4   1200  11726    82.7   1200  11720    69.8
   Loss
    %             34.8   34.8           33.8   34.2           21.0   22.3
 


To Start


Fixed Point Test - stressIntPiA7

This has six tests that alternately write and read data and six tests using write once and read many times, each test using two data patterns out of 24 variations. Some are shown in the results. The read phase comprises an equal number of additions and subtractions, with the data being unchanged afterwards. This is checked for correctness, at the end of each test, and any errors reported. Run time parameters are provided for KBytes memory used, seconds for each of the twelve tests and log number for use in multitasking. Default parameters are shown below.

The assembly instruction count for the reading test loop, covering 32 words or 128 Bytes, is 18 adds, 16 subtracts, 33 loads. a compare and a branch, or a total of 69. This equates to 0.539 instructions per byte read frpm memory. With a maximum of 1659 MB/second, Millions of Instruction Per Second (MIPS) executed is 894 (with CPU at 900 MHz).


             Command:-   ./stressIntPiA7 KB 8 Seconds 1 Log 0

  Integer Stress Test Linux/ARM A7 v1.0 Tue Apr  7 11:58:17 2015

   8 KBytes Cache or RAM Space, 1 Seconds Per Test, 12 Tests

 Write/Read
  1    1462 MB/sec  Pattern 00000000 Result OK      89262 passes
  2    1513 MB/sec  Pattern FFFFFFFF Result OK      92357 passes
  3    1518 MB/sec  Pattern A5A5A5A5 Result OK      92642 passes
  4    1514 MB/sec  Pattern 55555555 Result OK      92391 passes
  5    1508 MB/sec  Pattern 33333333 Result OK      92051 passes
  6    1497 MB/sec  Pattern F0F0F0F0 Result OK      91374 passes
 Read
  1    1659 MB/sec  Pattern 00000000 Result OK     202600 passes
  2    1659 MB/sec  Pattern FFFFFFFF Result OK     202600 passes
  3    1659 MB/sec  Pattern A5A5A5A5 Result OK     202600 passes
  4    1659 MB/sec  Pattern 55555555 Result OK     202600 passes
  5    1659 MB/sec  Pattern 33333333 Result OK     202600 passes
  6    1659 MB/sec  Pattern F0F0F0F0 Result OK     202600 passes

                   End at Tue Apr  7 11:58:29 2015

############################################################################

 16 KB L1 Cache
 W/Rd  1573 MB/sec  Pattern F0F0F0F0 Result OK      47991 passes
 Read  1650 MB/sec  Pattern F0F0F0F0 Result OK     100800 passes

 96 KB L2 Cache
 W/Rd  1275 MB/sec  Pattern F0F0F0F0 Result OK       6486 passes
 Read  1500 MB/sec  Pattern F0F0F0F0 Result OK      15300 passes

 50 MB RAM
 W/Rd   932 MB/sec  Pattern F0F0F0F0 Result OK         10 passes
 Read  1244 MB/sec  Pattern F0F0F0F0 Result OK         26 passes

 4 x 160 MB  Pattern F0F0F0F0
 W/Rd   MB/sec   382 + 373 + 372 + 373 = 1500
 Read   MB/sec   366 + 350 + 335 + 350 = 1401
  


To Start


Fixed Point Tests and OpenGL

Following are results from running four copies of stressIntPiA7 and OpenGL1PiR, whilst monitoring CPU MHz and temperature. The tests were run at both 900 and 1000 MHz, with data using the four L1 caches, the shared L2 cache and RAM. During all tests, CPU MHz recordings were constant at the maximum speed. Unlike the MFLOPS tests, performance at 1000 MHz was sometimes slower than at 900 MHz, and CPU temperatures were not necessarily higher.

As indicated above, running four RAM tests without OpenGL, produced a total throughput of at least 1400 MB/second, with 832/1244 using a single core. This time, reading speeds were much slower than the latter, and continued for a minute longer than the specified duration, due to OpenGL influence.

Maximum temperatures at 900 MHz were much higher than with the MFLOPS tests, and L1 cache based calculation speeds were 3.6/3.8 times faster than single core tests and OpenGL FPS scores similar to those in MFLOPS test. Then, at 1000 MHz, FPS measurements were again similar but, inexplicably, fixed point calculation speeds were much slower than at 900 MHz, with little change in maximum temperature.


   900 MHz   L1 cache  4 x 16 KB   L2 cache  4 x 40 KB    RAM    4 x 160 MB

                 4 Tests               4 Tests               4 Tests
   Seconds    C  MB/sec    FPS     C  MB/sec    FPS     C  MB/sec    FPS

         0   45.5                  47.1                  47.6
        80   66.4   5683    5.4    63.8   2118    4.7    61.1    713    4.3
       160   71.3   5649    6.0    67.0   1985    5.2    64.3    675    4.4
       240   73.4   5639    6.0    69.1   2016    5.2    65.9    661    4.2
       320   74.5   5651    6.0    70.8   1974    5.0    67.0    669    4.6
       400   76.1   5649    6.1    71.3   1996    5.0    67.0    680    4.7
       480   77.2   5656    6.1    74.5   2019    5.1    67.5    674    4.8

       560   77.7   6310    6.2    78.8   5637    6.8    68.1    664    4.9
       640   79.9   6298    6.4    81.5   5618    6.8    68.6    668    5.0
       720   81.5   6301    6.5    83.1   5617    6.8    69.7    671    5.1
       800   82.0   6305    6.7    83.7   5615    6.8    70.8    662    5.2
       880   82.0   6296    6.7    84.2   5488    7.0    70.8    670    5.2
       960   83.1   6292    6.6    83.1   5408    6.9    71.3    711    5.1
    later                                                       1505

   1000 MHz  L1 cache  4 x 16 KB   L2 cache  4 x 40 KB    RAM    4 x 160 MB

                 4 Tests               4 Tests               4 Tests
   Seconds    C  MB/sec    FPS     C  MB/sec    FPS     C  MB/sec    FPS

         0   46.0                  51.9                  43.3
        80   74.5   3027    7.1    74.0   2948    5.5    60.5    889    4.8
       160   78.8   2943    7.5    78.3   2859    5.7    64.8    840    5.0
       240   83.1   2871    7.1    79.4   2786    5.7    67.0    855    4.8
       320   84.2   2831    7.2    81.5   2744    5.6    70.2    846    4.7
       400   84.2   2940    7.4    83.1   2851    5.8    70.8    853    4.7
       480   83.7   2983    7.2    82.0   2895    5.7    73.4    849    5.0

       560   83.1   5253    7.0    84.7   5161    7.1    73.4    867    5.0
       640   83.7   5002    7.3    82.0   4913    7.1    74.5    902    5.5
       720   82.0   4905    7.0    84.2   4814    6.8    75.6    849    5.8
       800   83.1   4857    7.1    83.1   4767    6.9    76.7    867    5.7
       880   84.2   4817    7.3    83.1   4727    7.3    77.7    865    5.7
       960   85.3   4774    7.3    85.3   4682    7.0    77.7    853    5.5
    later                                                       1912
   


To Start


Fixed Point Tests Without OpenGL

The fixed point test stressIntPiA7 was also found to lead to performance being degraded due to Raspberry Pi 3 CPU MHz being throttled. So, tests were run using four copies of this program without running OpenGL benchmarks, each using 40 KB L2 cache space.

The first results below are from running the tests on a Raspberry Pi 2, without the experimental OpenGL GLUT driver being installed. Initial tests showed that CPU temperature did increase much, and throughput remained at around four times that of a standalone run. A further test, with results below, was run. These are for one stressIntPiA7 log file, the others being virtually identical. A lamp was used to increase temperatures after two minutes. This eventually lead to throttling, from the overclocked 1000 MHz, to 600 MHz, probably on and off, as MB/second speeds were not reduced in the same proportion.

The second set of results are from using a Raspberry Pi 3 with the OpenGL GLUT driver installed. These indicate CPU throttling, some within the first 80 second pass and later variability, often down from 1200 to 600 MHz, and MB/second approaching half speed. Again, all four programs essentially produced the same degraded performance.

The third table is for three runs on the Raspberry Pi 3, with the copper heatsink attached to the CPU (see above). With such variance, it is hardly possible to compare results with those for the first RPi3 above.

The fourth table is one of the virtually identical logs from repeating the tests on the RPi 3 in the FLIRC Case, plus measured CPU temperature and CPU MHz. The latter was constant and there was no degradation in measured MB/second. Unlike the floating point tests, temperature just about reached the point where throttling would start.


 ##############################################################################

   Raspberry Pi 2 Overclocked, Heated with lamp after two minutes, 1 log of 4

  Integer Stress Test Linux/ARM A7 v1.0 Tue Jul 12 08:19:32 2016

   40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
                                                                    MHz    C

 Write/Read                                                        1000   43.3          
  1    1356 MB/sec  Pattern 00000000 Result OK    1323953 passes   1000   63.8
  2    1360 MB/sec  Pattern FFFFFFFF Result OK    1327729 passes   1000   67.0 
  3    1360 MB/sec  Pattern A5A5A5A5 Result OK    1327936 passes   1000   70.2
  4    1361 MB/sec  Pattern 55555555 Result OK    1328912 passes   1000   72.4
  5    1361 MB/sec  Pattern 33333333 Result OK    1328838 passes   1000   75.1
  6    1360 MB/sec  Pattern F0F0F0F0 Result OK    1328438 passes   1000   77.7
 Read
  1    1646 MB/sec  Pattern 00000000 Result OK    3214400 passes   1000   81.5
  2    1634 MB/sec  Pattern FFFFFFFF Result OK    3191700 passes   1000   83.1
  3    1479 MB/sec  Pattern A5A5A5A5 Result OK    2888900 passes    600   82.6   
  4    1372 MB/sec  Pattern 55555555 Result OK    2679800 passes    600   84.7
  5    1315 MB/sec  Pattern 33333333 Result OK    2569200 passes    600   85.3
  6    1257 MB/sec  Pattern F0F0F0F0 Result OK    2456400 passes    600   85.3


                   End at Tue Jul 12 08:35:32 2016

             One Program Stand Alone, 1 Second Per Test
 Write/Read
  1    1377 MB/sec  Pattern 00000000 Result OK    1345001 passes
 Read
  1    1676 MB/sec  Pattern 00000000 Result OK    3274700 passes

 ##############################################################################

                        Raspberry Pi 3, 1 log of 4

  Integer Stress Test Linux/ARM A7 v1.0 Mon Jul 11 19:58:19 2016

   40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
                                                                    MHz    C

 Write/Read                                                        1200   62.8
  1    2472 MB/sec  Pattern 00000000 Result OK    2413986 passes    735   84.9
  2    1853 MB/sec  Pattern FFFFFFFF Result OK    1809189 passes    600   83.3
  3    1792 MB/sec  Pattern A5A5A5A5 Result OK    1749631 passes    600   83.3
  4    1770 MB/sec  Pattern 55555555 Result OK    1728301 passes    600   83.3
  5    1733 MB/sec  Pattern 33333333 Result OK    1692058 passes    891   83.8
  6    1745 MB/sec  Pattern F0F0F0F0 Result OK    1704346 passes    864   84.4
 Read
  1    1788 MB/sec  Pattern 00000000 Result OK    3491600 passes    714   85.4
  2    1723 MB/sec  Pattern FFFFFFFF Result OK    3366300 passes    600   84.9
  3    1670 MB/sec  Pattern A5A5A5A5 Result OK    3261500 passes    600   83.3
  4    1661 MB/sec  Pattern 55555555 Result OK    3244700 passes    600   83.8
  5    1662 MB/sec  Pattern 33333333 Result OK    3246100 passes    744   85.4
  6    1647 MB/sec  Pattern F0F0F0F0 Result OK    3217300 passes    600   83.8


                   End at Mon Jul 11 20:14:20 2016

             One Program Stand Alone, 1 Second Per Test
 Write/Read
  1    3099 MB/sec  Pattern 00000000 Result OK      37825 passes
 Read
  1    3220 MB/sec  Pattern 00000000 Result OK      78700 passes

 ##############################################################################

               Raspberry Pi 3, 3 Runs, Copper Heatsink 

           ----- MB/sec -----    ------- MHz ------   -------- 'C -------
     Run      1      2      3       1      2      3       1      2      3

   Write/Read                    1200   1200   1200    68.8   52.6   50.5
       1   2889   3145   3171    1000   1200   1200    82.2   73.6   77.9
       2   2384   3058   2776     822   1037    916    83.8   81.7   82.7
       3   2108   2509   2279     759    910    826    83.8   83.3   82.7
       4   1993   2261   2133     742    829    783    84.4   83.8   83.8
       5   1924   2137   2038     710    813    762    84.9   83.8   84.9
       6   1896   2091   1970     739    805    745    84.4   83.8   83.8
   Read
       1   2015   2099   2047     729    791    763    84.9   84.9   84.4
       2   1934   2016   1983     600    748    720    83.8   84.9   84.9
       3   1409   2031   1986     763    767    738    84.4   84.4   84.9
       4   1699   1993   1981     751    760    732    85.4   84.9   84.9
       5   1543   1857   1966     738    600    731    84.4   82.7   84.9
       6   1870   1849   1950     600    600    931    75.8   76.8   81.7

    min    1409   1849   1950
    max                                                85.4   84.9   84.9

##############################################################################

 Raspberry Pi 3 in FLIRC Case, 1 of 4 logs with virtually the same results

   Integer Stress Test Linux/ARM A7 v1.0 Mon Oct  3 22:15:28 2016

   40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
                                                                    MHz    C

 Write/Read                                                        1200  46.2
  1    3084 MB/sec  Pattern 00000000 Result OK    3011338 passes   1200  65.5 
  2    3128 MB/sec  Pattern FFFFFFFF Result OK    3054608 passes   1200  68.8
  3    3137 MB/sec  Pattern A5A5A5A5 Result OK    3063003 passes   1200  70.9
  4    3146 MB/sec  Pattern 55555555 Result OK    3072197 passes   1200  72.0
  5    3151 MB/sec  Pattern 33333333 Result OK    3077292 passes   1200  73.1
  6    3148 MB/sec  Pattern F0F0F0F0 Result OK    3074154 passes   1200  74.1
 Read
  1    3179 MB/sec  Pattern 00000000 Result OK    6209000 passes   1200  74.1
  2    3175 MB/sec  Pattern FFFFFFFF Result OK    6202000 passes   1200  74.1
  3    3177 MB/sec  Pattern A5A5A5A5 Result OK    6204800 passes   1200  75.8
  4    3178 MB/sec  Pattern 55555555 Result OK    6207300 passes   1200  78.4
  5    3185 MB/sec  Pattern 33333333 Result OK    6220600 passes   1200  79.5
  6    3189 MB/sec  Pattern F0F0F0F0 Result OK    6229600 passes   1200  80.1

                   End at Mon Oct  3 22:31:28 2016
  


To Start


Drive, USB and LAN Test - burnindrive2

This is essentially the same as my program used during hundreds of UK Government and University computer acceptance trials during the 1970s and 1980s, with some significant achievements. Burnindrive writes four files, using 164 blocks of 64 KB, repeated 16 times (164.0 MB), with each block containing a unique data pattern. The files are then read for two minutes, on a sort of random sequence, with data and file ID checked for correct values. Then each block (unique pattern) is read numerous times, over one second, again with checking for correct values. Total time is normally about 5 minutes for all tests, with default parameters. For further information, including data patterns and reading sequence example, see original burnindrive report. This new version is the same as the older one, except the unrequired configuration details are not produced. Details of input parameters and example of results log are below.

 
      Run Time Parameters - Upper or Lower Case
                                                                              Default
   R or Repeats             Data size, multiplier of 10.25 MB, more or less     16
   P or Patterns            Number of patterns for smaller files < 164         164
   M or Minutes             Large file reading time                              2
   L or Log                 Log file name extension 0 to 99                      0
   S or Seconds             Time to read each block, last section                1
   F or FilePath            For other than SD card or SD card directory - see examples
   C or CacheData           Omit O_DIRECT on opening files to allow caching      No  
   O or OutputPatterns      Log patterns and file equences used as above         No
   D or DontRunReadTests    Or only run write tests                              No   

  Format ./burnindrive2 Repeats 16, Minutes 2, Log 0, Seconds 1 
     or  ./burnindrive2 R 16, M 2, L 0, S 1


     Example Log System SD Card - IOlog0.txt

 Command ./burnindrive2
 ###############################################################

 Current Directory Path: 
 /home/pi/benchmarks/reliability/burnindrive/new
 Total MB    6266, Free MB    3428, Used MB    2838

 Linux Storage Stress Test for ARM v2.0, Sat Apr 11 12:55:38 2015

 File size  164.00 MB x 4 files, minimum reading time  2.0 minutes

 File 1  164.00 MB written in   17.57 seconds 
 File 2  164.00 MB written in   18.22 seconds 
 File 3  164.00 MB written in   18.06 seconds 
 File 4  164.00 MB written in   17.98 seconds 


         Total   71.83 seconds, Elapsed   71.83 seconds

              Start Reading Sat Apr 11 12:56:50 2015

 Read passes     1 x 4 Files x  164.00 MB in     0.68 minutes
 Read passes     2 x 4 Files x  164.00 MB in     1.36 minutes
 Read passes     3 x 4 Files x  164.00 MB in     2.04 minutes

            Start Repeat Read Sat Apr 11 12:58:52 2015

 Passes in 1 second(s) for each of 164 blocks of 64KB:

    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280    280
    260    280    280    280    280    280    280    280    280    280    280
    280    280    280    280    280    280    280    280    280    280

 45900 read passes of 64KB blocks in     2.86 minutes

  No errors found during reading tests

              End of test Sat Apr 11 13:01:44 2015
  


To Start


Drive Test, Multiple Devices

Below is the information needed to set up system tests using drives and the LAN. For the latter, external drives have to be mounted, as shown below [1]. USB drives are mounted when plugged in. File paths to use can then be identified via a df command [2], for use in run commands [3]. In order to organise running time, using multiple devices, it is useful to run each test separately, where slow writing times might need to be considered [4]. Default reading time is shown as between 2 and 3 minutes, with repeated reads a snot much longer than 164 x 1 second. However, running times using multiple drives can be unpredictable.

Monitoring - The RPiHeatMHz program [5] shows that drive tests have little impact on CPU temperature and remain running at 600 MHz, testing USB drives, with CPU utilisation, as monitored by vmstat [6], being negligible. MHz is switched to 900 MHz, running the LAN test, with CPU utilisation up to 60% of one CPU core. As shown, vmstat provides no details of network traffic volumes but this can be obtained via the sar function. The latter needs installation of the sysstat package [7].


  [1] Identify IP Address and Mount Drive

  Windows Command Prompt ipconfig command = 192.168.0.2
  Windows share drive (partition) d 

  sudo mount -t cifs -o dir_mode=0777,file_mode=0777 //192.168.0.2/d /media/public
  Password: raspberry - in this case unchanged default password
   
  Linux Terminal command ifconfig eth0 (or eth1) = 192.168.0.3
  Linux share directory all

  sudo mount -t cifs -o user=UU,password=PP //192.168.0.3/all /media/public
  UU and PP are IDs for Linux system, -o dir_mode=0777,file_mode=0777 not needed

  NOTE: If wrong IDs are used, a locked file will be generated and this leads to a
  failure to open a new file when correct IDs are used. The file must be deleted.

  ################################################################################

  [2] Identify Drive Path on RPi

  Command df

  Filesystem      1K-blocks      Used Available Use% Mounted on
  rootfs            6416312   2905604   3161732  48% /
  /dev/root         6416312   2905604   3161732  48% /
  devtmpfs           376896         0    376896   0% /dev
  tmpfs               76240      1432     74808   2% /run
  tmpfs                5120         0      5120   0% /run/lock
  tmpfs              152460         0    152460   0% /run/shm
  /dev/mmcblk0p5      60479     14527     45952  25% /boot
  /dev/sdd1         7811072     23156   7787916   1% /media/8GB
  /dev/mmcblk0p3      27633       435     24905   2% /media/SETTINGS
  /dev/sda1         7873384    822208   7051176  11% /media/SIGMA
  //192.168.0.2/d 235519996 160311812  75208184  69% /media/public

  SIGMA and 8GB are USB Flash Drives

  ################################################################################

  [3] Commands to Run Benchmark With Logs On RPi 

  Main SD Card  - ./burnindrive2 Log 71
  USB Drive     - ./burnindrive2 FilePath /media/SIGMA Log 72
  USB Drive     - ./burnindrive2 FilePath /media/8GB Log 73 
  Windows Drive - ./burnindrive2 FilePath /media/public/Temp Log 74

  ################################################################################

  [4] Running Times, Data Volumes and Speed

          Local SD            SIGMA               8GB                LAN
           secs    MB  MB/s   secs    MB  MB/s   secs    MB  MB/s   secs    MB  MB/s

  Write    70.4   656   9.3  183.3   656   3.6  253.9   656   2.6   69.0   656   9.5
  Read    124.2  1968  15.8  183.6  1312   7.1  147.6  1968  13.3  140.4  1312   9.3
  Repeats 171.6  2858  16.7  166.2  2259  13.6  171.6  2388  13.9  172.8  1640   9.5

  ################################################################################

  [5]Maximum Temperatures and CPU MHz

          Local               SIGMA               8GB                LAN

          600 MHz  51.4C     600 MHz  49.8C     600 MHz  49.8C    900 MHz  51.9C

  ################################################################################

  [6] vmstat and sar -n Performance Monitor Details

             ----KB/sec---    -Number/sec-      -----CPU Utilisation %----
                bi      bo      in      cs      us      sy      id      wa
  Main SD
  Write          1    8953    2181    1527       1       2      76      21
  Read       16720       6    2768    1653       1       2      74      23
  Repeats    16599       7    2570    1491       0       2      75      23

  USB
  Write          0    3714    2352    1030       0       3      74      23
  Read        7311       5    3468    1297       0       3      73      24
  Repeats    14000       7    5767    2346       0       5      72      23

  LAN
  Write          0      12   10692     974       1      14      85       0
  Read           0      11    3242    2304       1       9      90       0
  Repeats        0      12    3346    2356       0       9      90       0


  Command for 10 second samples of network statistics:-  sar -n DEV 10
 
              ---KB/sec---    Packets/sec      --B/packet--
                rx      tx     rx      tx        rx      tx

  Write        162   10280    3031    7003       53    1468
  Read        9943     123    6886    1362     1444      90
  Repeats    10060     126    6973    1390     1443      91

  To install sar [7]:- sudo apt-get install sysstat
  


To Start


System Stress Test

A sixteen minute system stress test was run, comprising burninfpuPiA7, stressIntPiA7 and burnindrive2 running on the main SD card, SIGMA USB drive and 8GB USB drive. Results for stand alone tests on the latter three are above. The script file [7], below, included RPiHeatMHz, CPU MHz/temperature measuring program and vmstat performance monitor.

With the CPU tests running, unlike the stand alone tests, CPU MHz was at 900 MHz, throughout. Resultant drive speeds [8] were virtually the same as when run by themselves. Testing times, set by the parameters, were not quite right, with one drive test longer than planned.

Bold times in seconds [8] indicate changes in the volumes of writing and reading. Monitored results reported [9] are at these points. The drive writing and reading speeds, surprisingly, reflected the sums derived from running times, particularly the 44 MB/second on reading. The two CPU tests produced 50% CPU utilisation, or 100% of two cores.

Maximum CPU temperature [9] was not that high at 62.7C. This can be compared with 54.1C from the single program Livermore Loops Stress Test 2 or Test at 78.3C3, with three copies and OpenGL1PiR running.

See screen shot of tests running in different windows via Raspberry Pi 2 Stress Test Screen Shot.


 [7] Script File runall.sh 

 lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 Mins 4 Log 21
 lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 Mins 1 FPath /media/SIGMA Log 22
 lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 FPath /media/8GB Log 23
 lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, secs 16
 lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 10 Sect 2 Mins 16 Log 20
 lxterminal --geometry=80x15 -e ./stressIntPiA7 KB 16 Secs 80 Log 11
 lxterminal --geometry=80x15 -e vmstat 10 96 > vmstat.txt

  ################################################################################

  [8] Running Times, Data Volumes and Speed

          Local SD            SIGMA               8GB              fpuPiA7 IntPiA7 
            secs    MB  MB/s   secs    MB  MB/s   secs    MB  MB/s    secs    secs

 Write      70.9   656   9.3  177.5   656   3.7  254.4   656   2.6
 Read      243.0  3936  16.2   94.2   656   7.0  145.2  1968  13.6
 Repeats   661.8 11019  16.6  664.2  8680  13.1  663.6  9159  13.8

 Total     975.7              935.9             1063.2               958.0  960.0

  ################################################################################

  [9] vmstat Performance Monitor Details and CPU Temperature

    After   --KB/sec--   Number/sec    --CPU Utilisation %--
     Secs     bi    bo    in     cs    us    sy     id    wa    C
        0                                                      49.2
       80      1 15951  4161   2743    52     7      3    38   56.8
      180  16685  6495  4597   2627    50     7      5    39   58.9
      270  24115  2748  5650   3070    50     6      5    39   58.9
      960  44296    13  9923   5324    50     9      6    35   62.7
  


To Start


Roy Longbottom at Linkedin   Roy Longbottom October 2016

The Official Internet Home for my Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection