# Multitasking with multiple CPU tests
General
The programs used for stress testing are based on those described in
Raspberry Pi Benchmarks.htm and
Raspberry Pi Stress Tests.htm.
They are generally those compiled to make use of Raspberry Pi 2 ARMv7 CPU features, run by script files for multitasking purposes. The benchmarks, test programs and sample script files are in a single folder, available in
Raspberry_Pi_2_Stress_Tests.zip.
A Separate folder contains source codes, with the compile commands used in the header area.
For stress testing purposes, these programs have command parameters that determine running time and, sometimes, which particular hardware to use. Most also include performance measurements, reported at regular intervals, to identify speed reductions due to such as overheating or system interference. The test programs can be run via command lines in shell scripts, including allowing multiple programs to be run at the same time, each in its own Terminal window. The commands sometimes include an option for different log file names, for cleaner results when more than one copy of the same program is run.Temperature and CPU MHz recording applications can be included in the mix.
All test programs check numeric answers or data transfers for correct or consistent values and report in the log files if incorrect.
2016 - Results of stress tests are included below, using a new OpenGL GLUT Benchmark and Maximum MFLOPS,
also with Livermore Loops.
Later additions are results from a Raspberry Pi 3.
On conducting Raspberry Pi 3 multiprocessor stress tests, starting
below,
CPU speed was seen to be slowing down, or throttling, as temperature increased. Running the same tests on a tablet, with apparently the same Cortex-A53 processor, did not show this effect. Googling indicated that the Raspberry Pi’s Broadcom BCM2837 version is manufactured using the 40 nm process, the tablet having a cooler Snapdragon implementation with 0.28 nm lithography.
Raspberry Pi 3 stress tests were carried out using various heatsinks and with the case cover off, but throttling still occurred. Based on advice from Raspberry Pi Forum, an aluminium FLIRC Case was purchased (relatively expensive). The RPi board is screwed to the lid that has a cuboid protraction, acting as an heatsink when clamped to the CPU via a thermal pad. This proved to be effective in reducing temperatures, with no throttling over the usual testing times. Results are included below.
To Start
Temperature and CPU MHz Recorder - RPiHeatMHz
This is designed to run in its own window, concurrently with other test programs. It identifies boot time speed settings, then CPU MHz and temperature, as defined by a command. The following command (optionally) opens a new terminal window with parameters for the number of samples and interval between reports.
As with other commands, upper or lower case can be used and only the first character is needed.
lxterminal -e ./RPiHeatMHz passes 60, seconds 15
Following is an example, as displayed and saved in RPiHeatMHz.txt log file, using default settings with 10 samples at 1 second intervals.
2016 - the original MHz was measured using function scaling_cur_freq. Now, it is apparent that this does not show dynamic variations. The latest version of RPiHeatMHz now includes results from the measure_clock arm command.
Below are example reports from the revised program running on a Raspberry Pi 3, showing that the two measures can be the same when nearly idling.
Temperature and CPU MHz Measurement
Start at Sun Mar 1 03:10:15 2015
Using 10 samples at 1 second intervals
Boot Settings
arm_freq=900
hdmi_force_hotplug=1
config_hdmi_boost=4
overscan_left=24
overscan_right=24
overscan_top=16
overscan_bottom=16
disable_overscan=0
core_freq=250
sdram_freq=450
over_voltage=0
Seconds
0.0 900 MHz temp=47.6°C
1.0 600 MHz temp=47.1°C
2.1 600 MHz temp=46.5°C
3.2 600 MHz temp=47.1°C
4.2 600 MHz temp=46.5°C
5.3 600 MHz temp=46.5°C
6.4 600 MHz temp=46.5°C
7.5 600 MHz temp=46.5°C
8.5 600 MHz temp=46.5°C
9.6 600 MHz temp=46.5°C
10.7 600 MHz temp=46.5°C
End at Sun Mar 1 03:10:25 2015
#################### New RPiHeatMHz ####################
Boot Settings
dtparam=audio=on
dtoverlay=vc4-kms-v3d
Seconds
0.0 1200 scaling MHz, 1200 ARM MHz, temp=58.0°C
15.0 1200 scaling MHz, 1200 ARM MHz, temp=67.1°C
30.1 1200 scaling MHz, 1200 ARM MHz, temp=70.9°C
45.1 1200 scaling MHz, 1200 ARM MHz, temp=73.6°C
60.2 1200 scaling MHz, 1200 ARM MHz, temp=75.8°C
75.3 1200 scaling MHz, 1200 ARM MHz, temp=78.4°C
90.5 1200 scaling MHz, 1200 ARM MHz, temp=79.5°C
105.6 1200 scaling MHz, 1160 ARM MHz, temp=80.6°C
120.7 1200 scaling MHz, 1075 ARM MHz, temp=81.1°C
135.8 1200 scaling MHz, 1051 ARM MHz, temp=81.7°C
150.9 1200 scaling MHz, 1023 ARM MHz, temp=81.7°C
166.0 1200 scaling MHz, 1020 ARM MHz, temp=82.2°C
181.1 1200 scaling MHz, 1006 ARM MHz, temp=82.2°C
Seconds
0.0 600 scaling MHz, 600 ARM MHz, temp=55.8°C
1.0 1200 scaling MHz, 1200 ARM MHz, temp=56.4°C
2.0 1200 scaling MHz, 1200 ARM MHz, temp=56.4°C
3.1 1200 scaling MHz, 1200 ARM MHz, temp=56.9°C
4.1 1200 scaling MHz, 1200 ARM MHz, temp=56.9°C
5.2 600 scaling MHz, 600 ARM MHz, temp=56.4°C
|
To Start
Livermore Loops Stress Test - liverloopsPiA7R
The Livermore Loops
benchmark was converted to act as a stress test, following wrong numeric results being produced on an overclocked, PC using a Pentium Pro CPU.
The Loops comprise 24 double precision floating point kernels, with performance measurements in terms of Millions of Floating Point Operations Per Second or MFLOPS. The kernel tests are repeated three times, with different data sizes. By including the running time of each loop converts the benchmark into a stress test, whereby numeric results of calculations are checked for correctness after each of the numerous passes, with errors errors being logged, along with performance details.
Detailed results are displayed continuously, as the tests are running. There is too much detail for logging. So, as shown below, the start times of each section are reported.
Certain changes were made to, liverloopsPiA7, the RPi gcc 4.8 benchmark, firstly to include the expected results of computation. Then, code for the rather convoluted system configuration details was removed and an option included to use different log files, for when multiple copies are run at the same time.
Following is an example command to open a new terminal window, run each test for approximately 12 seconds and save results in LoopsLog1.txt. Total time will be around 24 x 3 x 12 = 864 seconds.
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
Above is an example command to open a new terminal window, run each test for approximately 12 seconds and save results in LoopsLog1.txt. Total time will be around 24 x 3 x 12 = 864 seconds.
Livermore Loops Benchmark vfpv4 32 Bit via C/C++ Tue Mar 24 12:24:22 2015
Reliability test 12 seconds each loop x 24 x 3
Part 1 of 3 start at Tue Mar 24 12:24:23 2015
Part 2 of 3 start at Tue Mar 24 12:29:10 2015
Part 3 of 3 start at Tue Mar 24 12:33:57 2015
Numeric results were as expected
MFLOPS for 24 loops
130.3 161.0 222.0 220.1 85.3 122.5 220.0 210.2 190.5 138.0 99.1 56.0
67.7 82.6 124.2 132.9 201.7 180.1 160.0 124.2 95.0 42.6 184.2 127.2
Overall Ratings
Maximum Average Geomean Harmean Minimum
222.0 137.1 125.8 113.2 42.4
End of test Tue Mar 24 12:38:47 2015
|
To Start
OpenGL Stress Test - OpenGL1PiR.bin
This uses the
OpenGL ES Benchmark ,
that has command parameters for window width and height, plus running time in minutes. With the latter specified, the most demanding textured test is run. Following is a typical command.
lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
The test is run fifteen times for 4 seconds per specified minute. Actual running times and speed in Frames Per Second (FPS) are displayed and saved in log file OpenGLPi.txt.
NOTE - The original OpenGL benchmark was found to be producing FPS speeds twice as high as they should be. Existing relative performance comparisons of results on Raspberry Pi systems are still valid. The benchmark has been modified. So do not compare old scores with new ones.
Raspberry Pi OpenGL ES Benchmark 1.1, Wed Mar 25 02:41:29 2015
Reliability Mode 15 Tests of 60 Seconds
Test 1 60.24 seconds, 5.61 FPS
Test 2 60.07 seconds, 6.23 FPS
Test 3 60.03 seconds, 6.23 FPS
Test 4 60.15 seconds, 6.18 FPS
Test 5 60.10 seconds, 6.29 FPS
Test 6 60.02 seconds, 6.26 FPS
Test 7 60.02 seconds, 6.30 FPS
Test 8 60.19 seconds, 6.31 FPS
Test 9 60.07 seconds, 6.36 FPS
Test 10 60.22 seconds, 6.34 FPS
Test 11 60.01 seconds, 6.67 FPS
Test 12 60.17 seconds, 6.58 FPS
Test 13 60.00 seconds, 6.57 FPS
Test 14 60.03 seconds, 6.56 FPS
Test 15 60.19 seconds, 6.45 FPS
Screen Pixels 1920 Wide 1080 High
End Time Wed Mar 25 02:56:31 2015
|
To Start
Multiple Livermore Loops Plus OpenGL Scripts
Following are the scripts used for the first set of stress tests, each with the temperature and MHz speed program being run at the same time. The tests were one OpenGL, one Livermore Loops then one OpenGL and three Livermore Loops. The tests were run at the normal CPU speed of 900 MHz and overclocked at 1000 MHz.
Unfortunately, the lxterminal command does not appear to have the screen position option. However, with appropriate size parameters, and an extended testing time, they can be moved around with the mouse. The Multiple Windows example below, with MHz/heat, four Livermore Loops and one vmstat performance monitor, is appropriate for displaying six windows on a 1920 x 1080 TV, with some space at the side for a terminal for further commands.
Test 1 - Script 1opengl.sh
lxterminal -e ./RPiHeatMHz passes 60, seconds 15
lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
Test 2 - Script 1loop.sh
lxterminal -e ./RPiHeatMHz passes 60, seconds 15
lxterminal -e ./liverloopsPiA7R Seconds 12
Test 3 - Script 1ogl3loops.sh
lxterminal -e ./RPiHeatMHz passes 60, seconds 15
lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 2
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 3
Multiple Windows
lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, seconds 15
lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 1
lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 2
lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 3
lxterminal --geometry=80x15 -e ./liverloopsPiA7R Seconds 12 Log 4
lxterminal --geometry=80x15 -e vmstat 5
|
To Start
Multiple Livermore Loops Plus OpenGL Results
As confirmed by vmstat, the OpenGL benchmark used very little CPU time and, when run by itself, the CPU did not switch to the full speed at 900 MHz and rarely did so at 1 GHz. On the other hand, the Livermore Loops benchmark ran at full speed continuously.
OpenGL Frames Per Second was slightly faster, at normal CPU speed, when run at the same times as the other programs, probably as the CPU was running at 900 MHz. Other than this, FPS speeds were consistent over all tests. Livermore Loops summary MFLOPS indicated no degradation when running three copies, at the lower MHz, but a little at 1 GHz, both demonstrating efficient multitasking.
The OpenGL, graphics processor speed limited benchmark, generated higher temperatures than one Livermore Loops program. This was confirmed running the above script for four copies of the latter, at 900 MHz, where maximum temperature was 70°C, compared with 78°C, with with three CPU tests and OpenGL. Highest temperatures were during the latter, at 1 GHz, where 84.7°C was reached.
Below are later results using a new OpenGL GLUT Benchmark.
900 MHz ---------------------------- 1000 MHz ---------------------------
Test 1 ----- Test 2 - Test 3 ----- Test 1 ----- Test 2 - Test 3 -----
MHz °C FPS MHz °C MHz °C FPS MHz °C FPS MHz °C MHz °C FPS
600 48.2 900 49.2 900 47.6 1000 47.6 600 49.2 600 48.7
600 50.8 900 51.9 900 59.5 600 51.9 1000 54.6 1000 65.4
600 54.1 900 53.0 900 64.3 600 53.0 1000 56.2 1000 71.3
600 54.1 5.7 900 53.0 900 68.1 5.9 600 54.1 5.9 1000 57.3 1000 74.5 5.9
600 55.1 900 53.0 900 68.1 600 54.1 1000 55.7 1000 74.5
600 55.1 900 52.5 900 68.6 600 56.2 1000 57.3 1000 75.6
600 55.7 900 53.0 900 68.1 600 55.1 1000 56.8 1000 75.6
600 55.7 6.3 900 53.0 900 68.6 6.6 600 55.7 6.4 1000 56.2 1000 76.1 6.4
600 56.2 900 53.0 900 70.2 600 57.3 1000 57.3 1000 77.7
600 56.2 900 53.0 900 71.3 600 56.8 1000 57.3 1000 79.4
600 56.8 900 53.0 900 72.4 600 57.3 1000 57.3 1000 79.9
600 56.8 6.3 900 53.5 900 71.3 6.6 1000 59.5 6.5 1000 58.4 1000 79.4 6.5
600 57.8 900 53.0 900 70.8 600 57.3 1000 56.2 1000 78.3
600 58.4 900 53.0 900 70.8 600 57.8 1000 57.3 1000 79.4
600 57.3 900 53.0 900 71.8 600 57.3 1000 56.8 1000 79.9
600 57.8 6.3 900 53.0 900 72.4 6.5 600 57.3 6.2 1000 57.3 1000 80.4 6.2
600 57.8 900 52.5 900 72.9 600 57.8 1000 56.8 1000 81.0
600 58.4 900 53.0 900 72.4 600 57.3 1000 56.8 1000 79.9
600 57.8 900 53.0 900 73.4 600 57.8 1000 56.8 1000 82.0
600 57.8 6.4 900 53.5 900 74.5 6.6 600 57.3 6.4 1000 57.8 1000 82.6 6.4
600 59.5 900 53.5 900 75.6 600 57.8 1000 57.3 1000 83.7
600 57.8 900 53.5 900 75.6 600 57.3 1000 57.8 1000 84.7
600 59.5 900 54.1 900 76.1 600 57.8 1000 57.8 1000 83.7
600 58.4 6.4 900 54.1 900 75.1 6.7 600 57.8 6.3 1000 57.8 1000 83.7 6.3
600 58.4 900 54.1 900 74.5 600 60.5 1000 57.8 1000 83.1
600 58.4 900 53.0 900 75.1 600 57.8 1000 57.3 1000 83.1
600 60.0 900 53.5 900 75.1 600 57.8 1000 57.3 1000 84.2
600 58.9 6.3 900 53.0 900 76.1 6.7 600 58.9 6.4 1000 57.8 1000 83.1 6.4
600 59.5 900 53.5 900 76.7 600 58.4 1000 57.8 1000 84.2
600 58.9 900 53.0 900 75.6 600 58.4 1000 57.3 1000 81.5
600 60.0 900 54.1 900 75.6 600 61.6 1000 58.4 1000 83.7
600 58.4 6.3 900 53.0 900 74.5 6.7 600 59.5 6.5 1000 57.3 1000 83.1 6.5
600 58.4 900 53.0 900 76.1 600 58.9 1000 57.3 1000 84.2
600 58.9 900 53.0 900 75.6 600 58.4 1000 57.3 1000 83.7
600 60.5 900 53.0 900 75.1 600 61.6 1000 57.8 1000 84.2
600 60.0 6.3 900 53.0 900 74.5 6.7 600 60.5 6.5 1000 57.3 1000 83.1 6.5
600 58.4 900 53.0 900 75.6 600 58.4 1000 56.8 1000 83.1
600 59.5 900 53.0 900 76.7 1000 63.8 1000 57.3 1000 84.2
600 59.5 900 53.5 900 76.7 600 62.1 1000 57.3 1000 82.0
600 59.5 6.3 900 53.5 900 78.3 6.7 600 58.4 6.7 1000 57.8 1000 84.2 6.7
600 58.4 900 53.5 900 78.3 600 59.5 1000 57.8 1000 84.2
600 58.4 900 54.1 900 77.7 1000 63.8 1000 58.4 1000 82.6
600 59.5 900 54.1 900 77.7 600 60.5 1000 57.8 1000 83.7
600 59.5 6.5 900 53.5 900 77.2 6.8 600 58.9 6.5 1000 57.8 1000 83.1 6.5
600 59.5 900 53.5 900 77.2 1000 61.6 1000 57.3 1000 83.1
600 59.5 900 53.5 900 77.2 1000 64.8 1000 57.3 1000 84.2
600 59.5 900 54.1 900 77.7 600 60.0 1000 57.3 1000 84.2
600 59.5 6.5 900 53.0 900 78.3 6.9 600 60.0 6.5 1000 57.8 1000 84.2 6.5
600 59.5 900 53.5 900 76.7 600 59.5 1000 57.3 1000 81.5
600 59.5 900 53.5 900 76.7 600 60.0 1000 57.3 1000 83.7
600 59.5 900 53.0 900 76.1 600 59.5 1000 56.8 1000 83.1
600 60.0 6.5 900 53.0 900 77.2 6.9 600 60.0 6.3 1000 56.8 1000 83.1 6.3
600 59.5 900 53.0 900 76.1 600 59.5 1000 56.2 1000 82.0
600 59.5 900 53.0 900 76.7 600 59.5 1000 57.3 1000 81.5
600 58.4 900 53.0 900 76.7 600 59.5 1000 56.2 1000 82.6
600 59.5 6.6 900 53.0 900 75.6 6.9 600 59.5 6.3 1000 56.2 1000 81.0 6.3
600 59.5 900 53.0 900 74.5 600 61.1 1000 56.2 1000 81.5
600 59.5 900 53.5 900 72.4 600 60.0 1000 56.8 1000 81.0
600 59.5 600 50.8 600 69.1 600 59.5 600 53.5 1000 81.5
600 60.0 6.6 600 49.8 600 67.0 6.8 1000 61.6 6.4 600 50.8 600 73.4 6.4
600 59.5 600 49.8 600 65.9 600 60.0 600 50.3 600 71.8
Max 60.5 54.1 78.3 64.8 58.4 84.7
Livermore Loops MFLOPS
Test 2 - Test 3 ----- Test 2 - Test 3 -----
Maximum 222 224 224 220 247 249 249 247
Average 137 137 137 135 151 149 148 147
Geomean 126 125 126 124 139 136 136 134
Harmean 113 113 113 112 125 122 122 121
Minium 42 43 43 42 47 47 47 44
|
2016 - With New OpenGL GLUT Benchmark - videogl32 (Failed)
This benchmark is my main Linux OpenGL program. For further details see:
Raspberry Pi Benchmarks.htm.
The later Operating System would not execute the script with lxterminal commands. However, the details could be copied and pasted as a single list of commands. The extra export command turned off Wait For Vertical Blank (VSYNC) to demonstrate maximum speeds.
Only measurements with the CPU at 1000 MHz are provided, showing much faster OpenGL Frames Per Second than the original benchmark and, maybe, a slightly higher maximum temperature.
Running at the higher temperatures, the OpenGL display disappeared occasionally, displaying the multi-coloured square seen on booting. The display had to be restored by moving the mouse.
The test was rerun on a hotter day, with room temperature around 25°C. The second set of results are included below, where CPU MHz was the same. The test produced many more display failures, with the CPU temperature being higher. Continuously moving the mouse, to avoid failures, probably lead to the lower FPS and reduced temperatures (CPU part of chip, not graphics?).
Later in the tests, the coloured square was broken up.
In the Raspberry Pi Forum, it was suggested that the failures were due to inadequate power supply, indicating an under-voltage warning. That used was an official 5 volt, 2 amp unit and an available one, rated at 2.5 amps, also produced the same failures. In order to investigate, a
DROK
Digital Ampere Voltage Multimeter was obtained.
This has USB 2 ports in and out and could be used immediately with the the 2.5 amp power supply and a further available one, rated at 1.5 amps. When a suitable micro USB to USB 2 converter is obtained, the 2 amp unit can be tested.
Following the temperature and performance details are power consumption measurements from 15 minute tests, with the CPU overclocked at 1000 MHz, using both 2.5A and 1.5A power supplies, plus with the latter at 900 MHz.
The voltage and current recordings provide no indication of potential power supply issues.
My conclusion is that the cause of the problem appears to be temperature related.
Note - For maximum videogl32 speed, a new driver had to be installed. Attempting to run the original OpenGL1PiR.bin program, via this, produced the same display failure with the coloured square.
Raspberry Pi 3 - The measurements on the RPi 2 were via 2016-03-18-raspbian-jessie Operating System, released a month after RPi 3 launch, and needed for OpenGL GLUT support. Running the tests on a RPi 3, where overclocking is not available, produced that rainbow coloured square after a few minutes testing time. Results of a short run are provided below, showing the high temperatures that lead to the failures. Although the normal monitor displays disappeared, it should be noted that all programs continued running.
The newer 2016-05-27-raspbian-jessie was burnt onto a different micro SD card and the tests repeated. As shown below, these tests ran successfully at temperatures similar to those on the RPi 2. The Livermore Loops Benchmark produces results at the end. In this case, RPi 3 soak tests lead to a considerable reduction in measured MFLOPS, and no better than RPi2, where performance remained fairly constant.
Using the older Operating System, display failures were also found to occur running four processor tests, without the OpenGL benchmark. See 2016 Maximum MFLOPS Results that also shows how performance is degraded with time, due to CPU MHz being throttled.
The tests were repeated with the CPU installed in the FLIRC Case, where the whole aluminium enclosure becomes the heatsink. Then, as seen below, the maximum temperature reached was far less than before and Loop MFLOPS measurements indicated no degradation due to throttling. CPU MHz measurements were not recorded properly for the earlier tests
(see Temperature & MHz Recorder) but, using the later app, a constant 1200 MHz was recorded for the new measurements.
Test 3
export vblank_mode=0
lxterminal -e ./RPiHeatMHz passes 31, seconds 30
lxterminal -e ./videogl32 Wide 1920, High 1080, Minutes 15
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 1
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 2
lxterminal -e ./liverloopsPiA7R Seconds 12 Log 3
Running OpenGL Benchmark and Three Floating Point Programs
Hotter Old OS New OS FLIRC
Room Case
Rpi 2 Rpi 2 Rpi 3 Rpi 3 Rpi 3
Run 1 Run 2
MHz 1000 1000 1200 1200 1200
FPS 17 17 17 17 18
Minutes °C °C °C °C °C
0 51.9 53.5 49.4 52.1 52.1
0.5 71.3 72.4 66.6 69.3 61.2
1.0 74.5 75.6 73.1 75.8 62.8
1.5 75.6 76.7 76.8 78.4 63.9
2.0 77.2 78.8 80.1 82.2 64.5
2.5 79.4 81.0 81.7 82.7 65.5
3.0 77.7 79.4 81.7 83.8 64.5
3.5 79.9 81.0 82.2 82.7 66.1
4.0 79.9 81.0 82.7 82.7 66.1
4.5 81.0 82.0 80.1 83.8 66.1
5.0 83.1 85.3 80.6 83.3 66.6
5.5 84.2 84.2 Stop 83.8 67.7
6.0 83.1 84.2 84.4 69.3
6.5 83.7 84.2 85.4 68.8
7.0 85.3 84.2 84.4 68.8
7.5 83.1 84.2 84.4 69.8
8.0 83.1 83.1 84.9 68.8
8.5 83.7 83.1 84.9 68.8
9.0 83.1 83.7 84.9 69.3
9.5 82.6 84.7 84.4 70.4
10.0 83.7 83.1 83.8 71.4
10.5 82.6 85.3 84.9 71.4
11.0 85.3 84.2 82.7 70.9
11.5 85.3 83.7 84.9 70.9
12.0 84.2 83.1 83.3 70.9
12.5 84.2 83.1 83.3 70.4
13.0 84.2 84.7 83.8 70.9
13.5 84.2 83.1 84.9 70.9
14.0 83.1 84.2 84.9 68.2
14.5 78.8 84.7 84.9 65.0
15.0 75.6 82.0 82.7 62.3
15.5 63.8 69.1
Maximum 85.3 85.3 82.7 85.4 71.4
MFLOPS MFLOPS MFLOPS MFLOPS MFLOPS
Per Core
Maximum 242.9 240.0 N/A 388.5 387.8
Average 147.1 141.8 140.7 206.7
Geomean 134.0 129.1 121.3 182.3
Harmean 118.9 114.2 102.8 156.9
Minimum 38.9 33.2 33.4 54.7
Stand Alone
Maximum 244.9 244.9 398.4
Average 150.7 150.7 210.7
Geomean 138.2 138.2 186.0
Harmean 46.7 46.7 56.6
########################### Volts and Amps Measurements #########################
2.5A PS 1000 MHz 1.5A PS 900 MHz 1.5A PS 1000 MHz
Volts Amps Volts Amps Volts Amps
Power on 5.14 0.30 5.04 0.29 5.04 0.30
VideoGL32 only 5.14 0.47 to 0.56 5.05 0.44 to 0.48 5.06 0.47 to 0.53
3 x Livermore Loops 5.13 0.50 to 0.58
VideoGL32 and Loops 5.12 0.71 to 0.75 5.05 0.59 to 0.65 5.06 0.66 to 0.75
|
To Start
Maximum MFLOPS - burninfpuPiA7, burninfpuPi2
This uses the same program test code as
MP-MFLOPS,
but just for a single CPU. The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2, 8 or 32 operations per input data word.
The same variables are used for each word and final results are checked for consistency, any errors being reported.
The benchmark has input parameters for KWords, Section 1, 2 or 3 (for 2, 8 or 32 operations per word) and log number (0 to 99).
Below is an example log file, followed by sample resuults using L1 cache, L2 cache and RAM. Normally, 32 operations per word would produce the fastest speed, but it seems that this NEON based compilation runs out of registers, leading best performance being at 8 operations per word.
Raspberry Pi 3 - Results are included below. Compared with RPi 2 and relative to clock MHz, RPi 3 was slightly faster using L1 cache data, twice as fast from L2 and 2.5 times via RAM. Then see 2016 Maximum MFLOPS, where RPi 3 performance was severely degraded using multiple copies of the program.
A new version, burninfpuPiA7, was produced, following experiments, introducing more calculations in the 2 and 8 operations per word test functions. Results are
below,
where single core maximum Raspberry Pi 3 speeds are around 3 GFLOPS.
This new version is included in
Raspberry_Pi_2_Stress_Tests.zip.
Command:- ./burninfpuPiA7 KWords 4 Section 1 Log 1
Burn-In-FPU Linux/ARM A7 v1.0 Sun Mar 29 12:22:02 2015
Using 16 KBytes, 2 Operations Per Word, For Approximately 1 Minutes
Pass 4 Byte Ops/ Repeat Seconds MFLOPS First All
Words Word Passes Results Same
1 4000 2 888000 15.01 473 0.400158763 Yes
2 4000 2 888000 14.97 475 0.400158763 Yes
3 4000 2 888000 14.97 475 0.400158763 Yes
4 4000 2 888000 14.97 475 0.400158763 Yes
End at Sun Mar 29 12:23:02 2015
############################################################################
Raspberry Pi 2, 900 MHz
16 KB L1 Cache
1 4000 2 888000 15.01 473 0.400158763 Yes
1 4000 8 376000 15.14 795 0.540749788 Yes
1 4000 32 84000 15.58 690 0.353297979 Yes
96 KB L2 Cache
1 24000 2 141192 15.05 450 0.400418609 Yes
1 24000 8 61272 15.14 777 0.580618143 Yes
1 24000 32 13986 15.69 685 0.580735326 Yes
200 MB RAM
1 50000000 2 51 15.25 335 0.998471677 Yes
1 50000000 8 23 15.06 611 0.999586105 Yes
1 50000000 32 7 17.46 641 0.999661446 Yes
############################################################################
Raspberry Pi 3, 1200 MHz
16 KB L1 Cache
1 4000 2 1380000 15.04 734 0.400158763 Yes
1 4000 8 780000 15.02 1661 0.540749788 Yes
1 4000 32 204000 15.28 1709 0.353159577 Yes
96 KB L2 Cache
1 24000 2 225774 15.04 721 0.400158763 Yes
1 24000 8 128538 15.01 1644 0.541149199 Yes
1 24000 32 33300 15.02 1703 0.406550229 Yes
200 MB RAM
1 50000000 2 67 15.00 447 0.997993290 Yes
1 50000000 8 53 15.13 1402 0.999047875 Yes
1 50000000 32 16 15.13 1692 0.999228001 Yes
|
To Start
Maximum MFLOPS Paging
Selecting an input parameter of KWords 200000, for 800 MB, generated an error message from the program, as it could not allocate that amount of memory. Reducing this alowed the program to run, but painfully slowly, as the data was being swapped out and in to/from the SD card. In my case, selecting 180000 KW, for 720 MB, provided a demonstration of swapping, whilst checking that no data is corrupted.
Results from the 720 MB test are below. The program calibrates the number of passes for all phases, during the first one. As in this case, other phases can run for a different length of time.
The program starts with 683 MB free, soon falling to 12 MB plus taking over 25 MB of cache space (696 hex x 1.049 = 730). Noting that 25% CPU utilisation (us + sy) implies 100% of one core, after an initial burst of swapping activity (allocating memory space and generating data, outside timed tests), the benchmark executes at full speed. Next there is 30 seconds of swapping out and in, followed by phase 2, 3 and 4 calculations, at some speed, with some swapping in.
Command:- ./burninfpuPiA7 KW 180000 Section 1 Log 1
Burn-In-FPU Linux/ARM A7 v1.0 Sun Mar 29 14:33:48 2015
Using 720000 KBytes, 2 Operations Per Word, For Approximately 1 Minutes
Pass 4 Byte Ops/ Repeat Seconds MFLOPS First All
Words Word Passes Results Same
1 180000000 2 14 15.99 315 0.999579191 Yes
2 180000000 2 14 46.24 109 0.999579191 Yes
3 180000000 2 14 15.40 327 0.999579191 Yes
4 180000000 2 14 15.20 332 0.999579191 Yes
End at Sun Mar 29 14:35:46 2015
############################################################################
Command for 5 second sample3:- vmstat 5
number ------------KB----------- --K/sec-- -KB/sec- -num/sec- -----%-----
procs ----------memory---------- --swaps-- ---io--- --system- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 27864 683936 484 28368 0 0 0 3 1211 135 1 0 99 0
0 0 27864 683804 492 28368 0 0 0 5 1210 150 0 0 99 0
0 0 27864 683812 500 28368 0 0 0 14 1218 159 0 0 99 0
1 0 27840 73568 508 28636 12 0 63 4 1237 217 4 5 91 0
2 3 47176 11956 228 2988 46 3957 2788 3976 1474 301 2 20 8 70
1 0 44164 11776 200 7416 393 0 5644 6 2025 834 25 5 53 17
1 5 39408 11624 232 6240 1288 31 9816 46 3004 1597 28 8 38 26
0 5 37104 11636 220 6712 275 0 15533 14 3330 1604 27 9 3 60
0 8 60764 17852 200 5316 618 5285 10427 5303 4006 1382 7 6 23 64
0 4 39352 11036 188 3472 2915 87 18823 110 3776 2748 0 5 38 56
0 8 41456 10664 188 5344 1080 1684 12798 1684 2991 2018 0 3 59 37
0 5 44024 11860 188 7016 918 1380 9091 1380 2461 1383 2 4 52 42
0 7 62476 18144 188 4448 2076 5368 5546 5370 2190 936 0 2 31 67
1 2 33552 11296 188 3228 3615 0 9820 5 2609 1362 9 5 42 44
1 3 33564 12136 192 2788 10 11 17031 30 2826 1242 25 7 19 49
1 3 33548 12064 196 2796 14 2 17375 6 2945 1310 25 7 17 50
0 4 33544 11748 188 3324 2 0 16335 2 2663 1161 23 6 24 46
1 0 32996 12464 172 3076 29 1 11823 10 2174 895 17 13 47 22
2 0 35800 11876 172 6272 30 574 8144 586 2099 791 25 5 44 26
1 0 36596 12008 176 7236 11 166 1976 179 1425 245 25 1 67 6
1 0 36228 426196 184 7076 225 121 1078 126 1444 286 22 4 72 2
2 0 36032 10900 188 7988 126 0 3898 17 1701 492 22 6 67 5
2 0 36180 12388 188 6044 5 34 2226 44 1486 297 25 1 70 4
2 0 36516 12024 196 6632 0 67 2090 70 1418 241 25 1 70 4
0 0 35800 707804 320 14848 227 23 2976 26 1714 501 15 2 78 5
0 0 34952 704208 328 17128 222 0 680 6 1355 384 1 0 97 1
0 0 34332 702976 336 17692 103 0 225 10 1302 462 1 2 97 0
r = waiting to run, b = sleeped, i = in, o = out, in = interrups,
cs = context switches, us = user, sy = system, id = idle, wa = wait for I/O
|
To Start
Multiple Maximum MFLOPS Tests and OpenGL
OpenGL1PiR was run concurrently with four copies of burninfpuPiA7 and the temperature/MHz monitoring program, with data sizes for L1 cache, L2 cache and RAM, at both 900 MHz and 1000 MHz. The scripts and results are shown below.
CPU MHz - This was at maximum frequency during all tests.
CPU performance - With the OpenGL program not using much CPU time, throughput of the four core CPU test was 3.7 to 3.8 times that of one core, via L1 cache, and somewhat less so using the shared L2 cache. With RAM based data, the CPU tests ran for much longer than the 15 minute parameters (calibration complication) but, at least, thoughput was greater than using one core.
Temperature - With the CPU temperatures being mainly dependent on the OpenGL test, there were no surprises compared with the Livermore Loops tests. Then, the programs can provide a thorough test of a selected memory area.
OpenGL FPS - As with the CPU tests, performance, using shared RAM, was much slower than cache based speeds.
Below are later results using a new OpenGL GLUT Benchmark.
4fpul1.sh # Kwds = 4, 4fpul2.sh # Kwds = 10, 4fpuram.sh # Kwds = 40000
4 x16 KB 4 x 40 KB 4 x 160 MB
lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, seconds 15
lxterminal -e ./OpenGL1PiR.bin Wide 1920, High 1080, RunMinutes 15
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 11
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 12
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 13
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds # Sect 2 Mins 15 Log 14
900 MHz
L1 L2 RAM
Minute °C MFLOPS FPS °C MFLOPS FPS °C MFLOPS FPS
0 47.6 45.5 47.6
1 64.3 3037 5.6 63.2 2793 4.9 60.5 668 3.4
2 67.0 3028 6.2 66.4 2760 5.4 61.6 3.3
3 69.7 3018 6.2 69.1 2771 5.4 63.8 665 4.1
4 70.8 3016 6.2 70.8 2771 5.3 64.3 3.3
5 71.8 3046 6.3 71.3 2773 5.2 65.4 650 3.9
6 72.9 3003 6.2 72.9 2724 5.4 65.9 3.9
7 74.0 3074 6.2 72.9 2756 5.3 66.4 666 3.2
8 74.5 2961 6.3 73.4 2750 5.4 66.4 4.1
9 74.0 3018 6.3 74.5 2750 5.5 67.0 693 3.7
10 74.0 3026 6.3 74.5 2727 5.4 68.1 3.4
11 75.1 3015 6.6 74.5 2810 5.8 68.1 741 4.3
12 75.6 3085 6.6 75.1 2798 5.8 68.1 3.7
13 76.1 3056 6.6 75.6 2868 5.8 68.1 704 3.8
14 75.6 3069 6.5 75.6 2825 5.7 66.4 4.3
15 68.1 3039 6.5 68.6 2937 5.8 63.8 963 3.9
Average 3033 6.3 2787 5.5 719 3.8
Maximum 76.1 75.6 68.1
1 Core 795 777 611
1000 MHz
0 50.3 47.6 49.2
1 72.4 3382 6.7 70.8 3244 6.3 65.9 891 4.4
2 75.6 3381 7.2 75.6 3242 6.7 69.7 4.0
3 78.8 3393 7.5 78.8 3241 6.7 70.8 880 4.4
4 79.9 3395 7.4 81.0 3236 6.9 72.4 4.7
5 82.0 3380 7.2 81.0 3228 7.1 74.0 887 4.5
6 83.1 3367 7.2 82.6 3244 6.8 74.5 4.0
7 84.2 3375 7.5 83.7 3246 6.6 75.1 896 4.5
8 83.1 3337 7.4 84.2 3179 6.7 75.6 4.7
9 84.2 3266 7.1 83.7 3111 6.6 76.1 872 4.6
10 83.1 3235 7.0 83.1 3077 6.7 76.7 4.1
11 81.5 3092 7.5 84.2 3025 7.1 77.7 932 4.6
12 83.1 3194 7.7 84.7 2998 7.1 77.2 4.9
13 83.7 3207 7.3 82.0 2997 6.9 77.7 919 4.9
14 83.1 3096 7.2 84.2 3170 6.8 77.7 4.8
15 78.3 3315 7.5 78.3 3352 7.0 76.7 934 4.3
Average 3294 7.3 3173 6.8 901 4.5
Maximum 84.2 84.7 77.7
1 Core 887 866 702
|
2016 - With New OpenGL GLUT Benchmark - videogl32
This benchmark is my main Linux OpenGL program. For further details see:
Raspberry Pi Benchmarks.htm.
The later Operating System would not execute the script with lxterminal commands. However, the details could be copied and pasted as a single list of commands. The extra export command turned off Wait For Vertical Blank (VSYNC) to demonstrate maximum speeds.
Using this benchmark appears to produce lower maximum temperatures being recorded.than the original OpenGL tests.
Then, it seems to use more CPU time, reducing overall throughput of the floating point tests.
The tests were rerun, using three copies of burninfpuPiA7, to make more CPU time available for videogl32. See first column “All 1000 MHz” below, where all programs obtained a fair share of CPU time, at almost full speed. However, this did not lead to the high temperatures obtained using the
Livermore Loops,
that produced a display failure.
For the second results columns, 64 KB was used for the burninfpuPiA7 programs, jointly to using 192 KB of the 512 KB shared L2 cache. The slow speeds suggest that L2 cache space used by videogl32 leads to too high a demand. The third results columns are for 3 times 48 KB for the MFLOPS tests, using 32 operations per word, with all speeds running at a high efficiency, and the high temperature that caused the temporary display failure. Similar burninfpuPiA7 speeds were obtained at 48 KB with 8 operations per word.
Raspberry Pi 3 - These test were run for 5 minutes, via the newer Operating System, again using three copies of the floating point stress test, but with no display failures. As shown in the results below, maximum temperatures could be produced after two minutes. Instantaneous CPU MHz is sometimes reduced by a half, with MFLOPS speeds, averaged over 15 seconds or more, down by more than 40% and not much faster than the overclocked Raspberry Pi 2.
The 16 KB tests were repeated with the RPi 3 system in the FLIRC Case, with a running time of 15 minutes. There was no degradation in performance, up to a maximum of 69.3'C, with each core constantly executing around 1.7 GFLOPS and OpenGL between 17 and 18 FPS.
Next are 2016 Maximum MFLOPS , using four processor tests, without the OpenGL benchmark. In this case, the test failed using the old OS (2016-03-18-raspbian-jessie).
######################################################################
export vblank_mode=0
lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 63, seconds 15
lxterminal -e ./videogl32 Wide 1920, High 1080, Minutes 15
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 11
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 12
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 13
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 4 Sect 2 Mins 15 Log 14
900 MHz 1000 MHz
Seconds °C MFLOPS FPS °C MFLOPS FPS °C
Increase
Min 49.8 54.6 4.8
Max 72.4 79.4 7.0
150 65.4 2619 11 74.0 2977 14 8.6
165 65.9 2671 12 75.1 2916 13 9.2
180 66.4 2734 11 75.1 2977 13 8.7
195 66.4 2631 11 75.1 2945 14 8.7
210 67.0 2661 11 75.6 3030 14 8.6
225 67.0 2611 11 76.1 2884 14 9.1
240 67.5 2715 11 76.1 2920 14 8.6
255 68.1 2598 11 76.7 3022 14 8.6
270 68.1 2559 12 76.7 2950 13 8.6
285 68.6 2835 12 77.2 2981 14 8.6
Average 2663 11.3 2960 13.7
1 Program 795 15.9 887 17.2
4 Programs 3180 3548
Efficiency % 83.8 71.1 83.4 79.7
###############################################################################
All 1000 MHz - 3 burninfpuPiA7 programs
16 KB 8 Ops Per Word 64 KB 8 Ops Per Word 48 KB 32 Ops Per Word
Seconds °C MFLOPS FPS °C MFLOPS FPS °C MFLOPS FPS
Min 50.8 55.7 56.2
Max 81.0 83.1 85.3
150 71.8 2620 17 76.7 1920 13 78.8 2202 16
165 72.4 2603 17 77.2 1892 13 79.9 2215 16
180 72.4 2630 17 76.7 1922 13 79.9 2192 16
195 73.4 2616 17 77.7 1967 14 81.0 2233 16
210 73.4 2633 17 77.7 1930 14 81.0 2238 16
225 74.0 2631 17 78.3 1937 13 81.0 2222 16
240 74.5 2608 17 78.8 1898 13 81.5 2218 16
255 74.5 2601 17 78.8 1925 13 81.5 2228 16
270 74.5 2608 17 78.8 1955 13 82.0 2185 16
285 75.1 2628 17 78.8 1941 13 82.0 2237 16
Average 2618 17.0 1929 13.2 2217 16.0
1 Program 882 17.3 871 17.3 771 17.3
3 Programs 2646 2613 2313
Efficiency % 98.9 98.3 73.8 76.3 95.8 92.5
###############################################################################
Raspberry Pi 3, 1200 MHz - 3 burninfpuPiA7 programs
16 KB 8 Ops Per Word 64 KB 8 Ops Per Word 48 KB 32 Ops Per Word
Seconds °C MHz MFLOPS FPS °C MHz MFLOPS FPS °C MHz MFLOPS FPS
0 73.6 1200 69.8 1200 74.1 1200
30 81.1 1130 4921 17 79.5 1200 4821 17 82.7 994 5019 17
60 82.7 922 4351 17 82.7 947 4646 17 84.4 776 3636 17
90 83.8 816 3847 17 83.8 854 3624 17 82.7 600 3189 17
120 84.4 756 3287 17 84.9 766 3337 17 84.4 829 2935 17
150 84.4 735 3152 17 84.4 750 2999 17 83.3 600 2921 17
180 84.4 723 2983 17 84.9 784 2864 17 85.4 708 2834 17
210 84.9 715 2908 17 82.7 897 2838 17 82.7 600 2805 17
240 84.9 753 2895 17 84.4 808 2847 16 85.4 710 2787 16
270 84.9 714 2886 16 83.8 600 2838 16 83.8 600 2741 16
300 82.7 600 2860 16 84.9 707 2819 16 85.4 600 2821 16
|
To Start
Raspberry Pi 3 Multiple Maximum MFLOPS Tests without OpenGL
Following are results from running four copies of burninfpuPiA7 test program, along with the new
Temperature & MHz Recorder.
The first results are for the initial 13 passes, via the older Operating System (2016-03-18-raspbian-jessie), with the OpenGL GLUT driver enabled, showing how CPU MHz and all stress tests become slower as temperature rises. Note that run time parameters are determined at full speed, leading to increased time at lower speeds.
In this case, the display was replaced by that rainbow coloured square when the temperature reached 80°C but the programs continued running.
Results from other tests, that all ran successfully for more than the 15 minutes heat/speed measurements, are shown below. Two used the older OS, one with no GLUT driver and one with the driver installed, but not enabled. Yet, the latter produced higher temperatures and slower speeds. Finally, details of another run using the newer OS (2016-05-27-raspbian-jessie) are provided, with the driver installed and enabled but with even higher temperatures and slower speeds. Note that, particularly for the latter, higher temperatures and slower speeds, than noted during the failing tests, were recorded.
The third table of Raspberry Pi 3 results provides New OS, Driver Enabled test comparisons using different CPU heatsinks. Old is the original one supplied with the initial kit, Black is the latest from Pi Hut in September 2016, and Copper is the rather swish Enzotech BMR-C1, kindly supplied by Doc Watson in September 2016.
For comparison purposes, the MFLOPS speed is the most reliable, representing the average over at least 15 seconds, at the minute intervals. CPU MHz (I believe) is an instantaneous measurement that can go up and down during the test period. All tests suffered from CPU throttling of more than 24%. Black was the least sufferer, but it did start with the CPU a little cooler.
The fourth table is for the revised program burninfpuPi2, where single core speeds approach 3 GFLOPS, this time for the black and copper heatsinks, plus the latter with the case cover removed. There was not much difference on tests with the two heatsinks, but throttling started earlier than when using the previous stress test version. There was still some throttling with the covers removed.
Again, tests on the RPi 3, in the FLIRC Case, constantly ran at the maximum CPU speed of 1200 MHz, with throughput of around 11.7 GFLOPS from the four cores, and maximum CPU temperature of 69.8'C.
The tests for table 3 were repeated later, where, over the whole testing time, each core ran at around 1730 MFLOPS and 1200 MHz, with maximum CPU temperature of 67.1'C (with a warmer start).
Room temperature was around 22°C for all measurements.
######################################################################
4 CPU Tests Old OS, Driver Enabled - Failed with temperature > 80°C.
Burn-In-FPU Linux/ARM A7 v1.0 Sat Jun 25 22:26:57 2016
Using 16 KBytes, 8 Operations Per Word, 15 Seconds Per Pass
Prog2 Prog3 Prog4
Pass 4 B Ops Repeat Secs MFLOPS First All ----- MFLOPS ---- °C MHz
Wds /Wd Passes Results Same
0 60.1 1200
1 4000 8 740000 15.08 1571 0.540749788 Yes 1583 1623 1569 69.8 1200
2 4000 8 740000 14.56 1626 0.540749788 Yes 1648 1635 1621 74.7 1200
3 4000 8 740000 14.55 1627 0.540749788 Yes 1648 1654 1598 76.8 1200
4 4000 8 740000 14.57 1625 0.540749788 Yes 1644 1648 1643 79.5 1200
5 4000 8 740000 14.44 1640 0.540749788 Yes 1635 1629 1613 81.1 1200
6 4000 8 740000 15.52 1525 0.540749788 Yes 1515 1501 1487 81.1 1128
7 4000 8 740000 16.60 1427 0.540749788 Yes 1417 1411 1395 81.7 1052
8 4000 8 740000 17.29 1369 0.540749788 Yes 1357 1348 1329 82.2 1020
9 4000 8 740000 17.93 1320 0.540749788 Yes 1311 1298 1292 82.7 971
10 4000 8 740000 18.19 1302 0.540749788 Yes 1292 1274 1294 82.7 953
11 4000 8 740000 18.41 1287 0.540749788 Yes 1278 1258 1278 83.3 920
12 4000 8 740000 18.69 1267 0.540749788 Yes 1254 1227 1254 83.3 903
13 4000 8 740000 19.15 1237 0.540749788 Yes 1225 1200 1224 82.7 881
######################################################################
4 CPU Tests - All Successful - Above Failed Old OS with Driver Enabled
New OS Old OS Old OS
Driver Enabled No Driver Driver Installed
1 Test 1 Test 1 Test
Minute °C MHz MFLOPS °C MHz MFLOPS °C MHz MFLOPS
0 56.4 1200 42.9 1200 53.7 1200
1 75.8 1200 1661 63.9 1200 1640 73.1 1200 1657
2 81.1 1125 1597 69.8 1200 1655 79.5 1200 1657
3 82.2 1015 1407 73.6 1200 1657 81.1 1088 1517
4 82.7 962 1333 76.3 1200 1653 81.7 1035 1443
5 82.7 938 1272 77.9 1200 1649 82.2 1007 1391
6 82.7 919 1251 80.1 1200 1665 82.2 972 1340
7 82.7 905 1236 80.6 1146 1652 82.2 975 1340
8 83.8 882 1211 81.7 1099 1568 82.2 945 1303
9 83.8 886 1201 81.7 1075 1536 82.2 939 1298
10 83.8 858 1190 81.7 1056 1500 82.7 934 1287
11 83.3 856 1182 81.7 1035 1446 82.7 933 1278
12 83.3 887 1295 81.7 1039 1452 82.7 933 1272
13 82.2 942 1296 81.7 1015 1438 82.7 912 1262
14 82.2 924 1296 82.2 1002 1426 82.7 913 1259
15 82.7 936 1270 82.2 991 1414 82.7 906 1253
16 82.7 927 1149 82.2 1001 1400 83.3 916 1239
Min 856 1149 991 1400 906 1239
Max 83.8 82.2 83.3
######################################################################
Repeat 4 CPU Tests New OS Driver Enabled Different Heatsinks
Old Heatsink Black Heatsink Copper Heatsink FLIRC
Case
1 Core 1 Core 1 Core
Minute °C MHz MFLOPS °C MHz MFLOPS °C MHz MFLOPS °C
0 47.2 1200 44.0 1200 46.2 1200 47.8
1 67.1 1199 1729 65.0 1200 1731 65.0 1200 1726 60.1
2 74.7 1200 1725 71.4 1200 1729 72.5 1199 1727 61.8
3 78.4 1200 1725 76.3 1200 1731 77.4 1200 1732 62.3
4 81.1 1133 1658 79.5 1200 1734 80.6 1157 1725 63.4
5 81.1 1051 1530 81.7 1132 1637 81.7 1044 1532 63.4
6 81.7 1011 1471 81.7 1058 1538 81.7 1005 1459 64.5
7 82.2 1004 1430 80.6 1042 1489 82.2 976 1408 64.5
8 82.2 978 1392 82.2 1011 1467 82.2 966 1374 65.0
9 82.7 942 1360 82.2 976 1429 82.7 935 1344 65.5
10 82.7 939 1341 82.2 985 1407 82.7 939 1331 65.5
11 82.7 930 1327 82.7 963 1381 82.7 922 1320 66.1
12 83.3 904 1309 82.2 939 1353 83.3 892 1302 66.6
13 82.7 892 1290 82.7 936 1339 83.3 895 1292 66.6
14 83.3 877 1269 82.7 915 1315 83.3 878 1267 67.1
15 83.8 872 1253 82.7 910 1321 83.3 873 1267 Fin
min 67.1 872 1253 65.0 910 1315 65.0 873 1267 60.1
max 83.8 1200 1729 82.7 1200 1734 83.3 1200 1732 67.1
Loss
% 27.3 27.5 24.2 24.2 27.3 26.8
######################################################################
Revised Benchmark Max MFLOPS > 2900 Per Core - New OS Driver Enabled
FLIRC Case constant 1200 MHz and 4 x 2925 MFLOPS
Black Heatsink Copper Heatsink Copper No Cover FLIRC
Case
4 Core 4 Core 4 Core
Minute °C MHz MFLOPS °C MHz MFLOPS °C MHz MFLOPS °C
0 49.9 1200 41.9 1200 46.2 1200 41.9
1 73.6 1200 11699 65.0 1200 11706 67.1 1200 11720 56.9
2 81.7 1124 11282 73.6 1200 11709 74.1 1200 11709 59.6
3 82.7 977 9489 79.0 1200 11726 79.0 1200 11682 61.2
4 82.7 917 8954 81.7 1038 10322 80.6 1118 11059 62.3
5 83.8 867 8545 82.2 963 9629 81.7 1048 10296 63.4
6 83.8 846 8252 82.7 932 9165 81.7 1015 10073 65.0
7 83.8 830 8085 83.8 876 8832 81.7 991 9812 65.5
8 83.8 809 7991 83.3 867 8558 81.7 991 9684 66.3
9 83.8 816 7860 83.8 842 8318 82.2 963 9556 67.1
10 83.8 795 7738 83.8 824 8146 82.7 965 9369 67.1
11 84.4 782 7663 83.8 821 8051 82.7 968 9342 68.2
12 84.4 787 7625 83.8 813 7966 82.7 953 9241 69.3
13 83.8 844 8212 83.8 812 7879 82.2 956 9203 69.3
14 83.8 827 8177 84.4 796 7780 82.7 948 9194 69.8
15 84.4 830 8133 84.4 794 7710 82.7 949 9109 Fin
min 73.6 782 7625 65.0 794 7710 67.1 948 9109 56.9
max 84.4 1200 11699 84.4 1200 11726 82.7 1200 11720 69.8
Loss
% 34.8 34.8 33.8 34.2 21.0 22.3
|
To Start
Fixed Point Test - stressIntPiA7
This has six tests that alternately write and read data and six tests using write once and read many times, each test using two data patterns out of 24 variations. Some are shown in the results.
The read phase comprises an equal number of additions and subtractions, with the data being unchanged afterwards. This is checked for correctness, at the end of each test, and any errors reported. Run time parameters are provided for KBytes memory used, seconds for each of the twelve tests and log number for use in multitasking. Default parameters are shown below.
The assembly instruction count for the reading test loop, covering 32 words or 128 Bytes, is 18 adds, 16 subtracts, 33 loads. a compare and a branch, or a total of 69. This equates to 0.539 instructions per byte read frpm memory. With a maximum of 1659 MB/second, Millions of Instruction Per Second (MIPS) executed is 894 (with CPU at 900 MHz).
Command:- ./stressIntPiA7 KB 8 Seconds 1 Log 0
Integer Stress Test Linux/ARM A7 v1.0 Tue Apr 7 11:58:17 2015
8 KBytes Cache or RAM Space, 1 Seconds Per Test, 12 Tests
Write/Read
1 1462 MB/sec Pattern 00000000 Result OK 89262 passes
2 1513 MB/sec Pattern FFFFFFFF Result OK 92357 passes
3 1518 MB/sec Pattern A5A5A5A5 Result OK 92642 passes
4 1514 MB/sec Pattern 55555555 Result OK 92391 passes
5 1508 MB/sec Pattern 33333333 Result OK 92051 passes
6 1497 MB/sec Pattern F0F0F0F0 Result OK 91374 passes
Read
1 1659 MB/sec Pattern 00000000 Result OK 202600 passes
2 1659 MB/sec Pattern FFFFFFFF Result OK 202600 passes
3 1659 MB/sec Pattern A5A5A5A5 Result OK 202600 passes
4 1659 MB/sec Pattern 55555555 Result OK 202600 passes
5 1659 MB/sec Pattern 33333333 Result OK 202600 passes
6 1659 MB/sec Pattern F0F0F0F0 Result OK 202600 passes
End at Tue Apr 7 11:58:29 2015
############################################################################
16 KB L1 Cache
W/Rd 1573 MB/sec Pattern F0F0F0F0 Result OK 47991 passes
Read 1650 MB/sec Pattern F0F0F0F0 Result OK 100800 passes
96 KB L2 Cache
W/Rd 1275 MB/sec Pattern F0F0F0F0 Result OK 6486 passes
Read 1500 MB/sec Pattern F0F0F0F0 Result OK 15300 passes
50 MB RAM
W/Rd 932 MB/sec Pattern F0F0F0F0 Result OK 10 passes
Read 1244 MB/sec Pattern F0F0F0F0 Result OK 26 passes
4 x 160 MB Pattern F0F0F0F0
W/Rd MB/sec 382 + 373 + 372 + 373 = 1500
Read MB/sec 366 + 350 + 335 + 350 = 1401
|
To Start
Fixed Point Tests and OpenGL
Following are results from running four copies of stressIntPiA7 and OpenGL1PiR, whilst monitoring CPU MHz and temperature. The tests were run at both 900 and 1000 MHz, with data using the four L1 caches, the shared L2 cache and RAM.
During all tests, CPU MHz recordings were constant at the maximum speed. Unlike the MFLOPS tests, performance at 1000 MHz was sometimes slower than at 900 MHz, and CPU temperatures were not necessarily higher.
As indicated above, running four RAM tests without OpenGL, produced a total throughput of at least 1400 MB/second, with 832/1244 using a single core. This time, reading speeds were much slower than the latter, and continued for a minute longer than the specified duration, due to OpenGL influence.
Maximum temperatures at 900 MHz were much higher than with the MFLOPS tests, and L1 cache based calculation speeds were 3.6/3.8 times faster than single core tests and OpenGL FPS scores similar to those in MFLOPS test.
Then, at 1000 MHz, FPS measurements were again similar but, inexplicably, fixed point calculation speeds were much slower than at 900 MHz, with little change in maximum temperature.
900 MHz L1 cache 4 x 16 KB L2 cache 4 x 40 KB RAM 4 x 160 MB
4 Tests 4 Tests 4 Tests
Seconds °C MB/sec FPS °C MB/sec FPS °C MB/sec FPS
0 45.5 47.1 47.6
80 66.4 5683 5.4 63.8 2118 4.7 61.1 713 4.3
160 71.3 5649 6.0 67.0 1985 5.2 64.3 675 4.4
240 73.4 5639 6.0 69.1 2016 5.2 65.9 661 4.2
320 74.5 5651 6.0 70.8 1974 5.0 67.0 669 4.6
400 76.1 5649 6.1 71.3 1996 5.0 67.0 680 4.7
480 77.2 5656 6.1 74.5 2019 5.1 67.5 674 4.8
560 77.7 6310 6.2 78.8 5637 6.8 68.1 664 4.9
640 79.9 6298 6.4 81.5 5618 6.8 68.6 668 5.0
720 81.5 6301 6.5 83.1 5617 6.8 69.7 671 5.1
800 82.0 6305 6.7 83.7 5615 6.8 70.8 662 5.2
880 82.0 6296 6.7 84.2 5488 7.0 70.8 670 5.2
960 83.1 6292 6.6 83.1 5408 6.9 71.3 711 5.1
later 1505
1000 MHz L1 cache 4 x 16 KB L2 cache 4 x 40 KB RAM 4 x 160 MB
4 Tests 4 Tests 4 Tests
Seconds °C MB/sec FPS °C MB/sec FPS °C MB/sec FPS
0 46.0 51.9 43.3
80 74.5 3027 7.1 74.0 2948 5.5 60.5 889 4.8
160 78.8 2943 7.5 78.3 2859 5.7 64.8 840 5.0
240 83.1 2871 7.1 79.4 2786 5.7 67.0 855 4.8
320 84.2 2831 7.2 81.5 2744 5.6 70.2 846 4.7
400 84.2 2940 7.4 83.1 2851 5.8 70.8 853 4.7
480 83.7 2983 7.2 82.0 2895 5.7 73.4 849 5.0
560 83.1 5253 7.0 84.7 5161 7.1 73.4 867 5.0
640 83.7 5002 7.3 82.0 4913 7.1 74.5 902 5.5
720 82.0 4905 7.0 84.2 4814 6.8 75.6 849 5.8
800 83.1 4857 7.1 83.1 4767 6.9 76.7 867 5.7
880 84.2 4817 7.3 83.1 4727 7.3 77.7 865 5.7
960 85.3 4774 7.3 85.3 4682 7.0 77.7 853 5.5
later 1912
|
To Start
Fixed Point Tests Without OpenGL
The fixed point test stressIntPiA7 was also found to lead to performance being degraded due to Raspberry Pi 3 CPU MHz being throttled. So, tests were run using four copies of this program without running OpenGL benchmarks, each using 40 KB L2 cache space.
The first results below are from running the tests on a Raspberry Pi 2, without the experimental OpenGL GLUT driver being installed. Initial tests showed that CPU temperature did increase much, and throughput remained at around four times that of a standalone run. A further test, with results below, was run. These are for one stressIntPiA7 log file, the others being virtually identical. A lamp was used to increase temperatures after two minutes. This eventually lead to throttling, from the overclocked 1000 MHz, to 600 MHz, probably on and off, as MB/second speeds were not reduced in the same proportion.
The second set of results are from using a Raspberry Pi 3 with the OpenGL GLUT driver installed. These indicate CPU throttling, some within the first 80 second pass and later variability, often down from 1200 to 600 MHz, and MB/second approaching half speed. Again, all four programs essentially produced the same degraded performance.
The third table is for three runs on the Raspberry Pi 3, with the copper heatsink attached to the CPU
(see above).
With such variance, it is hardly possible to compare results with those for the first RPi3 above.
The fourth table is one of the virtually identical logs from repeating the tests on the RPi 3 in the FLIRC Case, plus measured CPU temperature and CPU MHz. The latter was constant and there was no degradation in measured MB/second. Unlike the floating point tests, temperature just about reached the point where throttling would start.
##############################################################################
Raspberry Pi 2 Overclocked, Heated with lamp after two minutes, 1 log of 4
Integer Stress Test Linux/ARM A7 v1.0 Tue Jul 12 08:19:32 2016
40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
MHz °C
Write/Read 1000 43.3
1 1356 MB/sec Pattern 00000000 Result OK 1323953 passes 1000 63.8
2 1360 MB/sec Pattern FFFFFFFF Result OK 1327729 passes 1000 67.0
3 1360 MB/sec Pattern A5A5A5A5 Result OK 1327936 passes 1000 70.2
4 1361 MB/sec Pattern 55555555 Result OK 1328912 passes 1000 72.4
5 1361 MB/sec Pattern 33333333 Result OK 1328838 passes 1000 75.1
6 1360 MB/sec Pattern F0F0F0F0 Result OK 1328438 passes 1000 77.7
Read
1 1646 MB/sec Pattern 00000000 Result OK 3214400 passes 1000 81.5
2 1634 MB/sec Pattern FFFFFFFF Result OK 3191700 passes 1000 83.1
3 1479 MB/sec Pattern A5A5A5A5 Result OK 2888900 passes 600 82.6
4 1372 MB/sec Pattern 55555555 Result OK 2679800 passes 600 84.7
5 1315 MB/sec Pattern 33333333 Result OK 2569200 passes 600 85.3
6 1257 MB/sec Pattern F0F0F0F0 Result OK 2456400 passes 600 85.3
End at Tue Jul 12 08:35:32 2016
One Program Stand Alone, 1 Second Per Test
Write/Read
1 1377 MB/sec Pattern 00000000 Result OK 1345001 passes
Read
1 1676 MB/sec Pattern 00000000 Result OK 3274700 passes
##############################################################################
Raspberry Pi 3, 1 log of 4
Integer Stress Test Linux/ARM A7 v1.0 Mon Jul 11 19:58:19 2016
40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
MHz °C
Write/Read 1200 62.8
1 2472 MB/sec Pattern 00000000 Result OK 2413986 passes 735 84.9
2 1853 MB/sec Pattern FFFFFFFF Result OK 1809189 passes 600 83.3
3 1792 MB/sec Pattern A5A5A5A5 Result OK 1749631 passes 600 83.3
4 1770 MB/sec Pattern 55555555 Result OK 1728301 passes 600 83.3
5 1733 MB/sec Pattern 33333333 Result OK 1692058 passes 891 83.8
6 1745 MB/sec Pattern F0F0F0F0 Result OK 1704346 passes 864 84.4
Read
1 1788 MB/sec Pattern 00000000 Result OK 3491600 passes 714 85.4
2 1723 MB/sec Pattern FFFFFFFF Result OK 3366300 passes 600 84.9
3 1670 MB/sec Pattern A5A5A5A5 Result OK 3261500 passes 600 83.3
4 1661 MB/sec Pattern 55555555 Result OK 3244700 passes 600 83.8
5 1662 MB/sec Pattern 33333333 Result OK 3246100 passes 744 85.4
6 1647 MB/sec Pattern F0F0F0F0 Result OK 3217300 passes 600 83.8
End at Mon Jul 11 20:14:20 2016
One Program Stand Alone, 1 Second Per Test
Write/Read
1 3099 MB/sec Pattern 00000000 Result OK 37825 passes
Read
1 3220 MB/sec Pattern 00000000 Result OK 78700 passes
##############################################################################
Raspberry Pi 3, 3 Runs, Copper Heatsink
----- MB/sec ----- ------- MHz ------ -------- 'C -------
Run 1 2 3 1 2 3 1 2 3
Write/Read 1200 1200 1200 68.8 52.6 50.5
1 2889 3145 3171 1000 1200 1200 82.2 73.6 77.9
2 2384 3058 2776 822 1037 916 83.8 81.7 82.7
3 2108 2509 2279 759 910 826 83.8 83.3 82.7
4 1993 2261 2133 742 829 783 84.4 83.8 83.8
5 1924 2137 2038 710 813 762 84.9 83.8 84.9
6 1896 2091 1970 739 805 745 84.4 83.8 83.8
Read
1 2015 2099 2047 729 791 763 84.9 84.9 84.4
2 1934 2016 1983 600 748 720 83.8 84.9 84.9
3 1409 2031 1986 763 767 738 84.4 84.4 84.9
4 1699 1993 1981 751 760 732 85.4 84.9 84.9
5 1543 1857 1966 738 600 731 84.4 82.7 84.9
6 1870 1849 1950 600 600 931 75.8 76.8 81.7
min 1409 1849 1950
max 85.4 84.9 84.9
##############################################################################
Raspberry Pi 3 in FLIRC Case, 1 of 4 logs with virtually the same results
Integer Stress Test Linux/ARM A7 v1.0 Mon Oct 3 22:15:28 2016
40 KBytes Cache or RAM Space, 80 Seconds Per Test, 12 Tests
MHz °C
Write/Read 1200 46.2
1 3084 MB/sec Pattern 00000000 Result OK 3011338 passes 1200 65.5
2 3128 MB/sec Pattern FFFFFFFF Result OK 3054608 passes 1200 68.8
3 3137 MB/sec Pattern A5A5A5A5 Result OK 3063003 passes 1200 70.9
4 3146 MB/sec Pattern 55555555 Result OK 3072197 passes 1200 72.0
5 3151 MB/sec Pattern 33333333 Result OK 3077292 passes 1200 73.1
6 3148 MB/sec Pattern F0F0F0F0 Result OK 3074154 passes 1200 74.1
Read
1 3179 MB/sec Pattern 00000000 Result OK 6209000 passes 1200 74.1
2 3175 MB/sec Pattern FFFFFFFF Result OK 6202000 passes 1200 74.1
3 3177 MB/sec Pattern A5A5A5A5 Result OK 6204800 passes 1200 75.8
4 3178 MB/sec Pattern 55555555 Result OK 6207300 passes 1200 78.4
5 3185 MB/sec Pattern 33333333 Result OK 6220600 passes 1200 79.5
6 3189 MB/sec Pattern F0F0F0F0 Result OK 6229600 passes 1200 80.1
End at Mon Oct 3 22:31:28 2016
|
To Start
Drive, USB and LAN Test - burnindrive2
This is essentially the same as my program used during hundreds of UK Government and University computer acceptance trials during the 1970s and 1980s, with some significant achievements. Burnindrive writes four files, using 164 blocks of 64 KB, repeated 16 times (164.0 MB), with each block containing a unique data pattern. The files are then read for two minutes, on a sort of random sequence, with data and file ID checked for correct values.
Then each block (unique pattern) is read numerous times, over one second, again with checking for correct values.
Total time is normally about 5 minutes for all tests, with default parameters. For further information, including data patterns and reading sequence example, see
original burnindrive report.
This new version is the same as the older one, except the unrequired configuration details are not produced. Details of input parameters and example of results log are below.
Run Time Parameters - Upper or Lower Case
Default
R or Repeats Data size, multiplier of 10.25 MB, more or less 16
P or Patterns Number of patterns for smaller files < 164 164
M or Minutes Large file reading time 2
L or Log Log file name extension 0 to 99 0
S or Seconds Time to read each block, last section 1
F or FilePath For other than SD card or SD card directory - see examples
C or CacheData Omit O_DIRECT on opening files to allow caching No
O or OutputPatterns Log patterns and file equences used as above No
D or DontRunReadTests Or only run write tests No
Format ./burnindrive2 Repeats 16, Minutes 2, Log 0, Seconds 1
or ./burnindrive2 R 16, M 2, L 0, S 1
Example Log System SD Card - IOlog0.txt
Command ./burnindrive2
###############################################################
Current Directory Path:
/home/pi/benchmarks/reliability/burnindrive/new
Total MB 6266, Free MB 3428, Used MB 2838
Linux Storage Stress Test for ARM v2.0, Sat Apr 11 12:55:38 2015
File size 164.00 MB x 4 files, minimum reading time 2.0 minutes
File 1 164.00 MB written in 17.57 seconds
File 2 164.00 MB written in 18.22 seconds
File 3 164.00 MB written in 18.06 seconds
File 4 164.00 MB written in 17.98 seconds
Total 71.83 seconds, Elapsed 71.83 seconds
Start Reading Sat Apr 11 12:56:50 2015
Read passes 1 x 4 Files x 164.00 MB in 0.68 minutes
Read passes 2 x 4 Files x 164.00 MB in 1.36 minutes
Read passes 3 x 4 Files x 164.00 MB in 2.04 minutes
Start Repeat Read Sat Apr 11 12:58:52 2015
Passes in 1 second(s) for each of 164 blocks of 64KB:
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280 280
260 280 280 280 280 280 280 280 280 280 280
280 280 280 280 280 280 280 280 280 280
45900 read passes of 64KB blocks in 2.86 minutes
No errors found during reading tests
End of test Sat Apr 11 13:01:44 2015
|
To Start
Drive Test, Multiple Devices
Below is the information needed to set up system tests using drives and the LAN. For the latter, external drives have to be mounted, as shown below [1]. USB drives are mounted when plugged in. File paths to use can then be identified via a df command [2], for use in run commands [3]. In order to organise running time, using multiple devices, it is useful to run each test separately, where slow writing times might need to be considered [4]. Default reading time is shown as between 2 and 3 minutes, with repeated reads a snot much longer than 164 x 1 second. However, running times using multiple drives can be unpredictable.
Monitoring - The RPiHeatMHz program [5] shows that drive tests have little impact on CPU temperature and remain running at 600 MHz, testing USB drives, with CPU utilisation, as monitored by vmstat [6], being negligible. MHz is switched to 900 MHz, running the LAN test, with CPU utilisation up to 60% of one CPU core. As shown, vmstat provides no details of network traffic volumes but this can be obtained via the sar function. The latter needs installation of the sysstat package [7].
[1] Identify IP Address and Mount Drive
Windows Command Prompt ipconfig command = 192.168.0.2
Windows share drive (partition) d
sudo mount -t cifs -o dir_mode=0777,file_mode=0777 //192.168.0.2/d /media/public
Password: raspberry - in this case unchanged default password
Linux Terminal command ifconfig eth0 (or eth1) = 192.168.0.3
Linux share directory all
sudo mount -t cifs -o user=UU,password=PP //192.168.0.3/all /media/public
UU and PP are IDs for Linux system, -o dir_mode=0777,file_mode=0777 not needed
NOTE: If wrong IDs are used, a locked file will be generated and this leads to a
failure to open a new file when correct IDs are used. The file must be deleted.
################################################################################
[2] Identify Drive Path on RPi
Command df
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 6416312 2905604 3161732 48% /
/dev/root 6416312 2905604 3161732 48% /
devtmpfs 376896 0 376896 0% /dev
tmpfs 76240 1432 74808 2% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 152460 0 152460 0% /run/shm
/dev/mmcblk0p5 60479 14527 45952 25% /boot
/dev/sdd1 7811072 23156 7787916 1% /media/8GB
/dev/mmcblk0p3 27633 435 24905 2% /media/SETTINGS
/dev/sda1 7873384 822208 7051176 11% /media/SIGMA
//192.168.0.2/d 235519996 160311812 75208184 69% /media/public
SIGMA and 8GB are USB Flash Drives
################################################################################
[3] Commands to Run Benchmark With Logs On RPi
Main SD Card - ./burnindrive2 Log 71
USB Drive - ./burnindrive2 FilePath /media/SIGMA Log 72
USB Drive - ./burnindrive2 FilePath /media/8GB Log 73
Windows Drive - ./burnindrive2 FilePath /media/public/Temp Log 74
################################################################################
[4] Running Times, Data Volumes and Speed
Local SD SIGMA 8GB LAN
secs MB MB/s secs MB MB/s secs MB MB/s secs MB MB/s
Write 70.4 656 9.3 183.3 656 3.6 253.9 656 2.6 69.0 656 9.5
Read 124.2 1968 15.8 183.6 1312 7.1 147.6 1968 13.3 140.4 1312 9.3
Repeats 171.6 2858 16.7 166.2 2259 13.6 171.6 2388 13.9 172.8 1640 9.5
################################################################################
[5]Maximum Temperatures and CPU MHz
Local SIGMA 8GB LAN
600 MHz 51.4°C 600 MHz 49.8°C 600 MHz 49.8°C 900 MHz 51.9°C
################################################################################
[6] vmstat and sar -n Performance Monitor Details
----KB/sec--- -Number/sec- -----CPU Utilisation %----
bi bo in cs us sy id wa
Main SD
Write 1 8953 2181 1527 1 2 76 21
Read 16720 6 2768 1653 1 2 74 23
Repeats 16599 7 2570 1491 0 2 75 23
USB
Write 0 3714 2352 1030 0 3 74 23
Read 7311 5 3468 1297 0 3 73 24
Repeats 14000 7 5767 2346 0 5 72 23
LAN
Write 0 12 10692 974 1 14 85 0
Read 0 11 3242 2304 1 9 90 0
Repeats 0 12 3346 2356 0 9 90 0
Command for 10 second samples of network statistics:- sar -n DEV 10
---KB/sec--- Packets/sec --B/packet--
rx tx rx tx rx tx
Write 162 10280 3031 7003 53 1468
Read 9943 123 6886 1362 1444 90
Repeats 10060 126 6973 1390 1443 91
To install sar [7]:- sudo apt-get install sysstat
|
To Start
System Stress Test
A sixteen minute system stress test was run, comprising burninfpuPiA7, stressIntPiA7 and burnindrive2 running on the main SD card, SIGMA USB drive and 8GB USB drive. Results for stand alone tests on the latter three are above. The script file [7], below, included RPiHeatMHz, CPU MHz/temperature measuring program and vmstat performance monitor.
With the CPU tests running, unlike the stand alone tests, CPU MHz was at 900 MHz, throughout. Resultant drive speeds [8] were virtually the same as when run by themselves. Testing times, set by the parameters, were not quite right, with one drive test longer than planned.
Bold times in seconds [8] indicate changes in the volumes of writing and reading. Monitored results reported [9] are at these points. The drive writing and reading speeds, surprisingly, reflected the sums derived from running times, particularly the 44 MB/second on reading. The two CPU tests produced 50% CPU utilisation, or 100% of two cores.
Maximum CPU temperature [9] was not that high at 62.7°C. This can be compared with 54.1°C from the single program
Livermore Loops Stress Test 2
or Test at 78.3°C3, with three copies and OpenGL1PiR running.
See screen shot of tests running in different windows via
Raspberry Pi 2 Stress Test Screen Shot.
[7] Script File runall.sh
lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 Mins 4 Log 21
lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 Mins 1 FPath /media/SIGMA Log 22
lxterminal --geometry=80x15 -e ./burnindrive2 Secs 4 FPath /media/8GB Log 23
lxterminal --geometry=80x15 -e ./RPiHeatMHz passes 60, secs 16
lxterminal --geometry=80x15 -e ./burninfpuPiA7 Kwds 10 Sect 2 Mins 16 Log 20
lxterminal --geometry=80x15 -e ./stressIntPiA7 KB 16 Secs 80 Log 11
lxterminal --geometry=80x15 -e vmstat 10 96 > vmstat.txt
################################################################################
[8] Running Times, Data Volumes and Speed
Local SD SIGMA 8GB fpuPiA7 IntPiA7
secs MB MB/s secs MB MB/s secs MB MB/s secs secs
Write 70.9 656 9.3 177.5 656 3.7 254.4 656 2.6
Read 243.0 3936 16.2 94.2 656 7.0 145.2 1968 13.6
Repeats 661.8 11019 16.6 664.2 8680 13.1 663.6 9159 13.8
Total 975.7 935.9 1063.2 958.0 960.0
################################################################################
[9] vmstat Performance Monitor Details and CPU Temperature
After --KB/sec-- Number/sec --CPU Utilisation %--
Secs bi bo in cs us sy id wa °C
0 49.2
80 1 15951 4161 2743 52 7 3 38 56.8
180 16685 6495 4597 2627 50 7 5 39 58.9
270 24115 2748 5650 3070 50 6 5 39 58.9
960 44296 13 9923 5324 50 9 6 35 62.7
|
To Start
Roy Longbottom October 2016
The Official Internet Home for my Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection
|