Logo

Roy Longbottom at Linkedin Roy Longbottom's Extended Android MP Benchmarks

For latest results see Android Benchmarks For 32 Bit and 64 Bit CPUs from ARM, Intel and MIPS.

Logo

Contents


General MHz Meaurement MP-MFLOPS2
MP-RndMem2 MP-BusSpd2 NEON-MFLOPS2-MP

Systems Used

Download Benchmark Apps


Logo CP_MHz2.apk
Measure CPU MHz
Download
Logo MP-MFLOPS2.apk
MFLOPS Caches and RAM
Download
Logo MP-RndMem2.apk
Serial/Random Memory Speed
Download
Logo MP-BusSpd2.apk
Bus and Memory Speed
Download
Logo NEON-MFLOPS2-MP.apk
NEON MFLOPS Caches/RAM
Download






All have option to save results via Email
For maximum and consistent performance, some units might need setting of a CPU Mode (example ICS Settings, Developer Options, CPU Mode, Change Normal to Performance).


To Start


General

On running multithreading benchmarks, it was found that processors could run at lower than expected CPU MHz speeds and this was influenced by running time of individual test procedures. It was also affected by Power Saving settings, for example running at 1000 MHz, the “On” setting, instead of the selected “Off” option, at 1700 MHz.

Revised versions of some of the multithreading benchmarks were produced, with extended running times, and a MHz timing program with a higher resolution. Results are provide below from the original and revised benchmarks, with Power Saving Off, and the original with Power Saving On. In the case of the tests carried out here, it seems that programs need to run for at least 100 milliseconds before switching to a high MHz. To me, this is the wrong way round and think that, to provide fastest response times, it would be better to start at a higher MHz and reduce it if the CPU starts overheating or in order to minimise power drain. A further examination of unthreaded benchmarks indicated that those comprising tests with short running times could run at a lower MHz initially, but switch to maximum speeds for the remaining time. With the multithreading benchmarks, the trend is to return to the slow speed mode on a regular basis.

For further details of the original benchmarks, and results on other systems, see www.roylongbottom.org.uk/Android MultiThreading Benchmarks.htm, www.roylongbottom.org.uk/Android NEON Benchmarks.htm and, for CPU MHz measurement, www.roylongbottom.org.uk/Android Benchmarks.htm.

Modifications to standard benchmarks are provided in Android MP Benchmarks.zip, Android NEON Benchmarks.zip and Android Benchmarks.zip.

To Start

Logo Measure CPU MHz Version 2 - CP_MHz2.apk

The original CPU_MHz program runs for 30 seconds with reporting every second. This was found to be of little use on some devices, when program running time was less than three seconds. Then, with later systems, CPU MHz was far from constant. The revised version has 100 milliseconds sampling and 300 reports. In practice, overall running time is greater than 30 seconds, due to overheads and occasions when Android further delays execution of the timing program.

The results, given below, demonstrate that, with multithreading, there can be rapid changes in measured CPU MHz and repeated measurements indicate that changes appear to be random and unpredictable. However, it is important to show the fluctuations, where measured performance is the same with Power Saving Off and On.

To Start


Logo MP MFLOPS Version 2 - MP-MFLOPS2.apk

MP MFLOPS measures floating point speed on data from caches and RAM. The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2 and 32 operations per input data word. Data sizes are also limited to three to use L1 cache, L2 cache and RAM at 12.8, 128 and 12800 KB (3200, 32000 and 3200000 single precision floating point words). The program is run using 1, 2, 4 and 8 threads. The original version executes 32 million single precision floating point calculations per test, at 2 per word, or 512 million, at 32 per word. Each thread executes a proportional share on different segments of the data. The extended version executes the instructions 640 or 10240 times.

Except for RAM speeds, the new extended benchmark results are generally around 1.7 times faster than those with the CPU at 1000 MHz. Cache dependent tests of the normal benchmark are similar to those at 1000 MHz, at 2 operations per word, and some running at slower clock speed was confirmed by the MHz measuring program results shown.

Calculations indicate that minimum single thread testing time was 50 milliseconds.



 ##############################################################
 T11 Samsung EXYNOS 5250 Dual 2.0 GHz Cortex-A15, Android 4.2.2

  Normal 1000 MHz

    FPU Add & Multiply using 1, 2, 4 and 8 Threads

        2 Ops/Word              32 Ops/Word
 KB     12.8     128   12800    12.8     128   12800
 MFLOPS
 1T      616     469     423     917     889     864
 2T     1169     943     651    1830    1787    1751
 4T     1178    1203     513    1828    1833    1757
 8T     1139    1122     645    1800    1836    1756

          Total Elapsed Time    5.3 seconds

  Normal Full Speed

  Android MP-MFLOPS v7 Benchmark V1.1 09-Aug-2013 17.14

    FPU Add & Multiply using 1, 2, 4 and 8 Threads

        2 Ops/Word              32 Ops/Word
 KB     12.8     128   12800    12.8     128   12800
 MFLOPS
 1T      619     475     429    1106    1481    1438
 2T     1195     935     654    2420    2992    2986
 4T     2010    1924     635    3054    3134    2573
 8T     1128    1197     640    2259    2796    2981

         Total Elapsed Time    3.7 seconds

  Extended

  Android MP-MFLOPS2 Benchmark V2.1 23-Sep-2013 13.03

    FPU Add & Multiply using 1, 2, 4 and 8 Threads

        2 Ops/Word              32 Ops/Word
 KB     12.8     128   12800    12.8     128   12800
 MFLOPS
 1T      845     806     554    1504    1496    1434
 2T     1991    1580     636    2978    2972    2968
 4T     1985    1751     640    3126    3112    2988
 8T     1946    2101     651    3112    3153    2983

          Total Elapsed Time   59.1 seconds

 CPU MHz Example Normal Full Speed
 
 5.86   1000   5.98   1000   6.09   1200   6.21   1400   6.33   1600
 6.44   1400   6.56   1600   6.67   1600   6.79   1700   6.91   1200 
 7.08   1600   7.31   1700   7.59   1500   7.74   1700   7.98   1700 
 8.28   1700   8.40   1700   8.69   1700   8.85   1700   9.06   1000  
 9.18   1000   9.31   1000   9.44   1000

   


To Start


Logo MP RandMem Version 2 - MP-RndMem2.apk

RandMem benchmark carries out four tests at increasing data sizes to produce data transfer speeds in MBytes Per Second from caches and memory. Serial and random address selections are employed, using the same program structure, with read and read/write tests, using a 32 bit integer array for indexing purposes. The main purpose is to demonstrate how much slower performance can be through using random access. Here, speed can be considerably influenced by reading and writing in bursts, where much of the data is not used, and by the size of preceding caches. This benchmark uses data from the same array for all threads, but starting at different points.

In this case, all measured speeds, at what could be expected to be at the highest CPU frequency, were virtually the same as those at 1000 MHz. Extended test speeds were mainly all around 1.7 times faster, using caches with serial and random reading. Some writing tests, particularly using random access, produced much lower gains, but these can have issues due to flushing caches and data bursts. Exceptionally slow normal full speed performance is confirmed by the MHz measurements.

The original benchmark is calibrated to run each test for a minimum of 100 milliseconds, with the extended one at 0.5 seconds. Calibration, or particularly slow RAM, can more than double overall elapsed time (Extended test 48 x 0.5 = 24 seconds, elapsed 43.9).


 T11 Samsung EXYNOS 5250 Dual 2.0 GHz Cortex-A15, Android 4.2.2

  

Normal 1000 MHz

MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3733 2600 3882 2542 2T 7321 2473 7287 2501 4T 7071 2442 7176 2486 8T 5725 2400 6380 2394 122.9 1T 1934 1677 945 1012 2T 3530 1667 941 997 4T 3585 1655 1166 974 8T 3608 1617 1171 974 12288 1T 1129 894 122 122 2T 1866 943 219 121 4T 1855 906 217 121 8T 1959 956 218 120 Total Elapsed Time 4.3 seconds

Normal Full Speed

Android MP-RndMem v7 Benchmark V1.1 09-Aug-2013 23.19 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3906 2582 3742 2599 2T 7107 2469 6813 2520 4T 7231 2482 6825 2475 8T 6909 2366 6705 2361 122.9 1T 1979 1651 969 1003 2T 3555 1679 1107 932 4T 3549 1654 1146 983 8T 3577 1617 1150 969 12288 1T 1184 940 119 121 2T 1644 966 218 119 4T 1842 957 208 120 8T 1675 951 217 121 Total Elapsed Time 4.4 seconds

Extended

Android MP-RndMem2 Benchmark V2.1 24-Sep-2013 00.10 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 6635 4435 6631 3355 2T 12280 2999 12218 3038 4T 12247 2892 12098 2876 8T 12055 2702 11018 2747 122.9 1T 3330 2871 1626 1732 2T 6292 1934 2027 1129 4T 6100 2050 2021 1309 8T 6120 1927 2015 1202 12288 1T 1327 975 141 121 2T 2166 932 259 115 4T 2176 912 267 119 8T 2182 986 277 124 Total Elapsed Time 43.9 seconds

CPU MHz Example Normal Full Speed

6.74 1000 6.90 1000 7.04 1000 7.16 1000 7.30 1000 7.44 1000 7.57 1000 7.70 1000 7.83 1000 7.98 1200 8.22 1200 8.36 1200 8.50 1000 8.63 1400 8.97 1000 9.13 1200 9.27 1200 9.42 1000 9.57 1200 9.69 1700 10.10 1000 10.24 1200 10.38 1400 10.53 1600 10.66 1600 10.80 1700 10.93 1000


To Start


Logo MP BusSpeed Version 2 - MP-BusSpd2.apk

This benchmark reads data using AND instructions at a range of data sizes covering caches and RAM. The program starts by reading words with 32 word address increments, then reduces the increment to eventually read all words sequentially. Speed reductions of around 50% at each higher increment suggests reading in bursts over the bus. This is normal for reading from RAM and is sometimes found reading cached data. In this case, only 12.3 KB, 123KB and 12.3 MB memory sizes are used via 1, 2, 4 and 8 threads. This time, each thread reads all the data.

MHz measurements show that this benchmark runs most of the time at 1700 MHz, as might be expected, and this is reflected in performance gains of 1.7 times for cached based data, compared with the run at 1000 MHz. The extended version is slightly faster on some single thread tests.

Running time is calibrated, as in MP-RandMem, but at a minimum of 0.1 seconds per test for the original benchmark and 0.4 seconds for the extended version. Note: the benchmark was recompiled, calibrated to 0.01 seconds (as MP-RndMem) and the full speed version also produced the same performance as those at 1000 MHz. Running a revised MP-RndMem program, calibrated for 0.1 seconds, produced little improvement over the 0.01 second version, and few instances of 1700 MHz were recorded.



 T11 Samsung EXYNOS 5250 Dual 2.0 GHz Cortex-A15, Android 4.2.2

  

Normal 1000 MHz

Android MP-BusSpd v7 Benchmark V1.1 18-Sep-2013 12.24 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 3238 3528 3740 3831 3807 3851 2T 5515 5930 6866 7126 7403 6979 4T 4197 5291 6901 7132 7383 7349 8T 4209 5207 6898 6662 7258 7538 122.9 1T 411 450 699 1692 1964 1970 2T 354 542 1171 2645 3525 3897 4T 353 586 1160 2750 3777 3874 8T 354 595 1110 2733 3742 3668 12288 1T 124 131 184 483 870 1118 2T 229 255 336 771 1351 2120 4T 207 220 344 749 1249 2084 8T 204 229 333 753 1331 2056 Total Elapsed Time 12.7 seconds

Normal Full Speed

Android MP-BusSpd v7 Benchmark V1.1 09-Aug-2013 17.16 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 3192 4183 3734 4797 5393 6190 2T 9165 10043 11654 12215 12606 12926 4T 7811 9174 10998 12110 12584 11769 8T 7009 9026 11033 12043 12587 12791 122.9 1T 667 776 1198 2916 3164 2028 2T 419 938 2036 4556 5943 6111 4T 593 1011 1899 4250 5955 5829 8T 472 1015 2031 4499 5872 5460 12288 1T 125 134 197 582 988 1382 2T 277 281 349 1097 1377 2347 4T 180 228 335 795 1524 2411 8T 274 243 354 824 1467 2468 Total Elapsed Time 12.5 seconds

Extended

Android MP-BusSpd2 Benchmark V2.1 24-Sep-2013 12.00 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5505 4738 4318 6406 6492 6569 2T 9163 9897 11365 11924 12137 12524 4T 7532 8876 11093 12046 12199 12328 8T 7164 9020 11451 12037 12384 12689 122.9 1T 692 898 1191 2684 2612 3325 2T 606 1010 2029 4569 6066 6312 4T 604 923 2063 4396 6234 6361 8T 600 1024 2059 4613 6269 6460 12288 1T 128 134 185 548 926 1335 2T 280 278 346 1103 1385 2445 4T 239 239 353 828 1362 2410 8T 236 245 354 810 1416 2418 Total Elapsed Time 49.8 seconds

CPU MHz Example Normal Full Speed

8.01 1400 8.19 1700 8.41 1700 8.65 1700 8.79 1700 8.91 1700 9.14 1700 9.26 1700 9.46 1700 9.61 1600 9.80 1700 10.02 1700 10.23 1700 10.41 1700 10.64 1700 10.98 1700 11.09 1700 11.21 1700 11.32 1700 11.44 1700 11.55 1000 11.68 1000 11.80 1200 11.92 1400 12.15 1700 12.44 1700 12.67 1700 12.92 1700 13.10 1700 13.29 1700 13.41 1700 13.54 1700 13.74 1300 13.93 1700 14.22 1700 14.44 1700 14.58 1700 14.72 1600 14.86 1700 15.01 1600 15.14 1700 15.27 1700 15.40 1700 15.53 1700 15.66 1700 15.79 1700 15.92 1700 16.04 1700 16.17 1700 16.41 1700 16.61 1700 16.91 1700 17.30 1700 17.71 1700 18.16 1700 18.46 1700 18.65 1700 18.97 1700 19.16 1700 19.33 1000


To Start


Logo MP NEON MFLOPS Version 2 - NEON-MFLOPS2-MP.apk

This benchmark is the same as MP-MFLOPS, except NEON SIMD functions are used instead of normal floating point calculations. With this faster program, the extended version executes 32 times the number operations.

The extended version produces a 1.7 times performance gain at 32 operations per word and for L2 cache results at 2 operations per word, but L1 speeds are not much faster than with the CPU running at 1000 MHz.



 ##############################################################

 T11 Samsung EXYNOS 5250 Dual 2.0 GHz Cortex-A15, Android 4.2.2

  

Normal 1000 MHz

Android NEON-MFLOPS-MP Benchmark V1.1 13-Sep-2013 14.18 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1907 1024 619 2484 2326 2357 2T 3664 2734 652 4871 4769 4609 4T 3342 3125 656 4768 4855 4482 8T 3121 3228 667 4763 4902 4582 Total Elapsed Time 2.4 seconds

Normal Full Speed

Android NEON-MFLOPS-MP Benchmark V1.1 13-Sep-2013 13.44 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1878 1433 616 2556 3078 2893 2T 3672 2720 673 5789 5903 6451 4T 4833 4606 690 6578 7680 5135 8T 4019 4474 676 6607 7685 7256 Total Elapsed Time 1.9 seconds

Extended

Android NEON-MFLOPS2-MP Benchmark V2.1 24-Sep-2013 16.44 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 2153 2421 602 3930 3975 3955 2T 4353 4661 684 8422 8219 7744 4T 4096 5165 668 8367 8372 7707 8T 4382 5846 681 8275 8412 7696 Total Elapsed Time 39.4 seconds

CPU MHz Example Normal Full Speed

9.50 1200 9.64 1000 9.78 1400 9.99 1700 10.18 1700 10.31 1700 10.44 1700 10.56 1700 10.67 1700 10.79 1000 10.91 1000 11.07 1000 11.21 1000 11.34 1000 11.47 1000


To Start


Systems Used



 T11     Voyo A15, Samsung EXYNOS 5250 Dual core 2.0 GHz Cortex-A15, 
         Mali-T604 GPU, 2 GB DDR3-1600 RAM, dual channel, 12.8 GB/s
         Screen pixels w x h 1920 x 1032 
         Android Build Version      4.2.2  - Jelly Bean
         Processor       : ARMv7 Processor rev 4 (v7l)
         processor       : 0
         BogoMIPS        : 992.87
         processor       : 1
         BogoMIPS        : 997.78
         Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
                           idiva idivt 
         CPU implementer : 0x41
         CPU architecture: 7
         CPU variant     : 0x0
         CPU part        : 0xc0f
         CPU revision    : 4
         Hardware        : SMDK5250
         Linux version 3.4.35Ut
 
To Start




Roy Longbottom at Linkedin Roy Longbottom January 2016

The Official Internet Home for my Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection