|
NOTE - These benchmarks generally ran successfully on devices controlled by up to Android 7. They could be installed, using Android 8, but failed to run due to a minor incompatibility. The benchmarks have been regenerated, excluding this problem. Newer versions can be downloaded from
Android 9 benchmarks.htm.
This also includes details and results from later technology.
ContentsDownload Benchmark AppsAll have an option to save results via Email. The first set automatically select benchmark code for ARM, Intel or MIPS processors at run time, for 32 bit architecture or 64 bit when supported. Following are older 32 bit benchmarks that are still relevant. GeneralMy original Android benchmarks were compiled to only run on ARM CPUs using 32 bit instructions. These are available from android benchmarks32.htm. The newer ones automatically select benchmark code for ARM, Intel or MIPS processors at run time, for 32 bit architecture or 64 bit when supported. These were produced using a later version of the gcc compiler. When evaluating performance differences of 64 bit operation, those at 32 bits should be produced by the same compiler version. These are in the 32 bit zip file that can be downloaded from above. It should be noted that these are recognised, by Android, as identical to the 64 bit versions, that might need to be reinstalled. The version is identified in the output display.The original ARM native code benchmarks will run on Intel CPUs, but slowly, via an Android based compatibility layer, called Houdini, that maps ARM instructions into those for X86 processors. The new ones use native Intel instructions. After installing Android 5.0, on the Intel tablet, the original ARM native code benchmarks were rerun. As shown in the results below, significant speed gains could be obtained. The latest benchmarks were compiled using gcc 4.8, via Eclipse Android Development Tools. The project files, with source code, are in Android Intel-ARM Benchmarks.zip. Limited tests show that these projects can also be used to produce the benchmarks via Android Studio. The zip file now includes the projects for the above earlier tests, in an folder named Old. All Java and native C/C++ based benchmarks use the same Java front end to run the benchmarks and display the results, an example being below. There are Run, Information and Save buttons, the latter to eMail results to me and/or whoever. The results are also saved in text based log files that also identifies system characteristics. As indicated, results identify whether 32 bit or 64 bit code has been executed.
New Version of android benchmarks.htm - The revised version of this report will contain results from running a wide range of the 32/64 bit benchmarks on a particular tablet or phone, plus any from newer top end devices. For detail and results from the original benchmarks see the last report. Strategy - These benchmarks, based on 50 years experience, do not attempt to provide an overall performance rating (the Lies, Damned Lies and Benchmarks type), as it is meaningless in representing the diverse variety of user activities. The programs are intended to identify best and worst performance characteristics that might explain why a particular application is fast or slow. CPU Benchmarks - The first set the Classic Benchmarks that were the original programs that set standards of performance for computers, comprising Whetstone, Dhrystone, Linpack (including NEON-Linpack) and Livermore Loops. Memory Benchmarks - Next are programs that measure performance with data from caches and RAM. MemSpeed (including NeonSpeed variant), BusSpeed and RandMrm all use the same range of data sizes beteen 4 KB and 64 MB. Then there is a Fast Fourier Transform benchmark with multiple data sizes. MultiThreading Benchmarks - These all measure performance using 1, 2, 4 and 8 threads. The first are MP-Whetstone, MP-Dhrystone and MP-Linpack (including NEON-Linpack-MP). The next batch all use memory sized 12.8 KB, 128 KB and 12.8 MB, comprising MP-MFLOPS (including NEON-MFLOPS MP), MP-BusSpeed and MP-RandMem. Older Benchmarks - These include graphics and SD drive benchmarks. Windows 10 Tablet - The C code part of the benchmarks has been used as the basis of programs compiled, as 32 bits and 64 bits, to run on Intel processors via Windows. Results are included below for comparison purposes, but performance might not be the same as that from Android versions running on the same Intel processor model (See system W1). The benchmark execution files are in WinTablet.zip. March 2016 - A second Windows 10 tablet was obtained, using the same Atom CPU, with the added dual boot option to use Android. This uses a 64 bit Linux kernel but, unfortunately, Android is a 32 bit variety. Results for both Operating Systems are in=ncluded below. October 2016 - Results now include some using Remix OS for PC that runs Android applications on compatible Intel-based PCs. These include using this for a second boot option on one of the Windows 10 tablets. May 2017 - Android 7.0 results included, with all 32 bit benchmarks being run on Cortex-A53 based P37. All processor dependent benchmark results were essentially the same as those from Android 6, except Java varieties, where the Whetstone benchmark speed improved considerably. June 2017 - Floating point and integer arithmetic stress tests were produced. These are multithreaded programs, where number of threads, data size and running time can be defined, plus operations per word for the floating point tests. Unlike previous benchmarks, these display results continuously, over 10 second periods.
Logged ConfigurationFollowing are examples of ARM and Intel based system information included in the log files.
Whetstone Benchmark - NativeWhetstone2.apk, Java Whetstone.apk, WhetsNN.exeThis provides an overall rating in MWIPS, plus separate results for the eight test procedures in MFLOPS (floating point) and MOPS (functions and integer). For full details and results via Windows. Linux, Android and via different programming languages, see this HTM file (including Windows tablet versions running on desktop PCs) and Whetstone Benchmark History and Results from the 1960’s. Below are results from the original benchmark for comparison with the new one, compiled for 32 bit systems. The initial aim was to show performance improvements of using native code on Intel Atom processors, rather than via the Houdini compatibility translation, where speeds of system A1 were around twice as fast. Note original ARM version (from here) performance differences on A1, Intel Atom based tablet, following upgrade to Android 5.0. The downside of the later gcc 4.8 compilation were much slower MWIPS ratings using ARM CPUs. This was due to the extremely slow speeds on the EXP tests that dominate overall running time. On a given platform, as other CPU only benchmarks, performance tends to be proportional to CPU MHz. Considering this, the particular code appears to suit the Qualcomm Snapdragon 800 and shows no real advantage of the ARM v8-A53 over the V7 varieties. In fact, the EXP test also uses the SQRT function. A test in Livermore Loops Benchmark also uses this function in a test that produces unexpectedly worse minimum speed on the same systems (T7, T11, T22 - ARM/Intel 32 Bit Version). Java results are also included, particularly to show the effects of Android 5 using ART virtual machine instead of Dalvik. For this particular benchmark, there are gains and losses, but all are slower than the native compiled versions. A5 and W2 Dual Boot Tablet - Differences in results from Microsoft and Android compilers are reflected. Atom Z8300 results for W2 are slower than W1. Later, similar results were obtained. 2016 - Note fast Core i7 results using Android via REMIX for PC and slow Java speeds with Android 6.0, that might be due to later Java, as shown with Intel/Windows Version results.
Dhrystone Benchmark - Dhrystone2i.apk, Dhry2NN.exeThe Dhrystone integer benchmark produces a performance rating in Vax MIPS (AKA DMIPS). Further details of the Dhrystone benchmark, and results from Windows and Linux based PCs, can be found in this HTM file (including Windows tablet versions running on desktop PCs) with those up to late 2012, The shown ratio, MIPS/MHz, is often quoted, with this depending on compiler optimisation (or over-optimisation) but is normally constant using the same benchmark on the same range of processors. Using native x86 code, performance of the Intel Atom based tablet A1 is 30% faster than the original ARM to Intel translated program but, on the other systems, the newer 32 bit compilations are slower. At least tablet T22 is nearly twice as fast when compiled for 64 bit operation. Following an upgrade to Android 5.0, A1 ARM to Intel translation produced performance equivalent to native code. Original can be obtained from here. 2016 - Note faster Android operation at 64 bits and REMIX Android on Core i7 outstanding speeds similar to Windows versions.
Linpack Benchmark - LinpackDP2.apk, LinpackSP2.apk, LinpackJava.apk,
|
System ARM MHz Android LinpackDP LinpackSP NEONLinpack LinpackJava See MFLOPS MFLOPS SP MFLOPS MFLOPS Original ARM Version T7 v7-A9 1200 4.1.2 151.05 201.30 376.00 56.44 T22 v8-A53 1300 5.0.2 156.70 184.09 393.34 86.09 T11 v7-A15 1700 4.2.2 459.17 803.04 1334.90 143.06 T21 QU-800 2150 4.4.3 389.52 751.95 1250.14 340.44 A1 Z3745 1866 4.4.2 168.16 296.63 443.42 252.49 A1 Z3745 1866 5.0 253.83 293.20 680.85 166.09 A5 ## Z8300 1840 5.1 238.04 318.00 746.36 174.67 R1=Atom Z8300 1840 6.0.1 781.17 37.65 R2 Core i7 3900 6.0.1 3717.42 222.23 ARM/Intel 32 Bit Version T7 v7-A9 1200 4.1.2 159.34 199.84 346.78 T7 v7-A9 1200 5.1.1 160.25 198.96 346.12 89.50 T22 v8-A53 1300 5.0.2 172.28 180.64 407.08 T22 v8-A53 1300 5.1 178.04 187.03 421.86 91.28 P37 v8-A53 1500 6.0.1 207.64 219.03 480.21 23.25 P37 v8-A53 1500 7.0 208.00 220.13 474.21 112.14 T11 v7-A15 1700 4.2.2 826.36 952.88 1411.86 See above T21 QU-800 2150 4.4.3 629.92 790.83 1325.00 See above A1 Z3745 1866 4.4.2 362.63 408.87 900.17 See above A1 Z3745 1866 5.0 363.98 406.59 900.64 See above A5 ## Z8300 1840 5.1 609.39 644.32 942.12 See above R1=Atom Z8300 1840 6.0.1 632.56 682.08 1000.00 See above R2 Core i7 3900 6.0.1 3442.00 1838.99 N/A See above ARM/Intel 64 Bit Version T22 v8-A53 1300 5.0.2 338.00 479.69 505.12 T22 v8-A53 1300 5.1 347.55 492.78 520.79 See above P33 QU-810 2000 5.0.2 1277.76 R1=Atom Z8300 1840 6.0.1 875.82 1473.16 N/A See above R2 Core i7 3900 6.0.1 5152.85 3950.31 N/A See above Intel/Windows 32 Bit Version W1 Atom Z8300 1840 Win 10 615.80 See 64b W2 ## Z8300 1840 Win 10 613.50 See 64b PC Core i7 3900 Win 10 3453.72 See 64b Intel/Windows 64 Bit Version W1 Atom Z8300 1840 Win 10 638.75 254.73 W2 ## Z8300 1840 Win 10 636.00 265.66 PC Core i7 3900 Win 10 3603.86 465.32 ## A5 and W2 Same Dual Boot Tablet =Atom R1 and w1 Same Tablet R2 and PC same System |
The Livermore Loops comprise 24 kernels of numerical application with speeds calculated in MFLOPS. A summary is also produced, with maximum, minimum and various mean values, geometric mean being the official average. As for other of these benchmarks, details and results from various hardware and software platforms are provided in this HTM file (including Windows tablet versions running on desktop PCs). with results up to late 2012 in
MFLOPS/MHz - The first set of the following comparisons are derived from shown MFLOPS of the 24 kernels for each system. divided by CPU MHz, and compared to those from T7 Cortex-A9 CPU. They can indicate the effectiveness of particular levels of hardware and compiler technology. The low minimum speeds occur in the only loop that uses the SQRT function, where the Whetstone Benchmark is also slow on the same systems. The second Cortex-A53 is running under 64 bit Android that might make a difference. Performance of the sytems with better minimum values appear enhanced by the slow T7 Cortex-A9. On average values for ARM CPUs, Qualcomm 800 and Cortex-A15 are somewhat faster. The Intel CPUs are faster on a per MHz basis, with Core i7 being far superior. Note that Android and Windows performance is quite similar for the latter.
64 Bit vs 32 Bit - At least as far as average speeds are concerned, working at 32 bits and 64 bits produces similar performance on Intel based devices but 64 bits can be much faster with ARM processors. Note that Intel CPUs can use the same SSE type SIMD instructions at both settings.
Native 32 Bit vs Original Code - The original benchmarks were compiled for ARM CPUs, producing Intel instructions via the Houdini conversion layer. In this case, performance was much better using native code compilation. ARM speeds were effected by using a later version of the compiler.
Original ARM only version can be obtained from here.
MFLOPS/MHz vs Cortex-A9 Avg Min Max T11 Cortex-A15 Android 32 1.38 0.90 2.51 T22 Cortex-A53 Android 32 0.83 0.93 0.92 P37 Cortex-A53 Android 32 0.95 2.17 0.96 T21 Qualcomm 800 Android 32 1.13 2.34 1.63 A1 Atom Z3745 Android 32 1.57 3.71 1.67 A5 Atom Z8300 Android 32 1.61 3.24 1.82 R1 Atom Z8300 Android 32 1.62 3.07 1.96 W2 Atom Z8300 Windows 32 1.66 4.47 1.62 R2 Core i7 Android 32 3.23 4.10 4.22 PC Core i7 Windows 32 3.68 5.07 4.52 64 Bit / 32 Bit Avg Min Max T22 Cortex-A53 Android 64/32 1.47 3.61 1.96 R1 Atom Z8300 Android 64/32 1.04 1.08 0.95 R2 Core i7 Android 64/32 1.18 1.00 1.72 W2 Atom Z8300 Windows 64/32 0.97 0.78 1.09 PC Core i7 Windows 64/32 0.96 0.74 1.06 Native/Original A1 Atom Z3745 Android 32 1.92 2.49 3.17 T7 Cortex-A9 Android 32 1.01 0.97 0.39 T11 Cortex-A15 Android 32 1.13 0.91 0.38 T21 Qualcomm 800 Android 32 1.08 1.00 1.12
System CPU MHz Android MFLOPS 24 Loops Original ARM Version ---------------------------------------------------------------- A1 Z3745 1866 4.4.2 9.5 secs 201.2 257.3 237.5 205.6 122.5 180.0 308.3 450.0 535.3 370.4 104.8 77.1 Max Average Geomean Harmean Min 80.0 95.1 153.8 136.4 202.0 268.9 535.8 201.9 172.4 146.7 48.8 179.5 209.7 145.0 95.0 254.2 51.3 A1 Z3745 1866 5.0 9.9 secs 374.9 274.8 327.6 295.6 247.9 227.8 468.5 538.6 569.2 396.2 167.9 141.9 Max Average Geomean Harmean Min 109.6 114.5 210.5 150.5 250.6 333.4 569.8 266.6 233.5 199.8 59.9 287.9 238.0 261.3 114.9 372.8 64.0 T7 v7-A9 1200 4.1.2 10.0 secs 241.7 233.4 383.5 388.7 98.4 147.1 293.1 258.5 314.6 181.1 99.1 95.3 Max Average Geomean Harmean Min 80.6 68.1 171.6 226.9 346.2 176.9 391.9 202.1 181.3 160.9 68.1 202.6 184.9 119.5 102.1 200.9 88.5 T11 v7-A15 1700 4.2.2 10.0 secs 646.8 671.1 839.9 789.7 176.2 671.6 1078.4 1243.4 1018.8 367.0 130.0 165.9 Max Average Geomean Harmean Min 117.6 210.7 370.5 521.1 657.3 625.4 1252.8 476.0 375.8 288.8 90.8 270.8 269.1 458.3 196.3 432.5 112.7 T21 QU-800 2150 4.4.3 10.0 secs 570.4 624.2 915.6 861.4 175.5 545.4 636.9 911.1 750.6 293.9 130.5 207.0 Max Average Geomean Harmean Min 115.0 159.8 330.5 327.1 608.7 592.8 1075.5 437.1 356.7 284.4 100.3 330.2 267.3 244.2 153.8 356.2 106.2 ARM/Intel 32 Bit Version ------------------------------------------------------------ A1 Z3745 1866 4.4.2 9.5 secs 484.6 529.2 1031.2 929.2 274.5 365.6 661.9 873.1 825.6 479.1 612.9 520.7 Max Average Geomean Harmean Min 156.8 324.4 339.4 497.8 693.1 481.8 1031.2 480.0 429.8 378.6 154.7 373.0 329.1 388.6 181.8 650.1 169.2 A5 ## Z8300 1840 5.1 9.6 secs 689.4 701.4 1108.3 873.6 230.1 488.4 662.2 770.0 876.7 404.9 439.6 428.2 Max Average Geomean Harmean Min 141.2 280.7 293.4 466.1 540.3 432.7 1108.3 495.8 433.6 370.6 133.2 313.9 307.8 649.7 176.1 662.0 148.3 T11 v7-A15 1700 4.2.2 10.0 secs 496.9 814.9 843.7 801.7 175.5 188.6 1223.8 1411.4 760.3 452.5 132.7 120.7 Max Average Geomean Harmean Min 107.1 264.7 34.3 529.0 592.6 728.2 1411.4 471.2 342.1 219.5 34.3 275.2 266.8 530.7 198.8 502.8 117.8 T21 QU-800 2150 4.4.3 10.1 secs 640.9 814.9 813.8 808.4 201.6 182.0 643.0 1158.9 779.9 351.4 133.1 176.2 Max Average Geomean Harmean Min 113.6 178.4 286.5 294.7 516.7 667.5 1159.4 446.9 356.0 280.3 112.3 327.5 281.7 297.9 153.6 613.1 117.0 T7 v7-A9 1200 4.4.2 10.2 secs 245.2 268.8 394.7 390.7 118.2 157.2 297.4 308.1 344.7 226.7 90.8 74.7 Max Average Geomean Harmean Min 85.6 81.7 26.9 227.5 338.9 240.3 396.6 207.6 175.6 136.1 26.8 204.9 180.6 179.9 110.8 271.4 78.5 P37 v8-A53 1500 6.0.1 9.8 secs 201.7 293.7 331.7 327.5 135.5 137.1 346.5 474.9 451.5 271.6 149.7 74.9 Max Average Geomean Harmean Min 81.2 104.5 236.3 278.4 411.1 294.2 474.9 237.4 208.3 179.9 72.7 208.0 245.7 148.2 128.8 351.7 99.9 P37 v8-A53 1500 7.0 9.7 secs 198.6 295.5 331.3 325.1 131.7 140.5 341.5 475.4 452.1 241.8 149.5 74.8 Max Average Geomean Harmean Min 81.8 105.2 237.0 279.0 412.9 295.1 475.4 237.0 208.1 180.0 72.9 208.8 238.9 131.1 133.1 353.3 100.4 ARM/Intel 32 Bit Version Then 64 Bit ------------------------------------------------ T22 v8-A53 1300 5.0.2 9.7 secs 163.4 243.4 272.1 270.3 109.5 111.2 282.2 389.0 360.6 219.6 124.0 61.8 Max Average Geomean Harmean Min 67.6 87.4 27.3 224.2 340.1 241.9 393.4 188.3 158.3 124.6 27.1 168.5 198.8 120.2 120.6 277.7 79.1 R1=Atom Z8300 1840 6.0.1 9.4 secs 746.6 767.9 1194.9 986.9 249.3 520.5 722.7 840.9 978.5 370.5 451.5 450.1 Max Average Geomean Harmean Min 151.3 301.3 331.4 524.9 608.1 465.5 1194.9 501.0 435.1 366.6 126.1 352.1 316.8 578.8 181.8 695.3 166.3 R2 Core i7 3900 6.0.1 8.4 secs 3664.3 3433.9 2498.9 2509.6 552.5 2201.3 4618.0 5337.8 5345.9 2426.9 1307.3 1888.8 Max Average Geomean Harmean Min 670.6 1211.5 2033.5 1804.4 2382.0 3571.5 5441.5 2259.0 1845.3 1445.9 356.9 840.6 968.8 2967.6 1112.8 1591.4 356.9 ARM/Intel 64 Bit Version ------------------------------------------------------------ T22 v8-A53 1300 5.0.2 9.7 secs 451.4 191.4 243.2 272.4 144.9 144.5 749.4 411.1 453.6 261.1 138.0 206.1 Max Average Geomean Harmean Min 122.5 130.1 215.0 249.8 411.6 395.4 772.2 265.9 232.5 206.3 97.8 241.7 248.1 152.8 118.7 317.2 103.7 R1=Atom Z8300 1840 6.0.1 9.4 secs 881.6 742.9 1130.2 928.7 236.9 554.1 869.1 795.4 854.7 433.5 198.4 604.5 Max Average Geomean Harmean Min 215.7 292.9 320.3 520.5 628.6 528.5 1130.2 524.1 451.1 380.6 136.1 321.3 290.4 692.1 205.4 698.3 164.8 R2 Core i7 3900 6.0.1 8.9 secs 9376.3 3394.8 2496.3 2523.0 559.6 2219.9 8891.9 5719.5 5828.1 2749.2 439.9 3146.1 Max Average Geomean Harmean Min 1182.7 1272.5 2282.8 2332.7 2379.4 5722.9 9376.3 2933.0 2172.0 1556.1 357.0 1068.6 966.6 2966.5 1435.5 1590.7 357.0 Intel/Windows 32 Bit Version -------------------------------------------------------- W1 Atom Z8300 1840 MHz Win10 721.4 702.3 862.7 988.7 245.3 489.6 875.8 794.8 980.5 441.3 201.1 446.7 Max Average Geomean Harmean Min 201.0 240.8 299.8 499.9 603.5 459.3 988.7 504.9 448.8 395.8 189.6 446.6 336.0 607.8 199.0 705.3 277.8 W2 ## Z8300 1840 MHz Win10 749.7 731.3 894.0 988.1 251.2 489.2 883.7 797.3 968.3 434.5 200.5 454.0 Max Average Geomean Harmean Min 202.9 240.7 301.1 521.4 604.3 457.0 988.1 503.5 446.9 393.5 183.5 443.6 333.7 587.1 200.6 697.1 276.4 PC Core i7 3900 MHz Win10 4752.7 3624.0 2593.7 2764.2 564.5 1590.3 5071.8 5284.2 5569.6 2784.2 441.5 1939.4 Max Average Geomean Harmean Min 931.3 1205.6 2284.2 2372.1 2435.2 3500.0 5821.7 2512.1 2102.9 1712.0 441.5 1068.8 1880.4 2819.7 1529.1 1590.4 1616.5 Intel/Windows 64 Bit Version -------------------------------------------------------- W1 Atom Z8300 1840 MHz Win10 655.2 651.9 728.1 688.9 217.3 457.6 732.2 735.7 965.5 378.3 170.7 381.6 Max Average Geomean Harmean Min 196.6 196.4 213.0 434.4 522.6 420.6 965.5 433.6 375.8 320.0 117.2 385.6 283.1 572.9 156.7 584.5 129.7 W2 ## Z8300 1840 MHz Win10 743.0 734.7 834.0 808.1 233.4 547.9 878.6 857.4 1074.4 440.8 201.7 450.4 Max Average Geomean Harmean Min 215.4 228.6 247.9 500.4 608.1 484.8 1074.4 500.3 433.7 369.8 143.4 440.7 327.7 650.4 180.8 682.2 151.6 PC Core i7 3900 MHz Win10 4566.1 3465.7 2459.1 2748.1 565.1 2308.3 6142.4 5354.0 5195.9 2518.0 417.8 1838.7 Max Average Geomean Harmean Min 941.0 1096.4 2166.0 2180.9 2291.5 3357.0 6142.4 2514.4 2014.9 1500.5 324.9 1005.1 1780.7 2871.2 1311.6 1600.5 324.9 ## A5 and W2 Same Dual Boot Tablet =Atom R1 and W1 Same Tablet R2 and PC same System |
This benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, sized 2 x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m], using double and single precision floating point and x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can be calculated by dividing double precision MB/second by 8 and 16, for the two tests, and single precision speeds by 4 and 8. Assembly listings for integer tests show that Millions of Instructions Per Second (MIPS) can be found by multiplying MB/second by 0.78 with 2 adds and 0.66 for the other test. Cache sizes are indicated by varying performance as memory usage changes. For more details and older results see here.
The native ARM/Intel results, on Intel Atom based A1, averaged 44% faster than the original translated speeds via L1 cache data, 27% using L2 and 14% from RAM. Running under Android 5.0, the translated benchmark speeds were similar to the new version, in most cases. (Original ARM only version can be obtained from here).
Initial measurement, running the new 32 bit version on ARM CPUs, produced similar results to the original benchmark.
First results, to provide 64/32 bit comparisons on ARM CPUs, were on Tablet T22, where average 64/32 bit speed ratios, were 2.20 times, using cached data, and 1.58 times from RAM.
The benchmark is based on, and is similar to, my original Windows MemSpeed bencmarks, where details and results can he found here. These can be compared with the new Windows tablet version, from later compiler, with 32 bit and 64 bit results included below. Android results R1 an R2 are via via REMIX for Intel PCs, running at 64 bits.
Dual Booting - Results include those for Windows and Android running on the same system. They are dual, boot A5 and W2, alternative boot W1 and R1 and alternative boot PC and R2.
Following the results are processor technology comparisons with the ARM Cortex-A9 CPU, based on MB/second divided by CPU MHz, demonstrating that each has its strengths and weaknesses. See comments in comparison table.
Results are dependent on the particular compiler used. Those for the Windows version were produced by an earlier compiler and are relatively slow at 64 bits. An example of differences is for the first test, with a source code loop, in double precision, that contains four multiples and four adds. Assembly code produced for Intel CPUs has four scalar SSE2 multiplies and four adds at 32 bits, with two SIMD SSE2 instructions of each at 64 bits. Those for ARM has four fmacd floating-point multiply-accumulate to double precision registers at 32 bits and two fmla fused multiply-add instructions to vector registers at 64 bits. The result is much faster performance at 64 bits.
In principle, SIMD instructions could also be used at 32 bits for Intel, but fmla is only available at 64 bits with ARM.
This benchmark carries out the same calculations as the MemSpeed Benchmark measuring data reading speeds in Mega Bytes per second, with functions accessing arrays of cache and RAM based data, sized 2 x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m] single precision floating point with x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can calculated by dividing single precision MB/second by 4 and 8, for the two tests. The first set of calculations use normal functions followed by some using NEON Intrinsic Functions. The last two columns are NEON only results. For further details and results see android neon benchmarks.htm.
On tablet A1, with the Intel Atom CPU, the 32 bit native code version produced some significant performance gains over the original ARM benchmark (available from here), but rerunning this via Android 5.0 produced much faster speeds, some better than native code compilation.
The later compiler produced some slower and some faster speeds on ARM based tablets.
Details are provided for the 64 bit version on T22. As with NEON-Linpack, many results from 32 bit and 64 bit compilations, via NEON intrinsic functions, were similar. With normal code, the 64 bit compilations were up to near four times faster than those at 32 bits.
Following the results are further MB per second/CPU MHz comparisons. Subject to variations due to cache occupancy, the comparisons for normal calculations are the same as MemSpeed. Then, more modern processors performed relatively better, using NEON instructions.
See comments in
comparison table.
This benchmark (based on PC version with details and results here) is designed to identify reading data in bursts over buses. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read. On reading data from RAM, 64 Byte bursts are typically used. Then, measured reading speed reduces from a maximum, when all data is read, to a minimum on using 16 word increments (64 bytes). Potential maximum speed can be estimated by multiplying this minimum value by 16. With this burst rate, measured speed at 32 word and 16 word increments are likely to be the same. Cache sizes are indicated by varying speed as memory use changes. Note, with smallest L1 cache demands, measured speed can be low due to overheads when reading little data. For more details and further results see here.
Comparing results from different versions, on a particular system, there can be unusual differences on burst reading speeds. Those quoted here are for the most important measurements for reading all data.
On Intel Atom based tablet A1, there was little difference between the old ARM version, with conversion, to the new 32 bit native code program, nor using Android 5.0 instead of 4.4.
Average revised 32 bit version performance improvements, via caches/RAM, were 8%/17% for T7 Cortex-A9, 11%/27% for T11 Cortex-A15 and 27%/-8% on T21 Snapdragon 800. Corresponding T22 Cortex-A53 64/32 bit improvements were 61%/25%.
After the results are further MB per second/CPU MHz comparisons, for this integer data streaming benchmark that can demonstrate maximum data transfer speed from RAM. As the latter might not be dependent on CPU speed, direct MB/second comparisons are also provided. These are dependent on bus speed, 32 bit or 64 bit bus width and whether one or two channels are available, one problem being that is it is often difficult to identify what is provided. Note that multithreaded benchmarks might be needed to fully utilise memory bandwidth - see later results.
Results of the Windows version are also included for a tablet and, for comparison purposes, a desk top PC with 4 memory channels. Intel systems have 64 bit bus widths.
Intel CPUs - Results on Atom Z8300 are similar via different compilers/Operating System, using Android A5, REMIX/Android R1 and R2, plus Windows W1 and W2. Of those available 32 bit and 64 bit versions have similar performance. RAM speeds tend to be faster than those on ARM based systems, due to 64 bit bus widths. As would be expected, Core i7 speeds are superior, based on MB/second per MHz and, particularly, on RAM MB/second comparisons. See also comments in comparison table.
ARM CPUs - With 32 bit versions, MB/second per MHz comparisons, with the older Cortex-A9, tend to be worse using L1 cache but better from L2 and RAM. The only 64 bit version results available are for T22, Cortex-A53, demonstrating faster L1 cache based tests, with lower improvements from L2 and RAM.
RandMem benchmark carries out four tests at increasing data sizes to produce data transfer speeds in MBytes Per Second from caches and memory. Serial and random address selections are employed, using the same program structure, with read and read/write tests using 32 bit integers. The main purpose is to demonstrate how much slower performance can be through using random access. Here, speed can be considerably influenced by reading and writing in bursts, where much of the data is not used, and by the size of preceding caches. For more details and further results see here.
On tablet A1, with the Intel Atom processor, results for the new 32 bit version were essentially the same as the Houdini instruction conversion of original ARM code via Android 5, both averaging 30% improvement, over the original Android 4 speeds on read only tests, but similar with reading and writing. The latter pattern of improvements were also apparent for 64 bit versus 32 bit benchmark modes on tablet T22, with the ARM Cortex-A53 processor, but only using cache based data. The later 32 bit benchmark produced inconsistent gains and some losses, running on the other ARM compatible systems (up to October 2015).
The benchmark code is the same as used on the Windows and Linux PC versions, with details and results here, where some of these results are also included.
Further MB per second/CPU MHz comparisons are provided below, showing the usual variability in performance.
See comments in
comparison table.
The benchmarks run code for single and double precision Fast Fourier Transforms of size 1024 to 1048576 (1K to 1024K), each one being run three times to identify variance. Results are displayed and saved in a log file (FFT-tests.txt), with FFT running time in milliseconds. Besides Android, the bechmarks are available to run via Windows and Linux. Two versions are available FFT1, original version and with optimised C code as FFT3c. Further details, results, and links for benchmarks and source code are in FFTBenchmarks.htm. The Android benchmarks are only available in the later 32 or 64 bit mode. Example results are below.
Version 3 Improvements - All systems produced significant gains, using the optimised benchmark, but some struggled running the smaller FFTs.
64 Bit Differences - Initially, only one tablet was available that runs at 64 bits, a Lenovo TAB 2 A8-50F using Android 5. In this case, 64 bit and 32 bit results were similar for the non-optimised version, but averaged 40% faster with the more efficient code. Later results, using Intel CPUs, produced similar performance via 32 bit and 64 bit versions.
Double and Single Precision - Using 64 bit DP numbers, instead of 32 bit for SP, can produce much slower speeds when a lower level cache space is exceeded and also though using more RAM based data. Other than these, there are slower and faster results.
Android Upgrades - First identified upgrades to Android 5, indicated better average performance but with wide variations on individual tests.
Intel/Windows 10 - 32 bit and 64 bit Intel/Windows results are now included for Atom and Core i7 CPUs.
A5 and W2 Dual Boot Tablet - Android and Windows speeds are again generally, similar, except for Version 3, where W2 is faster. Again W2 results using RAM are slower than W1.
Intel CPU Windows and REMIX/Android performance was quite similar.
Single Precision and Double Precision Results in milliseconds
T7 Nexus 7 T11 VOYO A15 T21 Kindle HDX 7
Cortex-A9 1.2 GHz Cortex-A15 1.7 GHz Qualcomm 800 2.1 GHz
L1/L2 KB 32/1024 32/2048 16/2048
Android 4.1.2 Android 5.0.2 Android 4.2.2 Android 4.4.3
32 Bit 32 Bit 32 Bit 32 Bit
K Size SP DP SP DP SP DP SP DP
Version 1.0
1 0.64 0.38 0.18 0.21 0.10 0.17 0.14 0.18
2 0.77 0.97 0.40 0.67 0.22 0.36 0.33 0.53
4 1.14 1.77 1.13 1.86 0.57 0.90 1.03 1.30
8 3.28 4.40 3.26 5.12 2.12 2.31 2.50 3.09
16 7.76 9.39 7.74 9.69 4.71 5.97 1.95 2.20
32 17.80 22.26 18.09 22.73 10.76 11.37 4.18 5.77
64 61.05 140.58 41.64 84.68 20.10 49.70 14.61 20.01
128 153.19 289.15 139.98 274.54 77.67 213.70 33.19 60.52
256 450.16 645.72 444.09 645.70 408.51 448.95 107.49 310.93
512 1084.11 1457.85 1102.20 1438.29 782.85 1101.70 584.54 497.23
1024 2388.33 3129.21 2388.56 3185.93 1799.89 2280.30 875.95 963.37
Version 3c.0
1 0.66 0.21 0.27 0.25 0.23 0.08 0.35 0.07
2 1.09 0.55 0.65 0.65 0.50 0.17 0.81 0.19
4 2.67 1.38 1.67 1.45 1.07 0.41 1.66 0.41
8 3.56 3.09 4.30 3.23 2.41 0.90 1.08 0.90
16 7.78 9.08 8.33 10.35 5.26 3.23 3.36 2.66
32 17.85 22.02 19.23 25.38 11.88 8.88 6.54 6.07
64 39.52 52.11 46.41 58.90 23.75 23.08 12.57 13.56
128 89.73 118.45 103.31 128.44 49.74 53.11 27.41 33.09
256 203.34 258.56 221.99 267.12 100.25 120.66 63.39 72.55
512 437.25 552.00 464.30 558.13 226.76 264.30 150.38 156.30
1024 918.32 1175.65 933.05 1182.49 505.68 586.18 306.32 337.07
T22 Lenovo TAB 2 A8-50F P37 Lenovo Moto G4
ARM Cortex-A53 1.3 GHz ARM Cortex-A53 1.5 GHz
L1/L2 KB 32/512 32/512
Android 5.0.2 Android 6.0.1 Android 7.0
64 Bit 32 Bit 32 Bit 32 Bit
K Size SP DP SP DP SP DP SP DP
Version 1.0
1 0.20 0.21 0.21 0.21 0.21 0.21 0.17 0.18
2 0.44 0.50 0.43 0.53 0.45 0.51 0.38 0.40
4 1.06 1.26 1.03 1.24 1.16 1.33 0.90 1.17
8 2.52 3.03 2.52 2.85 2.62 2.59 2.29 2.45
16 5.89 6.41 5.68 6.60 5.06 6.09 4.95 5.64
32 14.09 25.29 13.05 30.59 14.10 30.26 11.25 27.12
64 49.97 109.32 45.80 92.16 52.78 113.24 40.72 105.27
128 188.37 256.98 153.25 221.98 173.52 256.88 160.31 236.64
256 447.62 583.33 362.62 504.60 409.24 578.50 383.80 544.43
512 826.77 1019.84 840.44 1107.14 917.86 1265.79 876.99 1198.03
1024 1846.27 2299.97 1835.82 2423.72 2047.09 2750.92 1972.58 2683.18
Version 3c.0
1 0.17 0.20 0.34 0.20 0.28 0.17 0.29 0.16
2 0.37 0.48 0.74 0.47 0.65 0.39 0.64 0.38
4 2.55 1.07 1.62 1.06 1.42 0.85 1.44 0.86
8 1.93 2.40 3.63 2.33 3.35 1.95 3.25 1.95
16 4.59 5.64 8.07 9.12 8.20 8.13 6.95 7.86
32 10.68 15.40 18.20 22.93 15.99 18.95 15.93 19.43
64 28.17 36.16 45.33 50.41 37.84 43.62 37.29 42.46
128 66.87 82.23 101.38 112.46 84.06 96.71 83.55 95.01
256 148.69 193.91 222.13 264.79 190.32 217.23 186.20 213.21
512 347.25 424.72 501.52 550.88 425.97 474.15 416.25 462.13
1024 760.74 960.28 1085.65 1206.83 928.38 1026.33 897.72 1001.54
Intel CPUs Android
Dual Boot with W2
A1 Asus MemoPad 7 A5 Teclast X98 Plus
Atom Z3745 1.86 GHz Atom Z8300 1.84 GHz
L1/L2/L324/1024 KB 24/1024/0
Android 4.4.2 Android 5.0 Android 5.1
32 Bit 32 Bit 32 Bit
K Size SP DP SP DP SP DP
Version 1.0
1 0.09 0.11 0.10 0.09 0.09 0.12
2 0.21 0.29 0.16 0.23 0.18 0.31
4 0.61 0.66 0.48 0.52 0.61 0.57
8 1.35 1.17 1.07 1.17 1.17 1.56
16 3.20 2.57 2.38 2.59 3.15 3.34
32 5.41 5.75 5.30 6.02 6.65 9.20
64 11.74 29.95 11.77 28.31 15.62 45.48
128 67.54 99.31 54.05 97.58 49.67 110.14
256 194.13 225.94 189.11 219.98 222.78 264.65
512 438.49 501.59 433.06 487.49 521.72 602.38
1024 970.84 1121.61 968.37 1116.94 1187.13 1433.75
Version 3c.0
1 0.09 0.08 0.10 0.08 0.15 0.13
2 0.21 0.20 0.16 0.20 0.20 0.21
4 0.50 0.43 1.66 0.43 0.45 0.52
8 1.12 0.96 0.87 0.96 0.97 1.05
16 2.64 2.86 2.01 2.34 2.14 2.61
32 4.87 5.56 4.51 5.73 4.82 6.53
64 11.11 15.03 10.01 14.47 11.10 17.79
128 27.29 34.77 26.80 33.71 29.95 43.74
256 62.57 72.93 61.16 72.04 77.43 86.13
512 132.64 157.56 131.10 152.68 152.95 185.74
1024 282.99 332.37 274.01 363.60 314.54 460.91
Intel CPUs - Windows or Windows and Android
W2 Teclast X98 Plus
Atom Z8300 1.84 GHz
KB 24/1024/0
Windows 10
32 Bit 64 Bit
K Size SP DP SP DP
Version 1.0
1 0.11 0.12 0.10 0.12
2 0.24 0.34 0.22 0.33
4 0.65 0.74 0.72 0.74
8 1.46 1.66 1.37 1.68
16 3.25 3.61 3.21 3.78
32 7.33 8.10 6.98 7.97
64 16.40 28.29 15.96 29.96
128 38.56 121.13 76.10 136.39
256 232.47 266.35 259.73 298.24
512 565.20 597.42 596.50 629.28
1024 1205.59 1450.84 1288.20 1439.44
Version 3c.0
1 0.08 0.09 0.09 0.08
2 0.19 0.23 0.18 0.19
4 0.45 0.51 0.48 0.43
8 1.00 1.12 1.08 0.93
16 2.67 2.68 2.51 2.50
32 5.54 5.59 5.74 6.06
64 10.64 14.72 12.54 14.77
128 32.82 36.71 28.28 36.95
256 66.71 77.48 67.25 78.47
512 157.72 153.43 150.14 168.63
1024 332.39 365.36 300.79 370.48
W1 Pipo W1S Tablet R1/W1 Pipo W1S Tablet
Atom Z8300 1.84 GHz Atom Z8300 1.84 GHz
L1/L2/L3 KB 24/1024/0 KB 24/1024/0
Windows 10 REMIX/Android
32 bit 64 bit 32 bit 64 bit
K Size SP DP SP DP SP DP SP DP
Version 1.0
1 0.11 0.12 0.10 0.12 0.31 0.37 0.29 0.37
2 0.24 0.45 0.23 0.35 0.84 0.85 0.65 1.04
4 0.67 0.75 0.63 0.74 1.52 1.46 1.91 2.37
8 1.44 1.80 1.50 1.69 2.56 2.65 4.26 5.31
16 3.29 3.71 3.16 3.65 4.46 3.59 7.42 6.24
32 7.32 7.83 5.94 6.98 6.12 7.93 8.26 6.98
64 14.36 31.51 13.95 25.44 13.03 35.52 17.14 32.47
128 46.45 120.79 50.90 115.44 69.30 105.02 73.44 117.75
256 209.39 235.36 203.02 266.34 228.05 244.75 237.24 295.39
512 455.89 534.68 491.49 576.91 536.19 620.66 502.33 626.71
1024 1024.78 1195.81 1040.39 1182.20 1086.25 1287.63 1039.91 1209.47
Version 3c.0
1 0.08 0.08 0.08 0.09 0.16 0.08 0.26 0.08
2 0.19 0.20 0.20 0.22 0.37 0.21 0.60 0.23
4 0.46 0.44 0.46 0.48 0.89 0.46 1.45 0.44
8 1.20 0.97 1.06 1.07 1.58 1.03 3.21 0.97
16 2.27 2.26 2.26 2.25 3.21 2.53 7.37 2.29
32 5.11 5.54 5.31 5.83 5.28 6.13 11.42 5.62
64 12.48 14.29 11.22 15.59 12.13 18.74 13.93 14.66
128 27.62 34.25 27.47 31.65 31.28 37.99 28.97 31.81
256 71.32 70.99 62.74 67.95 72.23 81.63 57.01 66.84
512 143.07 144.60 140.50 146.76 155.62 196.93 122.36 140.30
1024 298.00 322.13 289.98 334.07 295.55 450.03 271.67 302.49
PC 2015 Top End Desktop PC R2/PC
Corei7-4820K 3.9 GHz Corei7-4820K 3.9 GHz
L1/L2/L332/256/10 MB 32/256/10 MB
Windows 10 REMIX/Android
32 bit 64 bit 32 bit 64 bit
K Size SP DP SP DP SP DP SP DP
Version 1.0
1 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.018
2 0.04 0.04 0.04 0.04 0.05 0.04 0.03 0.041
4 0.09 0.12 0.08 0.12 0.10 0.13 0.13 0.181
8 0.26 0.31 0.25 0.30 0.29 0.32 0.38 0.398
16 0.65 0.77 0.62 0.76 0.71 0.81 0.88 0.936
32 1.59 1.96 1.51 1.93 1.69 1.99 2.11 2.506
64 4.33 4.87 3.91 4.78 4.06 4.41 4.78 5.037
128 9.94 10.57 9.21 10.60 9.19 9.92 9.31 9.772
256 21.87 22.00 21.01 22.06 20.68 21.92 19.70 21.974
512 45.09 55.15 44.72 58.29 45.07 52.85 43.68 56.312
1024 105.75 199.77 111.23 199.11 106.39 188.55 110.34 176.725
Version 3c.0
1 0.02 0.02 0.01 0.01 0.02 0.02 0.01 0.018
2 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.04
4 0.07 0.08 0.06 0.07 0.07 0.08 0.06 0.09
8 0.16 0.18 0.14 0.16 0.16 0.17 0.22 0.199
16 0.37 0.41 0.33 0.38 0.39 0.45 0.47 0.402
32 0.81 0.86 0.73 0.82 0.85 0.96 1.11 0.873
64 1.76 1.86 1.56 1.75 1.82 2.05 2.18 1.888
128 3.77 4.05 3.38 3.76 3.94 4.36 4.45 4.047
256 8.24 9.36 7.38 8.78 8.47 9.78 8.66 9.282
512 19.09 22.96 17.28 22.50 19.52 24.29 17.74 23.361
1024 45.68 57.37 42.19 56.66 47.35 57.59 43.23 56.682
|
For more information on Whetstone Benchmark see stand alone version, above. The multithreading version runs multiple copies of the same shared code, with separate variables. In this case, performance of each of the eight test functions and overall MWIPS ratings is invariably (nearly) proportional to the number of CPU cores available. The driving program checks that calculations on every thread produce consistent numeric results.
The gcc 4.8 based ARM/Intel version, running on the Intel Atom tablet, is rated at twice the speed of the original, due to the use of native code. The fixed point results indicate overoptimisation, but the test uses little of the overall time, this being mainly dependent on the Cos, Exp and third MFLOPS tests. Running the original ARM converted code version via Android 5.0, mainly produced better performance, but an overall lower rating, due to slower Cos and Exp tests, same as stand alone version above.
Also the same as the stand alone version, the new native ARM program was generally slower, running on tablets T7, T11 and T21,
On T22, with the Cortex-A53 CPU, the new 32 bit single thread tests appeared to be slower than the stand alone version, but that was not the case at 64 bits, apparently indicating a 64 bit performance gain.
A5 and W2 Dual Boot Tablet - Android and Windows speeds are significantly different, on some tests, because of the different compilers, particularly due to optimisation, but these tests do not affect the overall MWIPS results much. The latter averages 18% faster via Android but both show 2 and 4 thread performance gains of around 1.9 and 3.5 times.
Intel CPU Windows and REMIX Android, 32 bit and 64 bit versions - overall MWIPS ratings were all quite similar on a Core i7 (PC/R2) and also on an Atom (W1/R1), but there were variations an individual tests, due to different compilers and instructions used.
MP Efficiency - For those with four cores, average throughput, compared with one core, was 4.0 times on the Core i7 with REMIX and Windows, 3.5 times Atom with Windows, and 2.7 times REMIX, 3.7 times Android, then 3.9 times with ARM/Android. Core i7 (with Hyperthreading) recorded 6.9 timed with 8 threads, and the 8 core P37 6.5 times (1 to 4 cores at 1.5 GHz and 5 to 8 at at 1.2 GHz).
##################### T7 Original ######################
T7, ARM Cortex-A9 1300 MHz, Android 4.1.2,
Measured 1200 MHz
Android MP-Whetstone Benchmark V1.0 17-Oct-2012 13.49
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1033.7 247.4 235.4 266.0 25.3 15.0 448.4 630.9 513.5
2T 2058.1 456.3 473.0 532.4 50.0 30.1 898.1 1198.4 1026.6
4T 4122.8 831.9 944.7 1064.6 100.7 60.1 1797.0 2392.2 2053.4
8T 4163.2 1016.0 948.2 1069.5 101.8 60.9 1808.0 2414.2 2051.5
Overall Seconds 5.28 1T, 5.34 2T, 5.42 4T, 10.81 8T
#################### T7 ARM-Intel #####################
ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 21.32
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 602.2 242.3 242.3 140.2 27.2 4.9 482.8 1425.2 239.1
2T 1208.7 481.2 484.2 280.8 55.0 9.9 970.0 2869.6 478.7
4T 2398.7 805.4 966.7 562.5 109.5 19.5 1938.2 5722.5 957.1
8T 2429.1 974.6 1076.2 562.4 110.9 19.7 1981.5 5816.1 963.6
Overall Seconds 4.94 1T, 4.93 2T, 5.08 4T, 9.93 8T
#################### T11 Original #####################
T11 Samsung EXYNOS 5250 2.0 GHz Cortex-A15, Android 4.2.2
Measured 1.7 GHz
Android MP-Whetstone Benchmark V1.1 06-Sep-2013 12.49
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1308.2 345.9 379.0 294.1 30.8 17.2 1351.4 1265.7 843.1
2T 2886.6 782.1 782.6 614.0 80.1 34.3 2775.2 2463.7 1667.5
4T 3086.0 998.6 788.1 610.6 79.2 44.5 3472.0 2526.4 2191.4
8T 2930.0 788.2 843.5 616.5 80.5 35.0 2846.0 2799.1 1686.2
Overall Seconds 3.54 1T, 3.30 2T, 6.62 4T, 13.16 8T
#################### T11 ARM-Intel ####################
ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 21.23
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 837.2 340.1 341.7 191.2 39.1 6.2 1521.1 2532.8 629.3
2T 1676.2 596.2 683.2 387.3 77.8 12.4 3056.9 5055.1 1263.6
4T 1697.7 687.5 869.4 394.5 78.1 12.4 2980.7 6518.4 1258.8
8T 1685.2 685.9 691.0 389.7 78.3 12.4 3086.3 5113.7 1262.0
Overall Seconds 4.06 1T, 4.07 2T, 8.12 4T, 16.19 8T
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Android MP-Whetstone Benchmark V1.1 06-Jul-2015 10.42
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1877.1 645.2 642.6 524.1 44.0 22.3 1364.7 1572.1 898.9
2T 3668.6 1220.2 1262.4 1021.9 85.9 43.8 2663.5 3078.4 1753.4
4T 7426.9 2375.5 2474.7 2097.7 175.7 88.2 5052.6 6240.4 3555.0
8T 7706.6 2692.2 2746.2 2186.9 180.1 90.3 5822.5 6902.7 3681.3
Overall Seconds 4.44 1T, 4.62 2T, 4.64 4T, 9.00 8T
#################### T21 ARM-Intel ####################
ARM/Intel MP-Whetstone Benchmark V1.1 22-Jul-2015 12.02
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1598.0 512.1 508.7 311.7 43.6 22.1 1142.9 2123.3 598.4
2T 3161.2 960.0 996.7 614.2 86.7 43.8 2258.9 3820.9 1194.7
4T 6348.0 1593.5 2019.5 1231.5 174.2 88.5 4471.1 8139.4 2398.3
8T 6419.6 2058.2 2077.5 1252.6 175.0 88.7 4520.9 8875.0 2409.0
Overall Seconds 4.88 1T, 5.00 2T, 5.05 4T, 9.92 8T
###################### P37 32 Bit ######################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel MP-Whetstone Benchmark V1.2 14-Nov-2016 11.41
Compiled for 32 bit ARM v7a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1050.5 304.5 268.3 171.7 35.2 17.7 459.4 905.5 338.1
2T 2134.1 540.5 524.8 350.5 68.1 34.9 1316.8 1881.0 679.3
4T 4214.0 1090.4 1022.0 689.4 136.1 70.4 2283.5 3850.4 1348.4
8T 7490.8 1969.8 1759.1 1243.8 244.5 125.3 4038.0 6074.2 2392.9
Overall Seconds 4.67 1T, 4.65 2T, 4.71 4T, 5.75 8T
Android 7.0
ARM/Intel MP-Whetstone Benchmark V1.2 11-May-2017 10.28
Compiled for 32 bit ARM v7a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1069.2 300.7 252.9 176.7 31.5 19.4 646.9 942.2 338.6
2T 2103.2 543.2 490.9 343.7 64.1 38.7 1101.2 1830.5 675.9
4T 4212.2 1072.1 958.5 686.7 128.7 77.5 2251.5 3802.1 1354.9
8T 7564.2 1931.6 1744.2 1242.6 231.8 137.1 4243.9 6856.4 2461.7
Overall Seconds 3.99 1T, 4.06 2T, 4.06 4T, 4.94 8T
###################### T22 32 Bit ######################
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
ARM/Intel MP-Whetstone Benchmark V1.2 10-Aug-2015 11.30
Compiled for 32 bit ARM v7a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 676.4 275.9 281.9 147.9 35.4 5.3 600.3 901.0 285.5
2T 1362.5 533.8 561.7 298.0 70.9 10.8 1203.1 1838.9 574.0
4T 2698.6 903.9 1071.7 594.4 141.2 21.5 2346.1 3305.5 1138.5
8T 2830.1 1463.2 1393.0 614.2 152.5 21.9 3243.9 4418.3 1171.4
Overall Seconds 4.95 1T, 4.94 2T, 5.11 4T, 10.09 8T
###################### T22 64 Bit ######################
ARM/Intel MP-Whetstone Benchmark V1.2 10-Aug-2015 11.34
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1524.8 328.6 348.8 297.6 37.3 19.9 1462579 1867.2 1238.0
2T 3062.5 688.8 697.9 596.0 75.5 39.8 2097113 3726.7 2481.3
4T 6085.4 1214.9 1360.5 1185.4 150.5 79.4 2449153 7055.0 4951.8
8T 6222.4 1495.2 1545.6 1204.2 152.2 80.6 3869846 9218.8 5154.1
Overall Seconds 4.92 1T, 4.90 2T, 5.05 4T, 9.97 8T
#################### A1 Original #######################
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s
Android MP-Whetstone Benchmark V1.1 04-Feb-2015 11.39
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 953.7 363.0 382.4 267.8 21.0 13.2 413.1 1842.4 392.3
2T 1921.2 726.0 663.5 541.4 42.6 27.0 816.1 3662.6 793.3
4T 3820.6 1419.2 1514.6 1081.5 84.1 54.0 1543.8 6292.4 1588.5
8T 4003.8 1912.9 1872.4 1114.1 86.5 56.4 2053.1 8292.6 1599.7
Overall Seconds 4.88 1T, 4.87 2T, 4.96 4T, 10.05 8T
################## A1 V1 Android 5.0 ###################
Android MP-Whetstone Benchmark V1.1 05-Nov-2015 11.06
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 748.8 405.9 411.8 367.0 11.3 11.1 898.0 2129.1 459.8
2T 1468.5 822.0 827.5 744.8 22.4 22.2 1088.8 4228.4 924.5
4T 2781.0 1242.8 1638.6 1415.5 40.3 44.3 3404.6 8283.2 1852.1
8T 3050.7 1854.5 1831.0 1566.7 45.4 45.3 4519.7 10332.5 1844.5
Overall Seconds 5.00 1T, 5.09 2T, 5.72 4T, 10.30 8T
#################### A1 ARM-Intel ######################
ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 17.35
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1916.9 691.4 691.3 497.2 35.3 27.6 10209.8 2787.3 1351.8
2T 3800.3 1377.6 1381.2 980.0 70.1 54.7 20248.0 5252.8 2748.7
4T 7604.9 2713.2 2711.8 1977.1 140.2 110.0 33906.3 9526.5 5550.8
8T 7798.1 3141.5 3627.2 2064.2 141.2 110.2 59590.6 12743.7 5711.5
Overall Seconds 4.94 1T, 5.00 2T, 5.06 4T, 10.11 8T
########### A5 ARM-Intel Dual Boot With W2 #############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel MP-Whetstone Benchmark V1.2 14-Apr-2016 17.09
Compiled for 32 bit Intel x86
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 2121.9 695.0 695.7 483.5 39.6 34.8 10102.2 2700.8 1358.9
2T 4123.2 1319.0 1351.2 903.1 78.9 67.2 19593.6 5336.0 2604.5
4T 7368.1 2394.0 2375.9 1668.8 139.0 119.8 35711.8 9359.2 4603.0
8T 7391.0 2397.4 2769.0 1658.4 137.7 121.8 36643.4 9953.9 4670.9
Overall Seconds 4.88 1T, 5.04 2T, 5.84 4T, 11.52 8T
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-Whetstone Benchmark V1.2 21-Oct-2016 14.34
Compiled for 32 bit Intel x86
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1929.0 566.4 615.3 440.7 38.1 28.7 9518.0 2440.1 1235.3
2T 3528.5 912.9 1188.8 832.1 65.0 57.9 13330.0 4114.1 2272.6
4T 5295.0 1821.0 1784.7 1305.4 95.6 88.5 23671.1 6465.3 3461.3
8T 6406.2 2158.8 2247.6 1588.9 128.2 117.4 24747.2 8243.7 4403.3
Overall Seconds 4.81 1T, 5.38 2T, 7.72 4T, 14.07 8T
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-Whetstone Benchmark V1.2 11-Nov-2016 21.33
Compiled for 64 bit Intel x86_64
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 2189.0 524.1 488.1 402.0 44.7 41.7 1351656.1 1894.8 1758.8
2T 4036.7 1108.5 1178.5 780.0 78.1 73.2 4361015.9 4752.1 3140.7
4T 5652.4 1694.5 1270.9 1191.6 111.8 95.4 2680231.8 5593.2 4688.4
8T 7075.1 2126.0 2068.2 1522.4 147.6 134.8 3600866.1 6987.4 5694.7
Overall Seconds 4.84 1T, 5.22 2T, 8.26 4T, 14.49 8T
################# W1 Windows 10 32 bit #################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1816.7 568.3 580.3 477.8 34.9 26.9 1395.8 1100.4 7327.8
2T 3469.7 1145.9 1086.9 905.6 66.1 52.6 2684.4 2118.7 13383.7
4T 6337.0 2026.1 2029.6 1658.4 121.2 95.1 4886.7 3800.8 24933.3
8T 6900.2 2162.4 2326.0 1870.2 134.7 98.8 6089.9 4071.4 29659.9
Overall Seconds 4.80 1T, 5.02 2T, 5.53 4T, 13.07 8T
################# W1 Windows 10 64 bit #################
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1994.3 537.7 536.4 476.9 42.3 28.8 1420.0 1099.2 7305.8
2T 3760.6 1080.6 1075.4 894.9 79.9 53.3 2842.5 2115.5 12762.4
4T 6946.5 1850.0 1883.3 1655.9 146.8 101.3 4946.3 3787.9 25246.0
8T 7556.2 1891.4 2159.3 1867.7 163.1 104.8 5362.5 4283.3 26001.8
Overall Seconds 4.89 1T, 5.19 2T, 5.66 4T, 13.26 8T
######## W2 Windows 10 32 bit Dual Boot With A5 ########
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86
Start of test Fri Apr 15 16:28:12 2016
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1776.5 561.5 581.1 466.3 34.1 26.2 1402.2 1093.2 6981.4
2T 3364.9 1014.1 1020.8 832.8 65.6 51.6 2643.0 2027.2 12415.1
4T 6316.1 1987.1 2016.5 1655.2 121.2 94.2 4860.8 3793.2 24941.8
8T 6563.4 2372.8 2031.4 1850.4 122.8 96.6 5667.8 3844.8 28561.7
Overall Seconds 4.75 1T, 5.06 2T, 5.39 4T, 11.56 8T
######## W2 Windows 10 64 bit Dual Boot With A5 ########
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64
Start of test Fri Apr 15 16:38:09 2016
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 1954.1 506.3 538.0 469.7 40.4 29.1 1411.3 1091.8 7280.9
2T 3615.7 1011.5 989.7 873.6 77.1 51.7 2477.6 1907.0 13107.0
4T 6941.8 1877.9 1879.3 1652.7 147.1 100.9 4946.8 3789.6 25046.5
8T 7124.5 2128.2 1975.4 1705.5 149.7 103.3 5058.7 4284.8 28862.8
Overall Seconds 4.95 1T, 5.36 2T, 5.59 4T, 11.72 8T
==================================================================
Top end 2015 PC - Core i7-4820K at 3.9 GHz
==================================================================
32 Bit
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 5273.9 1114.8 1119.2 921.1 129.4 90.7 3404.0 5351.3 22213.6
2T 11031.8 2238.4 2304.9 1938.0 271.1 189.4 6973.5 11713.2 46821.3
4T 21347.8 4713.1 4718.0 3879.9 493.4 375.2 14335.7 21161.6 89584.4
8T 39679.6 9374.0 9397.5 7687.6 874.8 726.5 24631.8 23418.6 93465.8
Overall Seconds 4.97 1T, 4.76 2T, 4.99 4T, 5.59 8T
64 Bit
MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 6200.6 1236.5 1236.2 870.8 206.0 108.8 3359.1 4767.4 23413
2T 13050.4 2603.8 2606.2 1891.4 432.6 217.5 7076.8 10041.6 46840
4T 25336.0 5195.2 5211.7 3707.1 832.8 422.9 13626.9 16962.6 78346
8T 46141.7 10293.2 10379.0 7242.4 1332.7 814.2 24394.5 23451.3 93588
Overall Seconds 4.82 1T, 4.60 2T, 4.91 4T, 5.50 8T
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-Whetstone Benchmark V1.2 21-Oct-2016 12.50
Compiled for 32 bit Intel x86
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 5425.2 1343.1 1343.4 868.1 131.8 87.8 55255 11089 4899
2T 10969.5 2773.7 2475.7 1735.9 274.5 175.4 114023 23300 10637
4T 22989.7 5587.5 5609.8 3889.2 547.5 362.3 131855 44619 19739
8T 41099.9 10957 10752 7683.9 881.4 702.7 235813 46954 23348
Overall Seconds 4.91 1T, 4.80 2T, 4.74 4T, 5.76 8T
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-Whetstone Benchmark V1.2 11-Nov-2016 14.38
Compiled for 64 bit Intel x86_64
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 6033.0 1343.3 1342.8 831.9 162.5 109.3 33291632 11076 5540
2T 12746.2 2673.1 2827.1 1834.6 330.7 231.9 30432979 23301 9592
4T 25953.4 5598.0 5642.9 3788.8 662.1 473.8 44736026 34693 23308
8T 46218.9 11093 11108 7685.5 1035 889.3 99650183 46841 23415
Overall Seconds 5.14 1T, 5.07 2T, 5.04 4T, 6.10 8T
|
For further details see Dhrystone Benchmark above and the following, including further results Android MultiThreading Benchmark Apps. This multithreading benchmark runs using 1, 2, 4 and 8 threads, executing multiple copies of the same program. An initial calibration, using a single thread, determines the number of passes needed for an overall execution time of 1 second. Then all threads are run using the same pass count, running time being extended when there are more threads than CPUs. The same calculations are carried out on each thread. Separate data arrays are used for each thread but some variables can be used by all threads. The latter is probably responsible for failure to increase throughput much, using multiple threads or, in the case of A1, with the Atom CPU, reduced throughput using more than one thread.
On all the initial results shown, there was little difference in performance between the original and the new 32 bit version but T22, with the Cortex-A53, produced significant gains at 64 bits.
T21, the Kindle Fire with a Quad Core Qualcomm Snapdragon 800 CPU, failed to run using the new ARM/Intel version, and obtained a rather excessive score with 8 threads via the original benchmark (but similar to a possible 4 x 2850).
ARM vs Intel MP - Note that the systems using ARM processors increased performance with multiple threads but those with Intel CPUs did not.
32 Bit vs 64 Bit - The latter was typically 70% faster via Android and REMIX/Android but much less using the Windows compilations.
VAX MIPS or DMIPS
Threads
System CPU MHz Android 1 2 4 8 None
See
Original ARM Version
A1 Z3745 1866 x4 4.4.2 2360 1394 1334 1321 1840
A1 Z3745 1866 x4 5.0 2411 1633 1313 1298 2488
T7 v7-A9 1200 x4 4.1.2 1584 2749 3836 3569 1610
T22 v8-A53 1300 x4 5.0.2 1686 2943 4232 4323 1683
T11 v7-A15 1700 x2 4.2.2 2271 4281 4326 4171 3189
T21 QU-800 2150 x4 4.4.3 2850 4395 7736 11821 3854
ARM/Intel 32 Bit Version
A1 Z3745 1866 x4 4.4.2 2365 1322 1323 1319 2451
A5 ## z8300 1840 x4 5.1 2256 1155 1163 1054 2318
T7 v7-A9 1200 x4 4.1.2 1464 2399 3575 3737 1317
T22 v8-A53 1300 x4 5.0.2 1412 2559 4038 4291 1423
P37 v8-A53 1500 x8 6.0.1 1720 2923 4839 2618 1649
P37 v8-A53 1500 x8 7.0 1575 2899 4955 2697 1722
T11 v7-A15 1700 x2 4.2.2 2295 4057 3902 4096 2551
T21 QU-800 2150 x4 4.4.3 Failed to run 3319
P38 v8-A57 2700 x4 6.0.1 3094 5612 6849 3776
+V8-A53 1300 x4
R1=Atm Z8300 1840 x4 6.0.1 2174 1150 1170 1139 2390
R2 Core i7 3900 x4 6.0.1 9919 5685 5305 6076 10489
ARM/Intel 64 Bit Version
T22 v8-A53 1300 x4 5.0.2 2548 4311 5560 5613 2569
R1=Atm Z8300 1840 x4 6.0.1 3900 1677 1709 1666 3769
R2 Core i7 3900 x4 6.0.1 16740 7595 7271 8612 17003
Intel/Windows 32 Bit Version
W1 Z8300 1840 x4 Win10 3284 1477 1235 1313 3044
W2 ## Z8300 1840 x4 Win10 2521 1730 1333 1285 2906
PC Core i7 3900 x4 Win10 12776 7175 6116 7876 12090
Intel/Windows 64 Bit Version
W1 Z8300 1840 x4 Win10 3745 1625 1400 1436 3291
W2 ## Z8300 1840 x4 Win10 3717 1566 1386 1441 3195
PC Core i7 3900 x4 Win10 15129 8535 7278 8769 11686
## A5 and W2 Same Dual Boot Tablet
=Atm R1 and W1 Same Tablet
R2 and PC Same PC
R1 and R2 Android via REMIX
|
This is a multithreading version of the above. Further details and results can be found in here. The benchmark is run on 100x100, 500x500 and 1000x1000 matrices using 0, 1, 2 and 4 separate threads, the programming code for zero theads being the same as the above example. Multithreading performance, using this standard linear equation solver, is severely degraded, due to overheads, the zero thread results being the only ones of real use and the others fairly constant, probably running one thread at a time and limited by RAM speed.
Performance of A1, with the Intel CPU and using native Intel compilation, is shown to be twice as fast as the Houdini ARM to Intel converted version, except at N = 1000, which is mainly dependent on calculations from data in RAM. Then, when running the ARM only version, using Android upgraded to 5.0, the performance difference was considerably reduced.
On ARM CPUs, speeds obtained from 32 bit and 64 bit compilations were similar, due to the programs use a limited number of identical NEON intrinsic functions. For the same reason, the new ARM/Intel version produced similar results as the original.
32 Bit vs 64 bit - Results from 64 bit versions were generally slightly faster than those compiled for 32 bits.
Android vs Windows - Intel based Android and REMIX/Android speeds were around three times faster than Windows results on the Atom CPU and twice as fast on the Core i7.
The program checks that the same numeric results are produced, irrespective of the number of threads used, at each matrix size. Then, due to rounding effects, these are slightly different from ARM and Intel hardware, as shown below.
MFLOPS 0 to 4 Threads, N 100, 500, 1000
##################### T7 Original ######################
Android Linpack NEON SP MP Benchmark 31-Jan-2013 12.14
T7, ARM Cortex-A9 1300 MHz, Android 4.1.2,
Threads None 1 2 4
N 100 413.47 45.95 48.22 48.34
N 500 253.08 187.51 189.69 189.94
N 1000 148.76 135.49 136.08 136.17
#################### T7 ARM-Intel #####################
ARM/Intel Linpack NEON SP MP Benchmark 14-May-2015 15.40
Threads None 1 2 4
N 100 385.49 28.79 29.06 29.25
N 500 272.07 184.85 183.70 183.18
N 1000 147.09 131.92 132.44 130.05
#################### T11 Original #####################
Android Linpack NEON SP MP Benchmark 13-Aug-2013 23.28
T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2
Threads None 1 2 4
N 100 1399.82 54.86 55.31 54.66
N 500 1154.21 434.16 434.06 436.97
N 1000 571.26 482.57 487.25 485.80
#################### T11 ARM-Intel ####################
ARM/Intel Linpack NEON SP MP Benchmark 14-May-2015 15.44
Threads None 1 2 4
N 100 1497.90 61.13 63.13 61.87
N 500 1399.10 491.49 489.29 494.69
N 1000 586.14 499.00 504.97 497.49
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Android Linpack NEON SP MP Benchmark 26-Jul-2015 11.46
Threads None 1 2 4
N 100 1311.08 12.38 12.93 15.05
N 500 2271.56 344.04 419.52 381.73
N 1000 837.30 540.99 523.52 564.87
#################### T21 ARM-Intel ####################
ARM/Intel Linpack NEON SP MP Benchmark 26-Jul-2015 11.51
Threads None 1 2 4
N 100 1308.07 14.89 11.77 11.63
N 500 2341.17 407.96 481.02 415.12
N 1000 901.21 551.80 566.77 564.31
###################### P37 32 Bit ######################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Nov-2016 12.09
Compiled for 32 bit ARM v7a
Threads None 1 2 4
N 100 555.85 26.39 26.62 26.78
N 500 459.23 224.55 207.08 217.47
N 1000 359.47 270.92 275.58 272.08
Android 7.0
ARM/Intel Linpack NEON SP MP Benchmark 1.2 09-May-2017 11.18
Compiled for 32 bit ARM v7a
Threads None 1 2 4
N 100 560.74 25.96 26.35 26.41
N 500 501.69 234.14 237.16 236.78
N 1000 393.49 305.86 310.71 309.85
###################### T22 32 Bit ######################
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
ARM/Intel Linpack NEON SP MP Benchmark 1.2 13-Aug-2015 12.52
Compiled for 32 bit ARM v7a
Threads None 1 2 4
N 100 460.74 22.35 23.16 23.82
N 500 480.63 336.52 339.94 303.66
N 1000 470.02 405.86 403.01 405.98
###################### T22 64 Bit ######################
ARM/Intel Linpack NEON SP MP Benchmark 1.2 13-Aug-2015 12.57
Compiled for 64 bit ARM v8a
Threads None 1 2 4
N 100 548.67 27.70 33.93 37.00
N 500 470.04 285.95 297.79 301.67
N 1000 519.02 441.84 443.47 441.91
#################### A1 Original #######################
Android Linpack NEON SP MP Benchmark 07-Feb-2015 18.42
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Threads None 1 2 4
N 100 452.39 21.00 23.48 17.48
N 500 663.38 275.56 88.66 312.71
N 1000 617.04 380.60 191.26 195.61
################## A1 V1 Android 5.0 ###################
Android Linpack NEON SP MP Benchmark 05-Nov-2015 11.49
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 662.21 25.84 25.59 25.43
N 500 1022.76 317.51 310.52 311.49
N 1000 861.75 549.32 558.52 547.91
#################### A1 ARM-Intel ######################
ARM/Intel Linpack NEON SP MP Benchmark 1.2 06-Nov-2015 22.11
Compiled for 32 bit Intel x86
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 979.81 49.01 42.69 45.34
N 500 1160.24 369.43 349.04 334.87
N 1000 716.94 560.86 535.46 486.61
########## A5 ARM-Intel Dual Boot With W2 ############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Apr-2016 17.22
Compiled for 32 bit Intel x86
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 1131.44 16.52 16.05 17.00
N 500 1427.56 234.84 231.15 266.46
N 1000 874.35 474.20 423.36 577.54
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
Android Linpack NEON SP MP Benchmark 11-Nov-2016 21.35
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 764.63 23.72 18.72 8.77
N 500 1387.27 153.52 153.30 145.98
N 1000 880.43 360.42 357.60 348.40
ARM/Intel Linpack NEON SP MP Benchmark 1.2 21-Oct-2016 14.38
Compiled for 32 bit Intel x86
Threads None 1 2 4
N 100 1095.33 53.33 57.76 57.01
N 500 1589.75 493.68 512.28 511.92
N 1000 886.08 638.19 635.86 638.70
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Aug-2016 22.33
Compiled for 64 bit Intel x86_64
Threads None 1 2 4
N 100 1221.20 60.54 65.60 64.04
N 500 1405.14 567.66 554.66 568.40
N 1000 1058.21 729.60 734.22 747.03
################# W1 Windows 10 32 bit #################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
Linpack Single Precision MultiThreaded Benchmark
32 Bit, N=500, Wed Dec 23 21:01:12 2015
Threads 0 1 2 4
MFLOPS 740.71 256.40 226.44 163.99
Linpack Double Precision MultiThreaded Benchmark
32 Bit, N=500, Wed Dec 23 21:00:30 2015
Threads 0 1 2 4
MFLOPS 480.73 194.42 196.76 148.52
################# W1 Windows 10 64 bit #################
Linpack Single Precision MultiThreaded Benchmark
64 Bit, N=500, Wed Dec 23 21:17:19 2015
Threads 0 1 2 4
MFLOPS 707.50 263.47 240.46 197.31
Linpack Double Precision MultiThreaded Benchmark
64 Bit, N=500, Wed Dec 23 21:16:42 2015
Threads 0 1 2 4
MFLOPS 488.12 205.02 202.39 165.47
######## W2 Windows 10 32 bit Dual Boot With A5 ########
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
Linpack Single Precision MultiThreaded Benchmark
32 Bit, N=500, Fri Apr 15 16:23:55 2016
Threads 0 1 2 4
MFLOPS 626.40 231.31 183.87 129.48
Linpack Double Precision MultiThreaded Benchmark
32 Bit, N=500, Fri Apr 15 16:23:21 2016
Threads 0 1 2 4
MFLOPS 412.89 221.03 148.56 94.62
######## W2 Windows 10 64 bit Dual Boot With A5 ########
Linpack Single Precision MultiThreaded Benchmark
64 Bit, N=500, Fri Apr 15 16:36:10 2016
Threads 0 1 2 4
MFLOPS 662.15 241.59 228.59 195.97
ResidN 3.96 3.96 3.96 3.96
Linpack Double Precision MultiThreaded Benchmark
64 Bit, N=500, Fri Apr 15 16:35:42 2016
Threads 0 1 2 4
MFLOPS 527.64 195.54 180.62 154.02
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
Android Linpack NEON SP MP Benchmark 11-Nov-2016 14.40
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 3829.87 113.83 90.99 52.76
N 500 6053.91 1024.25 1014.78 985.31
N 1000 6601.66 2628.01 2568.70 2522.01
ARM/Intel Linpack NEON SP MP Benchmark 1.2 21-Oct-2016 12.51
Compiled for 32 bit Intel x86
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 4738.29 284.27 288.92 289.43
N 500 7078.15 3328.75 3287.02 3288.17
N 1000 7556.05 5459.01 5478.02 5461.30
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel Linpack NEON SP MP Benchmark 1.2 11-Nov-2016 14.42
Compiled for 64 bit Intel x86_64
MFLOPS 0 to 4 Threads, N 100, 500, 1000
Threads None 1 2 4
N 100 5622.61 318.61 317.19 320.32
N 500 7355.32 3448.71 3577.17 3541.12
N 1000 7734.14 5566.40 5622.47 5653.65
#################### PC Windows 32 Bit ##################
Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Windows 10,
Linpack Single Precision MultiThreaded Benchmark
32 Bit, N=500, Tue Nov 15 11:29:25 2016
Threads 0 1 2 4
MFLOPS 4018.79 1674.30 1583.93 1199.23
Linpack Double Precision MultiThreaded Benchmark
32 Bit, N=500, Tue Nov 15 11:29:03 2016
Threads 0 1 2 4
MFLOPS 3307.45 1521.69 1453.19 1185.62
#################### PC Windows 64 Bit ##################
Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Windows 10,
Linpack Single Precision MultiThreaded Benchmark
64 Bit, N=500, Tue Nov 15 11:37:57 2016
Threads 0 1 2 4
MFLOPS 4036.32 1891.33 1782.15 1345.03
Linpack Double Precision MultiThreaded Benchmark
64 Bit, N=500, Tue Nov 15 11:37:24 2016
Threads 0 1 2 4
MFLOPS 3370.00 1692.80 1590.42 1304.35
################### Numeric Results ###################
NR=norm resid RE=resid MA=machep X0=x[0]-1 XN=x[n-1]-1
Single Precision
N 100 500 1000
ARM
NR 1.60 3.96 11.32
RE 3.80277634e-05 4.72068787e-04 2.70068645e-03
MA 1.19209290e-07 1.19209290e-07 1.19209290e-07
X0 -1.38282776e-05 5.26905060e-05 1.62243843e-04
XN -7.51018524e-06 3.26633453e-05 -6.65783882e-05
Intel
NR 1.68 3.96 11.39
RE 4.00543213e-05 4.72545624e-04 2.71725655e-03
MA 1.19209290e-07 1.19209290e-07 1.19209290e-07
X0 -1.38282776e-05 5.26905060e-05 1.62243843e-04
XN -7.51018524e-06 3.26633453e-05 -6.65783882e-05
Double Precision Intel SSE2
5.76
1.27986510e-012
2.22044605e-016
5.59552404e-014
3.39728246e-014
|
This is a multithreading version of the above. and here for further results. In the original MP-BusSpdi benchmark, all threads read data from the beginning. With large shared caches, this could lead to exaggerated data transfer speeds for RAM based data, using multiple threads. The revised MP-BusSpd2i attempts to avoid this by arranging for threads to have staggered starting points, but each still reading all the data, besides having a much longer running time for consistent scores. Performance using a single thread is similar to the non-threaded version and it is clear that multiple threads are needed to demonstrate maximum throughput. As usual, maximum RAM speeds can be estimated from burst transfer results, such as 16 times Inc16 MB/second. some results are provided below.
MP-BusSpdi.apk can be downloded from here.
Using A1, with the Intel Atom CPU, the initial Houdini ARM to Intel conversion speeds were slightly slower than the results from the native code compilations, but this was made up on running via Android 5.
Results for the original version, running on ARM CPUs, are not all shown, as they were similar to those for the new version. See here. On T22, with the Cortex-A53, performance could be more than twice as fast, reading all data, using the 64 bit compilation.
The problem associated with shared caches is probably best identified by wide variations in the burst reading tests, that are not apparent in the long running versions (see T7 and T21 below ).
Following the main tables are comparisons of the Read All speeds,for the revised benchmarks. They are based on MB/second/MHz for cached based data and MB/second using RAM.
MP Efficiency - The L1 cache based 4 thread gains over 1 thread ratios shown indicate more than 3.5 times on ARM CPUs but much less from Intel processors, but can be similar using L3 cache. There were also some significant gains reading data from RAM. However, this was influenced by relatively faster Intel speed, using one thread.
64 Bit vs 32 Bit - Windows tests indicated similar performance but 64 bit compilations were much faster than at 32 bits via Android, even using Intel CPUs via REMIX.
Some of the above might be due to the different compilers used.
#################### T7 ARM-Intel #####################
T7, ARM Cortex-A9 1.2 GHz, DDR3-1333, 5.3 GB/s Android 4.1.2,
4 x 32 KB L1 cache, 1 MB shared L2 cache
ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.35
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2853 3392 3376 3511 3551 3494
2T 2857 3389 3542 5540 5730 5595
4T 7257 10326 10289 10997 11373 11100
8T 6584 10325 10485 11175 11322 11189
122.9 1T 362 379 347 546 623 978
2T 516 530 508 726 1227 1840
4T 598 658 548 1181 1556 2657
8T 721 733 736 1181 1548 2653
12288 1T 58 57 84 123 173 334
2T 111 111 182 248 348 664
4T 87 85 276 463 687 1290
8T 154 107 147 429 441 1242
Total Elapsed Time 12.7 seconds
########## T7 New Long Version
ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.59
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2166 2774 3181 3307 3377 3263
2T 3924 5188 5207 5754 5759 5805
4T 7570 10011 10252 11165 11375 11777
8T 3510 4786 9011 8318 11351 11544
122.9 1T 383 409 359 558 663 983
2T 525 541 520 741 1241 1814
4T 739 752 753 1219 1590 2776
8T 735 741 753 1218 1607 2737
49152 1T 56 51 81 126 172 330
2T 65 67 107 196 335 620
4T 70 68 108 215 426 835
8T 70 68 109 215 428 851
Total Elapsed Time 48.2 seconds
Maximum RAM Speed Estimate = 68 x 16 = 1088 MB/second
#################### T11 ARM-Intel ####################
T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2
Dual core, 2 x 32 KB L1 cache, 1 MB shared L2 cache
ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.45
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2165 3591 4256 5587 5998 6109
2T 4121 6469 9530 11381 11846 11936
4T 4106 6438 8827 6793 9802 12080
8T 4098 6390 9534 10141 10996 11603
122.9 1T 464 740 1173 2395 3276 3340
2T 579 989 1934 3994 5431 5792
4T 579 988 1930 3873 5469 5821
8T 580 985 1915 3999 5408 5812
12288 1T 134 172 211 462 602 1904
2T 269 343 387 934 1217 2685
4T 252 231 374 768 991 2625
8T 231 254 367 781 1104 2782
Total Elapsed Time 12.1 seconds
########## T11 New Long Version
ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 17.07
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 3499 4539 5499 5505 6134 6045
2T 3775 7202 8377 10605 10457 11319
4T 3982 6676 7687 9326 9707 10807
8T 2546 3643 7891 8003 10725 11097
122.9 1T 672 901 1336 2784 3274 3334
2T 568 969 1931 3894 5427 5221
4T 574 971 1912 3831 5256 4811
8T 559 971 1917 3878 5387 5162
49152 1T 140 142 193 575 989 1499
2T 221 223 342 769 1379 2355
4T 228 223 344 783 1382 2376
8T 223 223 342 787 1385 2352
Total Elapsed Time 49.9 seconds
Maximum RAM Speed Estimate = 223 x 16 = 2568 MB/second
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s
L1 caches 4 x 16 KB, L2 cache shared 2048 KB
Android MP-BusSpd v7 Benchmark V1.1 29-Jun-2015 18.37
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2580 2206 5048 5176 5679 5989
2T 4062 5175 9340 9868 10971 11281
4T 4688 10324 16552 17196 21714 23708
8T 8467 9834 16698 18183 21936 23693
122.9 1T 1152 1052 2068 3035 3927 5723
2T 1710 1840 3094 5001 7963 11475
4T 2047 2002 5031 9267 14698 22920
8T 2235 2275 5223 9348 14234 21783
12288 1T 262 382 508 867 1466 2661
2T 464 766 1049 1754 3186 5735
4T 612 1018 1796 3149 5892 9095
8T 575 680 1277 2308 4987 7948
Total Elapsed Time 12.7 seconds
Impossible Maximum RAM Speed 1018 x 16 = 16288 MB/second
#################### T21 ARM-Intel ####################
ARM/Intel MP-BusSpd v7 Benchmark V1.1 23-May-2015 17.05
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 1840 2073 3512 3554 4829 5243
2T 3432 4591 7128 7651 9120 9821
4T 4398 7855 13752 15428 18530 20235
8T 6692 9507 13857 16110 18143 18796
122.9 1T 860 753 2011 2841 3205 5282
2T 1505 1609 3076 5038 8089 10421
4T 1924 1981 4299 7588 14614 20754
8T 1909 1988 4264 7980 13884 19027
12288 1T 270 379 538 856 1626 2859
2T 471 677 1098 1849 3304 5924
4T 549 787 1066 1874 6274 10781
8T 713 853 1649 2258 4664 8321
Total Elapsed Time 13.1 seconds
########## T21 New Long Version
ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.39
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2247 2616 4010 4443 4909 5614
2T 3558 4725 7241 9048 9747 10892
4T 6074 8303 13442 16937 18525 21068
8T 3998 5106 14314 13615 18200 20740
122.9 1T 874 1198 2024 2935 4529 5345
2T 1686 1702 3174 5357 7688 10545
4T 1988 2139 4465 8171 14969 21169
8T 1972 2139 4468 8195 15261 21132
49152 1T 292 406 516 899 1663 2929
2T 449 541 962 1569 2851 4776
4T 495 605 1109 2439 4161 8243
8T 530 564 1156 2149 4172 7907
Total Elapsed Time 48.0 seconds
Maximum RAM Speed Estimate = 605 x 16 = 9680 MB/second
#################### P37 32 Bit V1.2 ####################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel MP-BusSpd2 Benchmark V1.2 14-Nov-2016 12.11
Compiled for 32 bit ARM v7a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2060 2433 2430 2487 2555 2625
2T 3966 4727 4886 4964 5091 5167
4T 6843 8675 9208 9581 10025 10254
8T 5360 6326 13507 10947 15929 16546
122.9 1T 666 672 1231 2000 2368 2524
2T 1029 1036 1993 3570 4766 5089
4T 1062 1098 2144 4166 7694 9835
8T 1737 1793 3540 6473 10502 14201
49152 1T 164 172 339 658 1247 2014
2T 289 307 591 1124 2192 3839
4T 410 353 813 1692 3015 6058
8T 429 426 842 1495 2949 5790
Total Elapsed Time 56.3 seconds
Maximum RAM Speed Estimate = 426 x 16 = 6816 MB/second
Android 7.0
ARM/Intel MP-BusSpd2 Benchmark V1.2 11-May-2017 10.35
Compiled for 32 bit ARM v7a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2151 2396 2448 2516 2589 2632
2T 4042 4460 4824 4893 5336 5192
4T 6828 8657 9409 9755 10120 10339
8T 5401 6897 13508 11464 15960 16792
122.9 1T 674 692 1267 2019 2402 2584
2T 1031 1043 1999 3591 4737 5047
4T 1064 1164 2168 4185 7761 9879
8T 1734 1857 3429 6438 10447 15287
49152 1T 163 172 337 674 1236 2098
2T 297 282 566 1101 2175 3735
4T 431 390 751 1470 3053 5716
8T 406 369 786 1621 2897 6031
Total Elapsed Time 57.0 seconds
###################### T22 32 Bit ######################
T22, Tab 2 A8-50, 1.3 GHz quad core 64 bit ARM Cortex-A53
Single Channel RAM, LPDDR3 666 MHz, 5.3 GB/second
4 x 32 KB L1 cache, 512 KB L2 cache
ARM/Intel MP-BusSpd Benchmark V1.2 12-Aug-2015 16.13
Compiled for 32 bit ARM v7a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 1849 2140 2079 2211 2270 2297
2T 3663 4252 4294 4400 4370 4580
4T 4630 5574 5691 5893 6015 6083
8T 5331 5775 6033 6622 7968 8023
122.9 1T 597 621 1119 1815 2135 2237
2T 869 943 1644 2992 3740 4412
4T 949 951 1922 3736 6468 7779
8T 948 978 1911 3717 6464 7542
12288 1T 123 174 344 678 1215 1840
2T 243 310 672 1332 2383 3974
4T 302 285 594 1282 2271 4606
8T 279 295 654 1198 2749 4660
Total Elapsed Time 12.8 seconds
########## T22 Long Version
ARM/Intel MP-BusSpd2 Benchmark V1.2 12-Aug-2015 16.14
Compiled for 32 bit ARM v7a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 1877 2124 2176 2266 2296 2343
2T 3625 4198 4341 4468 4536 4613
4T 5733 7541 8293 8830 8024 9042
8T 2985 3829 7438 6117 8108 8923
122.9 1T 604 625 1142 1846 2150 2284
2T 924 950 1793 3277 4270 4504
4T 962 989 1939 3765 6798 8862
8T 965 993 1933 3748 6651 8239
49152 1T 165 175 344 677 1285 1979
2T 234 238 482 961 1907 3547
4T 266 298 562 1224 2296 4478
8T 272 275 538 1098 2149 4282
Total Elapsed Time 48.8 seconds
Maximum RAM Speed Estimate = 298 x 16 = 4768 MB/second
###################### T22 64 Bit ######################
ARM/Intel MP-BusSpd2 Benchmark V1.2 12-Aug-2015 16.18
Compiled for 64 bit ARM v8a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 2610 2472 2586 2727 2748 5841
2T 4404 4681 4994 5369 5420 11297
4T 6546 8125 9105 10243 10319 20610
8T 3380 4023 7919 7146 9871 19852
122.9 1T 604 621 1110 1872 2446 5100
2T 919 948 1855 3433 4853 10037
4T 961 974 1984 3924 7491 14935
8T 963 942 1931 3915 7572 14689
49152 1T 173 177 340 692 1300 2653
2T 266 241 479 968 1883 3724
4T 304 277 556 1130 2126 4328
8T 279 278 544 1138 2179 4275
Total Elapsed Time 49.4 seconds
#################### A1 Original #######################
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s
4 x 24 KB L1, 2 x 1 MB L2
Android MP-BusSpd v7 Benchmark V1.1 05-May-2015 13.02
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 3990 4458 6123 6512 6438 6729
2T 3894 5699 8948 10299 11800 12555
4T 5046 7109 11952 14750 15533 23304
8T 4533 7464 13097 16970 21674 22225
122.9 1T 1304 1613 2291 2661 3667 5063
2T 2568 3145 4529 5365 7440 10147
4T 4117 4801 7963 7495 8239 18911
8T 3130 5016 7355 8543 11648 15845
12288 1T 190 265 601 1203 2316 3832
2T 244 448 995 1771 3599 6575
4T 427 584 860 1741 3439 7449
8T 395 510 855 1613 3547 6776
Total Elapsed Time 13.5 seconds
################## A1 V1 Android 5.0 ###################
Android MP-BusSpd v7 Benchmark V1.1 05-Nov-2015 11.52
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5509 6152 6796 6937 7060 7056
2T 4635 6757 9294 11284 12612 13486
4T 4545 9383 15861 21378 15369 23493
8T 4473 8723 15965 18476 23438 22747
122.9 1T 1467 1782 2386 2737 3799 5299
2T 2225 3460 4683 5421 7507 10514
4T 2493 5703 8165 9941 11313 11259
8T 4119 5481 6992 8726 12919 17166
12288 1T 213 253 589 1176 2309 3903
2T 252 396 842 1668 3325 6759
4T 404 437 1130 1659 4562 6911
8T 414 507 836 1902 3607 6670
Total Elapsed Time 13.9 seconds
#################### A1 ARM-Intel ######################
ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.28
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5925 6494 6778 6979 7047 7026
2T 3966 7029 9689 11689 12856 13654
4T 4438 8698 16739 22057 23946 25729
8T 4455 8619 15787 19934 22576 20804
122.9 1T 1490 1975 2360 2802 3818 5330
2T 2881 3798 4647 5531 7536 10546
4T 4452 6338 5910 10217 14650 19903
8T 4096 5075 6264 9213 12610 15821
12288 1T 206 273 593 1198 2343 3935
2T 276 455 842 1821 3319 6591
4T 445 730 1401 2076 4457 7525
8T 424 539 954 1829 3688 7064
Total Elapsed Time 13.0 seconds
########## A1 New Long Version
ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.50
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5431 6110 6780 6262 6655 7313
2T 3550 4464 7375 9825 11777 12442
4T 2027 4442 4399 8841 17611 23509
8T 983 2477 5063 4433 8568 15867
122.9 1T 1499 1991 2357 2839 3818 5382
2T 2816 3808 4708 5592 7557 10677
4T 4316 6313 7991 9816 14335 19993
8T 4235 5610 7917 8791 12828 19661
49152 1T 215 275 611 1183 2328 3922
2T 276 435 787 1671 3323 6507
4T 398 455 884 1754 3490 6971
8T 376 511 867 1746 3512 7510
Total Elapsed Time 48.6 seconds
Maximum RAM Speed Estimate = 511 x 16 = 8176 MB/second
########### A5 ARM-Intel Dual Boot With W2 #############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel MP-BusSpd2 Benchmark V1.2 14-Apr-2016 17.28
Compiled for 32 bit Intel x86
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5322 6275 6475 6901 6959 6925
2T 4625 4163 6792 8964 10879 11027
4T 2221 3775 4091 8006 15158 19631
8T 1178 1840 3907 3884 8002 15691
122.9 1T 1438 1891 2342 2601 3477 4957
2T 2509 3489 4597 5115 6807 9275
4T 3591 4849 6905 8356 11204 14596
8T 3868 5327 7014 7860 10754 15998
49152 1T 179 205 391 802 1372 3023
2T 238 310 495 1204 2397 4559
4T 240 336 653 1170 2008 4969
8T 291 321 681 1316 2378 5329
Total Elapsed Time 50.3 seconds
Maximum RAM Speed Estimate = 336 x 16 = 5376 MB/second
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-BusSpd Benchmark V1.2 21-Oct-2016 14.29
Compiled for 32 bit Intel x86
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5659 5848 5977 6263 6100 6481
2T 4075 6144 7960 9632 10899 11283
4T 3766 6335 7923 9544 10679 11425
8T 3531 6367 7693 7739 8336 7918
122.9 1T 1389 1492 2456 2702 1564 5013
2T 2080 2904 2943 3073 4785 7541
4T 1995 2761 4446 4114 5075 8011
8T 1673 2504 2711 3097 6693 8366
12288 1T 190 230 453 877 1681 2396
2T 222 246 405 1287 2291 3926
4T 180 299 588 1469 2951 5002
8T 303 380 701 1265 2476 6796
Total Elapsed Time 14.2 seconds
Maximum RAM Speed Estimate = 380 x 16 = 6080 MB/second
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-BusSpd2 Benchmark V1.2 11-Nov-2016 21.25
Compiled for 64 bit Intel x86_64
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 3870 3871 4281 4386 4382 16766
2T 3290 3312 5048 5924 6729 22511
4T 4909 6232 6866 7231 7745 27366
8T 2662 3012 6328 6364 8818 26211
122.9 1T 1506 1534 2471 2433 3510 9204
2T 2071 2479 3727 4428 5757 17952
4T 2636 2833 5013 4918 7263 22352
8T 2552 3360 5211 6178 7819 23389
49152 1T 243 245 565 1037 1469 3522
2T 329 370 565 1425 2421 4783
4T 329 387 673 1501 3148 4866
8T 402 433 858 1681 2838 6987
Total Elapsed Time 53.8 seconds
Maximum RAM Speed Estimate = 433 x 16 = 6928 MB/second
################# W1 Windows 10 32 bit #################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10 4 GB DDR3 1600 dual channel 12.8 GB/s
MP-BusSpeed From C/C++ 18.00.21005.1 for x86
Start of test Wed Dec 23 20:57:34 2015
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 6170 6348 6836 6869 7029 6743
2T 1859 3059 5657 7800 9685 10880
4T 989 1804 3289 5900 10157 16055
8T 473 843 1578 3101 5665 10124
122.9 1T 1476 1532 2319 2679 3515 4824
2T 2234 2733 4337 5226 6710 9655
4T 3428 4628 6956 8606 10978 16225
8T 2675 3965 6432 8355 11139 15714
49152 1T 241 273 565 1090 2130 3848
2T 346 409 734 1591 3082 5762
4T 499 496 947 1887 3818 7634
8T 476 500 930 1888 3932 7625
End of test Wed Dec 23 20:58:22 2015
Maximum RAM Speed Estimate = 500 x 16 = 8000 MB/second
################# W1 Windows 10 64 bit #################
MPbusSpeed64 From C/C++ 18.00.21005.1 for x64
Start of test Wed Dec 23 21:15:07 2015
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5222 6158 6233 6523 6404 6580
2T 1882 3670 6113 8124 9540 10760
4T 1089 1817 3378 6083 10832 15242
8T 505 837 1846 3250 5899 9788
122.9 1T 1424 1540 2285 2544 3490 4854
2T 2567 2756 4233 4920 6579 9820
4T 3444 4858 6699 8186 11628 16690
8T 2593 3644 5671 7370 9304 13630
49152 1T 240 268 566 1097 2070 3860
2T 342 411 754 1448 2940 5836
4T 451 494 894 1902 3804 7526
8T 424 503 935 1830 3710 7180
End of test Wed Dec 23 21:15:55 2015
######## W2 Windows 10 32 bit Dual Boot With A5 ########
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
MP-BusSpeed From C/C++ 18.00.21005.1 for x86
Start of test Fri Apr 15 16:19:46 2016
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5387 5874 6023 6023 6158 6175
2T 2051 3414 5527 6968 9063 9875
4T 1105 1897 3213 5706 9238 13066
8T 452 830 1874 3063 5620 8967
122.9 1T 1266 1286 2041 2420 3084 4283
2T 2258 2657 3976 4624 5973 8438
4T 3163 4119 5893 7241 10447 15588
8T 2540 3404 5628 8170 8647 12274
49152 1T 139 170 319 592 986 2063
2T 202 225 442 802 1633 3542
4T 295 359 597 1220 2489 5001
8T 282 313 651 1159 2359 5166
End of test Fri Apr 15 16:20:38 2016
Maximum RAM Speed Estimate = 313 x 16 = 5008 MB/second
######## W2 Windows 10 64 bit Dual Boot With A5 ########
MPbusSpeed64 From C/C++ 18.00.21005.1 for x64
Start of test Fri Apr 15 16:31:03 2016
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 5414 5881 5982 6593 6320 6915
2T 2004 3844 6095 8469 10032 11237
4T 977 1709 3311 6239 12238 17994
8T 498 862 1737 3185 5915 10456
122.9 1T 1515 1537 2447 2750 3625 5040
2T 2330 2730 4064 4923 6364 9105
4T 3702 4830 7300 8835 11707 16740
8T 2587 3613 5718 7715 9699 16216
49152 1T 183 198 429 834 1652 3143
2T 244 303 565 1144 2221 4537
4T 346 324 644 1284 2552 5123
8T 306 307 618 1249 2421 4874
End of test Fri Apr 15 16:31:54 2016
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-BusSpd2 Benchmark V1.2 21-Oct-2016 12.32
Compiled for 32 bit Intel x86
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 13032 13915 24235 25197 22774 23523
2T 12780 25046 41965 50097 47757 52797
4T 27568 24981 36907 46686 50510 64687
8T 14880 22221 47422 54616 80188 96729
122.9 1T 7133 6612 9381 15623 21204 26016
2T 7641 13474 22117 24280 44150 51649
4T 19935 25520 43348 41204 69425 101560
8T 31478 38036 59094 79377 96106 103008
49152 1T 712 1034 2181 4347 8729 13516
2T 1510 2074 2393 8057 15548 27128
4T 2952 2228 6703 13593 27804 42109
8T 4961 4460 8805 25670 49205 68560
Total Elapsed Time 53.2 seconds
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-BusSpd2 Benchmark V1.2 11-Nov-2016 14.29
Compiled for 64 bit Intel x86_64
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 11234 11268 11549 9728 11075 83709
2T 13975 18788 21241 20376 21981 126069
4T 11950 16021 25702 25888 22591 129598
8T 7847 11333 22999 26446 39027 137208
122.9 1T 7270 7472 9070 11037 11565 57013
2T 12151 13359 18497 21814 22939 110321
4T 23054 19821 35736 42796 23494 145387
8T 25125 32352 39249 44178 46373 261178
49152 1T 651 966 1872 3496 7749 18057
2T 930 1979 3815 6002 11796 33883
4T 2876 3639 7142 13308 26695 60051
8T 3802 4639 12125 22329 39597 106907
Total Elapsed Time 56.2 seconds
==============================================
Top end 2015 PC - Core i7-4820K at 3.9 GHz
Quad core, 8 threads, 10 MB shared L3 cache
RAM 1600 MHz, quad channel, 51.2 GB/sec
==============================================
Intel/Windows 32 Bit Version
MP-BusSpeed From C/C++ 18.00.21005.1 for x86
Start of test Sun Feb 14 18:30:05 2016
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 14262 14567 19724 19553 19374 19743
2T 10737 12187 18359 23285 31442 31491
4T 5537 7660 13862 24507 32888 42530
8T 3967 6138 14340 22999 39199 60117
122.9 1T 7263 7213 11664 16448 19425 20552
2T 10361 9428 20446 31143 34263 40155
4T 18846 21063 38732 54792 57770 56587
8T 22328 32794 54749 69742 79276 80967
49152 1T 668 1031 2141 4185 8650 14974
2T 1210 1726 3867 7731 15627 28522
4T 2161 3177 6122 11449 25009 41192
8T 4728 4106 9842 23118 43257 61779
End of test Sun Feb 14 18:31:00 2016
Intel/Windows 64 Bit Version
MPbusSpeed64 From C/C++ 18.00.21005.1 for x64
Start of test Sun Feb 14 18:46:52 2016
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 14760 14788 21402 20729 20934 21032
2T 12570 19878 27089 35589 37688 41618
4T 7000 11473 21725 34776 51827 74198
8T 3728 6525 14160 23059 40659 66975
122.9 1T 7571 7448 11828 16724 20283 21671
2T 13291 13676 22360 32586 39872 42740
4T 18270 21303 37555 62890 78583 84191
8T 21030 30880 53098 71255 91804 103575
49152 1T 663 1037 2159 4187 8611 15218
2T 1207 1720 3908 6418 15470 27796
4T 2319 2382 7002 13639 23754 46951 #
8T 4728 5602 12178 21784 35170 80274 #
End of test Sun Feb 14 18:47:43 2016
# Some data from sharesd 10 MB L3 cache
######### Comparison MB/sec/MHz and RAM MB/sec #########
Unless indicated all are quad core CPUs, Core i7 runs
up to 8 threads using HyperThreading
dual 8 core
T7 T11 T21 A1 A5 P37
KB Cortex Cortex Qualcom Atom Atom Cortex
A9 A15 800 Z3745 z8300 A53
MB/sec/MHz
12.3 1T 2.72 3.56 2.61 3.93 4.81 1.75
2T 4.84 6.66 5.07 6.69 7.66 3.44
4T 9.81 6.36 9.80 12.64 13.63 6.84
8T 9.62 6.53 9.65 8.53 10.90 11.03
122.9 1T 0.82 1.96 2.49 2.89 3.44 1.68
2T 1.51 3.07 4.90 5.74 6.44 3.39
4T 2.31 2.83 9.85 10.75 10.14 6.56
8T 2.28 3.04 9.83 10.57 11.11 9.47
RAM MB/sec
49152 1T 330 1499 2929 3922 3023 2014
2T 620 2355 4776 6507 4559 3839
4T 835 2376 8243 6971 4969 6058
8T 851 2352 7907 7510 5329 5790
4T gain L1 3.61 1.79 3.75 3.21 2.83 3.91
L2 2.82 1.44 3.96 3.71 2.94 3.90
RAM 2.53 1.59 2.81 1.78 1.64 3.01
========================================================
64 bit compilations compared with 32 bit
========================================================
Android REMIX/Android 8HT 8HT
T22 32 T22 64 R1 32 R1 64 R2 32 R2 64
KB Cortex Cortex Atom Atom Corei7 Corei7
A53 A53 Z8300 Z8300 4820K 4820K
MB/sec/MHz
12.3 1T 1.80 4.49 3.52 9.11 6.03 21.46
2T 3.55 8.69 6.13 12.23 13.54 32.33
4T 6.96 15.85 6.21 14.87 16.59 33.23
8T 6.86 15.27 4.30 14.25 24.80 35.18
122.9 1T 1.76 3.92 2.72 5.00 6.67 14.62
2T 3.46 7.72 4.10 9.76 13.24 28.29
4T 6.82 11.49 4.35 12.15 26.04 37.28
8T 6.34 11.30 4.55 12.71 26.41 66.97
RAM MB/sec
49152 1T 1979 2653 2396 3522 13516 18057
2T 3547 3724 3926 4783 27128 33883
4T 4478 4328 5002 4866 42109 60051
8T 4282 4275 6796 6987 68560 106907
4T gain L1 3.86 3.53 1.76 1.63 2.75 1.55
L2 3.88 2.93 1.60 2.43 3.90 2.55
RAM 2.26 1.63 2.09 1.38 3.12 3.33
64/32 Bit L1 2.49 2.59 3.56
========================================================
64 bit compilations compared with 32 bit
========================================================
Windows 8HT 8HT
W1 32 W1 64 W2 32 W2 64 PC 32 PC 64
KB Atom Atom Atom Atom Corei7 Corei7
Z8300 Z8300 z8300 z8300 4820K 4820K
MB/sec/MHz
12.3 1T 4.68 4.57 3.36 3.76 5.06 5.39
2T 7.56 7.47 5.37 6.11 8.07 10.67
4T 11.15 10.58 7.10 9.78 10.91 19.03
8T 7.03 6.80 4.87 5.68 15.41 17.17
122.9 1T 3.35 3.37 2.33 2.74 5.27 5.56
2T 6.70 6.82 4.59 4.95 10.30 10.96
4T 11.27 11.59 8.47 9.10 14.51 21.59
8T 10.91 9.47 6.67 8.81 20.76 26.56
RAM MB/sec
49152 1T 3848 3860 2063 3143 14974 15218
2T 5762 5836 3542 4537 28522 27796
4T 7634 7526 5001 5123 41192 46951 #
8T 7625 7180 5166 4874 61779 80274 #
# Core i7 results - some data from sharesd 10 MB L3 cache
4T gain L1 2.38 2.32 2.12 2.60 2.15 3.53
L2 3.36 3.44 3.64 3.32 2.75 3.88
RAM 1.98 1.95 2.42 1.63 2.75 3.09
64/32 Bit L1 0.98 1.12 1.07
Android/Win 0.75 1.99 1.19 3.98
|
This is an ARM/Intel version of the longer running MP-RndMem Benchmark, as the original, short version, produced inconsistent performance measurements. It is a multithreading variety of RandMem above. For further details and more results see here.
On tablet A1, with the Intel Atom CPU, the initial Houdini ARM to Intel conversion speeds were significantly slower than the results from the native code compilations. This problem was overcome via Android 5 procedures, when most results were faster.
On ARM based tablets, the new ARM/Intel compilations were generally slower than the original produced by an earlier compiler, but most of the difference was regained on a 64 bit version.
Intel/Windows Versions - Maximum data size of 12.3 MB was fine for early Android devices but performance can be affected by shared L2 caches on later ones. The Core i7 test, at this size, is mainly using the 10 MB L3 cache.
Later Intel Comparisons - Later measurements demonstrated inconsistent performance, using Intel Atom CPUs. For example, a second run of the tests could be faster, also between Windows and REMIX (Android) benchmarks and 32 bit vs 64 bit versions. On the other hand, all these comparisons were fairly consisten on the Intel Core i7 tests.
##################### T7 Original ######################
T7, ARM Cortex-A9 1200 MHz, Android 4.1.2,
4 x 32 KB L1 cache, 1 MB shared L2 cache
Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.17
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3120 3060 3128 3078
2T 6098 3003 6083 3004
4T 11354 2948 11188 2942
8T 11403 2857 10412 2872
122.9 1T 996 983 661 699
2T 1868 984 1012 697
4T 2600 982 1483 699
8T 2534 976 1459 694
12288 1T 335 286 91 80
2T 640 288 113 82
4T 892 286 130 82
8T 925 287 127 81
Total Elapsed Time 44.7 seconds
#################### T7 ARM-Intel #####################
ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 11.59
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3060 2001 2867 1904
2T 5459 1879 5463 1867
4T 10797 1852 10537 1856
8T 10090 1802 10608 1813
122.9 1T 968 823 588 547
2T 1749 785 902 618
4T 2716 812 1328 672
8T 2733 810 1407 673
12288 1T 329 274 90 82
2T 636 272 112 82
4T 849 271 128 82
8T 869 271 126 81
Total Elapsed Time 45.4 seconds
#################### T11 Original #####################
T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2
2 x 32 KB L1 cache, 1 MB shared L2 cache
Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.13
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 6696 4438 6594 4483
2T 12338 3078 12263 3573
4T 12419 2834 12166 2907
8T 12314 2903 11991 2934
122.9 1T 3371 2916 1639 1748
2T 6409 1922 2052 1097
4T 6155 1892 2027 1186
8T 6045 2105 2015 1192
12288 1T 1394 1048 153 133
2T 2245 985 285 123
4T 2277 1002 285 132
8T 2165 1001 286 127
Total Elapsed Time 44.0 seconds
#################### T11 ARM-Intel ####################
ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 12.07
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 6315 4486 6345 4484
2T 11837 2910 11846 3112
4T 11864 2835 11553 2858
8T 11821 3003 11805 3198
122.9 1T 3963 2681 1670 1704
2T 6672 1782 2040 1125
4T 6493 1817 2033 1218
8T 6673 1738 2038 1303
12288 1T 1805 1081 177 145
2T 2543 1066 279 137
4T 2600 1065 276 136
8T 2662 1073 281 138
Total Elapsed Time 43.7 seconds
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s
L1 caches 4 x 16 KB, L2 cache shared 2048 KB
Android MP-RndMem2 Benchmark V2.1 08-Jul-2015 16.33
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 5088 5325 4262 4711
2T 9752 4902 8895 4570
4T 17379 4653 17434 4096
8T 19771 4698 17358 4424
122.9 1T 2714 2578 1923 2163
2T 5614 2502 3483 2107
4T 10859 2219 4835 1972
8T 10654 2410 4904 1923
12288 1T 1798 952 186 204
2T 3489 974 341 195
4T 6515 943 563 196
8T 6218 922 563 187
Total Elapsed Time 42.3 seconds
#################### T21 ARM-Intel ####################
ARM/Intel MP-RndMem Benchmark V1.1 09-Jul-2015 11.48
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 4186 3777 4055 3933
2T 9324 3541 7710 3619
4T 16594 3350 15731 3142
8T 18117 3291 16187 3262
122.9 1T 2423 2043 1610 1683
2T 5235 2029 3013 1641
4T 10148 1935 4662 1565
8T 10015 1834 4611 1474
12288 1T 1363 886 171 186
2T 2643 845 325 187
4T 5197 823 534 184
8T 4801 835 542 184
Total Elapsed Time 42.6 seconds
###################### P37 32 Bit ######################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel MP-RndMem Benchmark V1.2 14-Nov-2016 12.13
Compiled for 32 bit ARM v7a
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3464 2779 3249 2792
2T 6473 2549 6471 2574
4T 12671 2355 12644 2243
8T 20039 2055 19677 1837
122.9 1T 3142 2667 843 847
2T 6072 2463 1552 785
4T 11678 2098 2400 675
8T 15639 2228 3822 668
12288 1T 2404 887 71 70
2T 4058 899 141 69
4T 5665 867 258 67
8T 7169 881 410 66
Total Elapsed Time 49.2 seconds
Android 7.0
ARM/Intel MP-RndMem Benchmark V1.2 17-Mar-2017 10.43
Compiled for 32 bit ARM v7a
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3497 2803 3267 2770
2T 6443 2600 6495 2585
4T 12818 2264 12751 2318
8T 20056 2121 19918 2160
122.9 1T 3148 2672 824 865
2T 6104 2493 1562 800
4T 11723 2203 2423 698
8T 16376 2120 3930 733
12288 1T 2554 931 73 72
2T 4276 909 148 70
4T 6703 872 267 68
8T 6425 914 407 67
Total Elapsed Time 47.9 seconds
#################### T22 Original ######################
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
4 x 32 KB L1 cache, 512 KB shared L2 cache
Android MP-RndMem2 Benchmark V2.1 11-Nov-2015 13.03
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3401 3874 3435 3892
2T 6777 3817 6592 3773
4T 13025 3729 12630 3685
8T 12848 3654 12113 3654
122.9 1T 3257 3583 827 946
2T 6416 3572 1481 943
4T 11897 3564 2205 934
8T 11106 3550 2173 945
12288 1T 2397 1734 82 93
2T 4652 1725 161 94
4T 5834 1748 287 94
8T 4774 1743 276 93
Total Elapsed Time 45.9 seconds
###################### T22 32 Bit ######################
ARM/Intel MP-RndMem Benchmark V1.2 12-Aug-2015 17.13
Compiled for 32 bit ARM v7a
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 2894 2438 2887 2433
2T 5665 2402 5663 2403
4T 10922 2369 11100 2310
8T 10065 2293 10648 2265
122.9 1T 2681 2368 757 758
2T 5351 2360 1398 769
4T 10056 2308 2121 772
8T 8838 2351 1916 742
12288 1T 2309 1662 80 78
2T 3986 1683 164 73
4T 5419 1684 283 82
8T 4658 1694 279 82
Total Elapsed Time 44.6 seconds
###################### T22 64 Bit ######################
ARM/Intel MP-RndMem Benchmark V1.2 12-Aug-2015 17.15
Compiled for 64 bit ARM v8a
12.29 1T 4445 3109 4455 3089
2T 8010 3100 8072 3105
4T 15909 3057 14711 3040
8T 14764 3036 14570 3037
122.9 1T 3457 2888 842 876
2T 6537 2924 1524 876
4T 11095 2892 2119 861
8T 11729 2916 2080 874
12288 1T 2475 1679 81 78
2T 4155 1713 163 73
4T 5503 1711 285 89
8T 4519 1717 281 89
Total Elapsed Time 48.1 seconds
#################### A1 Original #######################
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s
4 x 24 KB L1, 2 x 1 MB L2
Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.14
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 1337 2505 1337 2509
2T 2637 2513 2657 2521
4T 3535 2420 3484 2454
8T 3195 2403 3088 2406
122.9 1T 1305 2280 963 1758
2T 2581 2285 1945 1748
4T 3588 2130 3125 1740
8T 3211 2269 2949 1745
12288 1T 1248 1962 101 215
2T 2469 1940 191 214
4T 3462 1954 323 214
8T 3127 1926 318 212
Total Elapsed Time 43.7 seconds
################## A1 V1 Android 5.0 ###################
Android MP-RndMem2 Benchmark V2.1 05-Nov-2015 11.55
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 5580 5533 5554 5455
2T 10460 5393 8625 5336
4T 15584 5013 12183 5211
8T 14687 4850 9754 4882
122.9 1T 4180 4368 2557 2522
2T 8301 4276 5072 2511
4T 15613 4238 7764 2425
8T 14496 4259 7278 2466
12288 1T 3360 2180 239 239
2T 6219 2140 379 240
4T 6758 2135 418 238
8T 6991 2131 418 232
Total Elapsed Time 47.6 seconds
#################### A1 ARM-Intel ######################
ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 11.54
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 4643 3593 4710 3641
2T 8583 3552 8761 3564
4T 12707 3450 12496 3384
8T 10410 3389 10796 3408
122.9 1T 3733 2874 2408 2150
2T 7259 2871 4781 2165
4T 11726 2897 7656 2133
8T 11673 2853 7100 2113
12288 1T 3153 2087 226 238
2T 5782 2073 327 238
4T 6451 1997 447 236
8T 6471 2071 446 233
Total Elapsed Time 41.5 seconds
########### A5 ARM-Intel Dual Boot With W2 #############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel MP-RndMem Benchmark V1.2 14-Apr-2016 17.41
Compiled for 32 bit Intel x86
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 4395 3558 4562 3346
2T 8094 3465 7975 3372
4T 11923 3377 11375 3220
8T 10165 3207 10220 3205
122.9 1T 3519 2796 2360 1993
2T 6875 2591 4233 1970
4T 10225 2761 5943 1935
8T 10158 2755 6363 2052
12288 1T 2586 1846 187 192
2T 3890 1728 310 213
4T 5035 1986 373 194
8T 3972 1887 359 186
Total Elapsed Time 44.0 seconds
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-RndMem Benchmark V1.2 21-Oct-2016 14.32
Compiled for 32 bit Intel x86
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 4504 3504 4322 3382
2T 7137 2799 5874 3446
4T 8441 2526 7759 3049
8T 7693 1763 8478 1300
122.9 1T 2947 2777 2389 2086
2T 5791 2196 3345 1799
4T 6721 1821 4257 1475
8T 7466 1129 4926 1201
12288 1T 3026 2278 201 239
2T 3850 1687 326 218
4T 4451 1772 304 215
8T 5007 1407 407 160
Total Elapsed Time 47.0 seconds
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-RndMem Benchmark V1.2 11-Nov-2016 21.30
Compiled for 64 bit Intel x86_64
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 3501 2736 3655 2561
2T 5999 2462 6015 1922
4T 7295 1306 5998 1930
8T 7895 983 7769 1607
122.9 1T 2851 2036 2273 1861
2T 4950 1772 2973 1623
4T 6384 1405 4053 1292
8T 6409 1046 4598 1049
12288 1T 2362 1826 207 225
2T 3609 1356 349 185
4T 3711 1378 288 174
8T 4910 1131 436 120
Total Elapsed Time 51.0 seconds
################# W1 Windows 10 32 bit #################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10 4 GB DDR3 1600 dual channel 12.8 GB/s
MPRandMem32 From C/C++ 18.00.21005.1 for x86
Start of test Mon Dec 12 16:17:43 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 4227 5149 4457 5086
2T 7978 5490 7846 5379
4T 10589 5292 10543 5208
8T 7912 5066 8068 5137
122.9 1T 3571 3893 2345 2380
2T 6453 3867 4227 2327
4T 11784 3845 6403 2385
8T 11449 3950 6431 2373
12288 1T 2948 2750 222 227
2T 4889 2761 408 229
4T 6290 2771 532 231
8T 6256 2724 534 269
End of test Mon Dec 12 16:18:27 2016
################# W1 Windows 10 64 bit #################
MPRandMem64 From C/C++ 18.00.21005.1 for x64
Start of test Mon Dec 12 16:22:12 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 3816 4658 3884 4495
2T 7060 4531 6971 4390
4T 12603 4383 12604 4334
8T 12435 4179 12493 4215
122.9 1T 3212 3594 2431 2248
2T 5919 3437 4220 2302
4T 11178 3459 6838 2299
8T 10630 3539 6775 2280
12288 1T 2789 2689 228 229
2T 4688 2663 424 242
4T 6079 2670 561 250
8T 6061 2667 562 270
End of test Mon Dec 12 16:22:55 2016
######## W2 Windows 10 32 bit Dual Boot With A5 ########
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
MPRandMem32 From C/C++ 18.00.21005.1 for x86
Start of test Mon Dec 12 16:29:17 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 4151 4929 4126 5104
2T 7501 5063 7496 4887
4T 10549 4933 10620 5206
8T 7259 5126 7278 5072
122.9 1T 3576 3997 2358 2372
2T 6223 3629 3763 2206
4T 11064 3709 6300 2234
8T 11442 3464 5399 2334
12288 1T 2691 2043 195 203
2T 3706 1999 315 217
4T 5382 2098 371 205
8T 5067 1925 352 197
End of test Mon Dec 12 16:30:01 2016
######## W2 Windows 10 64 bit Dual Boot With A5 ########
MPRandMem64 From C/C++ 18.00.21005.1 for x64
Start of test Mon Dec 12 16:26:52 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 3606 4076 3535 4068
2T 5461 3879 6031 3761
4T 11092 3779 10265 4028
8T 9485 3753 9284 3728
122.9 1T 2465 2726 1897 1916
2T 4836 2957 3673 2066
4T 8259 3168 4491 1974
8T 10424 3125 6583 2052
12288 1T 2246 1655 187 188
2T 3245 1769 301 187
4T 4933 1560 360 186
8T 4345 1790 344 175
End of test Mon Dec 12 16:27:38 2016
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-RndMem Benchmark V1.2 21-Oct-2016 12.49
Compiled for 32 bit Intel x86
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 25329 28404 22502 27578
2T 45352 28049 43404 29901
4T 67532 27226 66231 26721
8T 73022 27909 70942 29210
122.9 1T 24237 24426 12519 8183
2T 40910 24130 22546 8612
4T 67966 22138 28955 7129
8T 74659 18872 46730 7929
12288 1T 14375 12505 1139 1127
2T 27645 11799 2248 1105
4T 48129 11772 3564 1078
8T 72818 12119 4256 775
Total Elapsed Time 43.6 seconds
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-RndMem Benchmark V1.2 11-Nov-2016 14.35
Compiled for 64 bit Intel x86_64
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 26834 29490 27107 28870
2T 54416 29824 53434 25831
4T 105809 27746 56139 20591
8T 85898 19779 84818 21910
122.9 1T 23931 25524 11601 8270
2T 48842 25062 23859 8412
4T 98110 22674 47244 7154
8T 89250 16270 53559 5951
12288 1T 15175 12540 1077 1127
2T 29600 11483 2342 1095
4T 43737 10585 2200 904
8T 78035 11667 4351 755
Total Elapsed Time 46.1 seconds
==============================================
Top end 2015 PC - Core i7-4820K at 3.9 GHz
Quad core, 8 threads, 10 MB shared L3 cache
RAM 1600 MHz, quad channel, 51.2 GB/sec
==============================================
Intel/Windows 32 Bit Version
MPRandMem32 From C/C++ 18.00.21005.1 for x86
Start of test Tue Feb 23 16:05:00 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 26590 29369 27321 27593
2T 52121 30063 48980 27757
4T 66651 29464 72519 27466
8T 58774 28464 57426 26236
122.9 1T 25876 28670 13416 8815
2T 46692 28183 21803 8767
4T 82678 28469 46885 8497
8T 83158 28482 49158 8677
12288 1T 16527 13042 1196 1191
2T 27888 12767 2389 1188
4T 49291 13049 3393 1191
8T 84109 12954 4176 1192
End of test Tue Feb 23 16:05:41 2016
Intel/Windows 64 Bit Version
MPRandMem64 From C/C++ 18.00.21005.1 for x64
Start of test Tue Feb 23 16:06:04 2016
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.3 1T 26322 28220 25930 28695
2T 54658 30081 39512 27874
4T 99694 29950 89274 27925
8T 88620 29773 85848 27924
122.9 1T 25196 27993 13424 8633
2T 44627 28207 21816 8785
4T 65329 28108 44155 8620
8T 91445 28208 53751 8715
12288 1T 17662 13110 1301 1198
2T 32242 12856 2595 1198
4T 57536 13117 4905 1197
8T 85697 13079 4645 1197
End of test Tue Feb 23 16:06:46 2016
|
The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2 and 32 operations per input data word, using 1, 2, 4 and 8 threads. Data sizes are limited to three to use L1 cache, L2 cache and RAM at 12.8, 128 and 12800 KB (3200, 32000 and 3200000 single precision floating point words). Each thread uses the same calculations but accessing different segments of the data. The program checks for consistent numeric results, primarily to show that all calculations are carried out and can be run. The numeric results start with values of 1.0, with subsequent calculations reducing the values, the amount depending on the number of calculations. Further details, results and links to download original MP-MFLOPS benchmark can be found here, with more details of the latest MP-MFLOP2S compilations here. The newer versions have longer running times that avoid inconsistent speeds produced by the original.
Using Tablet A1, with the Intel Atom CPU, the original ARM only version was much slower than the native code variety, at 32 operations per word, and running via Android 5.0 was not much faster. Similarly, there was little difference on ARM based systems, between the original and later compilations.
Tablet T22 results, from the 64 bit compilation, showed that it could be much faster than the 32 bit benchmark, up to 3.7 times at 2 operations per word. The reason is that 64 bit vector SIMD instructions were produced, instead of scalars.
MFLOPS/MHz Comparisons - These are provided to compare different CPU technology. None of these are particularly good, the best being The Cortex A53 at 64 bits, producing just over 1 result per cycle per CPU.
Intel/Windows Versions - The compiler used for these appears to be somewhat more advanced than that used for Intel/Android, implementing full SIMD SSE instructions for 64 bit and 32 bit benchmarks. The result is that a Z8300 Atom CPU core produced up to 1.66 MFLOPS/MHz. The maximum speed of a Core i7, using SSE instructions, is 4 multiplies and 4 linked adds per cycle (8 MFLOPS/MHz). This benchmark demonstrated more than 5.5 MFLOPS/MHz.
A5 and W2 Dual Boot Tablet - The Windows compilation is much faster than the Android version, as SSE SIMD type instructions are used. For comparable performance see A5 results below in section NEON-MFLOPS-MP Benchmark. This uses hand coded NEON intrinsic functions, rather than compiler generated machine code. Speeds from the 64 bit version appear to be somewhat faster than the 32 bit variety. However, note that there can ve wide variations in recorded results.
REMIX/Android vs Windows - Windows was faster at 32 bits but performance was similar at 64 bits.for both Atom Z8300 and Core i7.
Other 64 Bit vs 32 Bit - REMIX/Android produced significantly increased speeds at 64 bits, on the Atom and Core i7.
##################### T7 Original ######################
T7, ARM Cortex-A9 1200 MHz, Android 4.1.2,
Android MP-MFLOPS2 Benchmark V2.1 05-Feb-2015 11.37
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 182 156 114 598 578 572
2T 365 321 194 1194 1163 1141
4T 716 655 233 2367 2316 2240
8T 717 682 233 2347 2371 2246
Total Elapsed Time 135.5 seconds
#################### T7 ARM-Intel #####################
ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 17.44
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 188 156 116 598 578 574
2T 365 319 197 1195 1161 1145
4T 682 709 237 2372 2345 2249
8T 678 731 237 2361 2381 2254
Total Elapsed Time 135.0 seconds
#################### T11 Original #####################
T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2
Android MP-MFLOPS2 Benchmark V2.1 29-Apr-2015 10.22
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 845 817 544 1546 1539 1512
2T 1593 1668 648 3140 3067 2977
4T 1974 1775 645 2963 3093 2845
8T 1935 2059 652 3108 3147 2985
Total Elapsed Time 58.5 seconds
#################### T11 ARM-Intel ####################
ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 20.30
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 695 756 536 1537 1501 1476
2T 1319 1527 645 3151 3077 3000
4T 1604 1567 657 3035 3095 2997
8T 1604 1639 658 3108 3125 2996
Total Elapsed Time 59.1 seconds
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Android MP-MFLOPS2 Benchmark V2.1 05-Jul-2015 15.35
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 718 781 590 1214 1220 1228
2T 1572 1583 1118 2406 2436 2442
4T 2338 2959 1836 4867 4911 4859
8T 3148 3266 1866 4870 4916 4888
Total Elapsed Time 56.4 seconds
#################### T21 ARM-Intel ####################
ARM/Intel MP-MFLOPS2 Benchmark V2.1 05-Jul-2015 16.50
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 822 768 636 1232 1228 1231
2T 1662 1637 1184 2460 2463 2446
4T 2509 3216 1659 4519 4762 4900
8T 2965 3193 1881 4847 4925 4880
###################### P37 32 Bit ######################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Nov-2016 12.16
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 229 226 217 811 810 797
2T 451 446 422 1615 1617 1591
4T 884 857 646 3213 3199 3159
8T 1309 1276 714 5192 5164 5030
Total Elapsed Time 90.7 seconds
Android 7.0
ARM/Intel MP-MFLOPS2 Benchmark V2.2 11-May-2017 10.39
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 229 227 220 814 813 801
2T 455 450 435 1626 1623 1609
4T 891 867 687 3225 3219 3181
8T 1283 1307 708 5156 5241 5142
Total Elapsed Time 90.1 seconds
###################### T22 32 Bit ######################
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
ARM/Intel MP-MFLOPS2 Benchmark V2.2 09-Aug-2015 21.17
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 190 190 184 670 672 664
2T 377 378 370 1343 1345 1329
4T 707 755 725 2657 2669 2621
8T 722 736 714 2640 2672 2631
Total Elapsed Time 113.0 seconds
###################### T22 64 Bit ######################
ARM/Intel MP-MFLOPS2 Benchmark V2.2 09-Aug-2015 21.24
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 705 701 636 1398 1394 1362
2T 1376 1395 942 2794 2797 2757
4T 2063 2602 962 5491 5546 5336
8T 2474 2611 957 5367 5500 5417
Total Elapsed Time 51.6 seconds
#################### A1 Original #######################
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s
Android MP-MFLOPS2 Benchmark V2.1 04-Feb-2015 11.03
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 502 501 476 575 575 573
2T 1012 975 921 1133 1140 1115
4T 1571 1627 979 2238 2255 2258
8T 1550 1890 1007 2235 2239 2217
Total Elapsed Time 117.4 seconds
################## A1 V1 Android 5.0 ##################
Android MP-MFLOPS2 Benchmark V2.1 05-Nov-2015 11.59
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 607 586 559 556 553 555
2T 1174 1153 1057 1111 1115 1112
4T 1539 2220 992 2181 2207 2179
8T 1736 2097 1011 2184 2194 2178
Total Elapsed Time 119.2 seconds
#################### A1 ARM-Intel ######################
ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 17.24
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 695 696 661 1061 1061 1055
2T 1335 1382 1058 2088 2086 2102
4T 1832 2635 979 3993 4125 4145
8T 2026 2557 1007 3842 4044 4110
Total Elapsed Time 65.8 seconds
########### A5 ARM-Intel Dual Boot With W2 #############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Apr-2016 17.53
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 422 450 401 945 964 939
2T 795 849 754 1809 1859 1815
4T 1161 1514 1084 3043 3159 3144
8T 1141 1376 1065 3173 3241 3234
Total Elapsed Time 78.8 seconds
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-MFLOPS2 Benchmark V2.2 21-Oct-2016 14.27
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 386 449 427 922 930 917
2T 579 738 733 1658 1642 1636
4T 894 1011 839 2326 2146 2121
8T 974 1084 1039 2239 2355 2433
Total Elapsed Time 90.6 seconds
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Aug-2016 22.35
Compiled for 64 bit Intel x86_64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1365 1369 926 2478 2525 2438
2T 2628 2746 1403 4420 4439 4382
4T 2505 3654 1462 5398 6022 5754
8T 2619 3133 1570 6133 6500 6224
Total Elapsed Time 34.0 seconds
################# W1 Windows 10 32 bit #################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
MP-MFLOPS From C/C++ 18.00.21005.1 for x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1467 1388 1215 2537 2529 2486
2T 2773 2825 1659 4937 4958 4740
4T 3334 4845 1512 8453 8813 8694
8T 2818 5068 1575 8338 8896 8627
################# W1 Windows 10 64 bit #################
MP-MFLOPS From C/C++ 18.00.21005.1 for x64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1470 1471 1252 2936 3060 2996
2T 2775 2982 1653 5593 5860 5680
4T 3610 5290 1520 9401 10488 10326
8T 3132 5178 1562 8957 8365 10433
######## W2 Windows 10 32 bit Dual Boot With A5 ########
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
MP-MFLOPS From C/C++ 18.00.21005.1 for x86
Start of test Sat May 21 19:10:08 2016
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1415 1346 968 2368 2274 2227
2T 2336 2436 857 4460 4433 4181
4T 2718 4196 1046 7192 7984 7678
8T 3073 3220 1071 6133 8773 6413
######## W2 Windows 10 64 bit Dual Boot With A5 ########
MP-MFLOPS From C/C++ 18.00.21005.1 for x64
Start of test Fri Apr 15 16:41:27 2016
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1560 1584 1034 2952 2965 2877
2T 2590 2757 1160 5369 5862 5333
4T 3852 5094 1090 9407 10478 10331
8T 3480 4973 1133 7748 10417 7742
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-MFLOPS2 Benchmark V2.2 21-Oct-2016 12.47
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 3593 3565 3355 5610 5870 5859
2T 6858 7298 6767 10848 11732 11689
4T 7267 14299 7480 18157 23093 20018
8T 10919 13727 11940 22555 22935 22929
Total Elapsed Time 12.1 seconds
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel MP-MFLOPS2 Benchmark V2.2 11-Nov-2016 14.34
Compiled for 64 bit Intel x86_64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 13176 8885 6002 21867 22182 21447
2T 21999 22460 11030 42151 43598 45387
4T 24740 31790 15002 82615 86988 87136
8T 24161 41857 27639 78321 89838 85588
Total Elapsed Time 3.4 seconds
################# PC Windows 10 32 bit #################
Top end 2015 PC - Core i7-4820K at 3.9 GHz
MP-MFLOPS From C/C++ 18.00.21005.1 for x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 11945 10323 6088 21760 21813 21691
2T 18020 20096 11072 34309 43919 45673
4T 25662 42897 13955 55831 89194 90429
8T 22256 49955 14299 80928 90240 88848
################# PC Windows 10 64 bit #################
MP-MFLOPS From C/C++ 18.00.21005.1 for x64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 14218 12522 6044 22097 22201 22087
2T 21473 24706 11189 42464 44797 46061
4T 24241 28250 15774 59471 90548 81144
8T 27512 57442 14238 82808 92377 92959
################ Comparison MFLOPS/MHz ################
FPU Add & Multiply using 1, 2, 4 and Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
Threads
32 Bit Only
Android
T7 1T 0.16 0.13 0.10 0.50 0.48 0.48
Cortex 2T 0.30 0.27 0.16 1.00 0.97 0.95
A9 4T 0.57 0.59 0.20 1.98 1.95 1.87
T11 1T 0.41 0.44 0.32 0.90 0.88 0.87
Cortex 2T 0.78 0.90 0.38 1.85 1.81 1.76
A15 4T 0.94 0.92 0.39 1.79 1.82 1.76
T21 1T 0.38 0.36 0.30 0.57 0.57 0.57
Qualcomm 2T 0.77 0.76 0.55 1.14 1.15 1.14
800 4T 1.17 1.50 0.77 2.10 2.21 2.28
A1 1T 0.37 0.37 0.36 0.57 0.57 0.57
Atom 2T 0.72 0.74 0.57 1.12 1.12 1.13
Z3745 4T 0.98 1.42 0.53 2.15 2.22 2.23
A5 1T 0.23 0.24 0.22 0.51 0.52 0.51
Atom 2T 0.43 0.46 0.41 0.98 1.01 0.99
z8300 4T 0.63 0.82 0.59 1.65 1.72 1.71
P37 1T 0.15 0.15 0.14 0.54 0.54 0.53
Cortex 2T 0.30 0.30 0.28 1.08 1.08 1.06
A53 4T 0.59 0.57 0.43 2.14 2.13 2.11
8 core 8T 0.87 0.85 0.48 3.46 3.44 3.35
###########################################################
32 Bit and 64 Bit
Android
T22 32b 1T 0.15 0.15 0.14 0.52 0.52 0.51
Cortex 2T 0.29 0.29 0.28 1.03 1.03 1.02
A53 4T 0.54 0.58 0.56 2.04 2.05 2.02
T22 64b 1T 0.37 0.37 0.36 0.57 0.57 0.57
Cortex 2T 0.72 0.74 0.57 1.12 1.12 1.13
A53 4T 0.98 1.42 0.53 2.15 2.22 2.23
REMIX/Android
R1 32b 1T 0.21 0.24 0.23 0.50 0.51 0.50
Atom 2T 0.31 0.40 0.40 0.90 0.89 0.89
Z8300 4T 0.49 0.55 0.46 1.26 1.17 1.15
R1 64b 1T 0.74 0.74 0.50 1.35 1.37 1.33
Atom 2T 1.43 1.49 0.76 2.40 2.41 2.38
Z8300 4T 1.36 1.99 0.79 2.93 3.27 3.13
R2 32b 1T 0.92 0.91 0.86 1.44 1.51 1.50
Core i7 2T 1.76 1.87 1.74 2.78 3.01 3.00
4820K 4T 1.86 3.67 1.92 4.66 5.92 5.13
8HT 8T 2.80 3.52 3.06 5.78 5.88 5.88
R2 64b 1T 3.38 2.28 1.54 5.61 5.69 5.50
Core i7 2T 5.64 5.76 2.83 10.81 11.18 11.64
4820K 4T 6.34 8.15 3.85 21.18 22.30 22.34
8HT 8T 6.20 10.73 7.09 20.08 23.04 21.95
Windows
W1 32b 1T 0.80 0.75 0.66 1.38 1.37 1.35
Atom 2T 1.51 1.54 0.90 2.68 2.69 2.58
Z8300 4T 1.81 2.63 0.82 4.59 4.79 4.73
W1 64b 1T 0.80 0.80 0.68 1.60 1.66 1.63
Atom 2T 1.51 1.62 0.90 3.04 3.18 3.09
Z8300 4T 1.96 2.88 0.83 5.11 5.70 5.61
W2 32b 1T 0.77 0.73 0.53 1.29 1.24 1.21
Atom 2T 1.27 1.32 0.47 2.42 2.41 2.27
z8300 4T 1.48 2.28 0.57 3.91 4.34 4.17
W2 64b 1T 0.85 0.86 0.56 1.60 1.61 1.56
Atom 2T 1.41 1.50 0.63 2.92 3.19 2.90
z8300 4T 2.09 2.77 0.59 5.11 5.69 5.61
PC 32b 1T 3.06 2.65 1.56 5.58 5.59 5.56
Core i7 2T 4.62 5.15 2.84 8.80 11.26 11.71
4820K 4T 6.58 11.00 3.58 14.32 22.87 23.19
8HT 8T 5.71 12.81 3.67 20.75 23.14 22.78
PC 64b 1T 3.65 3.21 1.55 5.67 5.69 5.66
Core i7 2T 5.51 6.33 2.87 10.89 11.49 11.81
4820K 4T 6.22 7.24 4.04 15.25 23.22 20.81
8HT 8T 7.05 14.73 3.65 21.23 23.69 23.84
|
NEON-MFLOPS-MP carries out the same calculations as MP-MFLOPS Benchmarks above, but with NEON intrinsic functions used for all calculations. For further results see here. The effect of using these functions, instead of leaving it to the compiler, is that 32 bit performance, on ARM based systems, was similar between the original and new benchmarks.
T22 NEON 64 bit compilation produced a small performance gain over 32 bit results, at 2 operations per word, but near double speed at 32 operations, the latter benefiting from availability of sufficient registers for all the variables.
On the Intel Atom based tablet A1, via the ARM to Intel conversion layer, performance was similar via Android 4 and 5, but the native code version was more than twice as fast at 32 operations per word.
MFLOPS/MHz Comparisons are also provided, including examples on maximum speeds from the non-NEON version, demonstrating NEON gains of up to more than three times as fast. A result submitted for P33, with an ARM Cortex-A57 produced the best single core performance (at November 2015) of 3.47 results per cycle at 64 bits, followed by the Cortex-A53 at 2.13. This is still disappointing, compared with Intel desktop processors, such as the Core 2 onwards, at 6 per clock cycle out of a maximum of 8, with SSE SIMD code (See Linux results).
Intel REMIX/Android - For some reason. this native ARM/Intel and 64 bit/32 bit version failed to run. In this case, the compiler probably failed to translate NEON intrinsic functions into appropriate Intel instructions. The original benchmark had pure ARM code, translated by the Houdini interpreter and that ran successfully, results being included below. This demonstrated up to 4.11 MFLOPS/MHz using a single core.
Following the performance details are the numeric results of calculations from the fixed parameters used in the new version, for both ARM and Intel. It seems that Tablet T11 has an intermittent fault, as it occasionally fails to calculate a correct answer or causes the Tablet to crash and reboot. Now, this also appears to happen using the older version.
The benchmark appeared to run successfully with an Energy Saving On setting, where performance was much slower and CPU MHz was measured as 1000 MHz instead of 1700 (see results below).
##################### T7 Original ######################
T7, ARM Cortex-A9 1200 MHz, Android 4.1.2,
Android NEON-MFLOPS-MP Benchmark V1.0 20-Dec-2012 16.57
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 532 402 124 1135 1044 960
2T 1255 798 213 2041 1987 1916
4T 2441 1553 229 4185 4034 3450
8T 1922 2403 226 3774 3996 3346
Total Elapsed Time 4.5 seconds
#################### T7 ARM-Intel #####################
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.24
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 657 407 132 1077 1074 1053
2T 1265 817 222 2147 2150 2078
4T 2024 1695 234 4214 4276 3555
8T 2435 2495 234 4196 4100 3523
Total Elapsed Time 39.0 seconds
#################### T11 Original #####################
T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2
Dual Core
Android NEON-MFLOPS-MP Benchmark V1.1 13-Sep-2013 13.44
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1847 1415 597 3772 4096 3545
2T 3649 3309 664 8065 7966 7505
4T 3670 3922 658 7753 8148 7490
8T 5664 5570 681 8092 8355 7672
Total Elapsed Time 13.0 seconds
#################### T11 ARM-Intel ####################
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.07
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1965 1630 582 3792 4077 3521
2T 3789 2690 663 8497 8133 7297
4T 5714 4883 654 8364 8192 7554
8T 5414 6316 673 7976 8437 6635
Total Elapsed Time 13.0 seconds
######## T11 ARM-Intel Power Saving On 1.0 GHz ########
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-Nov-2015 16.55
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1935 1290 645 2516 2397 2339
2T 3664 2644 684 4945 4780 4657
4T 3436 3337 690 4911 4931 4674
8T 3133 3543 689 4818 4959 4651
Total Elapsed Time 19.2 seconds
#################### T21 Original #####################
T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4
Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s
Android NEON-MFLOPS2-MP Benchmark V2.1 25-Jul-2015 18.44
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 2757 2576 771 2808 2825 2800
2T 5662 5525 1516 5631 5664 5570
4T 6550 7846 1945 11167 11281 10939
8T 10273 10928 1981 10851 11211 11350
Total Elapsed Time 40.0 seconds
#################### T21 ARM-Intel ####################
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 28-Jun-2015 16.32
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 3049 2857 622 2923 2874 2098
2T 5508 4887 1009 5477 5736 4349
4T 5643 5282 1410 11244 11601 8564
8T 9294 11156 1681 11288 11605 8946
Total Elapsed Time 14.0 seconds
###################### P37 32 Bit ######################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second
8 x 32 KB L1 cache, 512 KB shared L2 cache
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 14-Nov-2016 12.18
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 740 660 399 1739 1729 1691
2T 1334 1228 566 3449 3416 3328
4T 2188 2139 675 6671 6674 6463
8T 2489 3261 722 10379 10466 9768
Total Elapsed Time 22.1 seconds
Android 7.0
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-May-2017 10.44
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 716 686 432 1740 1740 1703
2T 1367 1255 614 3457 3427 3358
4T 2389 2131 726 6814 6682 6644
8T 2914 2776 744 10082 9994 9712
Total Elapsed Time 21.8 seconds
###################### T22 32 Bit ######################
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 13-Aug-2015 16.35
Compiled for 32 bit ARM v7a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 619 613 575 1444 1446 1426
2T 1174 1206 889 2894 2902 2839
4T 1585 1616 901 5679 5726 5596
8T 2075 2130 944 5400 5585 5519
Total Elapsed Time 25.8 seconds
###################### T22 64 Bit ######################
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 13-Aug-2015 16.38
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 726 745 647 2766 2774 2639
2T 1397 1402 903 5523 5552 5371
4T 1871 1930 898 10780 10479 10439
8T 2496 2876 1011 9736 10679 9900
Total Elapsed Time 15.1 seconds
##################### P33 64 Bit #####################
P33 Quad-core 2 GHz Qualcomm Snapdragon 810, Android 5.0.2
4 x Cortex-A57 and 4 x Cortex-A53
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 16-Sep-2015 17.59
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 2811 3126 1089 6943 6589 6342
2T 2488 4114 1541 12084 10559 8809
4T 4759 5480 2038 16516 14826 11960
8T 4840 8985 2452 22082 23563 12461
Total Elapsed Time 7.6 seconds
#################### A1 Original #######################
A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4
Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s
Android NEON-MFLOPS2-MP Benchmark V2.1 07-Feb-2015 18.38
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1796 1520 1025 1231 1228 1227
2T 3354 2959 1047 2427 2445 2445
4T 4627 5508 978 4690 4791 4733
8T 3861 6307 1030 4611 4869 4742
Total Elapsed Time 88.3 seconds
################## A1 V1 Android 5.0 ##################
Android NEON-MFLOPS2-MP Benchmark V2.1 05-Nov-2015 12.09
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1969 1913 832 1230 1245 1225
2T 3537 3632 1046 2482 2487 2445
4T 3388 6497 982 4546 4847 4819
8T 4197 6863 1026 4640 4899 4828
Total Elapsed Time 87.7 seconds
#################### A1 ARM-Intel ######################
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.17
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 2151 1962 1064 2619 2694 2650
2T 4421 3849 1048 5296 5463 5343
4T 5886 6652 982 9592 10735 10362
8T 3744 7284 1018 9085 10791 9493
Total Elapsed Time 13.8 seconds
################### W1 REMIX Original ##################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
Android NEON-MFLOPS-MP Benchmark V1.1 11-Nov-2016 21.39
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 392 414 388 1964 1954 2084
2T 1790 2301 1133 3237 3775 3774
4T 2130 2386 1068 4165 3541 4188
8T 2110 2047 1026 4438 4091 3631
#################### W1 REMIX 32 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 21-Oct-2016 14.40
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1322 1342 965 2377 2517 2354
2T 2261 2627 1155 4140 4316 4329
4T 2187 2656 1361 5494 6082 5693
8T 1978 2673 1613 5888 6050 6119
Total Elapsed Time 17.7 seconds
#################### W1 REMIX 64 Bit ###################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB Shared L2
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-Nov-2016 21.40
Compiled for 64 bit Intel x86_64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
Can't run - Not an ARMv7 CPU
Total Elapsed Time 0.0 seconds
#################### A5 ARM Intel ######################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Android 5.1, 4 GB DDR 3 1600
4 x 24 KB L1, 2 x 1 MB L2
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 14-Apr-2016 17.57
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 1501 1551 1030 2520 2485 2301
2T 2300 2957 1161 4699 4999 4632
4T 3106 5126 1097 7929 8173 8015
8T 2692 4623 1108 7830 8432 7989
Total Elapsed Time 15.7 second
################### PC REMIX Original ##################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
Android NEON-MFLOPS-MP Benchmark V1.1 11-Nov-2016 14.44
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 7381 6891 4206 16044 14885 15134
2T 8892 8294 6078 25814 15291 15897
4T 20783 20566 12919 55052 33458 58857
8T 14049 16003 13811 49462 46915 53373
#################### PC REMIX 32 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 21-Oct-2016 12.53
Compiled for 32 bit Intel x86
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
Can't run - CPU doesn't support NEON
Total Elapsed Time 0.0 seconds
#################### PC REMIX 64 Bit ###################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3
800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1,
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-Nov-2016 14.45
Compiled for 64 bit Intel x86_64
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
Can't run - Not an ARMv7 CPU
Total Elapsed Time 0.0 seconds
################ Comparison MFLOPS/MHz ################
2 Ops/Word 32 Ops/Word Not NEON
KB 12.8 128 12800 12.8 128 12800 12.8
Threads
32 Bit Only
Android
T7 1T 0.55 0.34 0.11 0.90 0.90 0.88 0.50
Cortex 2T 1.05 0.68 0.19 1.79 1.79 1.73 1.00
A9 4T 1.69 1.41 0.20 3.51 3.56 2.96 1.98
T11 1T 1.16 0.96 0.34 2.23 2.40 2.07 0.90
Cortex 2T 2.23 1.58 0.39 5.00 4.78 4.29 1.85
A15 4T 3.36 2.87 0.38 4.92 4.82 4.44 1.79
T21 1T 1.42 1.33 0.29 1.36 1.34 0.98 0.57
Qualcomm 2T 2.56 2.27 0.47 2.55 2.67 2.02 1.14
800 4T 2.62 2.46 0.66 5.23 5.40 3.98 2.10
A1 1T 1.16 1.05 0.57 1.41 1.45 1.42 0.57
Atom 2T 2.38 2.07 0.56 2.85 2.94 2.87 1.12
Z3745 4T 3.16 3.58 0.53 5.16 5.77 5.57 2.15
A5 1T 0.82 0.84 0.56 1.37 1.35 1.25 0.51
Atom 2T 1.25 1.61 0.63 2.55 2.72 2.52 0.98
z8300 4T 1.69 2.79 0.60 4.31 4.44 4.36 1.65
P37 1T 0.49 0.44 0.27 1.16 1.15 1.13 0.54
Cortex 2T 0.89 0.82 0.38 2.30 2.28 2.22 1.08
A53 4T 1.46 1.43 0.45 4.45 4.45 4.31 2.14
8 core 8T 1.66 2.17 0.48 6.92 6.98 6.51 3.46
###########################################################
32 Bit and 64 Bit
Android
T22 32b 1T 0.48 0.47 0.44 1.11 1.11 1.10 0.52
Cortex 2T 0.90 0.93 0.68 2.23 2.23 2.18 1.03
A53 4T 1.22 1.24 0.69 4.37 4.40 4.30 2.04
T22 64b 1T 0.56 0.57 0.50 2.13 2.13 2.03 0.57
Cortex 2T 1.07 1.08 0.69 4.25 4.27 4.13 1.12
A53 4T 1.44 1.48 0.69 8.29 8.06 8.03 2.15
P33 1T 1.41 1.56 0.54 3.47 3.29 3.17 N/A
Cortex 2T 1.24 2.06 0.77 6.04 5.28 4.40
A57 64b 4T 2.38 2.74 1.02 8.26 7.41 5.98
REMIX/Android
R1 32b 1T 0.72 0.73 0.52 1.29 1.37 1.28 0.50
Atom 2T 1.23 1.43 0.63 2.25 2.35 2.35 0.90
Z8300 4T 1.19 1.44 0.74 2.99 3.31 3.09 1.26
R1 64b 1T Can't run - Not an ARMv7 CPU
Atom 2T
Z8300 4T
R2 32b 1T Can't run - Not an ARMv7 CPU
Core i7 2T
4820K 4T
8HT 8T
R2 64b 1T Can't run - Not an ARMv7 CPU
Core i7 2T
4820K 4T
8HT 8T
Original Houdini Interpreted Windows
R2 32b 1T 1.89 1.77 1.08 4.11 3.82 3.88 5.56
Core i7 2T 2.28 2.13 1.56 6.62 3.92 4.08 11.71
4820K 4T 5.33 5.27 3.31 14.12 8.58 15.09 23.19
Windows Not applicabe
##################### New Results #####################
Results x 100000, 12345 indicates ERRORS
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
T11 44934 12345 99850 36770 79897 99759
Android NEON-MFLOPS-MP Benchmark V1.1
1T 86735 98519 99984 79897 97638 99975
2T 86735 98519 99984 79897 97638 99975
4T 86735 98519 99984 79897 97638 99975
8T 86735 98519 99984 79897 97638 99975
Android NEON-MFLOPS2-MP Benchmark V2.1
1T 40015 66980 99522 35216 54898 99234
2T 40015 66980 99522 35216 54898 99234
4T 40015 66980 99522 35216 54898 99234
8T 40015 66980 99522 35216 54898 99234
|
OpenGL Benchmark - JavaOpenGL1.apkThe benchmark does not rely on complex visual scenes or mathematical functions. The objective being to generate moderate to excessive loading via multiple simple objects. It uses all Java code, with OpenGL ES GL10 statements, to measure graphics performance in Frames Per Second (FPS). Four tests draw a background of 50 cubes first as wireframes then colour shaded. The third test views the cubes in and out of a tunnel with slotted sides and roof, also containing rotating plates. The last test adds textures to the cubes and plates. The 50 cubes are redrawn 15, 30 and 60 times, with randomised positions, colours rotational settings. With 6 x 2 triangles per cube, minimum triangles per frame for the three sets of tests are 9000, 18000 and 36000. An example of the last scene is on the right. The tunnel is provided to show 3D effects, the plates rotating in fixed positions. The numerous cubes are in the distant background, the tunnel slots showing that they are still there, with size varying according to proximity. The cubes appear more as jumping objects, with changing colours and position. Android 5 has switched to ART virtual machine for Java, instead of Dalvik. First results indicate severe degradation in performance with this benchmark. Further details and results can be found
here.
This includes information on Vertical Synchronisation (VSYNC) that limits Frames Per Second (FPS) to 60 and can lead to heavier loading reducing speed in 50% steps. as is apparent in the results below.
Links to my Windows and Linux OpenGL benchmarks are also provided.
|
|
|
On tablets A1 and T7 Android was upgraded to version 5.0, leading a reduction in measured speeds by up to 50%, possibly suggesting that VSYNC had change to 30 FPS. The graphics in A5 appear to be slightly faster than A1, but maximum speed appears to be similarly restricted to 30 FPS.
Except for tablet T15, none of the results are particularly good at the heavier loading. T15 results were also produced via Android 5, with several measurements at near 60 FPS, suggesting that speed reductions on the other tablets are not solely dependent on Android 5.
P37, with Adreno graphics and Android 6 was also slower than T21, with an inferior Adreno GPU and Android 4. So was Wi/R1 Intel Atom based REMIX/Android 6 tablet The powerful Intel Core i7 REMIX speeds were some of the fastest but disappointing for high end GeForce graphics (All effects of the change to Java via ART?).
########################## T7 ##########################
T7 Nexus 7 Quad 1200 MHz Cortex-A9, Android 4.1.2
nVidia ULP GeForce Graphics 12 core, 416 MHz
Android Java OpenGL Benchmark 06-Mar-2013 21.51
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 42.18 43.57 33.38 23.54
18000+ 23.68 23.47 19.91 13.38
36000+ 12.05 11.95 11.00 7.10
Screen Pixels 1280 Wide 736 High
Total Elapsed Time 121.0 seconds
#################### T7 Android 5.0 ####################
Android Java OpenGL Benchmark 12-Oct-2015 16.06
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 22.61 23.23 17.71 13.46
18000+ 12.03 12.11 10.36 7.57
36000+ 6.14 6.01 5.64 4.03
Screen Pixels 1280 Wide 736 High
Total Elapsed Time 121.5 seconds
########################## T11 #########################
T11 Samsung EXYNOS 5250 Dual 1.7 GHz Cortex-A15, Android 4.2.2
Mali-T604 Quad Core GPU
Android Java OpenGL Benchmark 09-Aug-2013 09.42
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 39.13 41.52 32.19 27.25
18000+ 22.03 20.73 19.69 16.30
36000+ 12.24 12.23 10.75 8.68
Screen Pixels 1920 Wide 1032 High
Total Elapsed Time 120.8 seconds
########################## T15 #########################
T15 HTC Nexus 9, dual core Denver CPU 2400 MHz, Android 5.0.1
Kepler DX1 Graphics
Android Java OpenGL Benchmark 28-Jan-2015 22.38
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 59.79 59.84 59.84 57.79
18000+ 59.97 59.26 52.64 32.74
36000+ 31.33 30.95 29.02 17.59
Screen Pixels 2048 Wide 1440 High
Total Elapsed Time 121.0 seconds
########################## T21 #########################
T21 Quad Core 2.2 GHz Snapdragon 800, Android 4.4.3
GPU Qualcomm Adreno 330, 578 MHz
Android Java OpenGL Benchmark 27-Jul-2015 16.50
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 35.05 35.45 25.60 21.58
18000+ 18.04 18.05 15.32 12.73
36000+ 9.28 9.33 8.47 6.91
Screen Pixels 1200 Wide 1803 High
Total Elapsed Time 120.8 seconds
########################## P37 #########################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
GPU Adreno 405 550 MHz
Android Java OpenGL Benchmark 17-Oct-2016 10.01
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 27.46 27.68 21.16 17.96
18000+ 14.56 14.60 12.47 10.36
36000+ 7.17 7.21 6.56 5.37
Screen Pixels 1776 Wide 1080 High
Total Elapsed Time 121.0 seconds
Android 7.0
Android Java OpenGL Benchmark 17-Mar-2017 10.39
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 18.49 18.74 14.45 11.73
18000+ 9.70 9.75 8.40 6.31
36000+ 4.78 4.78 4.45 3.48
Screen Pixels 1776 Wide 1080 High
Total Elapsed Time 121.3 seconds
########################## T22 #########################
T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53
Android 5.0, GPU Mali T720 MP2
Android Java OpenGL Benchmark 26-Aug-2015 16.24
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 22.55 22.11 16.67 14.27
18000+ 11.55 11.60 9.98 8.27
36000+ 5.92 5.98 5.48 4.48
Screen Pixels 800 Wide 1216 High
Total Elapsed Time 120.9 seconds
########################## A1 ##########################
A1 Asus MemoPad 7, Quad Core 1.86 GHz Intel Atom Z3745
Intel HD Graphics, Android 4.4.2
Android Java OpenGL Benchmark 21-Dec-2014 16.30
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 37.95 37.64 29.86 23.63
18000+ 19.44 19.70 17.26 13.26
36000+ 9.99 9.93 9.35 7.17
Screen Pixels 1280 Wide 736 High
Total Elapsed Time 120.6 seconds
#################### A1 Android 5.0 ####################
Android Java OpenGL Benchmark 10-Oct-2015 13.44
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 25.87 25.89 20.27 16.29
18000+ 13.43 13.56 11.72 9.38
36000+ 6.92 6.73 6.32 4.98
Screen Pixels 800 Wide 1216 High
Total Elapsed Time 120.9 seconds
#################### A5 Android 5.1 ######################
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Intel HD Graphics, Android 5.1
Android Java OpenGL Benchmark 21-May-2016 13.00
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 29.77 30.17 22.58 18.54
18000+ 16.09 16.03 13.70 10.78
36000+ 8.31 8.27 7.79 5.76
Screen Pixels 2048 Wide 1440 High
Total Elapsed Time 121.0 seconds
####################### W1 REMIX ######################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, HD Graphics
Android Java OpenGL Benchmark 14-Aug-2016 22.40
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 19.87 20.29 15.75 12.98
18000+ 11.57 11.68 9.90 7.71
36000+ 6.12 6.14 5.64 4.22
Screen Pixels 1920 Wide 996 High
Total Elapsed Time 121.4 seconds
####################### PC REMIX ######################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
Android 6.0.1, GeForce GTX 650, 64-Bit Windows 10
Android Java OpenGL Benchmark 14-Aug-2016 14.23
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 59.96 59.95 60.00 56.78
18000+ 58.21 58.49 53.97 36.68
36000+ 33.45 33.47 31.29 20.46
Screen Pixels 1920 Wide 996 High
Total Elapsed Time 120.3 seconds
|
OpenGL Drawing Benchmark - JavaDraw.apkThis all Java benchmark uses small to rather excessive simple objects to measure drawing performance, again via Frames Per Second (FPS). Five tests draw on a background of continuously changing colour shades. The image on the right is after four tests.
Further details and results can be found
here,
that includes links to an off line version that runs on PCs via Windows and Linux.
|
|
|
As with Java OpenGL, speeds are limited to 60 FPS by imposed VSYNC. In general, there was not a great deal of differences in performance on the initial systems shown here. In the cases of Android upgrades to version 5. it was virtually identical to tablet A1 but T7 speed was much faster on the tests least dependent on CPU speed.
March 2016 - Results from W1, the Windows 10 based tablet, indicate that VSYNC is not imposed, producing the fastest speeds at this time. Windows/Android dual boot tablet W2/A5, confirms the faster Windows performance (via Java). However, the android version runs at full screen, as opposed to a fixed 1280 x 720 with the Windows variety. The latter was recompiled to use full screen, producing much slower speeds (see below). Windows results from the PC, with a reasonably powerful graphics card, are also shown, to reflect the huge difference in performance.
A5 and W2 Dual Boot Tablet - At Screen pixels 2048 x 1440, the Windows speed was slower than via Android, on the first test, but faster on others. A second test on W2, at 1280 x 720, demonstrates faster speed using a smaller window.
REMIX Android vs Windows - Unlike Android, Windows based tests were not limited to 60 FPS, due to VSYNC, and particularly the PC results shown indicated superior performance.. As with the OpenGL benchmark, P37 was relatively slow (More ART/Java issues?).
########################## T7 ##########################
T7 Nexus 7 Quad 1200 MHz Cortex-A9, Android 4.2.1
nVidia ULP GeForce Graphics 12 core, 416 MHz
Android Java Drawing Benchmark 12-Apr-2013 19.50
Test Frames FPS
Display PNG Bitmap Twice 204 20.38
Plus 2 SweepGradient Circles 165 16.48
Plus 200 Random Small Circles 145 14.50
Plus 320 Long Lines 113 11.30
Plus 4000 Random Small Circles 39 3.81
Screen pixels 1280 Wide 736 High
Total Elapsed Time 50.4 seconds
Maximum 19.2 Million Pixels Per Second
#################### T7 Android 5.0 ####################
Android Java Drawing Benchmark 01-Oct-2015 12.24
Test Frames FPS
Display PNG Bitmap Twice 487 48.70
Plus 2 SweepGradient Circles 297 29.66
Plus 200 Random mall Circles 231 23.02
Plus 320 Long Lines 149 14.85
Plus 4000 Random Small Circles 39 3.90
Screen pixels 1280 Wide 736 High
Total Elapsed Time 50.1 seconds
########################## T11 #########################
T11 Samsung EXYNOS 5250 2.0 GHz Cortex-A15, Android 4.2.2
Mali-T604 quad core GPU
Android Java Drawing Benchmark 09-Aug-2013 09.39
Test Frames FPS
Display PNG Bitmap Twice 558 55.74
Plus 2 SweepGradient Circles 277 27.66
Plus 200 Random Small Circles 244 24.36
Plus 320 Long Lines 169 16.84
Plus 4000 Random Small Circles 68 6.72
Screen pixels 1920 Wide 1032 High
Total Elapsed Time 50.4 seconds
Maximum 110 Million Pixels Per Second
########################## T21 #########################
T21 2.2 GHz Quad Core Snapdragon 800, Android 4.4.3
GPU Qualcomm Adreno 330, 578 MHz
Android Java Drawing Benchmark 27-Jul-2015 16.47
Test Frames FPS
Display PNG Bitmap Twice 533 53.24
Plus 2 SweepGradient Circles 248 24.73
Plus 200 Random Small Circles 218 21.72
Plus 320 Long Lines 158 15.75
Plus 4000 Random Small Circles 57 5.61
Screen pixels 1200 Wide 1803 High
Total Elapsed Time 50.3 seconds
########################## T22 #########################
T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53
Android 5.0, GPU Mali T720 MP2
Android Java Drawing Benchmark 26-Aug-2015 16.21
Test Frames FPS
Display PNG Bitmap Twice 558 55.72
Plus 2 SweepGradient Circles 368 36.70
Plus 200 Random Small Circles 286 28.52
Plus 320 Long Lines 178 17.76
Plus 4000 Random Small Circles 50 4.99
Screen pixels 800 Wide 1216 High
Total Elapsed Time 51.5 seconds
########################## P37 #########################
P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1
GPU Adreno 405 550 MHz
Android Java Drawing Benchmark 17-Oct-2016 09.59
Test Frames FPS
Display PNG Bitmap Twice 246 24.53
Plus 2 SweepGradient Circles 158 15.77
Plus 200 Random Small Circles 130 12.98
Plus 320 Long Lines 98 9.71
Plus 4000 Random Small Circles 27 2.66
Screen pixels 1776 Wide 1080 High
Total Elapsed Time 50.4 seconds
Android 7.0
Android Java Drawing Benchmark 17-Mar-2017 10.32
Test Frames FPS
Display PNG Bitmap Twice 236 23.57
Plus 2 SweepGradient Circles 149 14.85
Plus 200 Random Small Circles 132 13.19
Plus 320 Long Lines 103 10.24
Plus 4000 Random Small Circles 41 4.06
Screen pixels 1776 Wide 1080 High
Total Elapsed Time 50.3 seconds
########################## A1 ##########################
A1 Asus MemoPad 7, Quad Core 1.86 GHz Intel Atom Z3745
Intel HD Graphics, Android 4.4.2
Android Java Drawing Benchmark 21-Dec-2014 16.35
Test Frames FPS
Display PNG Bitmap Twice 599 59.79
Plus 2 SweepGradient Circles 486 48.55
Plus 200 Random Small Circles 383 38.25
Plus 320 Long Lines 219 21.88
Plus 4000 Random Small Circles 64 6.38
Screen pixels 1280 Wide 736 High
Total Elapsed Time 50.1 seconds
#################### A1 Android 5.0 ####################
Android Java Drawing Benchmark 10-Oct-2015 13.42
Test Frames FPS
Display PNG Bitmap Twice 595 59.40
Plus 2 SweepGradient Circles 458 45.79
Plus 200 Random Small Circles 383 38.27
Plus 320 Long Lines 199 19.81
Plus 4000 Random Small Circles 56 5.60
Screen pixels 800 Wide 1216 High
Total Elapsed Time 50.1 seconds
#################### A5 Android 5.1 ####################
Same Tablet as W2
Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84
Intel HD Graphics, Android 5.1
Android Java Drawing Benchmark 02-Mar-2016 17.37
Test Frames FPS
Display PNG Bitmap Twice 447 44.62
Plus 2 SweepGradient Circles 212 21.12
Plus 200 Random Small Circles 171 17.02
Plus 320 Long Lines 93 9.25
Plus 4000 Random Small Circles 32 3.13
Screen pixels 2048 Wide 1440 High
Total Elapsed Time 50.4 seconds
####################### W1 REMIX #######################
R1 Intel Atom Z8300 quad core 1.84 GHz
Android 6.0.1, HD Graphics
Android Java Drawing Benchmark 14-Aug-2016 22.38
Test Frames FPS
Display PNG Bitmap Twice 594 59.39
Plus 2 SweepGradient Circles 375 37.47
Plus 200 Random Small Circles 315 31.43
Plus 320 Long Lines 210 20.96
Plus 4000 Random Small Circles 66 6.57
Screen pixels 1920 Wide 1032 High
Total Elapsed Time 50.1 seconds
############## W1 Windows 10 1280 x 720 ##############
Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84
Windows 10, Intel HD Graphics Gen8
Java Drawing Benchmark, Dec 27 2015, 21:51:45
Produced by javac 1.7.0_2
Test Frames FPS
Display PNG Bitmap Twice Pass 1 872 87.13
Display PNG Bitmap Twice Pass 2 991 98.95
Plus 2 SweepGradient Circles 961 95.98
Plus 200 Random Small Circles 782 78.08
Plus 320 Long Lines 605 60.44
Plus 4000 Random Small Circles 164 16.32
Total Elapsed Time 60.1 seconds
Operating System Windows 10, Arch. x86, Version 10.0
Java Vendor Oracle Corporation, Version 1.8.0_66
Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs
############## W2 Windows 10 1280 x 720 ##############
Same Tablet as A5
Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84
Windows 10, Intel HD Graphics Gen8
Java Drawing Benchmark, Mar 2 2016, 21:30:58
Produced by javac 1.7.0_2
Test Frames FPS
Display PNG Bitmap Twice Pass 1 748 74.78
Display PNG Bitmap Twice Pass 2 833 83.24
Plus 2 SweepGradient Circles 828 82.78
Plus 200 Random Small Circles 690 68.99
Plus 320 Long Lines 560 55.94
Plus 4000 Random Small Circles 163 16.30
Total Elapsed Time 60.0 seconds
Operating System Windows 10, Arch. x86, Version 10.0
Java Vendor Oracle Corporation, Version 1.8.0_66
Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs
############ W2 Windows 10 2048 x 1440 #############
Java Drawing Benchmark, Mar 3 2016, 12:22:42
Produced by javac 1.7.0_2 2048 x 1440
Test Frames FPS
Display PNG Bitmap Twice Pass 1 275 27.42
Display PNG Bitmap Twice Pass 2 301 30.01
Plus 2 SweepGradient Circles 296 29.54
Plus 200 Random Small Circles 286 28.51
Plus 320 Long Lines 225 22.45
Plus 4000 Random Small Circles 118 11.72
Total Elapsed Time 60.3 seconds
Operating System Windows 10, Arch. x86, Version 10.0
Java Vendor Oracle Corporation, Version 1.8.0_66
Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs
####################### PC REMIX ######################
R2 Core i7 4820K quad core + HT at 3900 MHz Turbo
Android 6.0.1, GeForce GTX 650, 64-Bit Windows 10
Android Java Drawing Benchmark 14-Aug-2016 14.19
Test Frames FPS
Display PNG Bitmap Twice 582 55.49
Plus 2 SweepGradient Circles 601 60.01
Plus 200 Random Small Circles 415 41.41
Plus 320 Long Lines 303 30.25
Plus 4000 Random Small Circles 43 4.20
Screen pixels 396 Wide 674 High
Total Elapsed Time 50.8 seconds
################ PC REMIX Full Scrren #################
Android Java Drawing Benchmark 14-Aug-2016 14.21
Test Frames FPS
Display PNG Bitmap Twice 553 55.21
Plus 2 SweepGradient Circles 539 53.86
Plus 200 Random Small Circles 330 32.91
Plus 320 Long Lines 212 21.19
Plus 4000 Random Small Circles 39 3.88
Screen pixels 1920 Wide 996 High
Total Elapsed Time 50.2 seconds
########### PC Windows 10 GeForce GTX 650 ###########
Core i7-4820K at 3.9 GHz
Java Drawing Benchmark, Mar 7 2016, 10:56:24
Produced by javac 1.7.0_2 2048 x 1440
Test Frames FPS
Display PNG Bitmap Twice Pass 1 5237 523.39
Display PNG Bitmap Twice Pass 2 5477 547.04
Plus 2 SweepGradient Circles 5484 548.07
Plus 200 Random Small Circles 5144 513.58
Plus 320 Long Lines 4736 473.32
Plus 4000 Random Small Circles 735 73.49
Total Elapsed Time 60.0 seconds
Operating System Windows 10, Arch. x86, Version 10.0
Java Vendor Oracle Corporation, Version 1.8.0_60
Intel64 Family 6 Model 62 Stepping 4, GenuineIntel, 8 CPUs
|
This program measures CPU MHz samples over 30 seconds, with 300 reports at 100 millisecond intervals (timing functions and overheads increase this time to 120 ms or above). The procedures are open a benchmark, open MHz program and run, switch in benchmark from recent screens and run, save benchmark results, switch in MHz program from recent screens and save results when finished. Further details and results can be found here. and here
Note - This program might not measure the CPU MHz that controls reductions in speed (throttling), introduced to reduce power consumption when temperature increases too much. No simple programming functions appear to be available for logging via a single app. Installing CPU Z might enable independent measurements to be noted. CPU Z can also provide CPU temperature measurement, with a range of values for a number of sensors on different systems. Research might be needed to find which are appropriate for CPU cores.
Below is an example, over the first 18 seconds, whilst running NEON-MFLOPS-MP benchmark (taking 14.6 seconds). In this case, MHz is fairly constant, but the frequency can vary a lot on other devices, or might run at a constant low value, if power saving is switched on.
T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2
ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 15-Nov-2015 17.21
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 785 771 669 2862 2851 2739
2T 1485 1499 895 5654 5729 5606
4T 1937 2074 995 10862 11024 10636
8T 2678 3021 1012 9971 10730 10534
Total Elapsed Time 14.6 seconds
Android CPU MHz 100 ms Sampling 15-Nov-2015 17:21:45
0.00 1300 0.12 1300 0.23 299 0.38 299 0.53 1300
0.67 1300 0.88 1300 1.01 1300 1.15 1300 1.29 1300
1.43 1300 1.57 1300 1.70 1300 1.83 1300 1.97 1300
2.11 1300 2.25 1300 2.39 1300 2.53 1300 2.67 1300 X
2.81 1300 2.95 1300 3.09 1300 3.22 1300 3.36 1300
3.49 1300 3.63 1300 3.76 1300 3.90 1300 4.04 1300
4.18 1235 4.30 1300 4.42 1300 4.59 1300 4.76 1300
4.91 1300 5.08 1300 5.23 819 5.40 1300 5.55 299 X
5.74 1300 5.91 1300 6.09 1300 6.25 1300 6.41 1300
6.59 1300 6.76 1300 6.92 1300 7.08 1300 7.24 1300
7.40 1300 7.52 1300 7.68 299 7.88 442 8.06 1300
8.24 1300 8.40 819 8.56 1300 8.71 1300 8.86 1300 X
9.01 1300 9.16 1300 9.32 1300 9.48 1300 9.64 1300
9.80 1300 9.97 1300 10.13 1300 10.27 1300 10.43 1300
10.57 1300 10.72 1300 10.88 1300 11.02 1300 11.17 1300
11.33 1300 11.47 1300 11.62 1300 11.78 1300 11.92 1300 X
12.07 1300 12.22 1300 12.37 1300 12.53 1300 12.68 1300
12.84 1300 12.99 1300 13.15 1300 13.30 1300 13.46 1300
13.61 1300 13.76 1300 13.92 1300 14.08 1300 14.24 1300
14.40 1300 14.56 1300 14.72 1300 14.88 1300 15.04 1300 X
15.20 1300 15.36 1300 15.52 1300 15.69 1300 15.86 1300
16.02 1300 16.21 1300 16.39 1300 16.55 1300 16.71 1300
16.85 1300 17.00 1300 17.14 1300 17.30 1300 17.45 1300
17.60 1300 17.75 1300 17.91 1300 18.05 1300 18.21 1300
|
The program runs the second most demanding OpenGL drawing benchmark test except CPU MHz is displayed, along with Frames Per Second (FPS) and running time in minutes, the MHz figure being the average of one measurement per frame. Default running arrangements are 60 passes of one second each, producing two columns of results that are displayed and saved on the Internal Drive. The CPU MHz is the average of samples taken once per frame. These results are to demonstrate any reductions as the battery capacity reduces. Before running, Display/Power Settings should be changed to never switch off and CPU to run at maximum speed, if possible.
Three buttons are provided where, besides the Run and usual Email option, to save results, there is a Time button, enabling manual input of the number of seconds for each pass. After rebooting, following a flat battery turning the device off, and after recharging, restarting the program reads and displays the saved results, ready for E-mailing. NOTE: some Android versions will not open a log file for saving results.
Following are results from a test set to run for 2 hours (60 x 120 seconds) and run twice. Displayed MHz, whilst the test was running, showed rapid variations that affected the final speed and FPS had a similar variation, but these were fairly constant over 4 hours.
Note - the later CPU Stress Tests might be more effective. Also, the CPU MHz app might not work on later systems.
T21 Quad Core Qualcomm Snapdragon 800, Android 4.4.3
GPU Qualcomm Adreno 330, 578 MHz
Up to 60 120 second runs, MHz 1 sample/frame
Log File /storage/emulated/0/BatteryTest.txt
Android Battery Test 28-Jul-2015 11.08 28-Jul-2015 13.28
Run FPS MHz Run FPS MHz Run FPS MHz Run FPS MHz
/b>
1 12.0 2100 2 12.1 1937 1 12.1 2014 2 12.2 2005
3 7.6 1874 4 12.1 1966 3 12.0 1975 4 12.0 2005
5 12.2 1993 6 12.2 1996 5 12.1 1962 6 12.1 1948
7 12.2 1996 8 12.2 1966 7 12.2 1979 8 12.2 2004
9 12.0 1935 10 12.3 1925 9 12.1 1959 10 12.1 2060
11 12.3 1983 12 12.0 2015 11 12.0 2017 12 12.2 1992
13 11.9 2013 14 12.1 2000 13 12.1 1987 14 12.2 1964
15 12.1 1934 16 12.1 2005 15 12.1 1973 16 12.1 1978
17 12.0 1948 18 12.0 2000 17 12.1 1998 18 12.1 1977
19 11.9 1979 20 12.0 1972 19 12.0 2007 20 12.0 1956
21 12.0 1997 22 12.0 1994 21 11.9 1966 22 12.1 1975
23 12.2 2035 24 12.1 2013 23 12.0 1978 24 12.0 2012
25 12.2 1981 26 12.1 1977 25 12.1 1988 26 12.1 2010
27 12.2 1976 28 12.2 1991 27 12.1 2004 28 12.0 1994
29 12.2 2000 30 12.3 1984 29 12.1 1989 30 12.2 2004
31 12.2 1986 32 12.3 1964 31 12.2 2009 32 12.1 1979
33 12.2 1955 34 12.1 1980 33 12.1 1945 34 12.0 1951
35 12.1 2002 36 12.2 2045 35 12.1 1997 36 12.1 2022
37 12.1 1993 38 12.2 2010 37 12.2 2038 38 12.1 2024
39 12.1 1947 40 12.1 1959 39 12.1 1997 40 12.1 2049
41 11.9 1949 42 12.0 1993 41 12.2 1996 42 12.1 1994
43 12.1 1953 44 12.2 2005 43 12.0 1978 44 12.0 1985
45 12.4 1928 46 12.3 1989 45 11.9 1947 46 12.2 1982
47 12.1 1987 48 12.1 1969 47 12.1 2022 48 11.8 1964
49 12.1 1992 50 12.1 1999 49 11.9 1985 50 12.1 1991
51 12.2 1929 52 12.1 1955 51 12.1 1988 52 12.0 2002
53 12.4 1950 54 12.3 1990 53 12.0 2009 54 11.9 2018
55 12.3 1930 56 12.2 1922 55 12.0 1994 56 12.0 1974
57 12.4 1952 58 12.1 1977 57 12.1 1950 58 12.1 2009
59 12.1 1986 60 12.2 1962 59 12.0 1976 60 12.0 2010
Total Elapsed Time 7614.9 seconds Total Elapsed Time 7202.9 seconds
|
This is primarily intended for measuring performance of SD cards and internal drives, but can also be used to test USB drives. DriveSpeed carries out four tests.
Test 1 - Write and read three 8 and 16 MB; Results given in MBytes/second
Test 2 - Write 8 MB, read can be cached in RAM; Results given in MBytes/second
Test 3 - Random write and read 1 KB from 4 to 16 MB; Results are Average time in milliseconds
Test 4 - Write and read 200 files 4 KB to 16 KB; Results in MB/sec, msecs/file and delete seconds.
The first DriveSpeed benchmark has two run buttons, RunS for an SD card and RunI for the internal drive, the file path being identified by standard functions. The external SD test worked on earlier Android tablets but failed on later Android versions. RunS ran but provided distorted reading speeds by caching data in RAM. An extra button was added to prevent large files from being deleted and a read only option to measure uncached speeds after rebooting.
DriveSpd2 requires input of the file path to use and this might be identified using a file browser app. The file path can sometimes be selected for internal drives, SD cards and USB devices but there are complications associated with permissions and caching.
Running these benchmarks can require a lot of experimentation. Lots of paths, results and explanations are here and here. Following are example DriveSpd2 results from T22 ( Lenovo Tab 2 A8-50) testing an external SD card, T11 (Voyo A15) from a USB 3 flash drive and read only benchmark results.
Intel/Windows Versions - Results for Tablet W1 main drive are below, with USB 3 and SD card speeds
here along with some
via Windows and Linux.
########################## T22 #########################
T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53
Android DriveSpeed2 Benchmark 1.0 28-Aug-2015 12.56
Data Not Cached
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 3.7 3.7 3.6 20.3 20.6 20.4
16 2.6 3.7 3.7 20.5 20.5 20.5
Cached
8 52.4 107.8 13.2 228.8 226.3 226.7
Random Write Read
From MB 4 8 16 4 8 16
msecs 4.65 4.91 18.23 0.01 0.01 0.66
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 0.07 0.18 0.49 2.16 3.79 6.51
msecs 59.14 44.59 33.61 1.90 2.16 2.52 2.099
Total Elapsed Time 85.4 seconds
File Path Used - /storage/sdcard1/
Drive MB 15258 Free 14687
########################## T11 #########################
T11 Samsung EXYNOS 5250 Dual 1.7 GHz Cortex-A15,
Android DriveSpeed2 Benchmark 1.0 10-Dec-2013 12.52
Data Not Cached
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 40.9 46.6 46.2 100.7 95.9 71.4
16 45.2 51.9 51.1 98.8 70.7 66.2
Cached
8 150.4 127.7 50.9 687.6 688.7 709.2
Random Write Read
From MB 4 8 16 4 8 16
msecs 0.91 0.90 0.82 0.01 0.01 0.02
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 0.56 1.18 1.85 4.20 13.33 34.79
msecs 7.29 6.96 8.88 0.98 0.61 0.47 0.149
Total Elapsed Time 24.8 seconds
File Path Used - /mnt/udisk/
Drive MB 30517 Free 30466
###################### Read Only #######################
Android DriveSpeed Benchmark Internal Drive Read Only
MBytes/Second
Device Write1 Write2 Write3 Read1 Read2 Read3
T7 0.0 0.0 0.0 41.7 42.8 39.0
T11 0.0 0.0 0.0 53.7 53.5 53.9
T21 0.0 0.0 0.0 102.9 104.0 103.6
T22 0.0 0.0 0.0 127.7 145.7 139.9
A1 0.0 0.0 0.0 155.7 128.6 156.2
################## W1 DriveSpeed32.exe W1 Windows 10 #################
Current Directory Path: C:\Test
Total MB 58722, Free MB 45286, Used MB 13436
Windows Storage Speed Test 32-Bit Version 1.2, Mon Jan 04 16:09:25 2016
Copyright (C) Roy Longbottom 2011
8 MB File 1 2 3 4 5
Writing MB/sec 100.68 101.04 110.81 105.04 113.32
Reading MB/sec 154.58 155.78 132.18 153.97 153.86
16 MB File 1 2 3 4 5
Writing MB/sec 115.96 117.50 118.53 113.16 116.46
Reading MB/sec 150.29 155.47 156.13 150.62 157.92
32 MB File 1 2 3 4 5
Writing MB/sec 118.84 118.26 123.01 123.42 125.39
Reading MB/sec 146.70 153.65 146.41 148.77 155.54
---------------------------------------------------------------------
8 MB Cached File 1 2 3 4 5
Writing MB/sec 176.10 292.34 462.14 201.19 452.46
Reading MB/sec 599.06 830.94 992.19 878.99 1033.57
---------------------------------------------------------------------
Bus Speed Block KB 64 128 256 512 1024
Reading MB/sec 101.09 107.71 123.43 139.70 136.62
---------------------------------------------------------------------
1 KB Blocks File MB > 2 4 8 16 32 64 128
Random Read msecs 0.22 0.18 0.18 0.18 0.18 0.19 0.19
Random Write msecs 0.13 0.13 0.13 0.13 0.14 0.19 0.21
---------------------------------------------------------------------
500 Files Write Read Delete
File KB MB/sec ms/File MB/sec ms/File Seconds
2 0.56 3.68 3.00 0.68 0.629
4 0.84 4.85 6.79 0.60 0.541
8 1.92 4.27 13.34 0.61 0.502
16 1.01 16.17 22.14 0.74 0.528
32 1.95 16.81 38.21 0.86 0.527
64 3.75 17.50 59.57 1.10 0.490
End of test Mon Jan 04 16:10:53 2016
|
Reliability/Stress tests were run using the ARM CPUs on various Raspberry Pi Systems, including 32 bit and 64 bit Operating Systems. Besides attempting to identify any false calculations or system crashes, a main purpose was to demonstrate performance reductions as the CPUs became overheated and identify processor clock throttling. This was aided by the availability of programmable functions that measure CPU MHz and temperature. The Raspberry Pi tests exercised multiple processor cores by running a number copies of the same programs via script files.
Running multiple copies of the same program does not appear to be possible using Android. So, multithreaded versions were produced, one using floating point calculations and the other integers. Earlier Android CPU benchmarks did not display results until the end of executing all tests. With long running stress tests, it is desirable to display running time and performance on an on-going basis. In this case, unreported calibration phases attempt to set run time parameters that lead to initial reportable test periods of around 10 seconds. This can be longer, if the initial pass takes more than 10 seconds, such as when other programs are running at the same time (as in the screen shot below).
Besides the CPU slowing down due to heating effects, the mobile devices, of course, run slower as the battery becomes discharged. In event of the latter, or CPU MHz throttling cannot avoid overheating, the CPU should turn off automatically (OR WORSE! - WATCH IT). It is recommended that stress testing is limited to one or a number of 15 minute sessions, to allow results to be saved and judgments made whether to continue.
Apparently running CPU MHz Benchmark and Raspberry Pi Stress Tests, functions required to obtain effective CPU MHz, can vary. This also applies to the measurement of CPU temperature. Hence, there can be no simple program to monitor these. In some cases, manual measurements can be noted after installing CPU-Z from Google Play. One difficulty there, is that a number of temperature measurements might be provided, without indications of the location.
The screenshot, below, of both stress tests, was from P37, a Moto G phone running Android 7. This has the option to run two programs at the same time, via a split screen. Besides performance, note the displayed sumchecks. An indication is given if data is not of the expected value.
The source code and project files are included in
Android Intel-ARM Benchmarks.zip.
![]() |
Buttons RunB - Run Benchmark - Runs most combinations of number of threads, data sizes and calculations per data word for the FPU tests. This is mainly to help to decide which options to use for stress testing. The benchmark runs using fixed parameters, carrying out exactly the same number of calculations using all thread combinations and data sizes. The pass count changes according to the number of calculations per word, for the FPU tests. RunS - Run Stress Tests - Default running time is 15 minutes, with the middle data size, intended for containment in L2 cache, using 8 threads. and 32 operations per word in the FPU tests. SetS - Specify run time parameters for stress test - These are 1, 2, 4, 8, 16 or 32 threads, 2, 8 or 32 Operations per word for FPU tests, 12.8 or 16 KB, 128 or 160 KB, 12.8 or 16 MB for FPU or Integer tests, and running time in minutes. Info - Test description and details - The is essentially the same as details provided here. Save - This offers details of the results and identified CPU hardware and Operating System for E-mail. Default addressee is the program author via results@roylongbottom.org.uk but this can be changed or additional addresses added.
Timing
On benchmarking running time of each pass is provided, reducing, where appropriate, on doubling the thread count. Cumulative running time is provided for the stress tests, demonstrating the number of passes carried out in the specified running time. This increases as the CPU slows down due to heating effects or a discharged battery. |
Benchmark - This is essentially the same program as used for the MP-MFLOPS Benchmark which, besides carrying out calculations with 2 and 32 floating point operations per data word, includes a further function with 8 operations. As a reminder, the benchmark runs using fixed parameters, carrying out exactly the same number of calculations using 1, 2, 4 and 8 threads. Note the sumchecks of numeric results of calculations, where every word is checked for identical values and results of zero are reported if any are incorrect. The number of calculations, and associated sumchecks, vary using different memory sizes and varying speeds of operation of caches and RAM.
Stress Test - As indicated earlier, the stress test runs multiple times, using the same run time parameters for number of threads, data size, floating point operations per data word and operations per pass, for the specified number of minutes. Then, the number of repeat passes can be fewer if CPU MHz is reduced. The calculated sumchecks should be identical for all threads. In the event of any comparison failures, the reported sumcheck is shown as zero.
Below are results from one minute stress tests using 16 and 32 threads, demonstrating similar throughput of around 6 GFLOPS. This is followed by details from 15 minute runs on various systems using 8 threads, including the same T22 system, that still produced a consistent performance of 6 around GFLOPS. All tests were carried out with fully charged batteries and power connected.
The table demonstrates a wide variation in the number of passes carried out in 15 minutes, where some are influenced by the calibration calculations for 10 seconds test duration, in this case the first pass shown as taking between 9.8 and 11.5 seconds. Besides speed reductions due to heating effects, or little change at the end, there can be short term reductions due to other system activity (worst case like downloading and installing updates).
P37 produced the highest performance degradation for initial tests, at 43%. The next three had similar beginning and end performance, with the occasional short term hiccup. The first T21 session produced slightly slower speed at the end. Repeating this shortly afterwards produced a 12% degradation. Kindle3 was run with the tablet in direct sunlight, with surrounding air around 30°C. This led to a 57% performance degradation. The last set of results were somewhat inconsistent over the whole period.
Benchmark Mode Results
ARM/Intel MP-FPU Stress Test V1.0 30-May-2017 19.39
Compiled for 32 bit ARM v7a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
8.6 T1 2 228 227 220 40392 76406 99700
4.4 T2 2 451 449 434 40392 76406 99700
2.4 T4 2 882 882 736 40392 76406 99700
2.0 T8 2 1182 1250 758 40392 76406 99700
16.3 T1 8 477 477 466 54760 85092 99819
8.2 T2 8 951 949 925 54760 85092 99819
4.2 T4 8 1856 1879 1830 54760 85092 99819
2.8 T8 8 2738 2941 2744 54760 85092 99819
38.1 T1 32 811 813 801 35218 66014 99520
19.1 T2 32 1625 1621 1605 35218 66014 99520
9.7 T4 32 3190 3222 3186 35218 66014 99520
6.1 T8 32 4909 5179 5135 35218 66014 99520
End Time 30-May-2017 19.41
Stress Test 16 Threads
ARM/Intel MP-FPU Stress Test V1.0 01-Jun-2017 11.43
Compiled for 64 bit ARM v8a
Data Ops/ Nmeric
Seconds Size Threads Word MFLOPS Results
11.9 128 KB 16 32 6058 35951
22.1 128 KB 16 32 6012 35951
32.8 128 KB 16 32 5717 35951
43.1 128 KB 16 32 5988 35951
53.3 128 KB 16 32 5991 35951
63.6 128 KB 16 32 5962 35951
End Time 01-Jun-2017 11.46
Stress Test 32 Threads
ARM/Intel MP-FPU Stress Test V1.0 01-Jun-2017 11.40
Compiled for 64 bit ARM v8a
Data Ops/ Nmeric
Seconds Size Threads Word MFLOPS Results
11.8 128 KB 32 32 6087 35951
22.0 128 KB 32 32 6040 35951
32.2 128 KB 32 32 6001 35951
42.4 128 KB 32 32 6020 35951
52.7 128 KB 32 32 6001 35951
63.1 128 KB 32 32 5897 35951
End Time 01-Jun-2017 11.43
Various Systems, all 8 Threads, 32 Ops/word, 128 KB, 15 Minutes
System P37 T22 A1 A5 T21 T21 T21 T11
Device moto Leno Asus Tec Kindl1 Kindl2 Kindl3 Voyo
CPU A53 A53 Atom Atom QC800 QC800 QC800 A15
Cores 8 4 4 4 4 4 4 2
GHz 1.5+1.2 1.3 1.86 1.44 2.2 2.2 2.2 2
Test Secs
Start 11.5 10.2 10.0 9.8 10.5 10.5 10.6 10.4
End 17.0 10.3 10.0 9.1 11.1 11.7 23.8 12.2
Pass -------------------------- MFLOPS --------------------------
1 5435 6025 4131 3329 4766 4853 4810 2758
2 5451 5937 4110 3183 4856 4876 4826 2226
3 5451 6005 4114 3097 4886 4886 4846 2937
4 5349 5919 4107 3168 4889 4882 4729 3045
5 5396 5995 4137 3138 4863 4897 4833 3052
6 5332 5997 4117 3154 4895 4766 4712 3032
7 5334 5985 4103 3161 4877 4690 4717 3023
8 5431 6009 4097 3214 4889 4610 4864 3056
9 5195 5977 4099 3193 4894 4609 4873 2726
10 5415 5879 4144 3153 4898 4574 4876 3033
11 5278 5994 4087 3149 4805 4610 4891 2592
12 5315 5989 4109 3140 4835 4592 4878 3046
13 5311 5977 4136 3151 4862 4617 4874 1617
14 5142 5991 4106 3173 4890 4557 4856 2216
15 5069 5964 4138 3113 4890 4569 4894 2719
16 5017 4618 4101 3118 4899 4546 4805 3037
17 5102 5879 4128 3161 4869 4553 4604 2729
18 5073 5945 4098 3135 4871 4520 4733 2727
19 5064 5963 4144 3170 4869 4533 4652 2973
20 5104 5976 4131 3139 4885 4558 4605 2672
21 4625 5824 4139 3152 4882 4512 4594 2699
22 4558 5984 4145 3106 4892 4547 4559 2924
23 4572 5934 4164 3128 4870 4508 4535 2739
24 4701 5968 4128 3132 4860 4497 4524 2626
25 4674 5975 4083 3121 4870 4550 4488 2987
26 4298 5979 4079 3139 4734 4525 4443 2675
27 4384 5963 4124 3034 4697 4485 4413 2623
28 4343 5981 4106 3118 4781 4483 4416 2928
29 4442 5965 4135 3180 4866 4514 4441 2692
30 4147 5974 4141 3130 4817 4492 4436 2619
31 4246 5998 4099 3032 4837 4505 4422 2744
32 4530 6008 4046 3393 4872 4469 3390 2892
33 3903 5951 4120 3380 4876 4488 4259 2615
34 3979 5990 4098 3350 4864 4519 3228 2617
35 4639 5973 4120 3388 4858 4488 3572 2833
36 3934 5953 4107 3364 4889 4499 3408 2801
37 4021 5921 4118 3372 4842 4474 3150 2579
38 3872 5983 4138 3401 4855 4515 3377 2624
39 4002 5925 4109 3397 4853 4464 2772 2613
40 4212 5996 4141 3384 4832 4474 2996 2838
41 3997 5970 4109 3397 4854 4460 2892 2800
42 3998 5986 4084 3397 4856 4446 2686 2645
43 3878 5992 4116 3302 4878 4432 2691 2523
44 3907 5965 4150 3400 4854 4485 2695 2589
45 3955 5922 4113 3402 4818 4429 2696 2840
46 3795 5944 4132 3368 4862 4475 2702 2765
47 3843 5938 4098 3359 4786 4432 2690 2652
48 3799 5979 4118 3379 4817 4464 2690 2492
49 3532 5947 4125 3374 4876 4438 2202 2619
50 3115 5986 4121 3375 4804 4435 2162 2798
51 3962 5979 3728 3399 4840 4435 2162 2694
52 3922 5980 4084 3401 4697 4404 2165 2520
53 3822 5977 4120 3383 4776 4448 2148 2621
54 3669 5967 4067 3364 4732 4383 2113 2607
55 3777 5991 4141 3389 4673 4444 2126 2702
56 3591 5964 4111 3390 4739 4423 2170 2830
57 3660 5992 4113 3372 4700 4428 2137 2627
58 3883 5966 4115 3397 4684 4445 2163 2510
59 3727 5972 4114 3395 4723 4436 2158 2522
60 3710 6002 4105 3209 4700 4356 2152 2792
61 3951 6009 4046 2951 4722 4408 2699
62 3628 5807 4082 3109 4745 4381 2546
63 3572 5929 4124 3069 4728 4390 2527
64 3743 5963 4113 3144 4714 4442 2522
65 5954 4133 3145 4699 4405 2785
66 5949 4142 3074 4688 4360 2698
67 5964 4112 3087 4688 4374 2532
68 5903 4107 3152 4685 4334 2468
69 5956 4088 3037 4527 4370 2630
70 5962 4136 3146 4664 4399 2793
71 5981 4146 3158 4658 4407 2598
72 5985 4107 3119 4647 4382 2508
73 5937 4086 3134 4618 4372 2512
74 5944 4130 3162 4658 4387 2504
75 5965 4086 3143 4602 4395 2787
76 5971 4153 3163 4652 4382 2607
77 5987 4130 3155 4588 4383 2547
78 5957 4150 3145 4581 4391 2512
79 5920 4137 3128 4596 4381
80 5984 4109 3141 4631
81 5989 4146 3121 4623
82 5959 4120 3174 4609
83 5957 4140 3184 4533
84 5982 4102 3143 4634
85 5902 4111 3171
86 5954 3787 3144
87 6000 4097 3167
88 4101 3121
89 4084 3162
90 4155 3141
91 3027
92 3336
93 3397
94 3365
Average 4403 5948 4108 3217 4779 4498 3715 2690
Maximum 5451 6025 4164 3402 4899 4897 4894 3056
Minimum 3115 4618 3728 2951 4527 4334 2113 1617
|
This test writes data, comprising two data patterns out of 24 variations (such as binary 0000. 0101, 0011, 1111) then reads it via alternate additions and subtractions. This leaves the original data unchanged, which is checked for correctness and any errors reported. As with the Floating Point Stress Test, buttons are provided to run a quick benchmark or long running stress test and one to set parameters for the latter. Performance is measured in MB/second.
Benchmark - Below is an example of results, the program using all thread and data size combinations, and the first 6 data patterns. Note fastest speeds are with all threads using different sections of 160 KB.
Stress Test - Following benchmark output are some stress test results, all at the default parameter settings and mainly with the systems connected to the power source. As with MP-FPU-Stress.apk, the number of passes in 15 minutes varies, depending on the initial calibrated time and whether speed is changed due to the CPU clock speed reducing at higher temperatures.
Results include some with the devices running without the power supply connected. One (T21), showed similar performance between battery and power supply driven, but the battery was probably fully charged.
Benchmark Mode Results
ARM/Intel MP-Int Stress Test V1.0 21-Jun-2017 16.50
Compiled for 32 bit ARM v7a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
9.1 1 2970 2855 2336 00000000 Yes
4.7 2 5770 5605 4523 FFFFFFFF Yes
3.0 4 10876 10907 5534 5A5A5A5A Yes
2.4 8 14361 16162 6156 AAAAAAAA Yes
2.3 16 16522 18100 6091 CCCCCCCC Yes
2.3 32 15948 17827 6187 0F0F0F0F Yes
End Time 21-Jun-2017 18.41
Various Systems, 8 Threads, 160 KB, 15 Minutes
System P37 T22 T11 T11 A1 A5 T21 T21
Device moto Leno Voyo Voyo Asus Tec Kindl2 Kindl2
CPU A53 A53 A15 A15 Atom Atom QC800 QC800
Cores 8 4 2 2 4 4 4 4
GHz 1.5+1.2 1.3 2 2 1.86 1.44 2.2 2.2
Test Secs
start 9.5 10.1 9.7 8.5 9.7 7.2 9.4 9.0
end 13.5 9.7 14.2 12.9 9.0 7.1 10.3 9.9
Pass Battery Battery
1 20037 16331 12745 11029 25433 22184 14100 9778
2 19149 16111 12102 10888 26589 21509 14361 14046
3 19451 16127 10349 10629 26185 20577 14570 13577
4 19308 16073 10938 8464 26727 18492 14433 14111
5 19386 16308 10988 9600 26541 18574 14449 12458
6 19714 16511 10713 6264 26841 19075 14468 12866
7 19376 16283 10186 6286 26982 18206 14468 14298
8 19327 16110 9845 6453 26761 18080 14488 13913
9 19224 16036 9792 6239 26804 16131 14101 14174
10 20331 16409 10116 6267 26563 15385 14097 13272
11 19945 16324 10797 6302 26765 14799 13961 12987
12 19101 15923 9830 6348 26946 12244 13875 13444
13 19478 16066 9630 6469 26928 17727 13381 14341
14 19482 16472 9043 6358 26708 16036 13173 14083
15 18492 16146 9873 6441 26985 12831 13121 14339
16 18664 15971 10445 6272 26678 18164 13071 13381
17 18476 16296 9875 6337 26818 18394 13017 13738
18 16615 16371 8887 6402 27028 18204 13033 13732
19 15829 16419 9170 6441 27069 18688 13078 13842
20 16755 16205 9185 6399 26640 18542 13067 13650
21 14564 16059 10952 6297 26796 18379 12936 13692
22 16996 15787 9597 6418 26967 18645 12896 13573
23 14891 16051 9359 6201 26830 19071 12966 13614
24 17154 16219 9178 6244 26765 18641 12759 13540
25 14580 15907 8707 6373 26817 18589 12908 13484
26 17185 15995 10194 6327 26875 18765 12891 13518
27 14063 15978 9824 6362 26716 17421 12781 13392
28 15158 16004 8697 6422 26725 18432 12771 13301
29 14347 16341 8705 6292 26909 16909 12779 13060
30 13116 16060 8774 6420 26854 15801 12689 13445
31 13267 16475 10325 6281 26888 13700 12768 13196
32 13814 16123 9327 6411 26948 20494 12785 13248
33 14348 16107 8643 6313 26960 20499 12723 13073
34 12555 16150 8445 6334 25360 21228 12794 12926
35 12579 16043 8702 6450 26332 21266 12613 12942
36 14506 16026 9960 5991 26047 21142 12613 13107
37 14338 16309 9510 6435 26233 20585 12594 13225
38 12474 16409 8837 6389 26708 23052 12523 13167
39 14030 15855 8564 6435 26985 23206 12551 13142
40 14399 16140 8594 6322 26869 23224 12503 13108
41 13122 15976 8876 6174 26730 23089 12447 13092
42 12340 16181 10637 6233 26445 23183 12443 13055
43 12880 16184 9317 6376 26554 23255 12555 12914
44 12454 16184 8423 6423 26970 23393 12473 13159
45 12220 16341 8614 6447 26637 23267 12408 12927
46 11486 16351 8183 6391 27022 23291 12423 13078
47 13306 16196 10820 6327 26650 23321 12383 12895
48 13629 16254 8968 6312 26702 23192 12234 13033
49 11897 16351 8463 6286 26558 23306 12353 12884
50 14640 16069 8036 6364 26757 23229 12344 12942
51 11354 16054 10331 6440 26781 23166 12320 12918
52 13217 16080 9005 6360 26759 23270 12261 12868
53 12672 15856 8373 6425 26863 23259 12216 12919
54 11752 16150 8603 6441 27063 23352 12255 12808
55 12783 16147 8228 6384 26641 23302 12283 12923
56 12984 16340 9258 6424 26832 23249 12201 12838
57 11459 16434 9690 6247 26491 23395 12228 12829
58 13042 16378 8510 6461 26706 22927 12265 12856
59 11289 16019 8259 6372 26866 23063 12241 12821
60 14140 16443 8089 6360 26919 23335 12212 12743
61 11527 16392 9949 6249 26639 23257 12227 12683
62 12224 16332 9177 6332 26721 23148 12081 12581
63 11942 16209 8383 6430 26806 23201 12302 12732
64 11836 15867 8339 6366 26724 23254 12210 12706
65 12680 16161 8215 6435 26560 23295 12143 12740
66 11171 16228 8719 6338 26823 23278 12166 12731
67 13207 16193 9821 6422 27002 23175 12107 12792
68 11382 16320 8978 6410 26768 23358 12077 12673
69 11365 16309 8073 6353 26942 23214 12135 12575
70 13366 16169 7896 6426 26681 22885 12111 12817
71 10909 16759 9942 6360 26608 23360 12208 12645
72 13351 15734 9088 6300 26883 23238 12098 12650
73 16595 8319 26708 23302 12652 13016
74 16291 26859 23308 12639 13126
75 15874 27044 23292 12656 13135
76 15990 26904 23344 12775 13155
77 16167 26865 23252 12664 13198
78 16320 26734 23181 12554 13108
79 16416 26781 23351 12423 12995
80 15805 27054 23130 12554 12889
81 16097 26781 23135 12464 13011
82 16405 26911 23331 12538 12960
83 16120 26803 19419 12370 13119
84 16338 26739 20700 12478 13037
85 16211 26756 21231 11963 12552
86 16311 26798 21087 12061 12426
87 16082 26798 21321 12123 12465
88 16141 26969 20290 11974 12387
89 16125 26974 21290 11936 12448
90 15928 26737 20739 11999 12499
91 16068 26703 21205 11932 12403
92 16321 26724 21281 12033 12422
93 16329 26595 21068 11915 12455
94 16398 27019 20730 12305
95 26742 21275 12404
96 26705 21495 12317
97 27052 20938
98 26899 21087
99 26613 21154
100 26466 21066
101 26338 20967
102 26626 19663
103 20117
104 20996
105 21445
106 20935
107 20882
108 21328
109 21347
110 20954
111 21175
112 21107
113 20924
114 20812
115 21636
116 21501
117 21184
118 21303
119 21259
120 21161
121 21398
122 20517
123 20803
124 21484
125 21334
126 20291
Average 14780 16186 9356 6615 26739 20978 12713 13046
Maximum 20331 16759 12745 11029 27069 23395 14570 14341
Minimum 10909 15734 7896 5991 25360 12244 11915 9778
End 11365 16398 8319 6300 26626 20291 11915 12317
End Pass
Seconds 13.5 9.7 14.2 12.9 9.0 7.1 10.3 9.9
Passes 72 94 73 72 102 126 93 96
|
The following series of tests comprised running both the floating point and integer stress tests at the same time, both using the default parameters with 8 threads. All tests were run using battery power. Results provided are for the first three test runs and the last three of the overall 15 minutes. Note that the two test programs had different running times.
The first (P37) has 8 cores. On starting, each of the two stress tests, as might be expected, initially running at around half speed. After 15 minutes, both produced similar performance degradations and essentially the same as single system tests, using power supplies. Following a slight delay, the second tests started running at slightly decreased temperatures and faster speed, but produced slower end speeds. Run 3 started in a similar manner, then went haywire, with FPU tests running at a crawl and the other speeding up. The fourth test runs fitted the normal pattern, each ending with performance equivalent to a quarter of the maximum of that running a single program.
The second system (T21) has a quad core CPU and produced fairly consistent performance over this particular hour of testing.
Next log details are provided to demonstrate that a device can handle 64 threads, using 32 from each of the stress tests. In this case (with P37), performance over 5 minutes was similar to that at the start of the 8 thread test, using both apps.
P37 Octa-core Cortex-A53 T21 Quad Core Snapdragon 800
Secs MB/sec %max Secs MFLOPS %max Secs MB/sec %max Secs MFLOPS %max
Max 1 program 20037 5435 14570 4899
Run 1
Start 17 10441 52 11 2790 51 9 7192 49 21 2702 55
16 10677 11 2819 8 7313 20 2482
17 10163 11 2862 8 6848 21 2517
End 22 7703 15 2018 9 6718 25 2482
25 7030 16 1886 9 6419 24 2517
25 6913 35 16 1899 35 9 6564 45 24 2479 51
Run 2
Start 18 8713 43 16 1969 36 10 6414 44 24 2140 44
20 7848 16 1964 10 6669 23 2268
20 7711 16 1949 10 6312 24 2179
End 26 5966 20 1529 10 6513 27 1883
27 5865 20 1522 10 6299 25 2092
27 5733 29 20 1569 29 12 5339 37 27 1900 39
Run 3
Start 21 6957 35 18 1746 32 10 6619 45 23 2247 46
23 6445 20 1553 10 6609 24 2120
24 6135 18 1680 10 6888 24 2168
End 17 8548 53 581 10 6738 26 1996
17 8542 84 367 12 5353 28 1849
18 8414 42 74 413 8 12 5519 38 28 1811 37
Run 4
Start 22 6275 31 12 1738 32 10 6880 47 22 2341 48
24 5941 12 1699 10 6821 23 2183
25 5532 14 1437 10 6563 24 2117
End 26 5309 15 1396 12 5357 28 1834
26 5277 15 1370 12 5629 28 1845
27 5081 25 16 1276 23 12 5572 38 28 1853 38
P37 Both Stress Tests 32 Threads Each
ARM/Intel MP-Int Stress Test V1.0 ARM/Intel MP-FPU Stress Test V1.0
25-Jul-2017 10.53 25-Jul-2017 10.54
Compiled for 32 bit ARM v7a Compiled for 32 bit ARM v7a
Data Same All Data Ops/ Numeric
Secs KB Threads MB/sec Sumcheck Threads Secs KB Threads Word MFLOPS Results
17 160 32 11626 00000000 Yes 15 128 32 32 2771 42157
35 160 32 10456 00000000 Yes 28 128 32 32 2492 42157
52 160 32 10743 00000000 Yes 38 128 32 32 2845 42157
71 160 32 10198 00000000 Yes 50 128 32 32 2758 42157
89 160 32 10534 00000000 Yes 61 128 32 32 2677 42157
106 160 32 10752 00000000 Yes 72 128 32 32 2926 42157
125 160 32 10168 FFFFFFFF Yes 84 128 32 32 2549 42157
142 160 32 11094 FFFFFFFF Yes 94 128 32 32 3017 42157
160 160 32 10389 FFFFFFFF Yes 104 128 32 32 2881 42157
178 160 32 10408 FFFFFFFF Yes 117 128 32 32 2474 42157
195 160 32 11203 FFFFFFFF Yes 127 128 32 32 2920 42157
214 160 32 9938 FFFFFFFF Yes 139 128 32 32 2712 42157
230 160 32 11622 5A5A5A5A Yes 150 128 32 32 2826 42157
249 160 32 9857 5A5A5A5A Yes 161 128 32 32 2816 42157
267 160 32 10381 5A5A5A5A Yes 173 128 32 32 2496 42157
285 160 32 10317 5A5A5A5A Yes 183 128 32 32 3067 42157
305 160 32 9808 5A5A5A5A Yes 194 128 32 32 2779 42157
207 128 32 32 2410 42157
End Time 25-Jul-2017 11.00 217 128 32 32 3020 42157
228 128 32 32 2743 42157
Average 10537 240 128 32 32 2474 42157
250 128 32 32 3147 42157
Started a little earlier 263 128 32 32 2423 42157
273 128 32 32 2927 42157
284 128 32 32 2876 42157
297 128 32 32 2422 42157
305 128 32 32 3758 42157
End Time 25-Jul-2017 11.00
Average 2786
Ended slightly later
|
Running the stress tests did not reveal any real data comparison failures, although one did appear to occur before a flat battery lead to a switch off. Also, there were a couple of inexplicable program crashes, where, of course, the recorded results are lost. However, there is an issue regarding false error reports.
All of my Android CPU benchmarks arrange for starting, stopping and displaying results via Java code, with those executing native machine code produced from compiled C. The original benchmarks only display results when all processing is finished and do not appear to demonstrate the peculiar behaviour of the stress tests.
When running these stress tests, rotating the device leads to the initial starting display to be produced. Then, after pressing the Run button, errors, as shown below, are indicated. Before this, running VMSTAT, via a Terminal Emulator app, indicates that processing had not stopped executing the benchmark code. Hence, it seems that two copies of the program were running at the same time, confusing reported results. The same effect is reproduced by pressing the Run button whilst the program is executing.
AVOID RUNNING WHEN THE PHONE/TABLET IS HAND HELD
The following are examples of false errors, when it was known that tests had been restarted after a stoppage caused by rotating the device. For a clean restart, normally the offending program can be killed by tapping the “RECENTS” (square) button and swiping the app off the screen. then restarted via the main display. However, one tablet (T21) had to be removed via the Settings, App, Force Stop button.
ARM/Intel MP-FPU Stress Test V1.0 26-Jun-2017 11.25
Data Ops/ Nmeric
Seconds Size Threads Word MFLOPS Results
32.2 128 KB 8 32 1410 0 Zero indicates eorrors found
54.1 128 KB 8 32 1172 0 Time/pass much greater than 10 seconds
83.4 128 KB 8 32 1195 0
121.9 128 KB 8 32 1385 0
135.4 128 KB 8 32 1507 49805 Unexpected result
153.8 128 KB 8 32 1367 0
189.7 128 KB 8 32 1159 0
204.5 128 KB 8 32 693 0 Shorter time but worse performance
323.5 128 KB 8 32 692 0
350.8 128 KB 8 3234222848 99999 Measured test time near zero in 27
seconds, 99999 reflects initial data
410.1 128 KB 8 32 1005 0
433.9 128 KB 8 32 1255 66014 Expected result
ARM/Intel MP-Int Stress Test V1.0 26-Jun-2017 11.25
Data Same All
Seconds Size Threads MB/sec Sumcheck Threads
8.7 160 KB 8 4568 00000000 Yes Test seconds as expected around 10
25.3 160 KB 8 6375 00000000 Yes to 11 seconds
32.4 160 KB 8 3451 00000000 Yes Yes means all threads correct result
39.5 160 KB 8 3492 00000000 Yes
46.6 160 KB 8 1951607840 00000000 Yes Impossible MB/sec suggests test did
79.5 160 KB 8 4728 FFFFFFFF Yes not run
98.0 160 KB 8 5205 FFFFFFFF Yes
114.6 160 KB 8 5760 FFFFFFFF Yes Should be six times 00000000
134.4 160 KB 8 4797 5A5A5A5A Yes then six times FFFFFFFF
158.9 160 KB 8 3537 5A5A5A5A Yes then six times 5A5A5A5A
174.1 160 KB 8 4891 5A5A5A5A Yes etc.
237.4 160 KB 8 1951607840 5A5A5A5A No 1 Impossible MB/sec, 1 thread wrong
374.6 160 KB 8 279600040 CCCCCCCC No 8 Impossible MB/sec, 8 threads wrong
397.3 160 KB 8 3204 CCCCCCCC Yes
415.7 160 KB 8 6304 0F0F0F0F No 4 4 threads wrong, were they CCCCCCCC
420.5 160 KB 8 1951607840 CCCCCCCC Yes
|
T7 Nexus 7 quad core CPU 1.3, GHz 1.2 GHz > 1 core
Device Asus Nexus 7
RAM 1 GB DDR3L-1333 Bandwidth 5.3 GB/sec
Screen pixels w x h 1280 x 736 MHz
Twelve-core Nvidia GeForce ULP graphics 416 MHz
Android Build Version 4.1.2
Processor : ARMv7 Processor rev 9 (v7l)
processor : 0 BogoMIPS : 1993.93
processor : 1 BogoMIPS : 1993.93
processor : 2 BogoMIPS : 1993.93
processor : 3 BogoMIPS : 1993.93
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09 - Cortex-A9
CPU revision : 9
Hardware : grouper - nVidia Tegra 3 T30L
Revision : 0000
Linux version 3.1.10
Runs at 1.2 GHz
T11 Voyo A15, Samsung EXYNOS 5250 Dual core 2.0 GHz Cortex-A15,
Device Urbetter VOYO A15
Mali-T604 GPU, 2 GB DDR3-1600 RAM, dual channel, 12.8 GB/s
Screen pixels w x h 1920 x 1032
Android Build Version 4.2.2 - Jelly Bean
Processor : ARMv7 Processor rev 4 (v7l)
processor : 0
BogoMIPS : 992.87
processor : 1
BogoMIPS : 997.78
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc0f
CPU revision : 4
Hardware : SMDK5250
Linux version 3.4.35Ut
Runs at 1.7 GHz
T15 HTC Nexus 9, dual core Denver CPU 2400 MHz
Screen pixels w x h 2048 x 1440
Android Build Version 5.0.1
Processor : NVIDIA Denver 1.0 rev 0 (aarch64)
processor : 0 & 1
Features : fp asimd aes pmull sha1 sha2 crc32
CPU implementer : 0x4e
CPU architecture: AArch64
CPU variant : 0x0
CPU part : 0x000
CPU revision : 0
Hardware : Flounder
Revision : 0000
MTS version : 33410787
Linux version 3.10.40
T21 Kindle Fire HDX 7, 2.2 GHz Quad Core Qualcomm Snapdragon 800 (Krait 400)
2 x 32 Bit LPDDR3-1866 Memory, 14.9 GB/s, GPU Qualcomm Adreno 330, 578 MHz
Device Amazon KFTHWI
Screen pixels w x h 1200 x 1803
Android Build Version 4.4.3
Processor : ARMv7 Processor rev 0 (v7l)
processor : 0, 1, 2, 3
BogoMIPS : 38.40
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x06f
CPU revision : 0
Hardware : Qualcomm MSM8974
Revision : 0000
Linux version 3.4.0-perf (gcc version 4.7)
T22 Lenovo Tab 2 A8-50, 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53
1 GB LPDDR3, GPU Mali T720 MP2
Device LENOVO Lenovo TAB 2 A8-50F
Screen pixels w x h 800 x 1216
Android Build Version 5.0.2
Processor : AArch64 Processor rev 3 (aarch64)
processor : 0, 1, 2
BogoMIPS : 26.0
Features : fp asimd aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: AArch64
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 3
Hardware : MT8161
Linux version 3.10.65
P33 Sony Xperia Z3+ E6533, Quad-core 1.5 GHz & Quad-core 2 GHz Qualcomm
Snapdragon 810 64-bit CPU
Screen pixels w x h 1080 x 1776
Android Build Version 5.0.2
Processor : AArch64 Processor rev 1 (aarch64)
processor : 0 to 7
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
Hardware : Qualcomm Technologies, Inc MSM8994
Linux version 3.?10.?49
P36 LGE LG-H811 Qualcomm Snapdragon 808, 1.8 GHz 64-bit Hexa-Core
Device LGE LG-H811
Screen pixels w x h 1440 x 2392
Android Build Version 5.1
Processor : AArch64 Processor rev 2 (aarch64)
processor : 0, 1, 2, 3, 4, 5
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 2
Hardware : Qualcomm Technologies, Inc MSM8992
Revision : 000b
Linux version 3.10.49-
P37 Lenovo Moto G4 Snapdragon 617, Octa-core Cortex-A53
Cores 4x1.5 GHz 4x1.2 GHz, 2 GB RAM 933 MHz,
GPU Adreno 405 550 MHz
Device Motorola Moto G (4)
Screen pixels w x h 1080 x 1776
Android Build Version 6.0.1
CPU part : 0xd03
CPU revision : 4
Hardware : Qualcomm Technologies, Inc MSM8952
Revision : 82a0
Processor : ARMv7 Processor rev 4 (v7l)
Device : athene_13mp
Radio : EMEA
MSM Hardware : MSM8952
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 5, 6, 7
model name : ARMv7 Processor rev 4 (v7l)
BogoMIPS : 38.00
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt vfpd32 evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
Linux version 3.10.84-g061c37c
P37 Later
Android Build Version 7.0
Linux version 3.10.84-g478d03a
P38 Samsung Galaxy Note 4 Snapdragon 805, 4x2.7 GHz Cortex A57 + 4x1.3 GHz Cortex A53
Device Samsung SM-N910C
Screen pixels w x h 1440 x 2560
Android Build Version 6.0.1
processor : 4 to 7
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 76.00
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 0
Hardware : Samsung EXYNOS5433
Revision : 0015
Serial : bfc12ce406b30041
Linux version 3.10.9-9186796
P39 Galaxy Tab S2 SM-T710 EXYNOS 5433, 4x1.9 GHz Cortex A57 + 4x1.3 GHz Cortex A53
Device Samsung SM-T710
Screen pixels w x h 1536 x 2048
Android Build Version 6.0.1
processor : 4 to 7
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 76.00
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 0
Hardware : Samsung EXYNOS5433
Revision : 0008
Serial : 5f827412e6280033
Linux version 3.10.9-8374498
P40 Moto X 1st XT1049, dual core 1.7 GHz Qualcomm Snapdragon S4 Pro MSM8960
Device Motorola XT1049
Screen pixels w x h 720 x 1184
Android Build Version 5.1
Processor : ARMv7 Processor rev 0 (v7l)
processor : 0, 1
BogoMIPS : 13.53
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x04d
CPU revision : 0
Hardware : msm8960dt
Revision : 8300
Serial : 0001000c044ef01d
Device : ghost
Radio : 4
Linux version 3.4.42-gd5fa9d8
P41 Moto G Play XT1607, quad core 1.2 GHz Cortex A53 MSM8916 Snapdragon 410
Device Motorola Moto G Play
Screen pixels w x h 720 x 1184
Android Build Version 6.0.1
CPU revision : 0
Hardware : Qualcomm Technologies, Inc MSM8916
Revision : 81b0
Serial : e5c8122300000000
Device : harpia
Radio : US
MSM Hardware : MSM8916
processor : 0 to 3
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 38.00
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva
idivt vfpd32 evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 0
Linux version 3.10.49-g41f86a8
A1 Asus MemoPad 7 ME176CEX, 1.86 GHz Atom Intel Atom Z3745
Device Asus K013
Screen pixels w x h 800 x 1216
Android Build Version 4.4.2
Processor : ARMv7 processor rev 1 (v7l)
BogoMIPS : 1500.0
Features : neon vfp swp half thumb fastmult edsp vfpv3
CPU implementer : 0x69
CPU architecture: 7
CPU variant : 0x1
CPU part : 0x001
CPU revision : 1
Hardware : placeholder
Revision : 0001
Linux version 3.10.20
Mainly runs at 1.86 GHz Turbo Boost
A4 Intel(R) Atom x5-Z8300 1.84 GHz (turbo)
Device Intel cht_cr_rvp
Screen pixels w x h 800 x 1216
Android Build Version 5.1.1
: 6 initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperf nonstop_tsc_s3 pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes
rdrand lahf_lm 3dnowprefetch ida arat epb dtherm tpr_shadow vnmi
flexpriority ept vpid tsc_adjust smep erms
bogomips : 2879.90
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 76
model name : Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz
stepping : 3
microcode : 0x358
cpu MHz : 1840.000
cache size : 1024 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid
Linux version 3.14.37
A5 Same tablet as W2 - Intel Atom Z8300 1.44 GHz, Turbo 1.84
Device Teclast X98 Plus(A5C8)
Screen pixels w x h 2048 x 1440
Android Build Version 5.1
Processor : ARMv7 processor rev 1 (v7l)
BogoMIPS : 1500.0
Features : neon vfp swp half thumb fastmult edsp vfpv3 vfpv4 idiva idivt
CPU implementer : 0x69
CPU architecture: 7
CPU variant : 0x1
CPU part : 0x001
CPU revision : 1
Hardware : placeholder
Revision : 0001
Linux version 3.14.37-x86_64-L1-R429
R1 Same as tablet W! running via Remix for PC with Android 6
Intel Z8300 quad core 1.44 GHz Turbo 1.8
Device PIPO W1S
Screen pixels w x h 396 x 674
Android Build Version 6.0.1 - 64 bit
flags etc. As A4 above
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 76
model name : Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz
stepping : 3
microcode : 0x34f
cpu MHz : 1599.975
cache size : 1024 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid : 6
initial apicid
Linux version 4.4.14-android-x86_64
R2 Same as PC - Core i7 4820K quad core + HT at 3900 MHz Turbo
Screen pixels w x h 396 x 674
Android Build Version 6.0.1 - 64 bit
flags: numerous
bogomips : 7421.92
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Core(TM) i7-4820K CPU @ 3.70GHz
stepping : 4
microcode : 0x416
cpu MHz : 2471.484
cache size : 10240 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
Linux version 4.4.14-android-x86_64
W1 Pipo W1S Tablet. Intel Z8300 quad core 1.44 GHz Turbo 1.84
Same as R1 above
Windows 10, 4 GB DDR 3 1600
CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000406C3
Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz Measured 1440 MHz
Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow,
AMD64 processor architecture, 4 CPUs
Windows NT Version 6.2, build 9200,
Memory 4020 MB, Free 2520 MB
W2 Same tablet as A5
Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84
CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000406C3
Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz Measured 1440 MHz
Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow,
Intel processor architecture, 4 CPUs
Windows NT Version 6.2, build 9200,
Memory 4021 MB, Free 2540 MB
User Virtual Space 4096 MB, Free 4083 MB
64 Bit
AMD64 processor architecture, 4 CPUs
User Virtual Space 134217728 MB, Free 134217716 MB
PC Core i7 4820K quad core + HT at 3900 MHz Turbo
Same as R2 above
CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000306E4
Intel(R) Core(TM) i7-4820K CPU @ 3.70GHz Measured 3711 MHz
Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow,
AMD64 processor architecture, 8 CPUs
Windows NT Version 6.2, build 9200,
Memory 32705 MB, Free 30584 MB
User Virtual Space 134217728 MB, Free 134217715 MB
|
Roy Longbottom March 2018