Android 12 and 13 Benchmarks and Cortex-X2 CPU With Low MP Efficiency
This new phone’s CPU is based on Arm®v9.0-A architecture. As can be seen here, the program functions used identify a completely different variety of features and limited information about the technology used. CPU-Z provided limited information and numerous searches did not help in finding more
|
CPUID From Benchmarks From CPU-Z or Searches System 4 Android 13 Samsung S22 Device Samsung SM-S901B 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Screen pixels w x h 1080 x 2009 SOC Exynos 2200 4nm Caches L1 64 KB, L2 between 512 & 1024 KB, L3 between 512 KB and 8 MB GPU Xclipse 920 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 bti CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x2 CPU part : 0xd48 CPU revision : 0 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x2 CPU part : 0xd47 CPU revision : 0 processor : 6 BogoMIPS : 51.20 |
Maximum CPU Speed Summary
System 1 - 2 x Cortex-A76 at 2.05 GHz, 6 x Cortex-A55 at 2.00 GHz System 2 - 2 x Cortex-A76 at 2.00 GHz, 6 x Cortex-A55 at 1.80 GHz System 3 - 2 x Cortex-A75 at 2.00 GHz, 6 x Cortex-A55 at 2.00 GHz System 4 - 1 x Cortex-X2 at 2.80 GHz, 3 x Cortex-A710 at 2.52 GHz, 4 x Cortex-A510 at 1.82 GHzThe following single threaded CPU benchmarks are expected to run on the fastest CPU core. The same applying to the MP multithreading programs running using a single thread.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel Native Whetstone Benchmark 4A8 04-Feb-2023 13.11 Compiled for 64 bit ARM v8a Test MFLOPS MOPS millisecs Results N1 float 1087.84 0.018 -1.124750137 N2 float 846.07 0.159 -1.131330490 N3 if 3066.65 0.034 1.000000000 N4 fixpt 5109.38 0.062 12.000000000 N5 cos 147.35 0.565 0.499109805 N6 float 816.02 0.661 0.999999821 N7 equal 2043.99 0.090 3.000000000 N8 exp 76.12 0.489 0.935364604 MWIPS 4815.37 2.077 Total Elapsed Time 18.3 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel Native Whetstone Benchmark 4A8 05-Feb-2023 10.04 Test MFLOPS MOPS millisecs Results N1 float 1068.88 0.018 -1.124750137 N2 float 886.76 0.152 -1.131330490 N3 if 2991.53 0.035 1.000000000 N4 fixpt 5013.41 0.063 12.000000000 N5 cos 141.39 0.588 0.499109805 N6 float 801.74 0.673 0.999999821 N7 equal 2004.78 0.092 3.000000000 N8 exp 70.97 0.524 0.935364604 MWIPS 4663.10 2.144 Total Elapsed Time 16.2 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel Native Whetstone Benchmark 4A8 04-Feb-2023 15.31 Test MFLOPS MOPS millisecs Results N1 float 819.73 0.023 -1.124750137 N2 float 665.33 0.202 -1.131330490 N3 if 2997.37 0.035 1.000000000 N4 fixpt 3331.87 0.095 12.000000000 N5 cos 130.91 0.636 0.499109805 N6 float 666.54 0.809 0.999999821 N7 equal 1332.93 0.139 3.000000000 N8 exp 63.31 0.588 0.935364604 MWIPS 3959.52 2.526 Total Elapsed Time 15.6 seconds System 4 Android 13 1x 2.80 GHz Cortex-X2 ARM/Intel Native Whetstone Benchmark 4A8 20-Apr-2023 20.18 Test MFLOPS MOPS millisecs Results System 4/System 2 N1 float 1491.65 0.013 -1.124750137 1.40 N2 float 1231.55 0.109 -1.131330490 1.39 N3 if 3598.79 0.029 1.000000000 1.20 N4 fixpt 6992.04 0.045 12.000000000 1.39 N5 cos 246.11 0.338 0.499109805 1.74 N6 float 1118.73 0.482 0.999999821 1.40 N7 equal 2796.29 0.066 3.000000000 1.39 N8 exp 106.54 0.349 0.935364604 1.50 MWIPS 6986.66 1.431 1.50 Total Elapsed Time 16.4 seconds |
The Dhrystone integer benchmark produces a performance rating in Vax MIPS (AKA DMIPS). Results from two runs are provided, for the first 3, to demonstrate variance in measured MIPS speeds. These are generally in line with performance expectations. But a single run can provide false impressions. The program checks for correct numeric results.
With this benchmark often being used to identify performance of ARM CPUs, they may have added more hardware tweaks to increase the rating to 12 MIPS per MHz on System 4, twice as high as the other bunch shown here. The program does not appear to be suitable for vector operation. In 2015 it used to be around 2 MIPS/MHz with my 64 bit program and 4 on high end Intel CPUs.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.25 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 40 Dhrystones per Second 24826887 VAX MIPS rating 14130 ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.32 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 40 Dhrystones per Second 24821062 VAX MIPS rating 14127 System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.35 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 40 Dhrystones per Second 24750676 VAX MIPS rating 14087 ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.38 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 40 Dhrystones per Second 24841761 VAX MIPS rating 14139 System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.20 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 47 Dhrystones per Second 21287928 VAX MIPS rating 12116 ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.22 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 47 Dhrystones per Second 21373535 VAX MIPS rating 12165 System 4 Android 13 1x 2.80 GHz Cortex-X2 ARM/Intel Dhrystone 2 Benchmark 4A8 20-Apr-2023 20.26 Compiled for 64 bit ARM v8a Nanoseconds one Dhrystone run 17 System 4/System 2 Dhrystones per Second 59677446 VAX MIPS rating 33966 2.40 |
The Linpack benchmark speed is measured in MFLOPS. Three versions are provided, the original using double precision floating point calculations, then one with single precision arithmetic, with the third via NEON SIMD single precision intrinsic functions. Results for this benchmark code should not be compared with those from High Performance Linpack (HPL) benchmark. Again the first two systems produced similar performance, with the third much slower. Single precision calculations were somewhat faster than those using double precision, producing different numeric sumchecks, yet consistent across all platforms. NEON functions lead to at least a doubling of measured MFLOPS with the same single precision sumchecks.
System 4 - This is the first indication of possible heating issues, when running in in the preferred power on mode. Then the third test appeared to be slower than expected. Note that a number of other benchmarks were run between the last two tests, also indicating slow performance. This benchmarks can be compiled to use vector processing but limited to two floating point operations per word, similar to MemSpeed and part of MFLOPS benchmarks.
The later System 4 gains over System 2 were all greater than twice with the NEON test achieving nearly 9.5 GFLOPS or 3.38 MFLOPS per MHz.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark 4A8 06-Feb-2023 12.19 4A8 06-Feb-2023 12.20 4A8 06-Feb-2023 13.38 Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Speed 2047.81 MFLOPS Speed 2186.84 MFLOPS Speed 4705.52 MFLOPS norm. resid 1.7 norm. resid 1.6 norm. resid 1.6 resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05 machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07 x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05 x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06 System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark 4A8 06-Feb-2023 14.59 4A8 06-Feb-2023 15.11 4A8 06-Feb-2023 15.13 Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Speed 2027.77 MFLOPS Speed 2150.02 MFLOPS Speed 4614.88 MFLOPS norm. resid 1.7 norm. resid 1.6 norm. resid 1.6 resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05 machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07 x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05 x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06 System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark 4A8 06-Feb-2023 15.44 4A8 06-Feb-2023 15.45 4A8 06-Feb-2023 15.47 Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Speed 1474.16 MFLOPS Speed 1664.41 MFLOPS Speed 3294.97 MFLOPS norm. resid 1.7 norm. resid 1.6 norm. resid 1.6 resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05 machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07 x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05 x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06 System 4 Android 13 1x 2.80 GHz Cortex-X2 Power then Battery ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark 4A8 20-Apr-2023 20.28 4A8 20-Apr-2023 20.30 4A8 20-Apr-2023 20.45 ## Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Speed 4834.32 MFLOPS Speed 4965.85 MFLOPS Speed 6246.93 MFLOPS norm. resid 1.7 norm. resid 1.6 norm. resid 1.6 resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05 machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07 x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05 x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06 After Memory Benchmarks ## System 4/System 2 MFLOPS 2.38 2.36 SLOW 1.35 2 ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark 4A8 23-Apr-2023 14.23 4A8 23-Apr-2023 14.21 4A8 23-Apr-2023 14.19 Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Speed 4826.04 MFLOPS Speed 5083.03 MFLOPS Speed 9466.57 MFLOPS norm. resid 1.7 norm. resid 1.6 norm. resid 1.6 resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05 machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07 x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05 x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06 System 4/System 2 MFLOPS 2.38 2.36 2.05 |
Below are MFLOPS scores for the 24 kernels, at one data span, and overall ratings of Maximum, Average, Geometric mean, Harmonic mean and Minimum MFLOPS.
Again, System 1 slightly faster CPU MHz gave a lead over System 2, with System 3 far behind. Results are also provided using System 3 for a second power on run and on battery at 45% charge, all indicating the same performance.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 12.22 Compiled for 64 bit ARM v8a MFLOPS for 24 loops Do Span 471 2603.8 1889.6 1644.0 1670.3 790.6 1433.2 2606.3 3006.5 2780.7 1905.8 941.0 2110.0 524.5 756.1 1414.9 1560.5 1533.0 2645.4 715.3 1930.0 1766.2 1300.3 1554.1 672.2 Overall Weighted MFLOPS Do Spans 471, 90, 19 Maximum Average Geomean Harmean Minimum 3007.5 1651.3 1495.8 1335.2 524.5 Results of last two calculations 4.850340602749970e+02 1.300000000000000e+01 Total Elapsed Time 8.8 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 15.15 Compiled for 64 bit ARM v8a MFLOPS for 24 loops Do Span 471 2558.0 1853.8 1592.7 1636.9 774.7 1402.6 2553.6 2942.2 2730.5 1869.7 968.9 2086.2 516.0 745.9 1362.3 1525.4 1508.5 2594.3 700.3 1894.8 1736.3 1221.3 1521.9 658.0 Overall Weighted MFLOPS Do Spans 471, 90, 19 Maximum Average Geomean Harmean Minimum 2942.2 1619.1 1466.2 1308.7 516.0 Results of last two calculations 4.850340602749970e+02 1.300000000000000e+01 Total Elapsed Time 8.8 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 15.48 Compiled for 64 bit ARM v8a MFLOPS for 24 loops Do Span 471 2138.1 1346.2 1329.3 1308.0 668.8 929.1 2183.1 2718.9 2443.1 1380.8 667.8 1375.9 410.7 534.2 961.6 1003.3 1241.0 1755.8 429.5 1328.1 1256.7 958.1 1234.5 440.7 Overall Weighted MFLOPS Do Spans 471, 90, 19 Maximum Average Geomean Harmean Minimum 2718.9 1258.8 1111.2 964.9 371.3 Results of last two calculations 4.850340602749970e+02 1.300000000000000e+01 Total Elapsed Time 9.0 seconds System 3 Rerun 2137.7 1344.8 1329.4 1307.4 668.3 934.9 2182.5 2719.7 2443.9 1379.3 668.5 1376.4 412.5 533.2 961.2 1012.2 1241.8 1755.9 429.6 1328.9 1255.6 958.0 1234.6 440.7 System 3 Battery 45% 2137.8 1338.8 1329.3 1307.8 668.6 920.2 2181.5 2717.3 2443.8 1380.2 668.5 1380.1 413.1 535.0 961.2 1010.0 1235.2 1756.1 429.7 1328.5 1256.2 957.9 1233.8 440.7 |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 On Power ARM/Intel Livermore Loops Benchmark 4A8 20-Apr-2023 20.32 Compiled for 64 bit ARM v8a MFLOPS for 24 loops Do Span 471 6669.7 4873.3 2659.1 3066.6 1131.5 2339.6 6444.7 6866.3 6740.1 4898.4 1372.2 6161.1 1871.2 1695.0 3828.5 3432.4 2452.0 6094.3 927.5 2690.4 2831.2 3429.4 2301.9 1363.1 Overall Weighted MFLOPS Do Spans 471, 90, 19 Maximum Average Geomean Harmean Minimum 6867.1 3634.5 3136.9 2652.1 927.5 Results of last two calculations 4.850340602749970e+02 1.300000000000000e+01 Total Elapsed Time 9.6 seconds Test 2 On Battery ARM/Intel Livermore Loops Benchmark 4A8 30-Apr-2023 13.40 Compiled for 64 bit ARM v8a MFLOPS for 24 loops Do Span 471 6827.0 4835.3 2747.5 3172.3 1136.1 2343.6 6520.9 6984.3 6718.6 4888.5 1375.9 6192.2 1928.1 1750.2 3963.8 3588.4 2550.2 6333.6 962.4 2699.2 2932.5 3547.5 2304.3 1361.5 Overall Weighted MFLOPS Do Spans 471, 90, 19 Maximum Average Geomean Harmean Minimum 7032.9 3662.7 3158.9 2667.6 929.0 Results of last two calculations 4.850340602749970e+02 1.300000000000000e+01 Total Elapsed Time 9.3 seconds Test 1/System 2 MFLOPS for 24 loops Do Span 471 2.61 2.63 1.67 1.87 1.46 1.67 2.52 2.33 2.47 2.62 1.42 2.95 3.63 2.27 2.81 2.25 1.63 2.35 1.32 1.42 1.63 2.81 1.51 2.07 Maximum Average Geomean Harmean Minimum 2.33 2.24 2.14 2.03 1.80 Test 2/System 2 MFLOPS for 24 loops Do Span 471 2.67 2.61 1.73 1.94 1.47 1.67 2.55 2.37 2.46 2.61 1.42 2.97 3.74 2.35 2.91 2.35 1.69 2.44 1.37 1.42 1.69 2.90 1.51 2.07 Maximum Average Geomean Harmean Minimum 2.39 2.26 2.15 2.04 1.80 |
This benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, sized 2 x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m], using double and single precision (DP and SP) floating point and x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can be calculated by dividing DP MB/second by 8 and 16, for the two tests, and SP speeds by 4 and 8.
The results clearly demonstrate differences in such as CPU, RAM and cache speeds, floating point double and single precision floating point performance and cache sizes, indicating the invalidity of an overall single number rating.
With calculated single precision MFLOPS greater than MHz or double precision half that rate, the use of SIMD instructions being executed are indicated. For some reason, the older technology Cortex A-75 was best on L1 cache based double precision MFLOPS.
This and later benchmarks demonstrate that System 3 RAM speeds are much slower than those for the other two.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 10.21 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 14368 12749 13579 25806 13430 13114 L1 32 14377 12612 13629 25300 13078 12931 64 14315 12442 13534 26042 12740 12967 128 13677 12190 13147 21466 12434 12616 L2 256 13537 12097 13036 21231 12311 12491 512 13432 12018 12831 20618 12261 12454 1024 13230 11924 12791 18379 12173 12401 L3 4096 11013 10328 10937 10390 10612 10386 16384 9371 9342 9406 8997 9282 9084 RAM 65536 8799 8846 8878 8636 8801 8665 Max MFLOPS 1797 3187 Total Elapsed Time 12.2 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 10.26 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 14059 12474 13286 26090 13109 12806 L1 32 14045 12320 13326 26087 13023 12843 64 14061 12187 13323 25871 12544 12729 128 13455 11979 12990 21318 12189 12418 L2 256 13100 11827 12715 20903 12119 12290 512 13309 11892 12791 21008 12129 12291 1024 13295 11932 12788 21078 11992 12281 L3 2 MB 4096 9419 9354 9522 8907 9251 6848 RAM 16384 7912 7797 7883 6614 7549 7320 65536 7722 7788 7530 7333 7467 7255 Max MFLOPS 1757 3119 Total Elapsed Time 11.8 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 21.49 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 19342 12941 14154 18768 10836 10799 L1 32 19432 12942 14187 18798 10784 10970 64 19430 12940 14184 18651 10803 10971 128 9987 9084 9830 10006 9040 9114 L2 256 10341 9551 10274 10461 10125 10120 512 10239 9563 10283 10398 10030 10021 1024 9249 8657 9109 9267 8923 8959 L3 4096 4942 4881 4926 4879 4917 4888 RAM 16384 4577 4511 4565 4522 4532 4542 65536 4408 4509 4523 4527 4512 4510 Max MFLOPS 2429 3236 Total Elapsed Time 10.1 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 On Power ARM/Intel MemSpeed Benchmark 4A8 20-Apr-2023 20.40 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 18273 16318 13593 34975 21486 21577 L1 32 15278 13607 13606 34968 21565 21690 64 15230 13584 13562 34953 21214 21543 128 15301 13604 13578 34717 21359 21555 L2 256 15244 13599 13599 34859 21152 21389 512 15311 13611 13610 34911 21257 21269 1024 15236 13590 13529 34630 21168 21299 4096 15269 13588 13570 34599 21601 21495 L3 16384 15075 13472 13449 21727 18962 19053 RAM 65536 13210 13468 13460 18029 16851 14148 Max MFLOPS 2284 4080 Total Elapsed Time 11.3 seconds Test 2 On Battery ARM/Intel MemSpeed Benchmark 4A8 23-Apr-2023 13.52 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 22292 19857 19860 51064 31512 31522 L1 32 22342 19872 19842 51115 31999 32111 64 22229 19706 19782 51115 31400 31663 128 22300 19864 19858 50730 31237 31454 L2 256 22298 19875 19844 50906 31585 31959 512 22265 19873 19859 50290 30853 31149 1024 22346 19865 19872 49249 29985 30510 4096 21319 18952 19300 43691 28347 28916 L3 16384 19239 17066 15105 19805 19700 20244 RAM 65536 16165 15122 15114 17565 17043 17009 Max MFLOPS 2793 4968 Total Elapsed Time 10.4 seconds Test 1/System 2 KBytes Dble Sngl Int Dble Sngl Int 16 1.30 1.31 1.02 1.34 1.64 1.68 32 1.09 1.10 1.02 1.34 1.66 1.69 64 1.08 1.11 1.02 1.35 1.69 1.69 128 1.14 1.14 1.05 1.63 1.75 1.74 256 1.16 1.15 1.07 1.67 1.75 1.74 512 1.15 1.14 1.06 1.66 1.75 1.73 1024 1.15 1.14 1.06 1.64 1.77 1.73 4096 1.62 1.45 1.43 3.88 2.33 3.14 16384 1.91 1.73 1.71 3.29 2.51 2.60 65536 1.71 1.73 1.79 2.46 2.26 1.95 Test 2/System 2 KBytes Dble Sngl Int Dble Sngl Int 16 1.59 1.59 1.49 1.96 2.40 2.46 32 1.59 1.61 1.49 1.96 2.46 2.50 64 1.58 1.62 1.48 1.98 2.50 2.49 128 1.66 1.66 1.53 2.38 2.56 2.53 256 1.70 1.68 1.56 2.44 2.61 2.60 512 1.67 1.67 1.55 2.39 2.54 2.53 1024 1.68 1.66 1.55 2.34 2.50 2.48 4096 2.26 2.03 2.03 4.91 3.06 4.22 L3 vs RAM 16384 2.43 2.19 1.92 2.99 2.61 2.77 65536 2.09 1.94 2.01 2.40 2.28 2.34 |
This benchmark carries out the same calculations as the MemSpeed Benchmark, except they are all in single precision, as applicable with the NEON calculations. The latter are carried out using NEON intrinsic functions. Using these SIMD instructions, four results per clock cycle are possible or 8 GFLOPS at 2 GHz, rising to 16 GFLOPS with fused multiply and add instructions, as possible with the first two columns. Here we have a maximum of nearly 10 GFLOPS. But more than 12 GFLOPS are demonstrated later under the MP-MFLOPS Benchmark, with compiled code using a single CPU core.
NEON integer operations per second were slightly higher than those for floating point, where integer instructions per second would be somewhat higher, due to the inclusion of load, store and branching instructions.
With NEON operation, the much slower performance of System 3 older processor is clearly shown.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM NeonSpeed Benchmark 4A8 08-Feb-2023 10.50 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 13068 39594 13739 43318 54907 54817 L1 32 13074 39493 13764 43255 46180 45660 64 13065 39273 13749 43106 45044 43823 128 12888 28829 13632 29341 29244 29271 L2 256 12647 26631 13425 26850 26852 26837 512 12629 22447 13434 22401 22417 22393 1024 12465 18418 13194 18358 18375 18341 L3 4096 11104 10324 11518 10239 9853 10056 16384 9022 8691 9324 8638 8589 8648 RAM 65536 8898 8365 8936 8322 8374 8312 Max MFLOPS 3269 9899 Total Elapsed Time 11.0 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM NeonSpeed Benchmark 4A8 08-Feb-2023 11.26 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 12829 38832 13490 42520 53871 53927 L1 32 12827 38786 13499 42635 53916 53880 64 12804 38518 13479 42122 43667 43600 128 12599 28491 13330 28704 28805 28773 L2 256 12488 27960 13172 28234 28509 28465 512 12547 27304 13238 27373 27753 27759 1024 12499 23922 13222 24250 24376 25347 L3 4096 9494 8896 10109 9242 9403 9242 RAM 16384 7968 7476 8194 7719 7735 7642 65536 7892 7274 7914 6716 7229 7226 Max MFLOPS 3207 9708 Total Elapsed Time 10.6 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM NeonSpeed Benchmark 4A8 08-Feb-2023 12.22 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 12933 21026 14176 21588 20680 20761 L1 32 12685 20668 13506 21296 20824 20824 64 12540 20612 13405 21227 20822 20844 128 9358 10086 10182 10055 10007 10016 L2 256 9843 10438 10550 10388 10379 10383 512 9827 10359 10414 10335 10270 10324 1024 8380 8886 8706 8902 8986 9011 L3 4096 4467 4561 4363 4576 4591 4596 RAM 16384 4656 4736 4674 4613 4741 4759 65536 4387 4601 4514 4588 4588 4588 Max MFLOPS 3233 5257 Total Elapsed Time 10.3 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 On Power ARM NeonSpeed Benchmark 4A8 20-Apr-2023 20.47 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 16313 85505 13260 72262 73469 73365 L1 32 13567 71006 12891 72240 73527 73528 64 13591 61035 12889 65553 62412 60633 128 13599 45930 12889 45743 45572 45718 L2 256 13606 46165 12891 46201 46187 46215 512 13595 45389 12878 45385 45550 45544 1024 13603 45930 12886 45922 45797 45865 4096 13595 38351 12878 38425 38827 38993 L3 16384 13482 22725 12767 22666 22942 22846 RAM 65536 13367 15431 12790 17360 18269 18185 Max MFLOPS 4078 21376 Total Elapsed Time 10.3 seconds Test 2 on Battery ARM NeonSpeed Benchmark 4A8 23-Apr-2023 13.55 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 19862 102403 18573 102534 103684 103639 L1 32 19381 100863 18167 101897 103666 103409 64 19051 85761 18163 91701 85459 88190 128 19187 64767 18183 64770 64783 64820 L2 256 19199 64334 18184 65047 65140 65178 512 19185 63656 18192 64717 65401 65100 1024 19181 62057 18172 63202 62816 62338 4096 19153 56099 18067 56160 56082 55613 L3 16384 17795 24262 16849 24127 24352 23700 RAM 65536 15837 18834 15683 18968 19080 19083 Max MFLOPS 4966 25601 Total Elapsed Time 10.4 seconds Test 1/System 2 KBytes Norm Neon Norm Neon Float Int 16 1.27 2.20 0.98 1.70 1.36 1.36 32 1.06 1.83 0.95 1.69 1.36 1.36 64 1.06 1.58 0.96 1.56 1.43 1.39 128 1.08 1.61 0.97 1.59 1.58 1.59 256 1.09 1.65 0.98 1.64 1.62 1.62 512 1.08 1.66 0.97 1.66 1.64 1.64 1024 1.09 1.92 0.97 1.89 1.88 1.81 4096 1.43 4.31 1.27 4.16 4.13 4.22 16384 1.69 3.04 1.56 2.94 2.97 2.99 65536 1.69 2.12 1.62 2.58 2.53 2.52 Test2/System 2 KBytes Norm Neon Norm Neon Float Int 16 1.55 2.64 1.38 2.41 1.92 1.92 32 1.51 2.60 1.35 2.39 1.92 1.92 64 1.49 2.23 1.35 2.18 1.96 2.02 128 1.52 2.27 1.36 2.26 2.25 2.25 256 1.54 2.30 1.38 2.30 2.28 2.29 512 1.53 2.33 1.37 2.36 2.36 2.35 1024 1.53 2.59 1.37 2.61 2.58 2.46 4096 2.02 6.31 1.79 6.08 5.96 6.02 L3 vs RAM 16384 2.23 3.25 2.06 3.13 3.15 3.10 65536 2.01 2.59 1.98 2.82 2.64 2.64 Battery/Power Best Case 512 1.41 1.40 1.41 1.43 1.44 1.43 1024 1.41 1.35 1.41 1.38 1.37 1.36 4096 1.41 1.46 1.40 1.46 1.44 1.43 |
This benchmark is designed to identify reading data in bursts over buses. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read. On reading data from RAM, 64 Byte bursts are typically used. Then, measured reading speed reduces from a maximum, when all data is read, to a minimum on using 16 word increments (64 bytes). Potential maximum bus speed can be estimated by multiplying the Int16 value by 16. Then, for each half reduction in increments, a near doubling of MB/second could be expected. Burst reading is also indicated on some cache based data transfers.
The near constant Read All performance indicates CPU speed limitation, influenced by calculations involved, where RAM Inc 2 to Read All data transfer speeds do not approach doubling on systems 1 and 2. This effect also disguises System 3’s slower RAM.
See MP-BusSpeed results, indicating that access by multiple cores is necessary to obtain maximum memory throughput, where adequate CPU performance is provided.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 10.52 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 3887 5358 7637 8100 8113 8111 L1 32 7697 7796 7836 8102 8103 8111 64 6288 6426 7983 8114 8118 8111 128 2017 3596 6107 8099 8104 8108 L2 256 1646 2526 4675 7276 8065 8094 512 863 1304 2723 5462 8104 8101 1024 791 1128 2277 4449 7705 7907 L3 4096 608 996 1965 3548 7123 7894 16384 558 886 1791 3198 6659 7945 RAM 65536 548 873 1768 3199 6494 7957 Total Elapsed Time 5.0 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 11.31 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 6809 6976 7643 7939 7952 7942 L1 32 7561 7650 7685 7951 7958 7952 64 6197 6285 7820 7959 7964 7946 128 1977 3555 5903 7894 7925 7938 L2 256 1526 2513 4872 7650 7913 7945 512 1022 1838 3661 7276 5696 6919 1024 910 1560 3071 5808 7796 6611 L3 4096 648 992 2132 4132 7393 7440 RAM 16384 586 877 1792 3650 6820 7898 65536 570 857 1763 3501 6647 7896 Total Elapsed Time 5.2 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 12.24 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 6671 6851 7497 7964 7981 7983 L1 32 7330 7498 7498 7979 7980 7990 64 2827 2565 5606 7463 7836 7953 128 1566 1426 2322 4300 6046 7990 L2 256 1213 991 2076 3945 5492 7983 512 604 625 1851 3750 5444 7974 1024 616 588 1726 3202 4796 7103 L3 4096 579 522 1228 2419 4788 7448 RAM 16384 541 537 1135 2230 4545 7510 65536 496 520 1145 2292 4582 7528 Total Elapsed Time 4.9 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 On Power ARM/Intel BusSpeed Benchmark 4A8 20-Apr-2023 20.42 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 7774 8000 8823 9146 9166 7630 L1 32 7262 7357 7375 7641 7648 7635 64 6110 7378 7575 7644 7654 7633 128 3745 3985 7557 7653 7653 7635 L2 256 3742 3917 7567 7648 7654 7633 512 3785 4060 7419 7652 7654 7597 1024 3727 4073 6810 7647 7654 7626 4096 3246 2934 5918 7611 7641 7625 L3 16384 1803 1692 3441 6450 7556 7572 RAM 65536 1485 1535 3175 6175 7495 7544 Total Elapsed Time 5.1 seconds Test 2 On Battery ARM/Intel BusSpeed Benchmark 4A8 23-Apr-2023 14.03 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 9518 9771 10736 11133 11157 11145 L1 32 10614 10747 10799 11160 11174 11153 64 8911 10778 11062 11163 11167 11155 128 5472 5824 11046 11169 11182 11152 L2 256 5504 5782 11121 11174 11179 11155 512 5544 5911 11065 11181 11172 11146 1024 5479 6056 10871 11177 11178 11150 4096 4731 4097 8153 11145 11145 11146 L3 16384 2432 2023 4103 7354 10873 11063 RAM 65536 1484 1712 3572 6648 10627 11050 Total Elapsed Time 5.0 seconds Test1/System 2 KBytes Inc32 Inc16 Inc8 Inc4 Inc2 All 16 1.14 1.15 1.15 1.15 1.15 0.96 32 0.96 0.96 0.96 0.96 0.96 0.96 64 0.99 1.17 0.97 0.96 0.96 0.96 128 1.89 1.12 1.28 0.97 0.97 0.96 256 2.45 1.56 1.55 1.00 0.97 0.96 512 3.70 2.21 2.03 1.05 1.34 1.10 1024 4.10 2.61 2.22 1.32 0.98 1.15 4096 5.01 2.96 2.78 1.84 1.03 1.02 16384 3.08 1.93 1.92 1.77 1.11 0.96 65536 2.61 1.79 1.80 1.76 1.13 0.96 Test2/System 2 KBytes Inc32 Inc16 Inc8 Inc4 Inc2 All 16 1.40 1.40 1.40 1.40 1.40 1.40 32 1.40 1.40 1.41 1.40 1.40 1.40 64 1.44 1.71 1.41 1.40 1.40 1.40 128 2.77 1.64 1.87 1.41 1.41 1.40 256 3.61 2.30 2.28 1.46 1.41 1.40 512 5.42 3.22 3.02 1.54 1.96 1.61 1024 6.02 3.88 3.54 1.92 1.43 1.69 4096 7.30 4.13 3.82 2.70 1.51 1.50 16384 4.15 2.31 2.29 2.01 1.59 1.40 65536 2.60 2.00 2.03 1.90 1.60 1.40 Battery/Power Best Case 512 1.46 1.46 1.49 1.46 1.46 1.47 1024 1.47 1.49 1.60 1.46 1.46 1.46 4096 1.46 1.40 1.38 1.46 1.46 1.46 |
RandMem benchmark carries out four tests comprising serial and random address selections using the same program structure, with read and read/write tests, where the data read points to the next address, with no arithmetic calculations. The main purpose is to demonstrate how much slower performance can be through using random access. Here, speed can be considerably influenced by reading and writing in bursts, where much of the data is not used, and by the size of preceding caches.
This benchmark demonstrates the best and worst data transfer speeds from RAM, running a single program. Best is serial reading that is has minimum CPU instruction execution time reading all data in a burst. Worst is random access with a low probability in reading data form the same burst.
Some of System 3’s results were noticeably slower than those in the other memory benchmarks.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 10.53 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 8659 13607 14309 13669 L1 32 14800 15595 14275 13640 64 14693 15357 14261 13579 128 12719 13268 8758 7856 L2 256 12616 13060 4867 5225 512 12746 13177 2816 3274 1024 12251 12337 1416 1908 L3 4096 11763 7213 664 717 16384 11472 6327 556 597 RAM 65536 11481 5996 526 565 Total Elapsed Time 8.1 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 11.37 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 14413 15265 14036 13429 L1 32 14467 15309 14068 13413 64 14558 15147 14022 13378 128 12462 13066 6195 6645 L2 256 12480 13083 4764 4853 512 10959 12560 1962 2452 1024 10617 12740 1195 1534 L3 4096 12067 6824 534 538 RAM 16384 12051 6031 409 415 65536 12002 5763 349 364 Total Elapsed Time 8.6 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 12.25 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 12972 15051 12798 12393 L1 32 13116 15184 12788 13243 64 12814 15150 11406 12759 128 8668 8727 2588 3199 L2 256 8078 7972 2279 2567 512 8017 7301 1555 1779 1024 7165 6442 1056 1268 L3 4096 7481 3425 484 410 RAM 16384 7453 3262 343 273 65536 7080 3014 333 292 Total Elapsed Time 8.5 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 ARM/Intel RandMem Benchmark 4A8 20-Apr-2023 20.43 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 26053 25057 23723 18443 L1 32 23084 22915 22289 18268 64 21887 22732 21187 16691 128 20287 21627 13268 10698 L2 256 20283 21661 10263 9161 512 20217 21467 8842 8383 1024 20015 21326 7138 7354 4096 20218 20853 3323 4499 L3 16384 19874 12556 1568 1962 RAM 65536 19649 11471 983 1328 Total Elapsed Time 7.9 seconds Test 2 Battery ARM/Intel RandMem Benchmark 4A8 23-Apr-2023 14.00 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 31747 30518 30144 24456 L1 32 30682 30415 29525 24245 64 29039 30172 28299 22411 128 26821 28695 17049 14193 L2 256 26980 28762 13155 11756 512 25989 27680 11462 10935 1024 25887 27358 9344 9597 4096 25894 25909 4078 5770 L3 16384 23440 13046 1647 1987 RAM 65536 22756 11750 1023 1372 Test1/System 2 Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 1.81 1.64 1.69 1.37 32 1.60 1.50 1.58 1.36 64 1.50 1.50 1.51 1.25 128 1.63 1.66 2.14 1.61 256 1.63 1.66 2.15 1.89 512 1.84 1.71 4.51 3.42 1024 1.89 1.67 5.97 4.79 4096 1.68 3.06 6.22 8.36 L3 16384 1.65 2.08 3.83 4.73 RAM 65536 1.64 1.99 2.82 3.65 Test2/System 2 Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 2.20 2.00 2.15 1.82 32 2.12 1.99 2.10 1.81 64 1.99 1.99 2.02 1.68 128 2.15 2.20 2.75 2.14 256 2.16 2.20 2.76 2.42 512 2.37 2.20 5.84 4.46 1024 2.44 2.15 7.82 6.26 4096 2.15 3.80 7.64 10.72 L3 16384 1.95 2.16 4.03 4.79 RAM 65536 1.90 2.04 2.93 3.77 Battery/Power Best and Worst Case 1024 1.29 1.28 1.31 1.31 4096 1.28 1.24 1.23 1.28 16384 1.18 1.04 1.05 1.01 65536 1.16 1.02 1.04 1.03 |
The benchmarks run code for single and double precision Fast Fourier Transforms of size 1024 to 1048576 (1K to 1024K), with running times in milliseconds. Two versions are available FFT1, original version and with optimised C code as FFT3c. Memory used increases with FFT sizes, up to use from RAM and is often accessed on a skipped sequential basis, leading to burst reading effects. The charge from using a different cache or RAM is demonstrated where execution time is more than double on doubling the FFT size.
Here, on executing FFT1, system 2 is shown to be faster than system 1. This test was repeated later, showing system 1 slightly faster, as expected. As with all these first tests, the benchmarks were run with power connected, with the reason for the difference being unknown. This demonstrates the danger in assessing performance by running a single benchmark.
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 10.55 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.047 0.044 0.042 0.044 0.044 0.042 2 0.092 0.091 0.091 0.092 0.091 0.091 4 0.197 0.197 0.196 0.204 0.202 0.203 8 0.434 0.429 0.429 0.573 0.461 0.302 16 1.196 1.199 1.183 1.395 1.428 1.265 32 3.331 3.275 3.271 4.362 4.296 4.123 64 7.407 7.325 6.456 8.545 8.260 7.313 128 14.196 13.447 12.777 24.470 24.741 23.636 256 43.757 43.396 43.050 66.080 65.481 65.891 512 121.602 121.637 121.264 157.855 157.641 157.182 1024 310.438 309.197 303.803 369.157 364.380 362.249 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 4.3 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 11.40 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.037 0.030 0.030 0.031 0.031 0.030 2 0.065 0.064 0.064 0.065 0.065 0.064 4 0.140 0.139 0.139 0.144 0.143 0.143 8 0.306 0.303 0.303 0.420 0.411 0.410 16 0.697 0.668 0.666 1.002 0.875 0.836 32 1.740 1.744 1.707 2.158 2.112 2.090 64 4.656 4.247 4.453 5.826 5.675 6.420 128 17.591 12.325 11.902 23.000 23.823 22.929 256 45.956 47.550 46.355 64.257 63.979 63.376 512 120.193 120.099 124.833 156.133 155.517 156.019 1024 295.659 334.325 304.642 361.975 360.212 361.947 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 4.1 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 12.26 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.034 0.030 0.030 0.026 0.025 0.025 2 0.065 0.065 0.065 0.055 0.055 0.054 4 0.141 0.142 0.139 0.154 0.152 0.154 8 0.329 0.337 0.335 0.440 0.442 0.454 16 0.872 0.895 0.877 1.054 1.071 1.089 32 2.182 2.168 2.146 2.729 2.840 2.793 64 5.401 5.475 5.492 9.277 9.631 9.695 128 16.977 17.529 17.099 39.834 43.928 43.814 256 85.865 82.130 81.941 112.404 108.405 110.697 512 215.935 221.886 219.700 258.905 259.124 258.621 1024 506.663 504.806 500.864 604.900 598.287 595.695 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 6.5 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 ARM/Intel FFT Benchmark 1 4A8 20-Apr-2023 20.58 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.028 0.024 0.022 0.024 0.023 0.022 2 0.050 0.049 0.049 0.050 0.049 0.048 4 0.108 0.130 0.099 0.103 0.102 0.102 8 0.224 0.223 0.223 0.404 0.372 0.365 16 0.803 0.782 0.792 0.827 0.698 0.696 32 1.394 1.428 1.402 1.313 1.343 1.211 64 2.364 2.368 2.373 2.606 2.441 2.213 128 4.666 4.417 4.580 5.713 5.632 5.501 256 11.612 11.316 11.384 14.595 13.892 14.434 512 27.517 26.152 25.995 38.339 41.675 41.686 1024 79.904 78.725 78.795 105.524 105.813 107.723 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 1.3 seconds Test 2 Battery ARM/Intel FFT Benchmark 1 4A8 23-Apr-2023 14.06 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.027 0.023 0.022 0.024 0.023 0.022 2 0.050 0.178 0.049 0.050 0.049 0.049 4 0.108 0.107 0.107 0.112 0.111 0.111 8 0.245 0.245 0.242 0.400 0.412 0.397 16 0.850 0.857 0.865 0.950 0.892 0.694 32 1.524 1.404 1.417 1.391 1.259 1.212 64 2.543 2.188 2.174 2.316 2.287 2.183 128 4.584 4.687 4.464 4.886 4.555 4.635 256 9.222 9.279 9.224 10.926 10.972 10.583 512 22.076 21.046 21.753 33.690 31.855 33.518 1024 59.946 61.047 60.812 89.821 90.799 90.701 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 1.1 seconds Average Comparisons Test 1/Old 2 Test 2/old 2 Battery/Power SP DP SP DP SP DP 1 1.31 1.32 1.35 1.32 1.03 1.00 2 1.30 1.37 0.70 1.29 0.53 0.94 4 1.24 1.15 1.30 1.08 1.05 0.94 8 1.36 1.18 1.25 1.06 0.92 0.90 16 0.85 1.49 0.79 1.42 0.92 0.95 32 1.23 2.18 1.19 2.28 0.97 1.04 64 1.88 3.64 1.93 4.20 1.03 1.16 128 3.06 4.37 3.04 5.61 0.99 1.28 256 4.08 4.00 5.04 5.01 1.24 1.25 512 4.58 3.52 5.63 4.19 1.23 1.19 1024 3.94 3.40 5.14 4.00 1.31 1.18 |
System 1 Android 11 2.05 GHz ARM Cortex-A76 ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 10.56 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.035 0.029 0.028 0.030 0.028 0.028 2 0.066 0.062 0.061 0.063 0.059 0.062 4 0.141 0.132 0.134 0.136 0.136 0.134 8 0.307 0.290 0.290 0.360 0.350 0.338 16 0.702 0.676 0.675 0.790 0.766 0.790 32 1.545 1.476 1.472 1.754 1.766 1.783 64 3.423 3.333 3.367 4.380 4.278 4.231 128 8.240 8.024 8.108 11.155 10.916 10.553 256 19.756 19.283 19.493 26.542 26.701 26.368 512 43.903 43.320 43.422 60.771 61.454 60.828 1024 94.409 93.012 93.336 145.439 142.632 144.625 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 2.1 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 11.42 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.044 0.030 0.030 0.031 0.028 0.028 2 0.069 0.063 0.063 0.061 0.060 0.060 4 0.162 0.135 0.135 0.135 0.133 0.133 8 0.347 0.301 0.298 0.317 0.314 0.337 16 0.841 0.722 0.908 0.826 1.134 0.840 32 1.795 1.753 1.652 2.089 2.047 1.987 64 3.586 3.422 3.732 4.646 4.674 4.701 128 8.411 8.138 7.877 10.902 10.906 10.933 256 19.554 20.523 19.439 25.088 24.605 26.126 512 47.427 44.633 44.105 56.174 63.102 62.016 1024 107.446 102.961 101.591 145.147 141.521 141.941 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 2.1 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 12.27 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.054 0.035 0.034 0.035 0.032 0.032 2 0.076 0.073 0.073 0.073 0.070 0.070 4 0.165 0.157 0.161 0.169 0.167 0.165 8 0.381 0.353 0.360 0.391 0.382 0.379 16 0.856 0.823 0.836 0.991 0.966 0.983 32 1.861 1.852 1.899 2.365 2.297 2.317 64 4.402 4.224 4.266 6.097 5.913 6.111 128 10.802 10.491 10.793 15.843 15.477 15.512 256 26.539 25.950 26.473 37.175 37.135 37.191 512 58.571 57.610 56.704 88.722 90.241 88.155 1024 125.591 124.655 126.555 217.677 222.146 221.802 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 |
Running time of this benchmark is now less than one second, with some some measured FFT time being at microsecond level, possibly dependent on timer resolution and questioning validity of comparisons.
System 4 Android 13 1x 2.80 GHz Cortex-X2 Test 1 Power ARM/Intel FFT Benchmark 3c 4A8 20-Apr-2023 20.59 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.039 0.026 0.025 0.013 0.011 0.011 2 0.061 0.054 0.054 0.025 0.050 0.023 4 0.128 0.115 0.115 0.053 0.051 0.051 8 0.303 0.254 0.253 0.124 0.121 0.120 16 0.641 0.607 0.606 0.296 0.284 0.285 32 1.345 1.339 1.042 0.627 0.609 0.611 64 2.434 2.049 1.824 1.360 1.406 1.322 128 3.597 3.419 3.412 2.985 2.890 2.957 256 6.718 6.180 6.077 7.266 7.216 7.083 512 13.537 12.908 12.913 17.726 19.994 20.027 1024 31.804 30.518 30.398 46.998 44.458 44.174 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 0.7 seconds Test 2 Battery ARM/Intel FFT Benchmark 3c 4A8 23-Apr-2023 14.10 Compiled for 64 bit ARM v8a Size milliseconds K Single Precision Double Precision 1 0.051 0.026 0.025 0.013 0.011 0.011 2 0.061 0.054 0.053 0.025 0.023 0.023 4 0.139 0.115 0.115 0.053 0.051 0.051 8 0.276 0.257 0.254 0.123 0.121 0.141 16 0.646 0.607 0.604 0.295 0.284 0.284 32 1.366 0.985 0.979 0.632 0.619 0.618 64 2.240 2.054 1.869 1.394 1.333 1.328 128 3.824 3.569 2.988 3.016 2.914 2.882 256 6.934 6.319 6.096 7.240 7.160 7.170 512 13.635 13.227 13.144 17.729 17.652 17.596 1024 30.851 30.087 29.866 40.093 38.497 38.396 1024 Square Check Maximum Noise Average Noise SP 9.999520e-01 3.346482e-06 4.565234e-11 DP 1.000000e+00 1.133294e-23 1.428110e-28 Total Elapsed Time 0.7 seconds Average Comparisons Test 1/Old 2 Test 2/old 2 Battery/Power SP DP SP DP SP DP 1 1.16 2.02 1.02 2.53 0.88 1.25 2 1.15 2.30 1.16 2.58 1.01 1.12 4 1.21 2.63 1.17 2.54 0.97 0.96 8 1.17 3.06 1.20 3.02 1.03 0.99 16 1.33 3.29 1.33 3.27 1.00 0.99 32 1.40 3.39 1.56 3.40 1.12 1.00 64 1.70 3.62 1.74 3.63 1.02 1.00 128 2.34 3.57 2.35 3.57 1.00 1.00 256 3.14 3.24 3.08 3.45 0.98 1.06 512 3.46 3.15 3.40 3.59 0.98 1.14 1024 3.36 3.16 3.44 3.66 1.02 1.16 |
For more information on Whetstone Benchmark see stand alone version, above. The multithreading version runs multiple copies of the same shared code, with separate variables.
Before comparing results, it should be noted that the high Fixpt MOPS are impossible to achieve, where the compiler has found that some of the code can be ignored without changing he calculated result. However, the time for this function has little effect on overall MWIPS rating.
With mixed MHz CPU cores and big.LITTLE architectures, comparisons become more complex, where each one indicates superior performance in specific areas.
For this benchmark, overall seconds depend on calibrations and should not be compared. However, in an ideal world, on each system the time would be constant up to 8 threads accessing 8 CPU cores. Comparing overall MWIPS ratings, throughput running 2, 4 and 8 threads, over one thread, were around twice using 2 threads, then about 3.4 times at 4 threads, then between 5.1 and 6.2 with 8 threads.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 16.56 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 3886.0 680.6 663.1 734.2 119.3 62.5 23326.1 1976.9 741.3 2T 7695.3 1541.7 1409.7 1456.9 240.9 115.2 98493.7 4205.8 1474.0 4T 12943.7 2547.7 2495.0 2575.8 365.2 220.6148870.7 8186.7 2268.0 8T 24326.3 4564.2 4353.4 4700.6 695.7 435.2323353.9 22743.2 4101.4 Overall Seconds 2.91 1T, 2.91 2T, 3.93 4T, 4.83 8T All calculations produced consistent numeric results Total Elapsed Time 14.9 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 17.22 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 4064.7 957.0 728.2 738.0 129.4 64.6 18308.6 2444.3 751.1 2T 8660.6 1757.5 1505.7 1596.2 270.0 142.0 85717.2 5241.1 1505.1 4T 14117.1 3461.0 3322.1 2696.8 439.0 239.9140592.8 11249.6 2471.6 8T 20887.7 4732.1 4868.8 4176.3 518.0 386.2309958.3 19432.5 3457.2 Overall Seconds 2.74 1T, 2.67 2T, 3.98 4T, 4.57 8T All calculations produced consistent numeric results Total Elapsed Time 14.3 seconds System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 15.43 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 3856.7 819.3 818.8 666.1 130.1 63.4 50817.9 2984.8 562.2 2T 7716.2 1637.2 1636.4 1332.3 260.3 126.9112199.9 5982.7 1124.6 4T 13246.4 2792.2 2730.5 2385.4 421.9 230.4192831.0 11651.4 1966.1 8T 20674.2 4431.3 4528.9 3840.0 596.8 390.2289064.4 21237.2 3009.2 Overall Seconds 4.99 1T, 4.99 2T, 6.67 4T, 8.09 8T All calculations produced consistent numeric results Total Elapsed Time 25.7 seconds |
Some of the test function running times were, again, in the microsecond range, possibly distorting comparisons. The single core benchmark obtained an overall speed rating of 1.50 times the older phone used for comparison purposes. This time it was between 1.50 and 1.75 times, depending on the thread count.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Test 1 Battery ARM/Intel MP-Whetstone Benchmark 4A8 23-Apr-2023 14.30 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 6937.5 1419.7 1389.3 1092.2 236.7 103.3 83276.2 3985.3 1923.4 2T 12987.7 2695.4 2562.8 2087.1 449.1 197.9 125540.0 7418.4 3212.5 4T 24739.0 5315.1 5214.5 4090.7 835.9 384.5 244974.6 14825.7 5227.7 8T 32198.9 7676.4 7993.3 5510.1 1035.3 510.5 331293.6 25184.5 5594.2 Overall Seconds 4.20 1T, 5.29 2T, 6.00 4T, 9.04 8T All calculations produced consistent numeric results Total Elapsed Time 25.1 seconds Test1/System 2 1T 1.71 1.48 1.91 1.48 1.83 1.60 4.55 1.63 2.56 2T 1.50 1.53 1.70 1.31 1.66 1.39 1.46 1.42 2.13 4T 1.75 1.54 1.57 1.52 1.90 1.60 1.74 1.32 2.12 8T 1.54 1.62 1.64 1.32 2.00 1.32 1.07 1.30 1.62 |
This benchmark does not provide reasonable increases in measured performance using multiple cores, probably because many of the variables used are shared by all threads. Results using one thread are only slightly slower than from the single core version, indicating that threading overheads were not excessive.
The lack of improvement using multiple cores probably invalidates comparisons of the two systems.
At least the System 4/System 2 performance comparison indicated between 2.0 and 2.45 times gain.
System 1 Android 11 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 16.58 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads Threads 1 2 4 8 Seconds 0.80 2.03 5.47 14.00 Dhrystones per Second 25133472 19708774 14614211 11428905 VAX MIPS rating 14305 11217 8318 6505 Internal pass count correct all threads Total Elapsed Time 22.7 seconds System 2 Android 12 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 17.24 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads Threads 1 2 4 8 Seconds 0.84 2.24 6.23 14.31 Dhrystones per Second 23687920 17834612 12843313 11183452 VAX MIPS rating 13482 10151 7310 6365 Internal pass count correct all threads Total Elapsed Time 24.1 seconds System 3 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 15.45 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads Threads 1 2 4 8 Seconds 0.75 1.97 4.98 12.88 Dhrystones per Second 21326073 16280555 12851505 9937004 VAX MIPS rating 12138 9266 7314 5656 Internal pass count correct all threads Total Elapsed Time 21.3 seconds System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 ARM/Intel MP-Dhrystone 2 Benchmark 4A8 23-Apr-2023 14.32 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads Threads 1 2 4 8 Seconds 0.69 2.01 5.08 14.28 Dhrystones per Second 57735505 39843345 31467495 22401220 VAX MIPS rating 32860 22677 17910 12750 Internal pass count correct all threads Total Elapsed Time 22.6 seconds System 4/Syestem 2 2.44 2.23 2.45 2.00 |
This benchmark is not generally available with the new 4A8 compilation as overall running time had increased to more than 400 seconds, on a new phone.
Considering Read All, performance of all three systems was virtually the same for cache based data, using the simple integer arithmetic involved. Systems 1 and 2 RAM speeds were quite similar, with system 3 far behind, maybe due to dual channel versus single channel operation.
Estimated bus speeds calculated as 16 times Inc16 results were similar to calculated MB/second when greater than one thread was used.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 17.01 Compiled for 64 bit ARM v8a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 7329 7124 7451 7334 7341 7101 L1 2T 10325 13362 14290 7684 8059 7832 <<< Later 14130 4T 17070 19398 25187 24043 27212 20101 8T 14174 17228 36750 29288 41665 29522 122.9 1T 1878 2887 4854 7296 7368 6407 L2 2T 1863 3247 6737 7374 13119 7689 4T 3830 6261 9539 14764 17344 15561 8T 5462 8906 16427 25436 32650 29293 49152 1T 404 569 1155 2233 4053 4376 RAM 2T 409 777 1583 3176 6429 9715 4T 564 942 1821 3646 7426 11040 8T 598 970 1950 3715 7974 15460 No Errors Found Total Elapsed Time 58.4 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 17.26 Compiled for 64 bit ARM v8a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 7161 7297 7497 7588 7702 7460 L1 2T 8249 12429 13881 13746 15061 15482 4T 7947 10882 15414 19060 22373 19375 8T 12283 11971 29090 27379 39212 26439 122.9 1T 1992 3367 6029 7489 7375 7503 L2 2T 3907 7106 11767 14529 15642 15813 4T 4709 7833 12544 18015 19659 19260 8T 4742 8651 15108 25444 37308 32776 49152 1T 528 789 1730 3469 6325 7353 RAM 2T 726 988 1832 3623 7074 13999 Calculated 4T 719 882 1762 3321 6886 13740 Bus Speed 8T 681 861 1800 3451 7147 13906 13776 No Errors Found Total Elapsed Time 52.9 seconds System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 15.47 Compiled for 64 bit ARM v8a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 7116 7549 7746 7936 7963 7976 L1 2T 12590 13817 14578 15785 15865 15924 4T 19944 23807 26173 27694 28498 20714 8T 16635 16726 35602 29673 43358 32010 122.9 1T 1232 1142 2415 4406 5734 7975 L2 2T 2718 3123 5270 8813 11478 15947 4T 3100 4607 7739 13599 18013 20644 8T 3189 6323 9391 19850 27135 30640 49152 1T 547 540 1116 2269 4488 7518 RAM 2T 581 580 1140 2289 4582 9156 4T 642 625 1691 3324 8091 9188 8T 601 687 1586 3099 5079 9027 No Errors Found Total Elapsed Time 48.8 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 ARM/Intel MP-BusSpd2 Benchmark 4A8 23-Apr-2023 14.34 Compiled for 64 bit ARM v8a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 9766 10568 10655 10683 10312 10724 L1 2T 16006 17611 19131 19771 19436 19690 4T 29633 30846 35796 35823 37133 37949 8T 17413 18447 42972 39292 53233 52517 122.9 1T 4904 5381 8001 9509 9478 9553 L2 2T 8182 8579 15623 18945 19070 19051 4T 15433 15194 26980 34383 31191 35705 8T 14336 15505 27156 35831 39276 47641 49152 1T 1158 1163 2593 5707 10124 10218 RAM 2T 2580 2145 4723 9139 16890 18311 Calculated 4T 4236 3485 7626 12461 21916 30342 Bus Speed 8T 2821 2968 6508 10792 21131 34406 47488 No Errors Found Total Elapsed Time 50.9 seconds System 4 / System 2 12.3 1T 1.36 1.45 1.42 1.41 1.34 1.44 2T 1.94 1.42 1.38 1.44 1.29 1.27 4T 3.73 2.83 2.32 1.88 1.66 1.96 8T 1.42 1.54 1.48 1.44 1.36 1.99 122.9 1T 2.46 1.60 1.33 1.27 1.29 1.27 2T 2.09 1.21 1.33 1.30 1.22 1.20 4T 3.28 1.94 2.15 1.91 1.59 1.85 8T 3.02 1.79 1.80 1.41 1.05 1.45 49152 1T 2.19 1.47 1.50 1.65 1.60 1.39 2T 3.55 2.17 2.58 2.52 2.39 1.31 4T 5.89 3.95 4.33 3.75 3.18 2.21 8T 4.14 3.45 3.62 3.13 2.96 2.47 |
This program simply reads (or writes) data that supplies the next location to access. this lack of arithmetic calculations apparently provides faster data transmission speeds than BusSpeed.
Repeating the benchmark on System 1 continued to produce variable performance on RndRDWR tests using RAM.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-RndMem Benchmark 4A8 08-Feb-2023 17.04 Compiled for 64 bit ARM v8a MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 15672 16244 15166 13508 L1 2T 14435 10708 21438 9174 4T 35744 8391 34088 7762 8T 52284 8129 32321 7232 122.9 1T 11052 11762 7956 7209 L2 2T 17349 9400 14378 5457 4T 30743 7405 18898 5343 8T 44553 6837 21266 4174 12288 1T 11287 6549 407 424 RAM 2T 9081 4458 641 223 4T 14381 3463 539 64 8T 16627 2564 1061 121 No Errors Found Total Elapsed Time 47.9 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 15277 15160 13995 13764 L1 2T 27401 14764 27575 13529 4T 30145 14883 29903 13394 8T 43856 14293 33190 13297 122.9 1T 12005 13509 7296 7303 L2 2T 25241 12840 14676 7336 4T 30128 12674 15276 7226 8T 46484 11959 18064 7166 12288 1T 11371 6158 437 429 RAM 2T 15348 5818 471 402 4T 14136 5793 499 404 8T 17555 5276 597 392 No Errors Found Total Elapsed Time 47.2 seconds System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-RndMem Benchmark 4A8 08-Feb-2023 15.49 Compiled for 64 bit ARM v8a MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 13840 15739 13328 13741 L1 2T 25791 15710 25075 13919 4T 34426 15334 33819 13779 8T 50511 15029 38275 13788 122.9 1T 8965 9269 2727 3397 L2 2T 16943 9249 6348 3391 4T 24738 9152 8399 3410 8T 42321 9190 12827 3402 12288 1T 7704 3364 510 358 RAM 2T 9140 3371 550 334 4T 15521 3367 574 358 8T 14550 3358 747 358 No Errors Found Total Elapsed Time 42.6 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 ARM/Intel MP-RndMem Benchmark 4A8 23-Apr-2023 14.38 Compiled for 64 bit ARM v8a Battery MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 30966 18582 17061 14184 2T 29434 17173 29474 14222 4T 58965 25538 88024 22464 8T 91141 23590 67089 21167 122.9 1T 26009 19920 12525 10496 2T 39542 23049 23892 13454 4T 71554 23106 39923 12058 8T 75854 20575 42745 9824 12288 1T 23597 12335 1980 2921 2T 33194 11639 3260 2735 4T 44727 10552 5269 2372 8T 50346 9798 5297 1920 No Errors Found System 4 / System 2 KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 2.03 1.23 1.22 1.03 2T 1.07 1.16 1.07 1.05 4T 1.96 1.72 2.94 1.68 8T 2.08 1.65 2.02 1.59 122.9 1T 2.17 1.47 1.72 1.44 2T 1.57 1.80 1.63 1.83 4T 2.38 1.82 2.61 1.67 8T 1.63 1.72 2.37 1.37 12288 1T 2.08 2.00 4.53 6.81 2T 2.16 2.00 6.92 6.80 4T 3.16 1.82 10.56 5.87 8T 2.87 1.86 8.87 4.90 |
The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2 and 32 operations per input data word, using 1, 2, 4 and 8 threads. Data sizes are limited to three to use L1 cache, L2 cache and RAM at 12.8, 128 and 12800 KB (3200, 32000 and 3200000 single precision floating point words). Each thread uses the same calculations but accessing different segments of the data. The program checks for consistent numeric results, primarily to show that all calculations are carried out and can be run.
As indicated earlier, on using SIMD with 128 bit registers and linked (fused) multiply and add, up to eight single precision floating point operations could be expected per clock cycle, or 16 GFLOPS per core at 2 GHz. The first two processors, with Cortex A76 CPUs appear to have reasonable implementation of SIMD, achieving over 12 GFLOPS at 32 operations per word, with System 3 far behind. All show acceptable improvements using two cores, performance improvements then becoming disappointing using four cores, with these big.LITTLE CPU architectures.
Note that all systems obtained the same sumchecks of numeric calculations at all levels of threading.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 17.06 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 5378 5545 3318 12106 12306 11395 2T 10988 10354 3174 22955 23278 12780 4T 9979 10692 2591 25718 25633 24694 8T 13285 14803 2433 30061 31648 28941 Results x 100000, 0 indicates ERRORS 1T 40392 76406 99700 35218 66014 99520 2T 40392 76406 99700 35218 66014 99520 4T 40392 76406 99700 35218 66014 99520 8T 40392 76406 99700 35218 66014 99520 Total Elapsed Time 8.1 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 17.31 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 6819 6238 2804 12506 12537 12441 2T 8797 9307 2946 22427 24126 22731 4T 9364 9132 2554 25008 26004 25345 8T 10985 13262 2398 33664 34024 32553 Results x 100000, 0 indicates ERRORS 1T 40392 76406 99700 35218 66014 99520 2T 40392 76406 99700 35218 66014 99520 4T 40392 76406 99700 35218 66014 99520 8T 40392 76406 99700 35218 66014 99520 Total Elapsed Time 7.5 seconds System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 15.52 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 5825 4972 1567 7724 7327 7052 2T 11131 11772 1673 14574 15183 14065 4T 11598 13049 1775 17670 17991 17216 8T 13773 15038 1748 23906 24232 22806 Results x 100000, 0 indicates ERRORS 1T 40392 76406 99700 35218 66014 99520 2T 40392 76406 99700 35218 66014 99520 4T 40392 76406 99700 35218 66014 99520 8T 40392 76406 99700 35218 66014 99520 Total Elapsed Time 11.5 seconds |
Again, running times of individual tests could be too short to provide accurate performance estimates and comparisons. But it is clear that more than twice as fast as the older phone can be achieved. On Power heating effects indicate possible reductions in performance of more than 25%.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Test 1 Power ARM/Intel MP-MFLOPS2 Benchmark 4A8 20-Apr-2023 20.48 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 15822 13964 5340 24338 21850 21784 2T 17818 22599 5582 30511 30811 29994 4T 28770 25695 14235 48935 51359 48815 8T 44099 36862 25214 66160 71096 74910 Results x 100000, 0 indicates ERRORS 1T 40392 76406 99700 35218 66014 99520 2T 40392 76406 99700 35218 66014 99520 4T 40392 76406 99700 35218 66014 99520 8T 40392 76406 99700 35218 66014 99520 Total Elapsed Time 4.0 seconds Test 2 Battery ARM/Intel MP-MFLOPS2 Benchmark 4A8 23-Apr-2023 14.13 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 15285 14091 6830 31790 30388 30516 2T 21438 19857 8629 40890 41320 41764 4T 38093 25569 14322 64398 66969 64473 8T 40847 39072 31887 66206 68989 70401 Results x 100000, 0 indicates ERRORS 1T 40392 76406 99700 35218 66014 99520 2T 40392 76406 99700 35218 66014 99520 4T 40392 76406 99700 35218 66014 99520 8T 40392 76406 99700 35218 66014 99520 Total Elapsed Time 3.2 seconds Test1/System 2 1T 2.32 2.24 1.90 1.95 1.74 1.75 2T 2.03 2.43 1.89 1.36 1.28 1.32 4T 3.07 2.81 5.57 1.96 1.98 1.93 8T 4.01 2.78 10.51 1.97 2.09 2.30 Test2/System 2 1T 2.24 2.26 2.44 2.54 2.42 2.45 2T 2.44 2.13 2.93 1.82 1.71 1.84 4T 4.07 2.80 5.61 2.58 2.58 2.54 8T 3.72 2.95 13.30 1.97 2.03 2.16 Battery/Power 1T 0.97 1.01 1.28 1.31 1.39 1.40 2T 1.20 0.88 1.55 1.34 1.34 1.39 4T 1.32 1.00 1.01 1.32 1.30 1.32 8T 0.93 1.06 1.26 1.00 0.97 0.94 |
All produced identical sumchecks, these being different to those from MP-MFLOPS, probably due to a variance initial run time calibration or SIMD content.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM NEON-MFLOPS2-MP Benchmark 4A8 08-Feb-2023 17.07 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 7929 7999 3322 13136 13104 13090 2T 14163 13998 3171 25686 25710 25825 4T 15732 15495 3008 27646 27012 24837 8T 9105 12776 2439 29803 28991 27127 Results x 100000, 12345 indicates ERRORS 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 Total Elapsed Time 3.6 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM NEON-MFLOPS2-MP Benchmark 4A8 08-Feb-2023 17.33 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 4396 4753 2555 12669 11585 11782 2T 4661 6779 2894 22112 21236 21738 4T 7706 6001 2561 23015 26865 24635 8T 7286 7062 2397 35348 31644 29849 Results x 100000, 12345 indicates ERRORS 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 Total Elapsed Time 4.1 seconds System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 5486 5040 1706 7138 7167 7176 2T 11637 11560 1787 14195 14325 14398 4T 10948 10623 1853 17213 17304 17096 8T 12279 11952 1846 23173 23078 23495 Results x 100000, 12345 indicates ERRORS 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 Total Elapsed Time 5.9 seconds |
Comparing NEON-MFLOPS-MP with MP-MFLOPS indicates that performance was similar at 32 Ops/Word but the latter could be faster at 2 Ops/Word.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Test 1 Battery ARM NEON-MFLOPS2-MP Benchmark 4A8 23-Apr-2023 14.16 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 8810 9275 5476 29395 28331 30117 2T 16769 11592 8731 39449 40333 40440 4T 6822 17552 12335 62263 59255 59455 8T 25900 24135 18693 66554 64566 65969 Results x 100000, 12345 indicates ERRORS 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 Total Elapsed Time 1.8 seconds Test 2 Power ARM NEON-MFLOPS2-MP Benchmark 4A8 23-Apr-2023 14.17 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 9327 9188 5500 28636 28474 29788 2T 18024 18392 8359 38319 39596 39531 4T 31653 20778 10451 61957 64741 61611 8T 24930 22931 18816 56111 59569 66356 Results x 100000, 12345 indicates ERRORS 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 Total Elapsed Time 1.8 seconds Test2/Test1 - Power/Battery 1T 1.06 0.99 1.00 0.97 1.01 0.99 2T 1.07 1.59 0.96 0.97 0.98 0.98 4T 4.64 1.18 0.85 1.00 1.09 1.04 8T 0.96 0.95 1.01 0.84 0.92 1.01 Test1/System2 1T 2.00 1.95 2.14 2.32 2.45 2.56 2T 3.60 1.71 3.02 1.78 1.90 1.86 4T 0.89 2.92 4.82 2.71 2.21 2.41 8T 3.55 3.42 7.80 1.88 2.04 2.21 Battery NEON/Normal MFLOPS 1T 0.58 0.66 0.80 0.92 0.93 0.99 2T 0.78 0.58 1.01 0.96 0.98 0.97 4T 0.18 0.69 0.86 0.97 0.88 0.92 8T 0.63 0.62 0.59 1.01 0.94 0.94 |
Systems 1 and 3 do not appear to have the option to run with a faster refresh speed than 60 MHz. So maximum performance cannot be demonstrated. System 2 default is much higher, providing up to near 90 FPS, but 60 MHz refresh rate was set to enable comparisons. These still show significant superior performance. On the other hand, it should be borne in mind that System 2 has fewer than half the number of pixels to deal with.
System 1 Android 11 2.05 GHz ARM Cortex-A76 Graphics Mali-76 MC4, refresh 60 MHz Android Java OpenGL Benchmark 4A8 09-Feb-2023 10.56 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 59.50 60.06 59.37 49.99 18000+ 44.03 44.23 38.75 30.12 36000+ 22.78 23.19 21.54 16.32 Screen Pixels 1200 Wide 1928 High Total Elapsed Time 120.4 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) Graphics 660 MHz Adreno 619, default refresh MHz Android Java OpenGL Benchmark 4A8 09-Feb-2023 11.29 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 88.58 86.98 89.82 76.51 18000+ 63.02 63.01 55.57 45.03 36000+ 33.92 33.76 31.49 25.04 Screen Pixels 1339 Wide 720 High Total Elapsed Time 120.5 seconds System 2 Android 12 2.0 GHz ARM Cortex-A75 Graphics 660 MHz Adreno 619, refresh 60 MHz Android Java OpenGL Benchmark 4A8 09-Feb-2023 19.14 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 50.43 47.05 53.55 56.48 18000+ 59.00 59.39 54.26 44.57 36000+ 33.35 33.50 31.14 25.02 Screen Pixels 1339 Wide 720 High Total Elapsed Time 120.5 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 Graphics Mali-62, refresh 60 MHz Android Java OpenGL Benchmark 4A8 09-Feb-2023 15.07 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 37.88 59.82 54.16 41.19 18000+ 26.59 35.84 31.73 28.13 36000+ 16.46 20.42 19.35 15.65 Screen Pixels 1200 Wide 1848 High Total Elapsed Time 120.6 seconds |
System 4 Android 13 1x 2.80 GHz Cortex-X2 Graphics Xclipse 920 Power Android Java OpenGL Benchmark 4A8 20-Apr-2023 21.02 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 24.12 24.24 14.92 16.06 18000+ 8.46 8.46 6.11 6.67 36000+ 2.53 2.47 2.07 2.32 Screen Pixels 1080 Wide 2009 High Total Elapsed Time 121.9 seconds Battery Android Java OpenGL Benchmark 4A8 20-Apr-2023 21.05 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 24.01 24.20 14.83 15.71 18000+ 8.41 8.37 6.06 6.63 36000+ 2.52 2.45 2.06 2.31 Screen Pixels 1080 Wide 2009 High Total Elapsed Time 122.1 seconds Battery Later Android Java OpenGL Benchmark 4A8 23-Apr-2023 14.49 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 33.77 31.61 18.93 18.33 18000+ 9.08 8.81 6.25 6.60 36000+ 2.53 2.46 2.07 2.32 Screen Pixels 1080 Wide 2009 High Total Elapsed Time 121.9 seconds |
This all Java benchmark uses small to rather excessive simple objects to measure drawing performance, again via Frames Per Second (FPS). Five 10 second tests draw on a background of continuously changing colour shades.
As with the OpenGL benchmark, these results depend on the available refresh rates and screen pixel content.
In this case, System 2 was the only one allowed to run free of the imposition of VSYNC that limits the maximum refresh rate at 60 FPS. But, as shown, 60 FPS can be selected in SettingsSettings, showing that it was slower than System 1.
System 1 Android 11 2.05 GHz ARM Cortex-A76 Graphics Mali-76 MC4, refresh 60 MHz Android Java Drawing Benchmark 4A809-Feb-2023 11.04 Test Frames FPS Display PNG Bitmap Twice 599 59.88 Plus 2 SweepGradient Circles 601 60.03 Plus 200 Random Small Circles 601 60.03 Plus 320 Long Lines 518 51.75 Plus 4000 Random Small Circles 217 21.68 Screen pixels 1200 Wide 1928 High Total Elapsed Time 50.1 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) Graphics 660 MHz Adreno 619, default refresh MHz Android Java Drawing Benchmark 4A809-Feb-2023 11.25 Test Frames FPS Display PNG Bitmap Twice 879 87.81 Plus 2 SweepGradient Circles 893 89.22 Plus 200 Random Small Circles 844 84.37 Plus 320 Long Lines 202 20.11 Plus 4000 Random Small Circles 136 13.55 Screen pixels 1339 Wide 720 High Total Elapsed Time 50.2 seconds System 2 Android 12 2.0 GHz ARM Cortex-A75 Graphics 660 MHz Adreno 619, refresh 60 MHz Android Java Drawing Benchmark 4A809-Feb-2023 19.18 Test Frames FPS Display PNG Bitmap Twice 497 49.48 Plus 2 SweepGradient Circles 476 47.47 Plus 200 Random Small Circles 516 51.55 Plus 320 Long Lines 209 20.85 Plus 4000 Random Small Circles 139 13.90 Screen pixels 1339 Wide 720 High Total Elapsed Time 50.2 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 Graphics Mali-62, refresh 60 MHz Android Java Drawing Benchmark 4A809-Feb-2023 15.12 Test Frames FPS Display PNG Bitmap Twice 596 59.58 Plus 2 SweepGradient Circles 600 59.98 Plus 200 Random Small Circles 407 40.63 Plus 320 Long Lines 106 10.54 Plus 4000 Random Small Circles 74 7.33 Screen pixels 1920 Wide 1128 High Total Elapsed Time 50.2 seconds |
System 4 Android 13 2.80 GHz Cortex-X2 Graphics Xclipse 920 Battery Android Java Drawing Benchmark 4A830-Apr-2023 13.48 Test Frames FPS Display PNG Bitmap Twice 1187 118.61 Plus 2 SweepGradient Circles 1194 119.30 Plus 200 Random Small Circles 1162 116.19 Plus 320 Long Lines 343 34.21 Plus 4000 Random Small Circles 236 23.51 Screen pixels 1080 Wide 2009 High Total Elapsed Time 50.1 seconds |
System 1 Android 11 2.05 GHz ARM Cortex-A76 Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.13 Test MFLOPS MOPS millisecs Results N1 float 620.56 0.031 -1.124750137 N2 float 571.43 0.235 -1.131330490 N3 if 1014.71 0.102 1.000000000 N4 fixpt 2881.98 0.109 12.000000000 N5 cos 139.13 0.598 0.499110132 N6 float 274.09 1.968 0.999999821 N7 equal 630.29 0.293 3.000000000 N8 exp 72.73 0.512 0.935364604 MWIPS 2598.66 3.848 Total Elapsed Time 13.5 seconds System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.26 Test MFLOPS MOPS millisecs Results N1 float 605.30 0.032 -1.124750137 N2 float 559.53 0.240 -1.131330490 N3 if 993.28 0.104 1.000000000 N4 fixpt 2720.21 0.116 12.000000000 N5 cos 134.19 0.620 0.499110132 N6 float 270.51 1.994 0.999999821 N7 equal 405.80 0.455 3.000000000 N8 exp 68.38 0.544 0.935364604 MWIPS 2435.86 4.105 Total Elapsed Time 14.6 seconds System 3 Android 13 2.0 GHz ARM Cortex-A75 Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.33 Test MFLOPS MOPS millisecs Results N1 float 385.54 0.050 -1.124750137 N2 float 359.17 0.374 -1.131330490 N3 if 1000.00 0.104 1.000000000 N4 fixpt 1913.73 0.165 12.000000000 N5 cos 125.02 0.666 0.499110132 N6 float 184.60 2.922 0.999999821 N7 equal 310.33 0.596 3.000000000 N8 exp 59.71 0.623 0.935364604 MWIPS 1818.81 5.498 System 4 Android 13 1x 2.80 GHz Cortex-X2 Battery Android Java Whetstone Benchmark 4A8 30-Apr-2023 13.44 Test MFLOPS MOPS millisecs Results System 4/System 2 N1 float 798.00 0.024 -1.124750137 1.32 N2 float 736.04 0.183 -1.131330490 1.32 N3 if 1352.94 0.077 1.000000000 1.36 N4 fixpt 4186.05 0.075 12.000000000 1.54 N5 cos 227.32 0.366 0.499110132 1.69 N6 float 367.44 1.468 0.999999821 1.36 N7 equal 835.44 0.221 3.000000000 2.06 N8 exp 101.20 0.368 0.935364604 1.48 MWIPS 3595.56 2.781 1.48 Total Elapsed Time 15.8 seconds |
System 1 Android 11 2.05 GHz ARM Cortex-A76 Android Java Linpack Benchmark 4A8 03-Mar-2023 10.52 Speed 920.22 MFLOPS norm. resid 1.67 resid 7.41628980e-14 machep 2.22044605e-16 x[0]-1 -1.49880108e-14 x[n-1]-1 -1.89848137e-14 System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) Android Java Linpack Benchmark 4A8 03-Mar-2023 10.49 Speed 884.88 MFLOPS norm. resid 1.67 resid 7.41628980e-14 machep 2.22044605e-16 x[0]-1 -1.49880108e-14 x[n-1]-1 -1.89848137e-14 System 3 Android 13 2.0 GHz ARM Cortex-A75 Android Java Linpack Benchmark 4A8 03-Mar-2023 10.56 Speed 645.24 MFLOPS norm. resid 1.67 resid 7.41628980e-14 machep 2.22044605e-16 x[0]-1 -1.49880108e-14 x[n-1]-1 -1.89848137e-14 System 4 Android 13 1x 2.80 GHz Cortex-X2 Battery Android Java Linpack Benchmark 4A8 30-Apr-2023 13.46 Speed 2346.11 MFLOPS norm. resid 1.67 resid 7.41628980e-14 machep 2.22044605e-16 x[0]-1 -1.49880108e-14 x[n-1]-1 -1.89848137e-14 System 4/System 2 MFLOPS 2.65 |
Test 1 - Write and read three 8 and 16 MB files; Results given in MBytes/second
Test 2 - Write three 8 MB files, read can be cached in RAM; Results given in MBytes/second
Test 3 - Random write and read 1 KB from 4 to 16 MB; Results are average time in milliseconds
Test 4 - Write and read 200 files 4 KB to 16 KB; Results in MB/sec, msecs/file and delete seconds.
Buttons - RunS SD Card Not used now, RunI Main Drive, More > Don't Delete, Read Only or Both and Save See below
As can be seen, there were wide variations on measured performance, making it difficult to declare a winner, but System 3 appears to have a greater number of lowest scores. Random reading speeds were too fast to register within the calculations used.
This was not run on System 4.
System 1 Android 11 2.05 GHz ARM Cortex-A76 Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.30 Internal Drive Data Cached Compiled for 64 bit ARM v8a MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 1249.5 1264.4 1293.2 2927.6 2978.2 3162.4 16 1272.8 1314.8 1335.7 2970.1 3168.8 3539.9 Cached 8 871.2 455.3 1264.1 2847.8 3026.6 3206.2 Random Write Read From MB 4 8 16 4 8 16 msecs 0.16 0.16 0.19 0.00 0.00 0.00 200 Files Write Read Delete File KB 4 8 16 4 8 16 secs MB/sec 16.70 35.42 60.87 126.61 245.65 344.08 msecs 0.25 0.23 0.27 0.03 0.03 0.05 0.027 No delete Total Elapsed Time 16.4 seconds Path Used /data/user/0/com.drivespeed/files/ Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.37 Internal Drive Read Only MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 0.0 0.0 0.0 420.3 396.7 420.9 System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76) Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.43 Internal Drive Data Cached Compiled for 64 bit ARM v8a MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 1661.5 1649.2 1831.3 1993.7 2369.6 2969.9 16 1669.1 1530.7 1117.6 2125.8 2612.2 2167.9 Cached 8 1070.1 1557.8 1790.7 2124.0 2607.2 3217.0 Random Write Read From MB 4 8 16 4 8 16 msecs 0.22 0.43 0.47 0.00 0.00 0.00 200 Files Write Read Delete File KB 4 8 16 4 8 16 secs MB/sec 44.73 83.50 70.39 388.90 455.49 435.65 msecs 0.09 0.10 0.23 0.01 0.02 0.04 0.011 No delete Total Elapsed Time 16.3 seconds Path Used /data/user/0/com.drivespeed/files/ Continued Below System 2 Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.45 Internal Drive Read Only MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 0.0 0.0 0.0 338.4 425.1 393.9 System 3 Android 13 2.0 GHz ARM Cortex-A75 Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.51 Internal Drive Data Cached Compiled for 64 bit ARM v8a MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 849.8 1095.8 1478.3 2370.3 2270.8 2500.7 16 1519.2 1351.3 1234.3 1760.5 1853.6 1810.0 Cached 8 1612.1 1493.0 1262.3 2056.7 2007.7 1926.3 Random Write Read From MB 4 8 16 4 8 16 msecs 0.36 0.37 0.36 0.00 0.00 0.00 200 Files Write Read Delete File KB 4 8 16 4 8 16 secs MB/sec 66.03 178.95 323.02 519.97 837.431283.88 msecs 0.06 0.05 0.05 0.01 0.01 0.01 0.006 No delete Total Elapsed Time 16.5 seconds Path Used /data/user/0/com.drivespeed/files/ Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.59 Internal Drive Read Only MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 0.0 0.0 0.0 169.3 167.5 192.1System 3 SD Card Option Using RunS produces results on the latest versions of Android, but does not access the SD card. Following is an example of the start of a log after selecting this button without the SD card inserted and is the same with it in place. So, it is using a different file path on the internal drive. Writing speeds were much slower than via RunI but, on using the Read Only procedures, produced the same reading performance. MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 60.0 62.5 62.3 1211.5 1140.5 1106.5 16 65.5 67.5 56.5 1256.2 1756.7 2147.9 Path Used /storage/emulated/0/ |
There are two main stress test programs that can use multiple threads to exercise (presently) all CPU cores, one using floating point instructions, and the other carryinfg out integer arithmetic. Further detail is covered in the earlier report - android benchmarks.htm. The third program monitors MHz of up to 8 cores. Each of the stress test applications has five buttons:
RunB - Run Benchmark - Runs most combinations of number of threads, data sizes and calculations per data word for the FPU tests. This is mainly to help to decide which options to use for stress testing. The benchmark runs using fixed parameters, carrying out exactly the same number of calculations using all thread combinations and data sizes. The pass count changes according to the number of calculations per word, for the FPU tests.
RunS - Run Stress Tests - Default running time is 15 minutes, with the middle data size, intended for containment in L2 cache, using 8 threads. and 32 operations per word in the FPU tests.
False Errors - These can be caused if the run button is tapped again when the tests are running. The main unique symptoms are multiple “End Time” message displays.
SetS - Specify run time parameters for stress test - These are 1, 2, 4, 8, 16 or 32 threads, 2, 8 or 32 Operations per word for FPU tests, 12.8 or 16 KB, 128 or 160 KB, 12.8 or 16 MB for FPU or Integer tests, and running time in minutes.
Info - Test description and details - This is essentially the same as details provided here.
Save - This provides alternative methods to divert the logged output. Currently I select the Google Drive option, allowing me to access the files on my PCs.
Unexpected Faster Speed - Performance depends on whether the data comes from caches or RAM. Then, increasing the number of threads can lead to CPU cores using dedicated smaller and faster caches.
Sumchecks - The programs include sumchecks to show whether the correct arithmetic calculations were produced, as shown for the benchmark results. For integers, each test section uses a different data pattern for all words, checked by the program after manipulation. Floating point numeric results depend on the number of calculations carried out, constant for stress test reported time slots, easily verified manually.
CP_MHz2 measurements are instantaneous at a constant sampling rate, not averages over that time. The program has Set, Run and Save buttons, as above. Default running time is 15 minutes and sampling rate 10 seconds.
Later below are example results of Stress Test Benchmarks, followed by extended Reliability type Tests. Those for stress tests are from logs running default parameters, with 15 minutes running time. Some of the latter include only necessary detail. Examples of full output are as follows.
ARM/Intel MP-Int Stress Test 4A8 09-Mar-2023 10.24.37 Compiled for 64 bit ARM v8a Data Same All Seconds Size Threads MB/sec Sumcheck Threads 8.7 160 KB 8 57397 00000000 Yes 17.4 160 KB 8 56966 00000000 Yes ARM/Intel MP-FPU Stress Test 4A8 13-Mar-2023 11.59.35 Compiled for 64 bit ARM v8a Data Ops/ Nmeric Seconds Size Threads Word MFLOPS Results 9.4 128 KB 8 32 38431 35216 18.6 128 KB 8 32 37721 35216 |
As seen via the CPU-Z utility app, core MHz values are shown to change at extremely rapid rates. Here, CP_MHz2.apk provides samples at a selected number of seconds rate, as representative and not average. Example output:
MHz Measurement Test 4A8 13-Mar-2023 12.00.55 Running time 15 minutes, 30 second samples MHz for Core Secs 0 1 2 3 4 5 6 7 0.00 1805 1478 1805 1805 1805 1805 1651 1651 30.10 1805 1805 1805 1805 1805 1805 2035 2035 |
The usual relative performance attributes are show to apply, with System 2 indicated as much faster, with cache based data, using 1 or 2 treads, then possibly slower at 4 and 8.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.40.16 Compiled for 64 bit ARM v8a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 1.8 1 14159 14594 13354 00000000 Yes 1.2 2 21954 29948 13697 FFFFFFFF Yes 1.1 4 32124 32881 13805 5A5A5A5A Yes 1.0 8 41607 40944 14064 AAAAAAAA Yes 1.0 16 42412 44068 13862 CCCCCCCC Yes 0.8 32 42941 50142 20698 0F0F0F0F Yes End Time 07-Mar-2023 10.40.31 System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.44.17 Compiled for 64 bit ARM v8a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 1.8 1 15333 14398 12557 00000000 Yes 1.2 2 25656 25554 13615 FFFFFFFF Yes 1.2 4 29025 31166 13079 5A5A5A5A Yes 1.1 8 43667 40739 12317 AAAAAAAA Yes 1.0 16 39954 43161 13182 CCCCCCCC Yes 0.9 32 40849 42656 15047 0F0F0F0F Yes End Time 07-Mar-2023 10.44.27 System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.48.58 Compiled for 64 bit ARM v8a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 2.8 1 11252 11433 6011 00000000 Yes 1.9 2 20286 16018 8505 FFFFFFFF Yes 1.7 4 24332 23788 8086 5A5A5A5A Yes 1.5 8 36755 33932 8156 AAAAAAAA Yes 1.4 16 37736 39228 8096 CCCCCCCC Yes 1.1 32 35649 36291 12974 0F0F0F0F Yes End Time 07-Mar-2023 10.49.16 |
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 System 4 Battery ARM/Intel MP-Int Stress Test 4A8 23-Apr-2023 14.41.16 Compiled for 64 bit ARM v8a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 1.6 1 19675 16316 13029 00000000 Yes 1.1 2 31241 28440 15894 FFFFFFFF Yes 0.9 4 46282 40016 16222 5A5A5A5A Yes 0.7 8 59097 56981 18473 AAAAAAAA Yes 0.5 16 63286 67726 30086 CCCCCCCC Yes 0.4 32 65657 64560 61397 0F0F0F0F Yes End Time 23-Apr-2023 14.41.27 System 4 Power ARM/Intel MP-Int Stress Test 4A8 20-Apr-2023 20.51.13 Compiled for 64 bit ARM v8a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 1.2 1 23224 20831 19265 00000000 Yes 0.9 2 38975 37282 18468 FFFFFFFF Yes 0.5 4 62257 66630 40302 5A5A5A5A Yes 0.4 8 82663 90286 51540 AAAAAAAA Yes 0.3 16 88619 89234 72478 CCCCCCCC Yes 0.3 32 94039 86710 74422 0F0F0F0F Yes End Time 20-Apr-2023 20.51.21 System 4/System 2 1 1.51 1.45 1.53 2 1.52 1.46 1.36 4 2.14 2.14 3.08 8 1.89 2.22 4.18 16 2.22 2.07 5.50 32 2.30 2.03 4.95 System 4 Power/Battery 1 1.18 1.28 1.48 2 1.25 1.31 1.16 4 1.35 1.67 2.48 8 1.40 1.58 2.79 16 1.40 1.32 2.41 32 1.43 1.34 1.21 |
Again, at 12.8 and 128 KB. System 2 was much faster using 1 or 2 threads, but not so at more than 2.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.41.57 Compiled for 64 bit ARM v8a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 0.3 T1 2 9427 8174 3316 40392 76406 99700 0.4 T2 2 12505 9288 2517 40392 76406 99700 0.4 T4 2 11865 15337 2318 40392 76406 99700 0.4 T8 2 14857 16797 2240 40392 76406 99700 0.7 T1 8 12064 11755 11519 54760 85092 99819 0.5 T2 8 22060 21418 10649 54760 85092 99819 0.5 T4 8 26292 24186 9696 54760 85092 99819 0.5 T8 8 26257 24723 8943 54760 85092 99819 2.5 T1 32 12560 12096 11976 35218 66014 99520 1.4 T2 32 20570 23527 22632 35218 66014 99520 1.2 T4 32 25966 26414 25899 35218 66014 99520 1.1 T8 32 28518 30202 28717 35218 66014 99520 End Time 07-Mar-2023 10.42.09 System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.46.20 Compiled for 64 bit ARM v8a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 0.4 T1 2 7773 7983 2859 40392 76406 99700 0.4 T2 2 8975 7726 2545 40392 76406 99700 0.4 T4 2 8026 7542 2467 40392 76406 99700 0.4 T8 2 13882 11752 2336 40392 76406 99700 0.7 T1 8 11229 10090 11035 54760 85092 99819 0.6 T2 8 15553 17641 10259 54760 85092 99819 0.6 T4 8 18031 15945 10135 54760 85092 99819 0.5 T8 8 21272 21474 9410 54760 85092 99819 2.5 T1 32 11955 11956 12435 35218 66014 99520 1.4 T2 32 22202 22806 22787 35218 66014 99520 1.3 T4 32 23857 24021 25369 35218 66014 99520 1.0 T8 32 28250 32201 28726 35218 66014 99520 End Time 07-Mar-2023 10.46.33 System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.50.13 Compiled for 64 bit ARM v8a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 0.7 T1 2 5440 4195 1617 40392 76406 99700 0.5 T2 2 9855 10851 1781 40392 76406 99700 0.5 T4 2 8167 8485 1881 40392 76406 99700 0.5 T8 2 12014 10806 1847 40392 76406 99700 1.3 T1 8 6384 6381 5647 54760 85092 99819 0.8 T2 8 12496 12140 6674 54760 85092 99819 0.8 T4 8 12311 11922 7397 54760 85092 99819 0.6 T8 8 17907 17982 7476 54760 85092 99819 4.5 T1 32 6903 6912 6866 35218 66014 99520 2.2 T2 32 13696 13797 13740 35218 66014 99520 2.0 T4 32 13620 16951 16788 35218 66014 99520 1.4 T8 32 21211 21290 22181 35218 66014 99520 End Time 07-Mar-2023 10.50.32 |
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 System 4 Battery ARM/Intel MP-FPU Stress Test 4A8 23-Apr-2023 14.45.52 Compiled for 64 bit ARM v8a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 0.2 T1 2 15743 13802 6168 40392 76406 99700 0.1 T2 2 23790 22564 8635 40392 76406 99700 0.1 T4 2 31487 16944 11190 40392 76406 99700 0.1 T8 2 29239 16754 14704 40392 76406 99700 0.5 T1 8 17614 16465 14614 54760 85092 99819 0.4 T2 8 23473 21702 13270 54760 85092 99819 0.4 T4 8 28836 22915 14793 54760 85092 99819 0.3 T8 8 35877 33822 26051 54760 85092 99819 1.7 T1 32 14379 21304 22032 35218 66014 99520 1.1 T2 32 24714 27766 30000 35218 66014 99520 0.7 T4 32 44493 37534 46516 35218 66014 99520 0.7 T8 32 40943 39881 52404 35218 66014 99520 End Time 23-Apr-2023 14.46.02 System 4 Power ARM/Intel MP-FPU Stress Test 4A8 20-Apr-2023 20.49.55 Compiled for 64 bit ARM v8a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 0.2 T1 2 13959 13834 5427 40392 76406 99700 0.1 T2 2 21365 24557 9061 40392 76406 99700 0.1 T4 2 21907 21840 12173 40392 76406 99700 0.1 T8 2 18322 31692 12821 40392 76406 99700 0.5 T1 8 17088 17742 16266 54760 85092 99819 0.4 T2 8 23468 22740 13810 54760 85092 99819 0.4 T4 8 31470 24004 14281 54760 85092 99819 0.3 T8 8 28966 26081 23677 54760 85092 99819 1.7 T1 32 14975 20595 21972 35218 66014 99520 1.2 T2 32 24720 26515 28342 35218 66014 99520 0.8 T4 32 45125 33106 45770 35218 66014 99520 0.7 T8 32 49057 37660 46982 35218 66014 99520 End Time 20-Apr-2023 20.50.18 System 4/System 2 T1 2 1.80 1.73 1.90 T2 2 2.38 3.18 3.56 T4 2 2.73 2.90 4.93 T8 2 1.32 2.70 5.49 T1 8 1.52 1.76 1.47 T2 8 1.51 1.29 1.35 T4 8 1.75 1.51 1.41 T8 8 1.36 1.21 2.52 T1 32 1.25 1.72 1.77 T2 32 1.11 1.16 1.24 T4 32 1.89 1.38 1.80 T8 32 1.74 1.17 1.64 System 4 Battery/Power T1 2 1.13 1.00 1.14 T2 2 1.11 0.92 0.95 T4 2 1.44 0.78 0.92 T8 2 1.60 0.53 1.15 T1 8 1.03 0.93 0.90 T2 8 1.00 0.95 0.96 T4 8 0.92 0.95 1.04 T8 8 1.24 1.30 1.10 T1 32 0.96 1.03 1.00 T2 32 1.00 1.05 1.06 T4 32 0.99 1.13 1.02 T8 32 0.83 1.06 1.12 |
In all cases, CPU MHz of each of the six LITTLE CPU cores was essentially constant, performance degradation being imposed by MHz reductions on the two main cores.
Performance of System 2 was better than System 1, in spite of LITTLE CPU cores running at lower MHz. This is probably caused by the latter being produced by a later fabrication level. As expected, the older technology based System 3 was the slowest.
System 1 Power 1 Battery 2 Power 3 Power Mean MB/second 48110 48088 54838 39839 Usual Slow CPU MHz 2000 2000 1805 2002 System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 MHz for Core Secs MB/sec 0 1 2 3 4 5 6 7 Average 0 52349 30 51608 2000 2000 2000 2000 2000 2000 2050 2050 2013 60 48982 2000 2000 2000 2000 2000 2000 1796 1796 1949 90 46641 1275 875 1275 1175 1375 1275 1986 1986 1403 120 50087 2000 2000 2000 2000 2000 1800 1308 1308 1802 150 49026 2000 2000 2000 2000 2000 2000 1530 1530 1883 180 46743 2000 2000 2000 2000 2000 2000 1530 1419 1869 210 48994 2000 2000 2000 2000 2000 2000 1733 1733 1933 240 49110 2000 2000 2000 2000 2000 2000 1530 1530 1883 270 48631 2000 2000 2000 2000 2000 2000 1419 1419 1855 300 48052 2000 2000 2000 2000 2000 2000 1530 1530 1883 330 48752 2000 2000 2000 2000 2000 2000 1530 1308 1855 360 47384 2000 2000 2000 2000 2000 2000 1419 1530 1869 390 48812 2000 2000 2000 2000 2000 2000 1530 1419 1869 420 47352 2000 2000 2000 2000 2000 2000 1530 1530 1883 450 46944 2000 2000 2000 2000 2000 2000 1419 1419 1855 480 47086 2000 2000 2000 2000 2000 2000 1419 1419 1855 510 47789 2000 2000 2000 2000 2000 2000 1419 1419 1855 540 47799 2000 2000 2000 2000 2000 2000 1169 1308 1810 570 46693 2000 2000 2000 2000 2000 2000 1308 1419 1841 600 49389 2000 2000 2000 2000 2000 2000 1419 1308 1841 630 48092 2000 2000 2000 2000 2000 2000 1419 1308 1841 660 47454 2000 2000 2000 2000 2000 2000 1419 1419 1855 690 46836 2000 2000 2000 2000 2000 2000 1530 1530 1883 720 47261 2000 2000 2000 2000 2000 2000 1308 1419 1841 750 47122 2000 2000 2000 2000 2000 2000 1419 1419 1855 780 47362 2000 2000 2000 2000 2000 2000 1169 1419 1824 810 48045 2000 2000 2000 2000 2000 2000 1419 1419 1855 840 46429 1175 1933 2000 2000 2000 2000 1530 1419 1757 870 46835 2000 2000 2000 2000 2000 2000 1419 1308 1841 900 47738 1866 1866 1866 1866 2000 2000 1419 1530 1802 System 1 Battery - Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 0 53347 30 52694 2000 2000 2000 2000 2000 2000 1923 2050 1997 60 48780 2000 2000 2000 2000 2000 2000 1733 1733 1933 90 49702 2000 2000 2000 2000 2000 2000 1670 1530 1900 120 49449 2000 2000 2000 2000 2000 2000 1530 1670 1900 150 49864 1075 1375 1375 1375 1375 1075 1986 1419 1382 180 49477 2000 2000 2000 2000 2000 2000 1530 1530 1883 210 47739 2000 2000 2000 2000 2000 2000 1530 1530 1883 240 47961 2000 2000 2000 2000 2000 2000 1530 1530 1883 270 46765 2000 2000 2000 2000 2000 2000 1419 1419 1855 300 48323 2000 2000 2000 2000 2000 2000 1670 1419 1886 330 46877 2000 2000 2000 2000 2000 2000 919 919 1730 360 48398 2000 2000 2000 2000 2000 2000 1670 1670 1918 390 47699 2000 2000 2000 2000 2000 2000 1419 1419 1855 420 46764 2000 2000 2000 2000 2000 2000 1419 1419 1855 450 48355 2000 2000 2000 2000 2000 2000 1308 1419 1841 480 46643 2000 2000 2000 2000 2000 2000 1419 1419 1855 510 47094 1933 1933 1933 1933 1933 1933 1308 1085 1749 540 47462 2000 2000 2000 2000 2000 2000 1419 1419 1855 570 47156 2000 2000 2000 2000 2000 2000 1530 1530 1883 600 47482 2000 2000 2000 2000 2000 2000 1419 1419 1855 630 47205 2000 2000 2000 2000 2000 2000 1419 1419 1855 660 46806 2000 2000 2000 2000 2000 2000 1419 1419 1855 690 47632 2000 2000 2000 2000 2000 2000 1419 1419 1855 720 45909 1800 1800 1800 1800 1800 1800 1419 1419 1705 750 45615 1866 1866 1866 1866 1866 1866 1085 1419 1713 780 47168 1866 1866 1866 1866 1866 1866 1419 1085 1713 810 26772 2000 2000 2000 2000 2000 2000 774 774 1694 840 46179 2000 2000 2000 2000 2000 2000 1419 1419 1855 870 46743 1933 1933 1933 1933 1933 1933 1308 1419 1791 900 45630 1933 1933 1933 1933 1933 1933 1419 1419 1805 Integer Stress Tests continued Below or Go To Start |
Timeout variance refernceother results
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Threads 8 4 2 1 Battery Battery Bat+Pow Power 30-Apr 30-Apr 30-Apr 30-Apr Start 15:00 15:17 15:44 16:06 End 15.17 15.44 16.06 16.32 Secs MB/sec MB/sec MB/sec MB/sec 10 133083 96175 44760 16160 30 119760 88868 47865 15773 60 111445 82186 47151 15757 90 111613 82591 43305 15771 120 109574 81741 43289 15977 150 109483 74503 44553 15769 180 108523 80390 41614 15768 210 106909 79071 43289 15770 240 107657 76151 43296 15768 270 104187 66732 41341 15731 300 104027 73007 40234 15765 330 40548 15985 360 Timeout 42721 15770 390 Timeout 69770 39264 15766 420 61693 38915 15991 450 63592 41352 15768 480 63941 40039 15770 510 111579 62500 39279 15761 540 111350 62786 39488 15769 570 109626 62670 33665 15768 600 108377 62609 39265 15769 630 106509 62758 37640 15771 660 106738 62372 38942 15721 690 105756 62816 37879 16274 720 90875 62794 38051 15769 750 87526 62403 36682 15771 780 89403 62037 37333 15708 810 91222 62149 35351 15746 840 90148 62758 35344 15765 870 90497 62562 37108 15765 900 88864 62803 33745 15769 Start S 133083 96175 44760 16160 End E 88864 62803 33745 15769 %E/S 67 65 75 98 Benchmk 90286 66630 37282 20831 MHz Measurement Test 4A8 30-Apr-2023 15.01.35 Running time 15 minutes, 30 second samples MHz for Core Secs 0 1 2 3 4 5 6 7 0.00 960 960 1152 960 1920 1632 1152 1344 30.09 1728 1728 1728 1728 2112 1824 1824 2304 60.32 1440 1728 1728 1728 1824 1824 1824 960 90.53 1728 1728 1728 1728 2016 1728 1728 2208 821.61 1344 1344 1344 1344 1536 1536 1536 1536 1277.15 1152 1056 1056 1056 2515 2515 2400 2400 End Time 30-Apr-2023 15.23.14 |
System 1 Power 2 Power 3 Power Mean MFLOPS 31603 37395 22990 Usual Slow CPU MHz 2000 1805 2002 System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 MHz for Core Secs MFLOPS 0 1 2 3 4 5 6 7 Average 0 34841 30 32620 2000 2000 2000 2000 2000 2000 2050 2050 2013 60 32965 2000 2000 2000 2000 2000 2000 1796 1796 1949 90 32142 2000 2000 2000 2000 2000 2000 1733 1733 1933 120 31115 2000 2000 2000 2000 2000 2000 1733 1733 1933 150 31404 2000 2000 2000 2000 2000 2000 1670 1670 1918 180 32130 2000 2000 2000 2000 2000 2000 1530 1796 1916 210 31275 2000 2000 2000 2000 2000 2000 1670 1530 1900 240 31024 2000 2000 2000 2000 2000 2000 1796 1796 1949 270 31986 2000 2000 2000 2000 2000 2000 1670 1670 1918 300 32255 2000 2000 2000 2000 2000 2000 1530 1530 1883 330 32591 2000 2000 2000 2000 2000 2000 1530 1733 1908 360 31627 2000 2000 2000 2000 2000 2000 1419 1670 1886 390 31064 2000 2000 2000 2000 2000 2000 1530 1530 1883 420 32626 2000 2000 2000 2000 2000 2000 1530 1530 1883 450 31898 2000 2000 2000 2000 2000 2000 1530 1530 1883 480 30940 1866 1933 2000 2000 2000 2000 1530 1530 1857 510 31994 2000 2000 2000 2000 2000 2000 1860 1419 1910 540 31563 2000 2000 2000 2000 2000 1933 1419 1419 1846 570 30872 2000 2000 2000 2000 2000 2000 1733 1169 1863 600 31143 2000 2000 2000 2000 2000 2000 1670 1670 1918 630 31670 2000 2000 2000 2000 2000 2000 1419 1419 1855 660 31703 2000 2000 2000 2000 2000 2000 1530 1530 1883 690 30936 1866 1800 1800 1800 1800 1800 1670 1670 1776 720 30664 2000 2000 2000 2000 2000 2000 1530 1530 1883 750 31153 2000 2000 2000 2000 2000 2000 1530 1530 1883 780 30367 1933 1933 2000 2000 2000 2000 1670 1308 1856 810 30412 2000 2000 2000 2000 2000 2000 1733 1733 1933 840 30837 2000 2000 2000 2000 2000 2000 1530 1530 1883 870 30699 2000 2000 2000 2000 2000 2000 1419 1308 1841 900 31165 2000 2000 2000 2000 2000 2000 1530 1530 1883 System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) MHz for Core Secs MFLOPS 0 1 2 3 4 5 6 7 Average 0 38431 30 37700 1805 1805 1805 1805 1805 1805 2035 2035 1863 60 37537 1805 1805 1805 1805 1805 1805 2035 2035 1863 90 37643 1805 1805 1805 1805 1805 1805 2035 2035 1863 120 37777 1805 1805 1805 1805 1805 1805 2035 2035 1863 150 37524 1805 1805 1805 1805 1805 1805 2035 2035 1863 180 37956 1805 1805 1805 1805 1805 1805 2035 2035 1863 210 32704 1805 1805 1805 1805 1805 1805 2035 2035 1863 240 37343 1805 1805 1805 1805 1805 1805 2035 2035 1863 270 35775 1805 1805 1805 1805 1805 1805 2035 2035 1863 300 37173 1805 1805 1805 1805 1805 1805 2035 2035 1863 330 37469 1805 1805 1805 1805 1805 1805 2035 2035 1863 360 37749 1805 1805 1805 1805 1805 1805 2035 2035 1863 390 37643 1805 1805 1805 1805 1805 1805 2035 2035 1863 420 37404 1805 1805 1805 1805 1805 1805 2035 2035 1863 450 37339 1805 1805 1805 1805 1805 1805 2035 2035 1863 480 37850 1805 1805 1805 1805 1805 1805 2035 2035 1863 510 36378 1805 1805 1805 1805 1805 1805 2035 2035 1863 540 37348 1805 1805 1805 1805 1805 1805 2035 2035 1863 570 37537 1805 1805 1805 1805 1805 1805 2035 2035 1863 600 37885 1805 1805 1805 1805 1805 1805 2035 2035 1863 630 37787 1805 1805 1805 1805 1805 1805 2035 2035 1863 660 37526 1805 1805 1805 1805 1805 1805 2035 2035 1863 690 37721 1805 1805 1805 1805 1805 1805 2035 2035 1863 720 37841 1805 1805 1805 1805 1805 1805 2035 2035 1863 750 37871 1805 1805 1805 1805 1805 1805 2035 2035 1863 780 37513 1805 1805 1805 1805 1805 1805 2035 2035 1863 810 37863 1805 1805 1805 1805 1805 1805 2035 2035 1863 840 37711 1805 1805 1805 1805 1805 1805 2035 2035 1863 870 37709 1805 1805 1805 1805 1805 1805 2035 2035 1863 900 37528 1805 1805 1805 1805 1805 1805 2035 2035 1863 Floating Point Stress Tests continued Below or Go To Start |
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710 Threads 8 8 4 2 1 Battery Power Battery Battery Battery 27-Apr 27-Apr 30-Apr 30-Apr 30-Apr Start 20:35 20:50 14:06 14:22 14:40 End 20:50 21:09 14:22 14:40 14:57 Secs MFLOPS MFLOPS MFLOPS MFLOPS MFLOPS Start 84416 75701 66146 40172 18014 30 78275 72473 62252 40037 18003 60 77460 61675 61739 Timeout 18000 90 76556 65468 60870 41500 18007 120 75133 62711 60758 38685 18011 150 74824 62759 60320 39085 18002 180 74002 62159 60111 38975 18017 210 71878 58489 59853 38780 18014 240 73117 58442 59472 38367 18006 270 72064 55940 58854 38418 18005 300 72885 53904 35216 37431 18002 330 71437 55761 58663 36239 18015 360 71531 54161 57187 36538 390 70866 53668 54066 35590 420 70526 53834 51857 35860 450 69574 53701 55682 35227 480 62070 53907 50873 Timeout Timeout 510 62157 53930 52357 34290 540 59206 53534 51482 34310 570 57785 53970 49558 35564 600 56564 53967 Timeout 36059 630 59496 68216 59774 36938 660 55328 53941 47969 35854 31675 690 55826 59331 52595 34714 31642 720 56265 57811 57567 36331 30553 750 53968 58897 49164 36803 25729 780 56221 55074 59303 34276 22781 810 54436 56509 49458 34620 22491 840 55442 56757 58579 35851 22494 870 53653 53610 51860 35835 22493 900 54026 52228 50180 34358 22486 Start S 84416 75701 66146 40172 18014 End E 54026 52228 50180 34358 22486 %E/S 64 69 76 86 125 |
Main observations are that average performance can reduces following extended running time and MP gains can be nowhere near being proportional to the number core CPU cores used. For example, using 8 cores might lead to a three times improvement over that from a single core and with less that four times apparently inevitable.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 System 4 Android 13 1 x 2.80 GHz Cortex-X2, 4 x 1.82 GHz Cortex A510, 3 x 2.52 GHz Cortex A710 System 1 2 3 4 Threads MB/sec Gain MB/sec Gain MB/sec Gain MB/sec Gain 1 Best 14594 1.0 14398 1.0 11433 1.0 20831 1.0 2 Minimum 23529 1.6 30435 2.1 20842 1.8 33665 1.6 Average 25460 1.7 30707 2.1 21712 1.9 40107 1.9 Maximum 29863 2.0 30833 2.1 22919 2.0 47865 2.3 4 Minimum 30093 2.1 30379 2.1 23305 2.0 61693 3.0 Average 34008 2.3 35550 2.5 28169 2.5 69532 3.3 Maximum 40437 2.8 36440 2.5 29441 2.6 96175 4.6 8 Minimum 44260 3.0 50302 3.5 36674 3.2 87526 4.2 Average 48066 3.3 55361 3.8 39996 3.5 104589 5.0 Maximum 53708 3.7 57397 4.0 44521 3.9 133083 6.4 |
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55 System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55) System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55 System 4 Android 13 1 x 2.80 GHz Cortex-X2, 4 x 1.82 GHz Cortex A510, 3 x 2.52 GHz Cortex A710 System 1 2 3 4 Threads MFLOPS Gain MFLOPS Gain MFLOPS Gain MFLOPS Gain 1 Best 12096 1.0 12413 1.0 6917 1.0 31675 1.0 2 Minimum 22221 1.8 24629 2.0 13358 1.9 34276 1.1 Average 23468 1.9 24896 2.0 13821 2.0 36783 1.2 Maximum 24427 2.0 24990 2.0 13830 2.0 41500 1.3 4 Minimum 21944 1.8 26128 2.1 16433 2.4 35216 1.1 Average 25164 2.1 27510 2.2 16859 2.4 55459 1.8 Maximum 28083 2.3 27807 2.2 17087 2.5 66146 2.1 8 Minimum 29787 2.5 35775 2.9 20619 3.0 53653 1.7 Average 31555 2.6 37249 3.0 22881 3.3 65709 2.1 Maximum 34876 2.9 38431 3.1 24716 3.6 84416 2.7 |