Android Benchmarks for 32 bit and 64 bit CPUs from ARM, Intel and MIPS

Roy Longbottom's Android Benchmarks
For 32 Bit and 64 Bit CPUs from ARM, Intel and MIPS

Roy Longbottom's Android Benchmarks For 32 Bit and 64 Bit CPUs from ARM, Intel and MIPS

Roy Longbottom's Android Benchmarks
For 32 Bit and 64 Bit CPUs from ARM, Intel and MIPS

Download Benchmark Apps

General

Logged Configuration

Whetstone Benchmark - NativeWhetstone2.apk, Java Whetstone.apk, WhetsNN.exe

Dhrystone Benchmark - Dhrystone2i.apk, Dhry2NN.exe

Linpack Benchmark - LinpackDP2.apk, LinpackSP2.apk, LinpackJava.apk,
NEON-Linpacki.apk, Linpack32SSE.exe, Linpack64SSE.exe

Livermore Loops Benchmark - LivermoreLoops2.apk, Lloops32.exe, Lloops64.exe

MemSpeed Benchmark - MemSpeedi.apk, MemSpeed32.exe, MemSpeed64.exe

NeonSpeed Benchmark - NeonSpeedi.apk

BusSpeed Benchmark - BusSpeedv7i.apk, BusSpeed32.exe, BusSpeed64.exe

RandMem Benchmark - RandMemi.apk, RandMem32.exe, RandMem64.exe

FFT Benchmarks - fft1.apk, fft3c.apk,
For Windows - fft1-32.exe, fft3c-32.exe, fft1-64.exe, fft3c-64.exe

MP-Whetstone Benchmark - MP-WHETSi, MPWhets32.exe, MPWhets64.exe

MP Dhrystone Benchmark - MP-Dhryi.apk, MPDhry32.exe, MPDhry64.exe

NEON-Linpack-MP Benchmark - NEON-Linpacki-MP.apk
Windows SSE - MPLinpackBBPP.exe - BB 32 or 64, PP SP or DP

MP-BusSpeed Benchmark - MP-BusSpdi.apk, MP-BusSpd2i.apk, MPbusSpeed32.exe, MPbusSpeed64.exe

MP-RandMem Benchmark - MP-RndMemi.apk, MPRandMem32.exe, MPRandMem64.exe

MP-MFLOPS Benchmark - MP-MFLOPS2i, MPmflops32.exe, MPmflops64.exe

NEON-MFLOPS-MP Benchmark - NEON-MFLOPS2i-MP.apk

OpenGL Benchmark - JavaOpenGL1.apk

OpenGL Drawing Benchmark - JavaDraw.apk

CPU MHz Benchmark - CP_MHz2.apk

Battery Drain Test - BatteryTest.apk

DriveSpeed Benchmarks - DriveSpeed.apk, DriveSpd2.apk, drivespeed32.exe

CPU Stress Tests - USE AT YOUR OWN RISK

Floating Point Stress Test - MP-FPU-Stress.apk

Integer Stress Test - MP-Int-Stress.apk

Floating Point Plus Integer Tests

Errors and False Errors

System Details

Roy Longbottom March 2018

Contents

Download Benchmark Apps

General

Logged Configuration

Whetstone Benchmark - NativeWhetstone2.apk, Java Whetstone.apk, WhetsNN.exe

Dhrystone Benchmark - Dhrystone2i.apk, Dhry2NN.exe

Linpack Benchmark - LinpackDP2.apk, LinpackSP2.apk, LinpackJava.apk, NEON-Linpacki.apk, Linpack32SSE.exe, Linpack64SSE.exe

Livermore Loops Benchmark - LivermoreLoops2.apk, Lloops32.exe, Lloops64.exe

MemSpeed Benchmark - MemSpeedi.apk, MemSpeed32.exe, MemSpeed64.exe

NeonSpeed Benchmark - NeonSpeedi.apk

BusSpeed Benchmark - BusSpeedv7i.apk, BusSpeed32.exe, BusSpeed64.exe

RandMem Benchmark - RandMemi.apk, RandMem32.exe, RandMem64.exe

FFT Benchmarks - fft1.apk, fft3c.apk, For Windows - fft1-32.exe, fft3c-32.exe, fft1-64.exe, fft3c-64.exe

MP-Whetstone Benchmark - MP-WHETSi, MPWhets32.exe, MPWhets64.exe

MP Dhrystone Benchmark - MP-Dhryi.apk, MPDhry32.exe, MPDhry64.exe

NEON-Linpack-MP Benchmark - NEON-Linpacki-MP.apk Windows SSE - MPLinpackBBPP.exe - BB 32 or 64, PP SP or DP

MP-BusSpeed Benchmark - MP-BusSpdi.apk, MP-BusSpd2i.apk, MPbusSpeed32.exe, MPbusSpeed64.exe

MP-RandMem Benchmark - MP-RndMemi.apk, MPRandMem32.exe, MPRandMem64.exe

MP-MFLOPS Benchmark - MP-MFLOPS2i, MPmflops32.exe, MPmflops64.exe

NEON-MFLOPS-MP Benchmark - NEON-MFLOPS2i-MP.apk

OpenGL Benchmark - JavaOpenGL1.apk

OpenGL Drawing Benchmark - JavaDraw.apk

CPU MHz Benchmark - CP_MHz2.apk

Battery Drain Test - BatteryTest.apk

DriveSpeed Benchmarks - DriveSpeed.apk, DriveSpd2.apk, drivespeed32.exe

CPU Stress Tests - USE AT YOUR OWN RISK

Floating Point Stress Test - MP-FPU-Stress.apk

Integer Stress Test - MP-Int-Stress.apk

Floating Point Plus Integer Tests

Errors and False Errors

System Details

Roy Longbottom March 2018

Linpack Benchmark - LinpackDP2.apk, LinpackSP2.apk, LinpackJava.apk,
NEON-Linpacki.apk, Linpack32SSE.exe, Linpack64SSE.exe

FFT Benchmarks - fft1.apk, fft3c.apk,
For Windows - fft1-32.exe, fft3c-32.exe, fft1-64.exe, fft3c-64.exe

NEON-Linpack-MP Benchmark - NEON-Linpacki-MP.apk
Windows SSE - MPLinpackBBPP.exe - BB 32 or 64, PP SP or DP

NOTE - These benchmarks generally ran successfully on devices controlled by up to Android 7. They could be installed, using Android 8, but failed to run due to a minor incompatibility. The benchmarks have been regenerated, excluding this problem. The new versions can be downloaded from a Researchgate PDF file This also includes details and results from later technology, including Cortex-A73 CPU with Android 8.

General	Logged Configuration	Whetstone Benchmark
Dhrystone Benchmark	Linpack Benchmark	Livermore Loops Benchmark
MemSpeed Benchmark	NeonSpeed Benchmark	BusSpeed Benchmark
RandMem Benchmark	FFT Benchmarks	MP-Whetstone Benchmark
MP-Dhrystone Benchmark	NEON-Linpack-MP Benchmark	MP-BusSpeed Benchmark
MP-RandMem Benchmark	MP-MFLOPS Benchmark	NEON-MFLOPS-MP Benchmark
OpenGL Benchmark	Java Drawing Benchmark	CPU MHz Benchmark
Battery Drain Test	DriveSpeed Benchmark
CPU Stress Tests	Floating Point Stress Test	Integer Stress Test
FPU Plus Integer Tests	Errors and False Errors	System Details

A Settings, Security option may need changing to allow installation of non-Market applications.
All have an option to save results via Email. The first set automatically select benchmark code for ARM,
Intel or MIPS processors at run time, for 32 bit architecture or 64 bit when supported.

	NativeWhetstone2.apk First standard benchmark				Dhrystone2i.apk First integer benchmark
	LinpackDP2.apk First comptutational benchmark				LinpackSP2.apk Single precision Linpack
	LivermoreLoops2.apk First supercomputer benchmark				MemSpeedi.apk Floating Point Cache and RAM Test
	BusSpeedv7i.apk Integer Bus, Cache and RAM Test				RandMemi.apk Random/Serial Access Cache and RAM Test
	fft1.apk Original FFT Benchmark				fft3c.apk Optimised FFT Benchmark
	MP-WHETSi.apk Whetstone Floating and Fixed Point Tests				MP-Dhryi.apk Dhrystone Integer Benchmark
	MP-MFLOPS2i.apk CPU, Cache, RAM MFLOPS Long Running Test				MP-BusSpd2i.apk Long running vesion with staggered start
	MP-RndMemi.apk Multithreaded RandMem Benchmark				NEON-Linpacki.apk Linpack Benchmark using ARM NEON Intrinsic Functions
	NeonSpeedi.apk NEON Memory Speed Test Using Intrinsic Functions				NEON-MFLOPS2i-MP.apk MP-MFLOPS using ARM NEON Intrinsic Functions
	NEON-Linpacki-MP.apk Linpack MP Benchmark using NEON Intrinsic Functions				ZIP File 32 Bit Versions To check performance gains of 64 bit benchmarks
	Stress Tests
	MP-FPU-Stress.apk Floating Point Stress Test Variable Run Time Parameters				MP-Int-Stress.apk Integer Stress Test Variable Run Time Parameters

The above were produced using gcc 4.8, via Eclipse, running under Linux Ubuntu 14.04
Following are older 32 bit benchmarks that are still relevant.

My original Android benchmarks were compiled to only run on ARM CPUs using 32 bit instructions. These are available from a copy in British Library Archives or from here. The newer ones automatically select benchmark code for ARM, Intel or MIPS processors at run time, for 32 bit architecture or 64 bit when supported. These were produced using a later version of the gcc compiler. When evaluating performance differences of 64 bit operation, those at 32 bits should be produced by the same compiler version. These are in the 32 bit zip file that can be downloaded from above. It should be noted that these are recognised, by Android, as identical to the 64 bit versions, that might need to be reinstalled. The version is identified in the output display.

The original ARM native code benchmarks will run on Intel CPUs, but slowly, via an Android based compatibility layer, called Houdini, that maps ARM instructions into those for X86 processors. The new ones use native Intel instructions. After installing Android 5.0, on the Intel tablet, the original ARM native code benchmarks were rerun. As shown in the results below, significant speed gains could be obtained.

The latest benchmarks were compiled using gcc 4.8, via Eclipse Android Development Tools. The project files, with source code, are in Android Intel-ARM Benchmarks.zip. Limited tests show that these projects can also be used to produce the benchmarks via Android Studio. The zip file now includes the projects for the above earlier tests, in an folder named Old.

All Java and native C/C++ based benchmarks use the same Java front end to run the benchmarks and display the results, an example being below. There are Run, Information and Save buttons, the latter to eMail results to me and/or whoever. The results are also saved in text based log files that also identifies system characteristics. As indicated, results identify whether 32 bit or 64 bit code has been executed.

New Version of android benchmarks.htm - The revised version of this report will contain results from running a wide range of the 32/64 bit benchmarks on a particular tablet or phone, plus any from newer top end devices. For detail and results from the original benchmarks see the last report.

Strategy - These benchmarks, based on 50 years experience, do not attempt to provide an overall performance rating (the Lies, Damned Lies and Benchmarks type), as it is meaningless in representing the diverse variety of user activities. The programs are intended to identify best and worst performance characteristics that might explain why a particular application is fast or slow.

CPU Benchmarks - The first set the Classic Benchmarks that were the original programs that set standards of performance for computers, comprising Whetstone, Dhrystone, Linpack (including NEON-Linpack) and Livermore Loops.

Memory Benchmarks - Next are programs that measure performance with data from caches and RAM. MemSpeed (including NeonSpeed variant), BusSpeed and RandMrm all use the same range of data sizes beteen 4 KB and 64 MB. Then there is a Fast Fourier Transform benchmark with multiple data sizes.

MultiThreading Benchmarks - These all measure performance using 1, 2, 4 and 8 threads. The first are MP-Whetstone, MP-Dhrystone and MP-Linpack (including NEON-Linpack-MP). The next batch all use memory sized 12.8 KB, 128 KB and 12.8 MB, comprising MP-MFLOPS (including NEON-MFLOPS MP), MP-BusSpeed and MP-RandMem.

Older Benchmarks - These include graphics and SD drive benchmarks.

Windows 10 Tablet - The C code part of the benchmarks has been used as the basis of programs compiled, as 32 bits and 64 bits, to run on Intel processors via Windows. Results are included below for comparison purposes, but performance might not be the same as that from Android versions running on the same Intel processor model (See system W1). The benchmark execution files are in WinTablet.zip.

March 2016 - A second Windows 10 tablet was obtained, using the same Atom CPU, with the added dual boot option to use Android. This uses a 64 bit Linux kernel but, unfortunately, Android is a 32 bit variety. Results for both Operating Systems are in=ncluded below.

October 2016 - Results now include some using Remix OS for PC that runs Android applications on compatible Intel-based PCs. These include using this for a second boot option on one of the Windows 10 tablets.

May 2017 - Android 7.0 results included, with all 32 bit benchmarks being run on Cortex-A53 based P37. All processor dependent benchmark results were essentially the same as those from Android 6, except Java varieties, where the Whetstone benchmark speed improved considerably.

June 2017 - Floating point and integer arithmetic stress tests were produced. These are multithreaded programs, where number of threads, data size and running time can be defined, plus operations per word for the floating point tests. Unlike previous benchmarks, these display results continuously, over 10 second periods.

To Start

Following are examples of ARM and Intel based system information included in the log files.

ARM CPU Based System Information Device Asus Nexus 7 Screen pixels w x h 800 x 1205 Android Build Version 5.0.2 Processor : ARMv7 Processor rev 9 (v7l) processor : 0 BogoMIPS : 1993.93 processor : 1 BogoMIPS : 1993.93 processor : 2 BogoMIPS : 1993.93 processor : 3 BogoMIPS : 1993.93 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc09 CPU revision : 9 Hardware : grouper Revision : 0000 Linux version 3.1.10-g6ff7a51 (android-build@vpbs1.?mtv.?corp.?google.?com?) (gcc version 4.7 (GCC) ) #1 SMP PREEMPT Wed Dec 10 19:55:59 UTC 2014 Intel CPU Based System Information Device Asus K013 Screen pixels w x h 800 x 1216 Android Build Version 4.4.2 d : 0, siblings : 4, core id : 3, cpu cores : 4, apicid : 6, initial apicid : 6 fdiv_bug : no, f00f_bug : no, coma_bug : no, fpu : yes, fpu_exception : yes cpuid level : 11, wp : yes flags : fpu vme + numerous others including up to SSE4 bogomips : 2666.77 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 55 model name : Intel(R) Atom(TM) CPU Z3745 @ 1.33GHz stepping : 8 microcode : 0x81b cpu MHz : 1862.000 cache size : 1024 KB physical i Linux version 3.10.20-g268162b (3.2.23.182) (gcc version 4.7 (GCC) ) #1 SMP PREEMPT Tue Sep 16 10:49:37 CST 2014

To Start

This provides an overall rating in MWIPS, plus separate results for the eight test procedures in MFLOPS (floating point) and MOPS (functions and integer). For full details and results via Windows. Linux, Android and via different programming languages, see all modern Whetstone Benchmark results (including Windows tablet versions running on desktop PCs), also British Library Archived Files Results up to 2012 and Whetstone Benchmark History and Results from the 1960’s.

Below are results from the original benchmark for comparison with the new one, compiled for 32 bit systems. The initial aim was to show performance improvements of using native code on Intel Atom processors, rather than via the Houdini compatibility translation, where speeds of system A1 were around twice as fast. Note original ARM version (from here) performance differences on A1, Intel Atom based tablet, following upgrade to Android 5.0.

The downside of the later gcc 4.8 compilation were much slower MWIPS ratings using ARM CPUs. This was due to the extremely slow speeds on the EXP tests that dominate overall running time.

On a given platform, as other CPU only benchmarks, performance tends to be proportional to CPU MHz. Considering this, the particular code appears to suit the Qualcomm Snapdragon 800 and shows no real advantage of the ARM v8-A53 over the V7 varieties. In fact, the EXP test also uses the SQRT function. A test in Livermore Loops Benchmark also uses this function in a test that produces unexpectedly worse minimum speed on the same systems (T7, T11, T22 - ARM/Intel 32 Bit Version).

Java results are also included, particularly to show the effects of Android 5 using ART virtual machine instead of Dalvik. For this particular benchmark, there are gains and losses, but all are slower than the native compiled versions.

A5 and W2 Dual Boot Tablet - Differences in results from Microsoft and Android compilers are reflected. Atom Z8300 results for W2 are slower than W1. Later, similar results were obtained.

2016 - Note fast Core i7 results using Android via REMIX for PC and slow Java speeds with Android 6.0, that might be due to later Java, as shown with Intel/Windows Version results.

System ARM MHz Android MWIPS ------MFLOPS------- ------------MOPS-------------- See CPU Build 1 2 3 COS EXP FIXPT IF EQUAL Original ARM Version A1 Z3745 1866 4.4.2 1075.4 373.8 311.5 284.5 21.9 14.2 1421.1 1839.2 797.0 A1 Z3745 1866 5.0 773.4 548.6 371.9 368.2 11.4 11.3 1545.4 2139.5 793.1 A5 Z8300 1840 5.1 747.7 475.7 376.8 315.2 11.1 11.6 1424.5 1988.9 686.0 T7 v7-A9 1200 4.1.2 1115.0 271.3 250.7 256.4 25.8 14.6 1190.0 1797.0 1198.7 T22 v8-A53 1300 5.0.2 1433.7 348.0 319.3 308.2 36.3 19.8 1551.4 1861.9 611.0 T11 v7-A15 1700 4.2.2 1477.7 363.9 220.6 307.5 39.7 18.0 1690.5 2527.9 1127.9 T21 QU-800 2150 4.4.3 2035.1 665.7 640.0 531.6 45.2 23.1 3535.2 3180.4 2120.0 ARM/Intel 32 Bit Version A1 Z3745 1866 4.4.2 1888.4 665.8 504.4 492.0 35.7 27.5 3191.4 3585.8 2146.7 A5 ## Z8300 1840 5.1 1896.6 612.8 472.8 455.5 39.0 26.5 2916.9 3458.9 1943.3 T7 v7-A9 1200 4.1.2 731.1 273.6 253.0 252.8 28.0 5.0 1185.2 2383.4 1192.1 T11 v7-A15 1700 4.2.2 907.4 363.3 327.1 303.1 33.6 6.3 1506.9 2476.5 1122.6 P40 QU-S4 1700 5.1 1032.2 536.3 502.7 430.4 35.2 5.5 2871.2 2582.7 1267.3 T21 QU-800 2150 4.4.3 1973.8 679.6 648.4 525.6 44.7 21.9 3516.7 3147.2 1567.7 T22 v8-A53 1300 5.0.2 834.7 348.9 312.7 310.9 36.7 5.4 1556.7 1867.2 570.5 P41 v8-A53 1200 8.0.1 1225.1 335.4 300.2 294.8 28.9 15.8 1478.3 1776.5 588.0 P37 v8-A53 1500 6.0.1 1561.1 417.0 377.6 376.4 36.4 20.2 1880.8 2256.5 736.2 P37 v8-A53 1500 7.0 1554.9 421.7 379.1 377.2 34.7 21.5 1884.5 2258.4 701.7 P39 v8-A57 1900 6.0.1 1823.3 438.7 367.0 343.1 64.1 20.3 1892.1 2834.1 1099.1 R1=Atom Z8300 1840 6.0.1 1839.3 609.7 470.5 456.5 37.0 25.7 2692.6 3431.1 1993.1 R2 Core i7 3900 6.0.1 5574.4 1276.0 1178.2 977.1 161.5 75.5 6513.8 11702.0 3908.7 ARM/Intel 64 Bit Version T22 v8-A53 1300 5.0.2 1494.2 347.1 307.0 305.9 37.5 20.6 1552.2 1863.7 1239.1 R1=Atom Z8300 1840 6.0.1 2075.1 619.3 472.1 487.0 43.8 29.6 2968.7 2500.9 2015.2 R2 Core i7 3900 6.0.1 6026.9 1268.9 1171.8 977.0 178.2 93.5 6513.2 5861.9 3958.8 Intel/Windows 32 Bit Version W1=Atom Z8300 1840 Win10 1833 630 661 478 35.0 27.3 1302 1094 10833 W2 ## Z8300 1840 Win10 1710 604 616 439 32.6 25.7 1232 1060 10148 PC Core i7 3900 Win10 5314 1263 1245 920 130 90.7 3159 5563 44341 Intel/Windows 64 Bit Version W1=Atom Z8300 1840 Win10 1985 662 477 476 43.0 28.9 1244 1053 10931 W2 ## Z8300 1840 Win10 1918 584 458 461 40.7 28.8 1205 992 10527 PC Core i7 3900 Win10 6439 1247 1089 944 220 109 3124 5563 46749 Java Version A1 Z3745 1866 4.4.2 687.9 428.4 346.8 105.8 27.4 20.0 723.8 172.6 60.1 A1 Z3745 1866 5.0 826.9 182.0 307.1 126.8 32.5 22.3 382.1 126.6 129.6 A5 ## Z8300 1840 5.1 833.2 286.1 292.3 130.9 32.3 22.1 449.4 127.1 116.9 T7 v7-A9 1200 4.4.2 341.8 75.6 104.9 62.1 16.1 5.2 214.3 194.1 43.4 T7 v7-A9 1200 5.1.1 411.3 102.2 157.5 70.4 12.6 8.3 266.1 104.8 78.9 T11 v7-A15 1700 4.2.2 533.9 131.4 209.4 102.5 20.4 6.7 475.8 174.8 105.7 T21 QU-800 2150 4.4.3 855.1 124.0 272.9 159.3 36.3 20.1 572.2 169.3 78.3 T22 v8-A53 1300 5.0.2 363.9 83.0 149.3 65.1 10.8 7.6 268.1 88.2 59.6 T22 v8-A53 1300 5.1 391.3 86.0 155.7 66.8 12.3 8.6 277.8 89.8 61.4 P37 v8-A53 1500 6.0.1 197.8 28.5 35.6 34.5 9.8 5.6 82.8 34.0 22.7 R1=Atom Z8300 1840 6.0.1 279.8 42.2 63.0 53.5 11.2 6.3 118.6 63.1 33.3 P37 v8-A53 1500 7.0 925.5 236.9 263.5 121.7 33.7 21.8 468.8 259.7 346.4 R2 Core i7 3900 6.0.1 1405.7 220.1 325.4 207.5 61.9 47.6 628.7 321.8 173.7 Intel/Windows Version W1 Atom Z8300 1840 Java8 843.6 435.6 336.0 174.3 20.3 12.7 826.3 154.8 788.4 W2 ## Z8300 1840 Java8 749.5 374.7 290.7 151.4 17.8 12.0 738.2 133.5 664.3 PC Core i7 3900 J8_60 2520.5 829.7 774.6 420.8 69.1 39.8 3189.6 1003.9 1244.4 PC Core i7 3900 J8_111 807.0 275.6 245.7 133.8 22.0 12.9 1050.0 327.7 391.7 ## A5 and W2 Same Dual Boot Tablet =Atom R1 and W1 Same Tablet R2 and PC Same PC J8_60=Java 1.8.0_60, J8_111=Java 1.8.0_111

To Start

The Dhrystone integer benchmark produces a performance rating in Vax MIPS (AKA DMIPS). Further details of the Dhrystone benchmark, and results from Windows and Linux based PCs, can be found in all modern Dhrystone Benchmark results (including Windows tablet versions running on desktop PCs) with those up to late 2012 also in British Library Archives. The shown ratio, MIPS/MHz, is often quoted, with this depending on compiler optimisation (or over-optimisation) but is normally constant using the same benchmark on the same range of processors.

Using native x86 code, performance of the Intel Atom based tablet A1 is 30% faster than the original ARM to Intel translated program but, on the other systems, the newer 32 bit compilations are slower. At least tablet T22 is nearly twice as fast when compiled for 64 bit operation. Following an upgrade to Android 5.0, A1 ARM to Intel translation produced performance equivalent to native code. Original can be obtained from here.

2016 - Note faster Android operation at 64 bits and REMIX Android on Core i7 outstanding speeds similar to Windows versions.

System ARM MHz Android Vax MIPS See MIPS /MHz Original ARM Version A1 Z3745 1866 4.4.2 1840 0.99 A1 Z3745 1866 5.0 2488 1.33 A5 ## Z8300 1840 5.1 2365 1.29 T7 v7-A9 1200 4.1.2 1610 1.34 T22 v8-A53 1300 5.0.2 1683 1.29 T11 v7-A15 1700 4.2.2 3189 1.88 T21 QU-800 2150 4.4.3 3854 1.79 ARM/Intel 32 Bit Version A1 Z3745 1866 4.4.2 2451 1.31 A4 Z8300 1840 5.1.1 2430 1.32 A5 ## Z8300 1840 5.1 2318 1.26 T7 v7-A9 1200 4.1.2 1317 1.10 T11 v7-A15 1700 4.2.2 2551 1.50 T21 QU-800 2150 4.4.3 3319 1.54 T22 v8-A53 1300 5.0.2 1423 1.09 P37 v8-A53 1500 6.0.1 1649 1.10 P37 v8-A53 1500 7.0 1722 1.15 R1=Atom Z8300 1840 6.0.1 2390 1.30 R2 Core i7 3900 6.0.1 10489 2.69 ARM/Intel 64 Bit Version T22 v8-A53 1300 5.0.2 2569 1.98 R1=Atom Z8300 1840 6.0.1 3769 2.05 R2 Core i7 3900 6.0.1 17003 4.36 Intel/Windows 32 Bit Version W1 Atom Z8300 1840 Win 10 3044 1.65 W2 ## Z8300 1840 Win 10 2906 1.60 PC Core i7 3900 Win 10 10302 2.64 Intel/Windows 64 Bit Version W1 Atom Z8300 1840 Win 10 3291 1.79 W2 ## Z8300 1840 Win 10 3195 1.74 PC Core i7 3900 Win 10 15580 3.99 ## A5 and W2 Same Dual Boot Tablet =Atom R1 and W1 Same Tablet R2 and PC same System

To Start

The Linpack benchmark speed is measured in MFLOPS, officially for double precision floating point calculations. A version was produced using NEON functions, that only provides single precision operation. So, for comparison purposes, an available C code option, to define single precision data, was used to produce a new version and this has usually lead to a higher MFLOPS speed. Results from various hardware and software platforms can be found in all modern Linpack Benchmark results (including Windows tablet versions running on desktop PCs), with those up to 2013 in British Library Archives.

Single and Double Precision (SP and DP) - Performance of the Linpack benchmark is almost entirely dependent on the calculation x[i]=x[i]+c*y[i], where changes in CPU instructions used can have a dramatic effect. Using the original benchmarks, SP was often nearly twice as fast as DP, affected by use of such as ARM vfpv4 instructions that execute fused multiply-accumulate instructions. Tablet T11 has vfpv4 but T7 does not - See System Details. As a reminder, the original programs on Intel CPUs used a slower instruction conversion procedure, with the later ones with native Intel code. The newer compilations could make further use of advanced instructions, changing relative SP/DP and 32/64 bit performance.

Java results are also included, particularly to show the effects of Android 5 using ART virtual machine instead of Dalvik. For this particular benchmark, ART produces improvements, except for the original A1 Intel Atom results, all are slower than the native compiled versions

NEON-Linpack - This is a single precision version, using NEON intrinsic functions, where further details can be found in android neon benchmarks.htm. Performance can be double that of normal SP results. Note little gain on T22, at 64 bits, where normal SP might also use advanced SIMD instructions.

Following an upgrade to Android 5.0, A1 ARM to Intel translation produced some significant performance gains but Java based speeds were slower. Original ARM only version can be obtained from here.

Windows - both the latest 64 bit and 32 bit compilations use SSE2 instructions (32 bit incompatible with old systems), with similar speeds. Sample Single Precision compilations also produced similar performance.

2016 - Again note fast Core i7 speeds using REMIX Android for PC and some poor performance with latest Java release.

System ARM MHz Android LinpackDP LinpackSP NEONLinpack LinpackJava See MFLOPS MFLOPS SP MFLOPS MFLOPS Original ARM Version T7 v7-A9 1200 4.1.2 151.05 201.30 376.00 56.44 T22 v8-A53 1300 5.0.2 156.70 184.09 393.34 86.09 T11 v7-A15 1700 4.2.2 459.17 803.04 1334.90 143.06 T21 QU-800 2150 4.4.3 389.52 751.95 1250.14 340.44 A1 Z3745 1866 4.4.2 168.16 296.63 443.42 252.49 A1 Z3745 1866 5.0 253.83 293.20 680.85 166.09 A5 ## Z8300 1840 5.1 238.04 318.00 746.36 174.67 R1=Atom Z8300 1840 6.0.1 781.17 37.65 R2 Core i7 3900 6.0.1 3717.42 222.23 ARM/Intel 32 Bit Version T7 v7-A9 1200 4.1.2 159.34 199.84 346.78 T7 v7-A9 1200 5.1.1 160.25 198.96 346.12 89.50 T22 v8-A53 1300 5.0.2 172.28 180.64 407.08 T22 v8-A53 1300 5.1 178.04 187.03 421.86 91.28 P37 v8-A53 1500 6.0.1 207.64 219.03 480.21 23.25 P37 v8-A53 1500 7.0 208.00 220.13 474.21 112.14 T11 v7-A15 1700 4.2.2 826.36 952.88 1411.86 See above T21 QU-800 2150 4.4.3 629.92 790.83 1325.00 See above A1 Z3745 1866 4.4.2 362.63 408.87 900.17 See above A1 Z3745 1866 5.0 363.98 406.59 900.64 See above A5 ## Z8300 1840 5.1 609.39 644.32 942.12 See above R1=Atom Z8300 1840 6.0.1 632.56 682.08 1000.00 See above R2 Core i7 3900 6.0.1 3442.00 1838.99 N/A See above ARM/Intel 64 Bit Version T22 v8-A53 1300 5.0.2 338.00 479.69 505.12 T22 v8-A53 1300 5.1 347.55 492.78 520.79 See above P33 QU-810 2000 5.0.2 1277.76 R1=Atom Z8300 1840 6.0.1 875.82 1473.16 N/A See above R2 Core i7 3900 6.0.1 5152.85 3950.31 N/A See above Intel/Windows 32 Bit Version W1 Atom Z8300 1840 Win 10 615.80 See 64b W2 ## Z8300 1840 Win 10 613.50 See 64b PC Core i7 3900 Win 10 3453.72 See 64b Intel/Windows 64 Bit Version W1 Atom Z8300 1840 Win 10 638.75 254.73 W2 ## Z8300 1840 Win 10 636.00 265.66 PC Core i7 3900 Win 10 3603.86 465.32 ## A5 and W2 Same Dual Boot Tablet =Atom R1 and w1 Same Tablet R2 and PC same System

To Start

The Livermore Loops comprise 24 kernels of numerical application with speeds calculated in MFLOPS. A summary is also produced, with maximum, minimum and various mean values, geometric mean being the official average. As for other of these benchmarks, details and results from various hardware and software platforms are provided in all modern Livermore Loops Benchmark results (including Windows tablet versions running on desktop PCs), with results up to late 2012 in British Library Archives.

MFLOPS/MHz - The first set of the following comparisons are derived from shown MFLOPS of the 24 kernels for each system. divided by CPU MHz, and compared to those from T7 Cortex-A9 CPU. They can indicate the effectiveness of particular levels of hardware and compiler technology. The low minimum speeds occur in the only loop that uses the SQRT function, where the Whetstone Benchmark is also slow on the same systems. The second Cortex-A53 is running under 64 bit Android that might make a difference. Performance of the sytems with better minimum values appear enhanced by the slow T7 Cortex-A9. On average values for ARM CPUs, Qualcomm 800 and Cortex-A15 are somewhat faster. The Intel CPUs are faster on a per MHz basis, with Core i7 being far superior. Note that Android and Windows performance is quite similar for the latter.

64 Bit vs 32 Bit - At least as far as average speeds are concerned, working at 32 bits and 64 bits produces similar performance on Intel based devices but 64 bits can be much faster with ARM processors. Note that Intel CPUs can use the same SSE type SIMD instructions at both settings.

Native 32 Bit vs Original Code - The original benchmarks were compiled for ARM CPUs, producing Intel instructions via the Houdini conversion layer. In this case, performance was much better using native code compilation. ARM speeds were effected by using a later version of the compiler.

Original ARM only version can be obtained from here.

 
 MFLOPS/MHz vs Cortex-A9           Avg    Min    Max

 T11 Cortex-A15    Android  32    1.38   0.90   2.51
 T22 Cortex-A53    Android  32    0.83   0.93   0.92
 P37 Cortex-A53    Android  32    0.95   2.17   0.96
 T21 Qualcomm 800  Android  32    1.13   2.34   1.63
 A1  Atom Z3745    Android  32    1.57   3.71   1.67
 A5  Atom Z8300    Android  32    1.61   3.24   1.82
 R1  Atom Z8300    Android  32    1.62   3.07   1.96
 W2  Atom Z8300    Windows  32    1.66   4.47   1.62
 R2  Core i7       Android  32    3.23   4.10   4.22
 PC  Core i7       Windows  32    3.68   5.07   4.52

 64 Bit / 32 Bit                   Avg    Min    Max

 T22 Cortex-A53    Android 64/32  1.47   3.61   1.96
 R1  Atom Z8300    Android 64/32  1.04   1.08   0.95
 R2  Core i7       Android 64/32  1.18   1.00   1.72
 W2  Atom Z8300    Windows 64/32  0.97   0.78   1.09
 PC  Core i7       Windows 64/32  0.96   0.74   1.06

 Native/Original                                    

 A1  Atom Z3745    Android  32    1.92   2.49   3.17
 T7  Cortex-A9     Android  32    1.01   0.97   0.39
 T11 Cortex-A15    Android  32    1.13   0.91   0.38
 T21 Qualcomm 800  Android  32    1.08   1.00   1.12


 System   CPU   MHz Android                           MFLOPS 24 Loops
 
 Original ARM Version ----------------------------------------------------------------

   A1    Z3745 1866  4.4.2  9.5 secs     201.2   257.3   237.5   205.6   122.5   180.0
                                         308.3   450.0   535.3   370.4   104.8    77.1
  Max   Average Geomean Harmean   Min     80.0    95.1   153.8   136.4   202.0   268.9
 535.8   201.9   172.4   146.7    48.8   179.5   209.7   145.0    95.0   254.2    51.3
 
   A1    Z3745 1866  5.0       9.9 secs  374.9   274.8   327.6   295.6   247.9   227.8
                                         468.5   538.6   569.2   396.2   167.9   141.9
  Max   Average Geomean Harmean   Min    109.6   114.5   210.5   150.5   250.6   333.4
 569.8   266.6   233.5   199.8    59.9   287.9   238.0   261.3   114.9   372.8    64.0

   T7   v7-A9 1200  4.1.2     10.0 secs  241.7   233.4   383.5   388.7    98.4   147.1
                                         293.1   258.5   314.6   181.1    99.1    95.3
  Max   Average Geomean Harmean   Min     80.6    68.1   171.6   226.9   346.2   176.9
 391.9   202.1   181.3   160.9    68.1   202.6   184.9   119.5   102.1   200.9    88.5

   T11  v7-A15 1700  4.2.2    10.0 secs  646.8   671.1   839.9   789.7   176.2   671.6
                                        1078.4  1243.4  1018.8   367.0   130.0   165.9
  Max   Average Geomean Harmean   Min    117.6   210.7   370.5   521.1   657.3   625.4
1252.8   476.0   375.8   288.8    90.8   270.8   269.1   458.3   196.3   432.5   112.7

   T21  QU-800 2150  4.4.3    10.0 secs  570.4   624.2   915.6   861.4   175.5   545.4
                                         636.9   911.1   750.6   293.9   130.5   207.0
  Max   Average Geomean Harmean   Min    115.0   159.8   330.5   327.1   608.7   592.8
1075.5   437.1   356.7   284.4   100.3   330.2   267.3   244.2   153.8   356.2   106.2
  
 ARM/Intel 32 Bit Version ------------------------------------------------------------

   A1    Z3745 1866  4.4.2     9.5 secs  484.6   529.2  1031.2   929.2   274.5   365.6
                                         661.9   873.1   825.6   479.1   612.9   520.7
  Max   Average Geomean Harmean   Min    156.8   324.4   339.4   497.8   693.1   481.8
1031.2   480.0   429.8   378.6   154.7   373.0   329.1   388.6   181.8   650.1   169.2

   A5 ## Z8300 1840  5.1      9.6 secs   689.4   701.4  1108.3   873.6   230.1   488.4
                                         662.2   770.0   876.7   404.9   439.6   428.2
  Max   Average Geomean Harmean   Min    141.2   280.7   293.4   466.1   540.3   432.7
1108.3   495.8   433.6   370.6   133.2   313.9   307.8   649.7   176.1   662.0   148.3

   T11  v7-A15 1700  4.2.2    10.0 secs  496.9   814.9   843.7   801.7   175.5   188.6
                                        1223.8  1411.4   760.3   452.5   132.7   120.7
  Max   Average Geomean Harmean   Min    107.1   264.7    34.3   529.0   592.6   728.2
1411.4   471.2   342.1   219.5    34.3   275.2   266.8   530.7   198.8   502.8   117.8


   T21  QU-800 2150  4.4.3    10.1 secs  640.9   814.9   813.8   808.4   201.6   182.0
                                         643.0  1158.9   779.9   351.4   133.1   176.2
  Max   Average Geomean Harmean   Min    113.6   178.4   286.5   294.7   516.7   667.5
1159.4   446.9   356.0   280.3   112.3   327.5   281.7   297.9   153.6   613.1   117.0

   T7    v7-A9 1200  4.4.2    10.2 secs  245.2   268.8   394.7   390.7   118.2   157.2
                                         297.4   308.1   344.7   226.7    90.8    74.7
  Max   Average Geomean Harmean   Min     85.6    81.7    26.9   227.5   338.9   240.3
 396.6   207.6   175.6   136.1    26.8   204.9   180.6   179.9   110.8   271.4    78.5

   P37  v8-A53 1500  6.0.1     9.8 secs  201.7   293.7   331.7   327.5   135.5   137.1
                                         346.5   474.9   451.5   271.6   149.7    74.9
  Max   Average Geomean Harmean   Min     81.2   104.5   236.3   278.4   411.1   294.2
 474.9   237.4   208.3   179.9    72.7   208.0   245.7   148.2   128.8   351.7    99.9

   P37  v8-A53 1500  7.0       9.7 secs  198.6   295.5   331.3   325.1   131.7   140.5
                                         341.5   475.4   452.1   241.8   149.5    74.8
  Max   Average Geomean Harmean   Min     81.8   105.2   237.0   279.0   412.9   295.1
 475.4   237.0   208.1   180.0    72.9   208.8   238.9   131.1   133.1   353.3   100.4

 ARM/Intel 32 Bit Version Then 64 Bit ------------------------------------------------

   T22  v8-A53 1300  5.0.2     9.7 secs  163.4   243.4   272.1   270.3   109.5   111.2
                                         282.2   389.0   360.6   219.6   124.0    61.8
  Max   Average Geomean Harmean   Min     67.6    87.4    27.3   224.2   340.1   241.9
 393.4   188.3   158.3   124.6    27.1   168.5   198.8   120.2   120.6   277.7    79.1

 R1=Atom Z8300 1840  6.0.1     9.4 secs  746.6   767.9  1194.9   986.9   249.3   520.5
                                         722.7   840.9   978.5   370.5   451.5   450.1
  Max   Average Geomean Harmean   Min    151.3   301.3   331.4   524.9   608.1   465.5
1194.9   501.0   435.1   366.6   126.1   352.1   316.8   578.8   181.8   695.3   166.3

 R2    Core i7 3900  6.0.1     8.4 secs 3664.3  3433.9  2498.9  2509.6   552.5  2201.3
                                        4618.0  5337.8  5345.9  2426.9  1307.3  1888.8
  Max   Average Geomean Harmean   Min    670.6  1211.5  2033.5  1804.4  2382.0  3571.5
5441.5  2259.0  1845.3  1445.9   356.9   840.6   968.8  2967.6  1112.8  1591.4   356.9

 ARM/Intel 64 Bit Version ------------------------------------------------------------

   T22  v8-A53 1300  5.0.2     9.7 secs  451.4   191.4   243.2   272.4   144.9   144.5
                                         749.4   411.1   453.6   261.1   138.0   206.1
  Max   Average Geomean Harmean   Min    122.5   130.1   215.0   249.8   411.6   395.4
 772.2   265.9   232.5   206.3    97.8   241.7   248.1   152.8   118.7   317.2   103.7

 R1=Atom Z8300 1840  6.0.1     9.4 secs  881.6   742.9  1130.2   928.7   236.9   554.1
                                         869.1   795.4   854.7   433.5   198.4   604.5
  Max   Average Geomean Harmean   Min    215.7   292.9   320.3   520.5   628.6   528.5
1130.2   524.1   451.1   380.6   136.1   321.3   290.4   692.1   205.4   698.3   164.8 

 R2    Core i7 3900  6.0.1     8.9 secs 9376.3  3394.8  2496.3  2523.0   559.6  2219.9
                                        8891.9  5719.5  5828.1  2749.2   439.9  3146.1
  Max   Average Geomean Harmean   Min   1182.7  1272.5  2282.8  2332.7  2379.4  5722.9
9376.3  2933.0  2172.0  1556.1   357.0  1068.6   966.6  2966.5  1435.5  1590.7   357.0

 Intel/Windows 32 Bit Version --------------------------------------------------------

  W1  Atom Z8300 1840 MHz Win10          721.4   702.3   862.7   988.7   245.3   489.6
                                         875.8   794.8   980.5   441.3   201.1   446.7
  Max   Average Geomean Harmean   Min    201.0   240.8   299.8   499.9   603.5   459.3
 988.7   504.9   448.8   395.8   189.6   446.6   336.0   607.8   199.0   705.3   277.8

  W2 ##    Z8300 1840 MHz Win10          749.7   731.3   894.0   988.1   251.2   489.2
                                         883.7   797.3   968.3   434.5   200.5   454.0
  Max   Average Geomean Harmean   Min    202.9   240.7   301.1   521.4   604.3   457.0
 988.1   503.5   446.9   393.5   183.5   443.6   333.7   587.1   200.6   697.1   276.4

  PC     Core i7 3900 MHz Win10         4752.7  3624.0  2593.7  2764.2   564.5  1590.3
                                        5071.8  5284.2  5569.6  2784.2   441.5  1939.4
  Max   Average Geomean Harmean   Min    931.3  1205.6  2284.2  2372.1  2435.2  3500.0
5821.7  2512.1  2102.9  1712.0   441.5  1068.8  1880.4  2819.7  1529.1  1590.4  1616.5

 Intel/Windows 64 Bit Version --------------------------------------------------------

  W1  Atom Z8300 1840 MHz Win10          655.2   651.9   728.1   688.9   217.3   457.6
                                         732.2   735.7   965.5   378.3   170.7   381.6
  Max   Average Geomean Harmean   Min    196.6   196.4   213.0   434.4   522.6   420.6
 965.5   433.6   375.8   320.0   117.2   385.6   283.1   572.9   156.7   584.5   129.7

  W2 ##    Z8300 1840 MHz Win10          743.0   734.7   834.0   808.1   233.4   547.9
                                         878.6   857.4  1074.4   440.8   201.7   450.4
  Max   Average Geomean Harmean   Min    215.4   228.6   247.9   500.4   608.1   484.8
1074.4   500.3   433.7   369.8   143.4   440.7   327.7   650.4   180.8   682.2   151.6

  PC     Core i7 3900 MHz Win10         4566.1  3465.7  2459.1  2748.1   565.1  2308.3
                                        6142.4  5354.0  5195.9  2518.0   417.8  1838.7
  Max   Average Geomean Harmean   Min    941.0  1096.4  2166.0  2180.9  2291.5  3357.0
6142.4  2514.4  2014.9  1500.5   324.9  1005.1  1780.7  2871.2  1311.6  1600.5   324.9

                          ## A5 and W2 Same Dual Boot Tablet
                       =Atom R1 and W1 Same Tablet
                             R2 and PC same System

To Start

This benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, sized 2 x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m], using double and single precision floating point and x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can be calculated by dividing double precision MB/second by 8 and 16, for the two tests, and single precision speeds by 4 and 8. Assembly listings for integer tests show that Millions of Instructions Per Second (MIPS) can be found by multiplying MB/second by 0.78 with 2 adds and 0.66 for the other test. Cache sizes are indicated by varying performance as memory usage changes. For more details and older results see here, with results up to 2013 in British Library Archives.

The native ARM/Intel results, on Intel Atom based A1, averaged 44% faster than the original translated speeds via L1 cache data, 27% using L2 and 14% from RAM. Running under Android 5.0, the translated benchmark speeds were similar to the new version, in most cases. (Original ARM only version can be obtained from here).

Initial measurement, running the new 32 bit version on ARM CPUs, produced similar results to the original benchmark.

First results, to provide 64/32 bit comparisons on ARM CPUs, were on Tablet T22, where average 64/32 bit speed ratios, were 2.20 times, using cached data, and 1.58 times from RAM.

The benchmark is based on, and is similar to, my original Windows MemSpeed bencmarks, where details and results can he found here. These can be compared with the new Windows tablet version, from later compiler, with 32 bit and 64 bit results included below. Android results R1 an R2 are via via REMIX for Intel PCs, running at 64 bits.

Dual Booting - Results include those for Windows and Android running on the same system. They are dual, boot A5 and W2, alternative boot W1 and R1 and alternative boot PC and R2.

Following the results are processor technology comparisons with the ARM Cortex-A9 CPU, based on MB/second divided by CPU MHz, demonstrating that each has its strengths and weaknesses. See comments in comparison table.

Results are dependent on the particular compiler used. Those for the Windows version were produced by an earlier compiler and are relatively slow at 64 bits. An example of differences is for the first test, with a source code loop, in double precision, that contains four multiples and four adds. Assembly code produced for Intel CPUs has four scalar SSE2 multiplies and four adds at 32 bits, with two SIMD SSE2 instructions of each at 64 bits. Those for ARM has four fmacd floating-point multiply-accumulate to double precision registers at 32 bits and two fmla fused multiply-add instructions to vector registers at 64 bits. The result is much faster performance at 64 bits. In principle, SIMD instructions could also be used at 32 bits for Intel, but fmla is only available at 64 bits with ARM.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, DDR3 5.3 GB/s 4 x 32 KB L1 cache, 1 MB shared L2 cache Android MemSpeed Benchmark 17-Oct-2012 20.19 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 1735 888 2456 2726 1364 2818 L1 32 1448 760 1474 1700 1039 1648 64 1318 719 1290 1468 952 1385 L2 128 1279 715 1289 1443 944 1336 256 1268 714 1279 1435 943 1313 512 1158 691 1204 1321 892 1228 1024 729 553 735 772 632 742 4096 445 392 425 442 421 439 RAM 16384 435 390 428 435 412 431 65536 445 404 393 450 432 449 Total Elapsed Time 12.2 seconds #################### T7 ARM-Intel ##################### ARM/Intel MemSpeed Benchmark 1.1 25-Apr-2015 12.24 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 1856 1019 2537 2913 1459 2544 32 1416 832 1327 1508 920 1345 64 1286 779 1198 1418 908 1296 128 1282 781 1195 1424 912 1305 256 1278 774 1190 1433 878 1298 512 1197 752 1122 1340 862 1216 1024 833 626 822 903 695 857 4096 463 420 456 463 440 459 16384 459 426 453 455 435 458 65536 463 430 411 462 443 452 Total Elapsed Time 11.5 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1700 MHz Cortex-A15, Android 4.2.2 2 GB DDR3-1600 RAM, dual channel, 12.8 GB/sec Dual core, 2 x 32 KB L1 cache, 1 MB shared L2 cache Android MemSpeed Benchmark 1.1 09-Aug-2013 17.04 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 7296 4159 3513 9375 5453 6211 L1 32 7253 4540 3882 7364 4873 4839 64 6902 4265 3878 7026 4373 4274 L2 128 6735 4032 2480 4005 2797 3288 256 5859 3775 2192 4527 3263 3676 512 5795 3781 3568 6282 3819 3818 1024 2609 1757 1754 2607 1805 1825 4096 1614 1422 1471 1654 1342 1441 RAM 16384 1624 1412 1474 1642 1336 1443 65536 1617 1408 1479 1368 1321 1423 Total Elapsed Time 10.7 seconds #################### T11 ARM-Intel #################### ARM/Intel MemSpeed Benchmark 1.1 23-Apr-2015 12.26 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 6540 4359 4580 10119 6292 6502 32 8185 5132 4682 8729 4622 4465 64 5770 3530 3473 5780 3447 3782 128 5311 3386 3475 5225 3441 3451 256 5667 3642 3678 5805 3643 3726 512 5047 3318 3334 4869 3303 3337 1024 2015 1469 1423 2050 1452 1386 4096 1535 1322 1342 1598 1381 1385 16384 1505 1379 1406 1584 1387 1384 65536 1509 1306 1332 1585 1387 1382 Total Elapsed Time 10.8 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s L1 caches 4 x 16 KB, L2 cache shared 2048 KB Android MemSpeed Benchmark 1.1 02-Jun-2015 11.01 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 8922 4635 3566 12412 5648 3774 L1 32 5116 3542 2773 7594 4827 3657 L2 64 5174 3393 2684 5652 3757 3130 128 5286 3387 2648 5443 3758 3194 256 4937 3446 2889 7469 4624 3449 512 4941 3459 2915 7452 4566 3724 1024 4837 3449 2848 7065 4455 3722 4096 2840 2606 2343 2581 2458 2567 RAM 16384 2606 2423 2232 2395 2238 2338 65536 2653 2453 2257 2457 2312 2420 Total Elapsed Time 9.7 seconds Maximum SP MFLOPS 1159 Integer MIPS 2802 #################### T21 ARM-Intel #################### ARM/Intel MemSpeed Benchmark 1.1 02-Jun-2015 11.27 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 8074 4831 2603 11252 5065 3892 L1 32 5302 4138 3709 7252 4985 3693 L2 64 4801 3510 2832 5739 3684 3015 128 4502 3783 3577 5991 3914 3547 256 4907 3913 3934 6876 4280 4056 512 4686 3883 3921 6236 4215 4060 1024 4716 3808 3823 6131 4185 3942 4096 2691 2603 2679 2249 2634 2709 RAM 16384 2227 2223 2420 1798 2191 2445 65536 2099 2106 2306 1738 2040 2346 Total Elapsed Time 9.9 seconds Maximum SP MFLOPS 1207 Integer MIPS 2898 ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel MemSpeed Benchmark 1.2 01-Nov-2016 11.10 Compiled for 32 bit ARM v7a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 2343 1175 2093 2810 1547 2522 32 1895 1132 1983 2788 1494 2360 64 2168 1130 1943 2718 1474 2302 128 2170 1124 1942 2725 1472 2281 256 2198 1134 1960 2769 1483 2315 512 2019 1085 1799 2438 1390 2087 1024 1484 953 1396 1636 1152 1532 4096 1411 950 1368 1538 1131 1471 16384 1348 950 1306 1441 1112 1370 65536 1396 943 1318 1490 1132 1452 Total Elapsed Time 12.0 seconds Android 7.0 ARM/Intel MemSpeed Benchmark 1.2 10-May-2017 10.49 Compiled for 32 bit ARM v7a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 2344 1172 2091 2998 1549 2528 32 2227 1122 2014 2851 1514 2423 64 2169 1131 1945 2720 1476 2305 128 2214 1141 1976 2778 1485 2333 256 2219 1142 1988 2816 1500 2343 512 2080 1095 1853 2579 1383 2178 1024 1474 952 1408 1657 1149 1543 4096 1333 945 1294 1414 1098 1341 16384 1328 947 1274 1395 1112 1357 65536 1351 936 1331 1461 1116 1413 Total Elapsed Time 12.1 seconds ###################### T22 32 Bit ###################### T22, ARM Cortex-A53 1300 MHz, Android 5.0.2 Single Channel RAM, LPDDR3 666 MHz, 5.3 GB/second 4 x 32 KB L1 cache, 512 KB L2 cache ARM/Intel MemSpeed Benchmark 1.2 05-Aug-2015 17.16 Compiled for 32 bit ARM v7a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 1940 971 1693 2470 1278 2084 L1 32 1879 955 1676 2378 1255 1967 64 1801 938 1615 2254 1218 1912 L2 128 1706 941 1620 2279 1224 1872 256 1818 935 1570 2291 1155 1875 512 1633 884 1451 2008 1132 1704 1024 1276 781 1181 1454 938 1324 RAM 4096 1335 808 1260 1533 1010 1386 16384 1342 813 1270 1487 1013 1419 65536 1346 809 1274 1546 1031 1252 Total Elapsed Time 11.7 seconds ###################### T22 64 Bit ###################### ARM/Intel MemSpeed Benchmark 1.2 05-Aug-2015 17.29 Compiled for 64 bit ARM v8a Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 4092 2198 3951 5293 3611 4408 L1 32 3753 2496 3630 4651 3300 3992 64 3407 2388 3368 3715 3023 3677 L2 128 3496 2462 3521 4137 3139 3844 256 3535 2481 3573 4199 3322 3911 512 3054 2248 3126 3556 2548 3372 1024 1714 1704 2029 2069 1854 2099 RAM 4096 1832 1595 1841 1914 1780 1897 16384 1844 1601 1850 1925 1798 1891 65536 1859 1608 1837 1921 1795 1812 Total Elapsed Time 10.2 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s 4 x 24 KB L1, 2 x 1 MB L2 Android MemSpeed Benchmark 1.1 01-Feb-2015 10.06 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 2773 1745 2821 5993 3274 3094 L1 32 3088 1690 2451 4849 2769 2896 L2 64 3066 1694 2245 3883 2434 2568 128 3084 1695 2261 3886 2466 2524 256 3158 1732 2285 3964 2264 2176 512 2666 1721 2295 3959 2505 2561 1024 2938 1659 2163 3567 2356 2443 4096 2775 1653 2123 3055 2307 2395 RAM 16384 2827 1659 2121 3208 2321 2411 65536 2840 1661 2112 3248 2314 2406 Total Elapsed Time 10.8 seconds ############### A1 Original Android 5.0 ################ Android MemSpeed Benchmark 1.1 26-Oct-2015 16.01 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 3033 1738 4531 8287 3925 4532 L1 32 3048 1701 3690 5913 3371 3737 L2 64 3041 1727 3253 4613 2798 3137 128 3040 1748 3263 4613 2848 3190 256 3072 1746 3253 4568 2835 3070 512 3055 1740 3229 4541 2824 3163 1024 2841 1672 2969 4114 2689 2949 4096 2727 1658 2797 3157 2643 2784 RAM 16384 2746 1658 2786 3156 2646 2814 65536 2672 1658 2807 3150 2643 2808 Total Elapsed Time 10.4 seconds #################### A1 ARM-Intel ###################### ARM/Intel MemSpeed Benchmark 1.1 23-Apr-2015 11.46 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 3287 1859 4560 9789 4688 7316 L1 32 3233 1856 3807 6633 3990 4030 L2 64 3304 1860 2965 4457 2996 3894 128 3303 1855 3006 4463 3113 3992 256 3306 1860 2978 4463 3093 3946 512 3307 1862 2964 4377 3097 3958 1024 3031 1778 2766 3993 2867 3472 4096 2863 1776 2692 3129 2763 3046 RAM 16384 2857 1776 2702 3063 2768 3050 65536 2865 1765 2702 3176 2782 3087 Total Elapsed Time 10.1 seconds ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel MemSpeed Benchmark 1.2 24-Mar-2016 16.38 Compiled for 32 bit Intel x86 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 6580 3487 4236 6996 3453 6933 L1 32 4446 2719 3320 5506 3068 3445 l2 64 4781 2870 2783 4127 2670 3449 128 4800 2800 2692 4107 2722 3482 256 4847 2885 2642 4043 2715 3460 512 4740 2954 2603 3948 2633 3220 1024 3623 2444 2424 3376 2317 2850 4096 2645 2429 2353 2583 2301 2577 RAM 16384 2576 2390 2335 2603 2295 2545 65536 2594 2213 2203 2658 2318 1902 Total Elapsed Time 10.1 seconds ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 Memory Reading Speed Benchmark From C/C++ 18.00.21005.1 for x86 Start of test Wed Dec 23 20:57:00 2015 Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m] KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32 Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S 16 6533 3377 6232 6541 3239 7031 14212 3650 3444 L1 32 4955 3008 3732 4656 2764 3582 8596 2339 2268 L2 64 4502 2616 3202 4891 2841 3393 5115 1992 2106 128 4741 2818 3767 4963 2996 3206 5177 2110 2165 256 4795 2889 3096 4662 2864 3251 4947 2059 2088 512 4792 2921 3452 4518 2924 3292 4936 2022 2230 1024 4316 2491 3175 4201 2535 3163 4546 1974 1952 4096 3585 2618 2993 3485 2503 2961 1856 1710 1798 RAM 16384 3621 2676 3163 3629 2521 3218 1860 1825 1727 65536 3567 2579 3219 3521 2555 3250 1858 1749 1771 End of test Wed Dec 23 20:57:15 2015 #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MemSpeed Benchmark 1.2 21-Oct-2016 14.26 Compiled for 32 bit Intel x86 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 5904 3269 4009 7013 3653 6236 32 4879 3106 3331 5841 3251 3499 64 4169 3037 2863 4306 2783 3554 128 5001 2612 2882 4309 2810 3568 256 4973 3085 2504 4276 2795 3535 512 5003 3084 2616 4269 2773 2734 1024 2753 2273 1996 3067 2165 3092 4096 2753 2709 2577 3399 2508 2668 16384 3503 2617 2586 3513 2573 2010 65536 2596 2758 2657 3356 2337 3202 Total Elapsed Time 10.1 seconds ################# W1 Windows 10 64 bit ################# Memory Reading Speed Benchmark From C/C++ 18.00.21005.1 for x64 Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 Start of test Wed Dec 23 21:14:38 2015 Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m] KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32 Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S 16 6553 3300 6519 6338 3134 6891 6406 3565 3518 L1 32 5132 2814 3431 5130 3119 3451 5114 2156 2196 L2 64 4348 2973 3207 4674 2802 3189 3334 2075 2216 128 4600 2929 3082 4335 2974 3391 3234 2164 2020 256 4351 2826 3529 4507 2930 3192 3256 1945 2059 512 4673 3009 3415 4594 2698 3357 3278 2059 2072 1024 4430 2519 2807 4154 2691 2938 2947 2049 1811 4096 3541 2629 3227 3563 2654 2860 1858 1763 1829 RAM 16384 3587 2762 3086 3541 2433 3172 1847 1733 1741 65536 3508 2616 3097 3591 2568 3095 1841 1794 1684 End of test Wed Dec 23 21:14:52 2015 #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MemSpeed Benchmark 1.2 14-Aug-2016 22.45 Compiled for 64 bit Intel x86_64 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 12905 9950 10127 15445 12973 14328 L1 32 9741 8674 10622 10591 10582 11198 L2 64 8973 7375 8052 8473 7900 6987 128 7867 8286 7279 5715 7939 8210 256 8918 8469 6080 8102 7164 6799 512 7617 7620 8162 8291 8071 6543 1024 5752 5526 5682 5715 5693 3446 4096 2112 3004 3175 2940 2872 2656 RAM 16384 3068 3328 3362 3362 3358 3361 65536 2948 3376 2951 3388 3389 2880 Total Elapsed Time 10.1 seconds ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 Memory Reading Speed Benchmark From C/C++ 18.00.21005.1 for x86 Start of test Mon Apr 11 23:42:19 2016 Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m] KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32 Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S 16 6614 3624 6378 6504 3641 7247 14229 3548 3399 L1 32 5123 3040 4017 5262 3055 3889 8664 2453 2423 K2 64 4945 3006 3551 4974 2971 3531 5599 2293 2233 128 4644 3009 3709 4983 3009 3516 5499 2134 2229 256 4979 2964 3545 4986 3007 3520 5405 2251 2244 512 4926 2991 3381 4914 2995 3550 5072 2221 1671 1024 3400 2670 3343 4601 2723 3166 4483 2089 2093 4096 2785 2582 2741 2747 2573 2720 1417 1415 1386 RAM 16384 2678 2391 2730 2778 2501 2719 1409 1376 1412 65536 2771 2540 2719 2754 2587 2749 1403 1403 1413 End of test Mon Apr 11 23:42:34 2016 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## Memory Reading Speed Benchmark From C/C++ 18.00.21005.1 for x64 Start of test Mon Apr 11 23:46:41 2016 Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m] KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32 Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S 16 6304 3285 7069 7052 3639 7304 7244 3641 3634 L1 32 5264 3120 3665 5317 3117 3687 5451 2462 2406 L2 64 4998 3074 3529 4999 3084 3568 3553 2294 2184 128 4718 3084 3541 4973 3052 3552 3557 2226 2240 256 4971 3074 3552 4977 3003 3538 3421 2245 2228 512 4932 3083 3514 4953 3049 3546 3352 2228 2034 1024 4664 2953 3329 4631 2916 3367 3106 2101 2078 4096 2712 2553 2668 2686 2604 2687 1379 1354 1402 RAM 16384 2787 2602 2737 2751 2586 2728 1417 1414 1413 65536 2784 2620 2748 2703 2576 2751 1396 1418 1399 End of test Mon Apr 11 23:46:56 2016 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MemSpeed Benchmark 1.2 21-Oct-2016 12.32 Compiled for 32 bit Intel x86 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 33681 17997 16567 33854 19686 20530 L1 32 35646 18564 17469 33563 20770 20563 64 29646 18807 17361 33636 20333 20312 L2 128 32277 18825 17385 33675 20368 20335 256 30728 18777 17344 31768 20100 20087 512 25499 18128 17017 26094 18350 18479 L3 1024 25420 15696 17017 25866 18351 18481 4096 24527 17773 16644 25451 18056 18164 16384 15718 13805 13411 15827 13467 13619 RAM 65536 15012 13362 13045 15185 13042 13179 Total Elapsed Time 11.1 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MemSpeed Benchmark 1.2 14-Aug-2016 14.15 Compiled for 64 bit Intel x86_64 Reading Speed in MBytes/Second Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] KBytes Dble Sngl Int Dble Sngl Int 16 64599 46528 48775 74152 58161 53413 L1 32 64632 47018 50799 79043 61563 57263 64 49628 42403 42997 51183 46691 46244 L2 128 48556 42075 42700 49985 46092 45831 256 44225 37976 38268 46194 41269 41254 512 31103 29151 29336 31472 30662 30681 L3 1024 30323 28673 28768 30925 29883 29889 4096 30004 28313 28366 30483 29488 29511 16384 15360 15647 15577 15118 15409 15430 RAM 65536 14766 15072 15004 14623 14807 14825 Total Elapsed Time 9.8 seconds ======================================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz ======================================================== Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m] KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32 Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S 32 Bit 16 33625 18739 18205 33727 19627 19643 53146 14729 14722 L1 256 31208 18912 17930 30724 19812 20422 29081 13025 12998 L2 4096 24497 17539 17846 22309 14066 16152 17040 10884 10876 L3 65536 14722 12402 12522 14576 12782 12889 7398 7598 7585 RAM 64 Bit 16 33775 16842 14901 33782 16948 16924 29180 14652 14666 L1 256 31182 17652 15752 33216 17847 17604 23216 13254 13286 L2 4096 25009 17093 15360 25601 17336 17079 14941 10864 10880 L3 65536 14696 10754 12497 15026 13235 13400 7496 7652 7667 RAM ============================================================= MB/sec/MHz Comparisons With 32 Bit Cortex-A9 ============================================================= x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m Dble Sngl Int Dble Sngl Int 32 Bit Only Android T11 L1 2.49 3.02 1.27 2.45 3.04 1.80 Cortex L2 3.13 3.32 2.18 2.86 2.93 2.03 A15 RAM 2.30 2.14 2.29 2.42 2.21 2.16 T21 L1 2.43 2.65 0.57 2.16 1.94 0.85 QualcommL2 2.14 2.82 1.85 2.68 2.72 1.74 800 RAM 2.53 2.73 3.13 2.10 2.57 2.90 A1 L1 1.14 1.18 1.16 2.17 2.07 1.86 Atom L2 1.67 1.55 1.61 2.01 2.27 1.96 Z3745 RAM 3.99 2.65 4.24 4.44 4.05 4.41 A5 L1 2.31 2.23 1.09 1.57 1.54 1.78 Atom L2 2.47 2.43 1.45 1.84 2.02 1.74 z8300 RAM 3.65 3.36 3.50 3.75 3.41 2.74 P37 L1 1.01 0.92 0.66 0.77 0.85 0.79 Cortex L2 1.38 1.17 1.32 1.55 1.35 1.43 A53 RAM 2.41 1.75 2.57 2.58 2.04 2.57 ########################################################### 32 Bit and 64 Bit - Android on ARM CPU and via REMIX with Intel are more than twice as fast at 64 bits. Intel CPUs are much faster than ARM. Windows speeds are similar at 32 and 64 bits. Android T22 32b L1 0.96 0.88 0.62 0.78 0.81 0.76 Cortex L2 1.31 1.12 1.22 1.48 1.21 1.33 A53 RAM 2.68 1.74 2.86 3.09 2.15 2.56 T22 64b L1 2.04 1.99 1.44 1.68 2.28 1.60 Cortex L2 2.55 2.96 2.77 2.70 3.49 2.78 A53 RAM 3.71 3.45 4.13 3.84 3.74 3.70 REMIX/Android R1 32b L1 2.07 2.09 1.03 1.57 1.63 1.60 Atom L2 2.54 2.60 1.37 1.95 2.08 1.78 Z8300 RAM 3.66 4.18 4.22 4.74 3.44 4.62 R1 64b L1 4.53 6.37 2.60 3.46 5.80 3.67 Atom L2 4.55 7.14 3.33 3.69 5.32 3.42 Z8300 RAM 4.15 5.12 4.68 4.78 4.99 4.16 R2 32b L1 5.58 5.43 2.01 3.58 4.15 2.48 Core i7 L2 7.40 7.46 4.48 6.82 7.04 4.76 4820K RAM 9.98 9.56 9.77 10.11 9.06 8.97 R2 64b L1 10.71 14.05 5.92 7.83 12.27 6.46 Core i7 L2 10.65 15.10 9.89 9.92 14.46 9.78 4820K RAM 9.81 10.78 11.23 9.74 10.28 10.09 Windows PC 32b L1 5.57 5.66 2.21 3.56 4.14 2.38 Core i7 L2 6.78 6.99 4.16 6.27 6.63 4.67 4820K RAM 3.52 4.90 3.22 3.16 4.33 3.06 PC 64b L1 5.60 5.09 1.81 3.57 3.57 2.05 Core i7 L2 6.78 6.53 3.65 6.78 5.97 4.03 4820K RAM 3.52 4.25 3.21 3.26 4.48 3.18 W1 32b L1 2.30 2.16 1.60 1.46 1.45 1.80 Atom L2 2.45 2.43 1.70 2.12 2.13 1.63 Z8300 RAM 5.02 3.91 5.11 4.97 3.76 4.69 W1 64b L1 2.30 2.11 1.68 1.42 1.40 1.77 Atom L2 2.22 2.38 1.93 2.05 2.18 1.60 Z8300 RAM 4.94 3.97 4.91 5.07 3.78 4.47 W2 32b L1 2.32 2.32 1.64 1.46 1.63 1.86 Atom L2 2.54 2.50 1.94 2.27 2.23 1.77 z8300 RAM 3.90 3.85 4.31 3.89 3.81 3.97 W2 64b L1 2.22 2.10 1.82 1.58 1.63 1.87 Atom L2 2.54 2.59 1.95 2.27 2.23 1.78 z8300 RAM 3.92 3.97 4.36 3.82 3.79 3.97

To Start

This benchmark carries out the same calculations as the MemSpeed Benchmark measuring data reading speeds in Mega Bytes per second, with functions accessing arrays of cache and RAM based data, sized 2 x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m] single precision floating point with x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can calculated by dividing single precision MB/second by 4 and 8, for the two tests. The first set of calculations use normal functions followed by some using NEON Intrinsic Functions. The last two columns are NEON only results. For further details and results see android neon benchmarks.htm.

On tablet A1, with the Intel Atom CPU, the 32 bit native code version produced some significant performance gains over the original ARM benchmark (available from here), but rerunning this via Android 5.0 produced much faster speeds, some better than native code compilation.

The later compiler produced some slower and some faster speeds on ARM based tablets.

Details are provided for the 64 bit version on T22. As with NEON-Linpack, many results from 32 bit and 64 bit compilations, via NEON intrinsic functions, were similar. With normal code, the 64 bit compilations were up to near four times faster than those at 32 bits.

Following the results are further MB per second/CPU MHz comparisons. Subject to variations due to cache occupancy, the comparisons for normal calculations are the same as MemSpeed. Then, more modern processors performed relatively better, using NEON instructions. See comments in comparison table.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, Android NeonSpeed Benchmark 15-Dec-2012 14.38 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 860 2575 2325 2918 3053 3245 L1 32 950 2551 2400 2823 2944 3131 64 744 1396 1329 1434 1465 1496 L2 128 713 1342 1319 1365 1392 1417 256 714 1339 1311 1357 1377 1400 512 708 1323 1299 1348 1358 1383 1024 608 875 869 917 930 952 4096 460 493 492 481 488 504 RAM 16384 460 498 487 507 506 504 65536 459 495 469 251 503 505 Total Elapsed Time 11.5 seconds #################### T7 ARM-Intel ##################### ARM/Intel NeonSpeed Benchmark V1.1 09-May-2015 18.07 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 881 2440 2501 3334 3206 3465 32 901 1868 1705 2260 2083 2186 64 801 1395 1365 1573 1548 1581 128 784 1282 1278 1405 1389 1411 256 787 1279 1285 1420 1380 1409 512 777 1266 1267 1409 1370 1394 1024 604 786 762 769 770 828 4096 458 479 477 463 486 488 16384 436 447 448 469 470 469 65536 450 472 469 240 482 483 Total Elapsed Time 11.5 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 Android NeonSpeed Benchmark V1.1 09-Aug-2013 17.10 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 3793 9641 4375 13023 13456 13562 L1 32 5777 11410 4993 11718 11365 11143 64 4122 6692 3855 6539 6682 7210 L2 128 4017 6565 3849 6475 6520 6983 256 4067 6562 3836 6459 6495 7038 512 3900 6531 3820 6428 6490 7095 1024 1821 2544 1774 2532 2554 2539 4096 1141 1645 1536 1612 1615 1635 RAM 16384 1437 1695 1490 1576 1694 1668 65536 1424 1675 1475 1699 1687 1694 Total Elapsed Time 11.2 seconds #################### T11 ARM-Intel #################### ARM/Intel NeonSpeed Benchmark V1.1 09-May-2015 18.17 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 2252 4964 3321 6602 7304 7237 32 4202 8364 4543 8366 8553 8101 64 3710 6096 3860 6570 6348 6182 128 3802 5581 3874 6044 5624 5877 256 3654 5618 3501 6154 5655 5783 512 3597 5688 3723 6130 5812 5684 1024 1727 2466 1659 2481 2454 2472 4096 1479 1718 1421 1714 1713 1706 16384 1488 1704 1435 1576 1705 1694 65536 1477 1755 1453 1754 1759 1752 Total Elapsed Time 10.8 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s Android NeonSpeed Benchmark V1.1 23-Jul-2015 13.00 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 4324 13809 4498 14660 17501 18186 L1 32 3587 6845 2922 8073 6981 7035 L2 64 3347 6894 2912 8078 6964 6938 128 3343 6651 2919 7922 6726 6999 256 3511 6963 3002 8071 6902 6897 512 3476 6628 3025 7827 6613 6818 1024 3172 4627 2773 6424 4800 4806 4096 2653 2051 2378 3613 2090 2054 RAM 16384 2356 1891 2118 3165 1955 1962 65536 2424 1923 2167 3368 1933 1925 Total Elapsed Time 9.9 seconds #################### T21 ARM-Intel #################### ARM/Intel NeonSpeed Benchmark V1.1 23-Jul-2015 13.03 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 3623 16704 4623 15187 17446 16719 32 3455 9210 2997 8723 9280 9112 64 3336 7721 3002 8544 8469 8581 128 3415 7664 3111 8481 7549 7638 256 3584 7526 3087 8500 7849 7805 512 3538 7422 3154 8266 7567 7541 1024 3513 7227 3067 7789 7294 7261 4096 2302 1673 2413 3107 1693 1677 16384 2286 1616 2323 3024 1620 1617 65536 2322 1617 2271 2505 1634 1600 Total Elapsed Time 9.9 seconds ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel NeonSpeed Benchmark V1.2 11-Nov-2016 10.32 Compiled for 32 bit ARM v7a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 993 4666 2185 4895 4775 5308 32 1171 4637 2177 4874 4745 5267 64 1132 3995 2037 4148 4058 4370 128 1142 4110 2073 4280 4196 4492 256 1143 4136 2072 4332 4224 4524 512 1091 3462 1837 3145 3147 3658 1024 942 1831 1462 1839 1810 1857 4096 935 1701 1377 1643 1648 1731 16384 934 1623 1321 1513 1631 1649 65536 916 1556 1303 1559 1567 1621 Total Elapsed Time 11.4 seconds Android 7.0 ARM/Intel NeonSpeed Benchmark V1.2 10-May-2017 10.53 Compiled for 32 bit ARM v7a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 1174 4673 2189 4908 4779 5312 32 1152 4271 2100 4478 4370 4784 64 1129 3961 2030 4122 4036 4335 128 1131 4001 2002 4195 4108 4386 256 1129 4038 2049 4230 4138 4393 512 1068 2970 1815 3043 3012 3158 1024 933 1724 1422 1733 1723 1756 4096 914 1533 1279 1297 1385 1438 16384 887 1502 1287 1511 1515 1516 65536 902 1322 1221 1113 1530 1479 Total Elapsed Time 11.7 seconds ###################### T22 32 Bit ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel NeonSpeed Benchmark V1.2 13-Aug-2015 16.32 Compiled for 32 bit ARM v7a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 971 3853 1807 4059 3957 4397 L1 32 970 3812 1800 3983 3891 4323 64 927 3228 1605 3038 3269 3521 L2 128 926 3321 1681 3343 3354 3596 256 936 3386 1693 3449 3413 3667 512 898 2889 1578 2996 2927 3118 1024 794 1859 1345 2057 1996 1924 RAM 4096 794 1796 1250 1788 1813 1835 16384 792 1773 1270 1820 1829 1864 65536 796 1811 1289 1852 1832 1880 Total Elapsed Time 11.3 seconds ###################### T22 64 Bit ###################### ARM/Intel NeonSpeed Benchmark V1.2 13-Aug-2015 16.37 Compiled for 64 bit ARM v8a Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 3054 4055 3605 4376 4911 5094 32 2922 3787 3435 4198 4546 4682 64 2795 3514 3259 3658 4050 4116 128 2886 3529 3373 3924 4148 3963 256 2883 3641 3264 3942 4193 4276 512 2454 3165 2985 3385 3586 3542 1024 1633 2000 1835 2043 2114 2105 4096 1738 1893 1899 1900 1956 1955 16384 1757 1870 1886 1802 1921 1846 65536 1755 1875 1870 1903 1936 1937 Total Elapsed Time 10.2 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s Android NeonSpeed Benchmark V1.1 02-Feb-2015 17.09 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 1778 3940 2807 5474 4997 5062 L1 32 1781 3576 2636 4431 4316 4291 L2 64 1772 3589 2639 4480 4337 4332 128 1784 3589 2641 4423 4320 4320 256 1766 3592 2642 4400 4347 4358 512 1784 3585 2633 4375 4350 4355 1024 1705 3253 2448 3760 3789 3788 4096 1673 3021 2366 3257 3245 3237 RAM 16384 1672 2948 2349 3062 3157 3151 65536 1675 2967 2345 3190 3168 3168 Total Elapsed Time 10.8 seconds ############### A1 Original Android 5.0 ################ Android NeonSpeed Benchmark V1.1 04-Nov-2015 10.49 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 1746 5026 4902 7781 8155 8196 32 1720 4652 4031 5868 6168 6192 64 1641 4611 4043 5909 6233 6240 128 1740 4709 4022 5811 6209 6216 256 1740 4710 4040 5887 6222 6218 512 1714 4526 3862 5594 5886 5893 1024 1630 4134 3642 4876 5021 4740 4096 1589 3344 3183 3407 3415 3412 16384 1608 3329 3180 3385 3409 3390 65536 1470 2962 3166 2778 3380 3384 Total Elapsed Time 10.1 seconds #################### A1 ARM-Intel ###################### ARM/Intel NeonSpeed Benchmark V1.1 09-May-2015 16.54 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 1816 5996 4916 6244 6882 6880 32 1851 4703 3985 5200 5609 5711 64 1862 3845 3121 4174 4441 4520 128 1841 3929 3110 4179 4411 4487 256 1863 3932 3092 4179 4412 4493 512 1861 3938 3090 3894 4215 4415 1024 1784 3475 2738 3130 3223 3443 4096 1741 2376 2649 2998 3112 3139 16384 1774 3086 2780 3116 3140 3145 65536 1774 2987 2547 2328 3126 3072 Total Elapsed Time 10.1 seconds #################### A5 ARM Intel ###################### Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel NeonSpeed Benchmark V1.2 28-Mar-2016 11.19 Compiled for 32 bit Intel x86 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 3523 6058 4676 6052 6245 6163 32 2926 5163 3585 5275 5711 5322 64 2988 4406 2894 4892 4786 4713 128 2691 4085 2876 4648 4657 4662 256 2856 4274 2912 4865 4752 4714 512 2848 3929 2635 4550 4557 4509 1024 2613 3536 2634 3629 3597 3633 4096 2259 1928 2229 2407 2512 2503 16384 1654 1849 2373 2543 2592 2595 65536 2375 2632 2429 1902 2711 2731 Total Elapsed Time 10.3 seconds ################### W1 REMIX Original ################## R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 Android NeonSpeed Benchmark V1.1 13-Nov-2016 11.09 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 3628 9456 4464 11833 11028 10050 32 2642 7006 3741 7068 7350 7470 64 3162 6829 3851 7529 7204 7487 128 2949 7429 3861 6523 6536 6627 256 3153 6595 2768 7480 7501 7505 512 3151 7462 3866 7493 7502 7511 1024 2793 4896 2591 4893 4907 4923 4096 2308 3711 2641 3837 3840 3840 16384 2766 3824 2806 3707 3835 3836 65536 2553 3647 3168 3844 3837 3841 Total Elapsed Time 10.9 seconds #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel NeonSpeed Benchmark V1.2 21-Oct-2016 14.41 Compiled for 32 bit Intel x86 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 2932 5981 4826 6094 6518 6259 32 2673 4381 3587 5437 5972 6017 64 3038 4106 2978 4947 4762 4707 128 3042 4475 2738 4566 4777 4740 256 2865 4487 2978 4210 4774 4748 512 2950 4489 2964 4945 4153 4792 1024 2329 2798 2115 2712 2460 2745 4096 2473 2784 2502 3371 3441 3396 16384 2681 3306 2546 3397 3017 2590 65536 2092 3518 2743 2969 3589 3590 Total Elapsed Time 10.7 seconds #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel NeonSpeed Benchmark V1.2 13-Nov-2016 11.11 Compiled for 64 bit Intel x86_64 Can't run - Not an ARMv7 CPU ################### PC REMIX Original ################## R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, Android NeonSpeed Benchmark V1.1 13-Nov-2016 11.29 Vector Reading Speed in MBytes/Second Memory Float v=v+s*v Int v=v+v+s Neon v=v+v KBytes Norm Neon Norm Neon Float Int 16 14705 42623 16797 46797 52332 52696 32 14750 42792 17798 49624 54033 54690 64 15576 37757 17823 41115 41340 43711 128 15603 37884 17864 41382 43787 44439 256 15595 34071 17867 36314 38923 39580 512 15466 27798 17871 28896 30298 30414 1024 15353 27510 17998 28548 29643 29725 4096 14411 27435 18067 28355 29534 29617 16384 13472 15856 14241 15865 15732 15711 65536 13154 15139 13886 14634 14979 14942 Total Elapsed Time 10.6 seconds #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel NeonSpeed Benchmark V1.2 21-Oct-2016 12.53 Compiled for 32 bit Intel x86 Can't run - CPU doesn't support NEON #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel NeonSpeed Benchmark V1.2 13-Nov-2016 11.30 Compiled for 64 bit Intel x86_64 Can't run - Not an ARMv7 CPU ======================================================== MB/sec/MHz Comparisons With Cortex-A9 ======================================================== Float v=v+s*v Int v=v+v+s Neon v=v+v Norm Neon Norm Neon Float Int 32 Bit Only - Neon not always faster Android T11 L1 3.04 2.79 1.23 2.76 2.96 2.76 Cortex L2 3.65 3.62 2.11 3.21 3.32 3.53 A15 RAM 2.23 2.50 2.22 5.00 2.47 2.48 T21 L1 2.30 3.82 1.03 2.54 3.04 2.69 Qualcomm L2 2.54 3.28 1.34 3.34 3.17 3.09 800 RAM 2.88 1.91 2.70 5.83 1.89 1.85 A1 L1 1.33 1.59 1.27 1.21 1.38 1.28 Atom L2 1.53 1.98 1.55 1.90 2.06 2.06 Z3745 RAM 2.54 4.08 3.50 6.26 4.18 4.10 A5 L1 2.61 1.62 1.22 1.18 1.27 1.16 Atom L2 2.37 2.18 1.48 2.23 2.25 2.18 z8300 RAM 3.44 3.64 3.38 5.17 3.67 3.69 P37 L1 0.90 1.53 0.70 1.17 1.19 1.23 Cortex L2 1.16 2.59 1.29 2.44 2.45 2.57 A53 RAM 1.63 2.64 2.22 5.20 2.60 2.68 ########################################################### 32 Bit and 64 Bit - Normal code on ARM CPUs faster at 64 bits but similar via Neon instructions. Intel CPU 64 bit speeds not available via Android/REMIX. Android T22 32b L1 1.02 1.46 0.67 1.12 1.14 1.17 Cortex L2 1.10 2.44 1.22 2.24 2.28 2.40 A53 RAM 1.68 3.66 2.62 3.58 3.59 3.67 T22 64b L1 3.20 1.53 1.33 1.21 1.41 1.36 Cortex L2 3.38 2.63 2.34 2.56 2.80 2.80 A53 RAM 3.72 3.86 3.89 3.55 3.77 3.63 R1 32b L1 2.17 1.60 1.26 1.19 1.33 1.18 Atom L2 2.37 2.29 1.51 1.93 2.26 2.20 Z8300 RAM 3.03 4.86 3.81 8.07 4.86 4.85 R1 64b Can't run - Not an ARMv7 CPU R2 32b Can't run - Not an ARMv7 CPU Following is original R2 32b L1 5.14 5.37 2.07 4.32 5.02 4.68 Core i7 L2 6.10 8.20 4.28 7.87 8.68 8.64 4820K RAM 8.99 9.87 9.11 18.76 9.56 9.52 R2 64b Can't run - Not an ARMv7 CPU

To Start

This benchmark (based on PC version with details and results here) is designed to identify reading data in bursts over buses. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read. On reading data from RAM, 64 Byte bursts are typically used. Then, measured reading speed reduces from a maximum, when all data is read, to a minimum on using 16 word increments (64 bytes). Potential maximum speed can be estimated by multiplying this minimum value by 16. With this burst rate, measured speed at 32 word and 16 word increments are likely to be the same. Cache sizes are indicated by varying speed as memory use changes. Note, with smallest L1 cache demands, measured speed can be low due to overheads when reading little data. For more details and further results see here, with results up to 2013 in British Library Archives.

Comparing results from different versions, on a particular system, there can be unusual differences on burst reading speeds. Those quoted here are for the most important measurements for reading all data.

On Intel Atom based tablet A1, there was little difference between the old ARM version, with conversion, to the new 32 bit native code program, nor using Android 5.0 instead of 4.4.

Average revised 32 bit version performance improvements, via caches/RAM, were 8%/17% for T7 Cortex-A9, 11%/27% for T11 Cortex-A15 and 27%/-8% on T21 Snapdragon 800. Corresponding T22 Cortex-A53 64/32 bit improvements were 61%/25%.

After the results are further MB per second/CPU MHz comparisons, for this integer data streaming benchmark that can demonstrate maximum data transfer speed from RAM. As the latter might not be dependent on CPU speed, direct MB/second comparisons are also provided. These are dependent on bus speed, 32 bit or 64 bit bus width and whether one or two channels are available, one problem being that is it is often difficult to identify what is provided. Note that multithreaded benchmarks might be needed to fully utilise memory bandwidth - see later results.

Results of the Windows version are also included for a tablet and, for comparison purposes, a desk top PC with 4 memory channels. Intel systems have 64 bit bus widths.

Intel CPUs - Results on Atom Z8300 are similar via different compilers/Operating System, using Android A5, REMIX/Android R1 and R2, plus Windows W1 and W2. Of those available 32 bit and 64 bit versions have similar performance. RAM speeds tend to be faster than those on ARM based systems, due to 64 bit bus widths. As would be expected, Core i7 speeds are superior, based on MB/second per MHz and, particularly, on RAM MB/second comparisons. See also comments in comparison table.

ARM CPUs - With 32 bit versions, MB/second per MHz comparisons, with the older Cortex-A9, tend to be worse using L1 cache but better from L2 and RAM. The only 64 bit version results available are for T22, Cortex-A53, demonstrating faster L1 cache based tests, with lower improvements from L2 and RAM.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, DDR3 5.3 GB/s 4 x 32 KB L1 cache, 1 MB shared L2 cache Android BusSpeed Benchmark 19-Oct-2012 17.29 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 2723 2420 3044 3364 3499 3500 L1 32 1054 1087 1061 1382 1565 2145 64 436 433 419 652 751 1160 L2 128 345 337 337 542 633 943 256 329 309 322 522 614 961 512 339 299 311 506 574 937 1024 170 168 180 269 349 629 4096 59 55 84 127 176 338 RAM 16384 56 56 83 125 173 335 65536 56 56 82 125 174 334 Total Elapsed Time 5.7 seconds #################### T7 ARM-Intel ##################### ARM/Intel BusSpeed Benchmark 1.1 v7 25-Apr-2015 12.30 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 2940 3344 3625 3866 3862 3893 32 698 707 682 1071 1208 1826 64 448 477 465 726 851 1357 128 367 355 292 542 657 1070 256 334 344 341 546 651 1059 512 326 336 336 531 629 1025 1024 169 175 197 309 411 749 4096 58 58 83 131 191 395 16384 56 57 83 129 189 392 65536 56 48 82 129 187 388 Total Elapsed Time 5.6 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 2 GB DDR3-1600 RAM, dual channel, 12.8 GB/sec 2 x 32 KB L1 cache, 1 MB shared L2 cache Android BusSpeed Benchmark 1.1 v7 09-Aug-2013 17.07 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 3193 3451 4412 5272 5389 6191 L1 32 1298 1558 1990 3478 4264 4420 64 804 928 1209 2442 3263 3426 L2 128 784 904 1175 2321 3148 3333 256 780 908 1181 2336 3142 3327 512 788 907 1165 2312 3120 3300 1024 360 387 384 803 1348 1744 4096 145 146 194 507 648 1378 RAM 16384 141 136 190 507 638 1373 65536 142 141 191 506 643 1371 Total Elapsed Time 5.3 seconds #################### T11 ARM-Intel #################### ARM/Intel BusSpeed Benchmark 1.1 v7 23-Apr-2015 12.15 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 2085 3208 4055 4553 5272 5758 32 1282 1811 2498 4182 4867 5163 64 600 864 1309 2974 3504 3841 128 614 892 1310 3027 3500 3826 256 614 892 1337 3050 3509 3828 512 618 888 1319 3042 3382 3811 1024 425 479 444 1244 1803 2291 4096 146 146 191 590 1050 1751 16384 141 139 186 585 1039 1725 65536 139 139 187 585 1039 1721 Total Elapsed Time 5.3 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s L1 caches 4 x 16 KB, L2 cache shared 2048 KB Android BusSpeed Benchmark 1.1 v7 04-Jun-2015 17.00 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 1382 1350 3122 4300 4938 5283 L1 32 1106 1118 2026 2637 3786 5210 L2 64 1064 1118 2058 2679 3820 5251 128 1123 1170 2081 2688 3669 4166 256 1121 1196 2109 2623 3873 3429 512 940 1127 2050 2684 3777 4795 1024 951 1124 2038 2655 3759 4950 4096 239 375 472 806 1486 2679 RAM 16384 239 370 464 806 1476 2656 65536 239 368 495 854 1537 2792 Total Elapsed Time 5.0 seconds #################### P36 ARM-Intel #################### P36 LGE LG-H811 Qualcomm® Snapdragon 808, 1.8 GHz 64-bit Hexa-Core, Android 5.1 ARM/Intel BusSpeed Benchmark 1.2 05-Jan-2016 12.24 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 1211 1532 3673 5686 6244 6484 32 1514 1952 3190 5034 6258 6576 64 794 1096 2109 3840 5643 6171 128 721 1005 1928 3603 5416 6032 256 721 1004 1938 3606 5423 6025 512 724 1009 1941 3610 5428 6022 1024 384 411 698 1112 1732 2607 4096 198 228 459 859 1529 2659 16384 199 231 462 930 1624 2936 65536 194 232 467 923 1635 3036 Total Elapsed Time 5.2 seconds #################### T21 ARM-Intel #################### ARM/Intel BusSpeed Benchmark 1.1 v7 04-Jun-2015 17.00 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 1328 1442 2797 4291 4699 5685 32 1165 1100 1933 2848 3603 5844 64 1147 1055 2007 2846 3586 5890 128 1181 1136 2008 2711 3600 5878 256 1185 1126 2018 2716 3568 5873 512 1022 1026 1805 2525 3378 5611 1024 796 843 1584 2202 3088 5053 4096 199 294 431 657 1166 2409 16384 200 299 430 659 1167 2408 65536 205 301 436 668 1173 2380 Total Elapsed Time 5.2 seconds ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel BusSpeed Benchmark 1.2 01-Nov-2016 11.16 Compiled for 32 bit ARM v7a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 2314 2497 2739 2871 2897 2939 32 1737 1868 2213 2814 2849 2908 64 775 803 1437 2071 2687 2826 128 727 749 1312 1992 2667 2423 256 691 712 1300 2055 2685 2847 512 450 494 937 1503 2413 2676 1024 191 203 393 663 1454 2535 4096 184 187 372 508 1119 2439 16384 180 183 364 576 1231 2395 65536 177 182 357 486 1015 2439 Total Elapsed Time 5.2 seconds Android 7.0 ARM/Intel BusSpeed Benchmark 1.2 10-May-2017 10.54 Compiled for 32 bit ARM v7a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 2080 2309 2730 2889 2905 2936 32 1081 1134 1734 2349 2806 2888 64 782 806 1437 2075 2689 2823 128 729 749 1338 2026 2687 2843 256 694 711 1294 2001 2686 2847 512 394 429 802 1272 2226 2638 1024 199 196 317 595 1473 2497 4096 182 184 371 600 1242 2534 16384 184 186 372 620 1272 2509 65536 179 188 371 501 1223 2435 Total Elapsed Time 5.2 seconds ###################### T22 32 Bit ###################### T22, ARM Cortex-A53 1300 MHz, Android 5.0.2 LPDDR3 RAM, single channel, 1333 MHz = 5.3 GB/sec 4 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel BusSpeed Benchmark 1.2 06-Aug-2015 10.57 Compiled for 32 bit ARM v7a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 874 932 1814 2302 2355 2263 L1 32 758 803 1309 1820 2323 2386 64 653 671 1203 1741 2206 2332 L2 128 603 620 1107 1693 2222 2351 256 574 589 1075 1711 2211 2327 512 332 372 681 1075 1863 2120 1024 137 193 371 578 1322 2129 RAM 4096 172 179 351 567 1151 2126 16384 172 178 351 504 1117 2136 65536 172 177 349 478 882 2129 Total Elapsed Time 5.3 seconds ###################### T22 64 Bit ###################### T22, ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel BusSpeed Benchmark 1.2 06-Aug-2015 11.02 Compiled for 64 bit ARM v8a Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 3188 3635 3937 4327 4372 4462 32 1478 1607 2246 3382 3853 4144 64 600 622 1163 2011 2972 3585 128 558 575 1056 1889 2892 3525 256 538 550 1028 1826 2837 3260 512 371 425 813 1490 2403 3202 1024 136 196 382 728 1423 2750 4096 170 177 346 669 1340 2652 16384 169 174 341 678 1352 2663 65536 168 174 341 676 1347 2611 Total Elapsed Time 5.2 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s 4 x 24 KB L1, 2 x 1 MB L2 Android BusSpeed Benchmark 1.1 v7 21-Dec-2014 16.06 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 4178 3473 6270 6713 6759 6869 L1 32 1420 1529 2252 2686 3702 5108 L2 64 1385 1498 2276 2629 3657 5108 128 1394 1542 2278 2614 3640 5092 256 1410 1576 2258 2607 3259 5110 512 1417 1574 2274 2602 3700 5119 1024 349 428 888 1431 2848 4306 4096 215 265 593 1181 2289 3891 RAM 16384 210 266 596 1181 2278 3897 65536 220 272 600 1193 2346 3886 Total Elapsed Time 5.1 seconds ################## A1 V1 Android 5.0 ################### Android BusSpeed Benchmark 1.1 v7 29-Oct-2015 11.31 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5747 5958 6843 7041 7163 7220 32 1446 1722 2244 2725 3745 5288 64 1436 1745 2322 2738 3777 5314 128 1461 1779 2348 2734 3749 5259 256 1462 1787 2367 2735 3761 5285 512 1450 1757 2331 2593 3707 5258 1024 376 484 1107 1845 2954 4545 4096 217 258 585 1179 2310 3895 16384 217 255 593 1172 2302 3894 65536 214 257 593 1163 2305 3916 Total Elapsed Time 5.1 seconds #################### A1 ARM-Intel ###################### ARM/Intel BusSpeed Benchmark 1.1 v7 22-Apr-2015 21.42 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 4845 5705 6403 6926 7094 7167 32 1407 1716 2255 2646 3713 5094 64 1395 1703 2257 2689 3754 4843 128 1283 1571 2108 2620 3671 5135 256 1416 1753 2288 2679 3687 5178 512 1439 1372 2251 2510 3679 5183 1024 350 409 942 1696 2792 4403 4096 213 253 564 1188 2173 3631 16384 219 259 600 1189 2330 3920 65536 218 259 599 1102 2323 3716 Total Elapsed Time 5.1 seconds ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel BusSpeed Benchmark 1.2 28-Mar-2016 11.44 Compiled for 32 bit Intel x86 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5857 5444 6672 6835 6777 6923 32 1432 1525 2134 2492 3511 4828 64 1400 1523 2392 2688 3485 4690 128 1444 1546 2253 2419 3153 4427 256 1410 1588 2387 2750 3593 4949 512 1464 1567 2367 2692 3530 4643 1024 236 284 601 1276 2118 3462 4096 176 202 322 561 1505 3000 16384 173 202 417 796 1585 3061 65536 172 199 429 655 1095 2189 Total Elapsed Time 5.0 seconds #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel BusSpeed Benchmark 1.2 21-Oct-2016 14.19 Compiled for 32 bit Intel x86 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5867 5618 6515 6824 6475 6175 L1 32 1308 1431 2185 2731 3593 4999 L2 64 1432 1346 1831 2646 3600 4973 128 1463 1578 2382 2763 3606 4995 256 1484 1581 2401 2266 2201 3531 512 1447 1579 2376 2725 3564 4920 1024 354 378 808 1592 1516 3945 4096 219 271 566 1095 2116 3696 RAM 16384 244 271 567 1097 1606 3738 65536 186 223 487 870 1773 3581 Total Elapsed Time 5.3 seconds ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 BusSpeed From C/C++ 18.00.21005.1 for x86 Start of test Wed Dec 23 20:52:24 2015 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5481 5833 6688 6396 6329 6647 L1 32 1407 1581 2161 2678 3546 4436 L2 64 1409 1480 2297 2678 3534 5089 128 1461 1588 2441 2618 3505 4762 256 1346 1593 2441 2631 3571 4988 512 1398 1576 2299 2636 3283 4980 1024 902 1038 1845 2011 2994 4384 4096 237 270 570 1095 2100 3684 RAM 16384 239 273 565 1083 2118 3870 65536 240 275 547 1089 2088 3746 End of test Wed Dec 23 20:52:34 2015 #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel BusSpeed Benchmark 1.2 01-Nov-2016 12.23 Compiled for 64 bit Intel x86_64 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5669 4967 6198 6218 6271 7119 L1 32 1480 1557 2368 2771 3616 5025 L2 64 1470 1281 2428 2762 3637 5033 128 1502 1591 2425 2761 3641 5019 256 1501 1590 2444 2193 3644 4984 512 1510 1606 2439 2646 3599 4711 1024 324 353 747 1000 2403 3921 4096 210 270 566 1091 2105 3751 RAM 16384 243 271 568 850 2113 3761 65536 244 272 567 1093 2114 3765 Total Elapsed Time 5.3 seconds ################# W1 Windows 10 64 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 BusSpeed From C/C++ 18.00.21005.1 for x64 Start of test Wed Dec 23 21:06:27 2015 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5047 5323 6188 6349 6742 6265 32 1456 1518 2177 2725 3657 5063 64 1364 1475 2274 2884 3535 4991 128 1485 1584 2180 2797 3627 5055 256 1368 1397 2257 2739 3387 4724 512 1477 1518 2377 2595 3448 4851 1024 707 709 1295 2038 2529 3913 4096 239 270 559 1089 2131 3693 16384 239 272 567 1095 2094 3635 65536 236 269 566 1096 2115 3854 End of test Wed Dec 23 21:06:37 2015 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 BusSpeed From C/C++ 18.00.21005.1 for x86 Start of test Mon Apr 11 23:59:13 2016 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 5166 5420 6209 6078 6748 6330 32 1371 1475 2118 2587 3272 4429 64 1388 1397 2211 2488 3167 4437 128 1296 1444 2245 2586 3353 4427 256 1414 1445 2293 2511 3299 4468 512 1309 1404 2128 2795 3473 4563 1024 677 748 1433 1906 2295 3853 4096 180 210 446 858 1674 3273 16384 182 215 438 845 1649 3244 65536 180 212 434 851 1666 3251 End of test Mon Apr 11 23:59:23 2016 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## BusSpeed From C/C++ 18.00.21005.1 for x64 Start of test Mon Apr 11 23:47:35 2016 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 4767 5335 6232 6677 6468 6875 32 1473 1440 2227 2691 3362 4752 64 1398 1556 2234 2702 3616 4768 128 1404 1571 2256 2606 3485 4869 256 1416 1422 2228 2535 3567 4747 512 1488 1593 2271 2614 3444 4854 1024 940 755 1698 2256 3179 4449 4096 184 212 443 860 1647 3246 16384 179 211 437 856 1654 3154 65536 183 213 441 842 1651 3271 End of test Mon Apr 11 23:47:45 2016 #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel BusSpeed Benchmark 1.2 01-Nov-2016 12.08 Compiled for 64 bit Intel x86_64 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 12958 13094 20980 20905 21003 21077 L1 32 12716 11344 14122 20945 21006 20757 64 6820 7270 11515 16554 19455 21628 L2 128 7313 7511 11450 16387 19483 21674 256 4824 4957 8757 13097 17966 21651 512 2738 2783 5437 9828 16051 21535 L3 1024 2725 2778 5446 9668 16019 21543 4096 2717 2772 5399 9558 15730 21300 16384 756 1089 2329 4508 8828 15692 RAM 65536 723 1044 2219 4318 8809 15722 Total Elapsed Time 5.1 seconds #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel BusSpeed Benchmark 1.2 21-Oct-2016 12.30 Compiled for 32 bit Intel x86 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 13049 13142 20864 20857 20893 20967 L1 32 12822 11610 13927 20486 20760 20647 64 6546 6837 10799 15898 20131 21634 L2 128 6264 6296 10478 15700 19803 21554 256 3486 3560 6624 12078 18207 21615 512 2810 2869 5575 9912 16920 21513 L3 1024 2716 2784 5448 9692 16778 21508 4096 2706 2780 5390 9582 16467 21285 16384 765 1073 2317 4471 9005 15788 RAM 65536 737 1046 2222 4332 8929 15441 Total Elapsed Time 5.1 seconds ======================================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz ======================================================== Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 32 Bit 16 13170 13179 18786 19440 19399 19373 256 5411 5708 10003 14452 17994 20098 4096 2713 2780 5358 9510 15915 19871 65536 660 963 2064 4220 8659 12891 64 Bit 16 13183 13185 20016 20441 20481 20495 256 6436 6636 11038 16140 19734 21099 4096 2701 2776 5339 9573 16245 20827 65536 685 1041 2183 4265 8731 15120 ====================================================================== MB/sec/MHz and RAM MB/sec Comparisons With Cortex-A9 Comments mainly for Read All ====================================================================== Inc32 Inc16 Inc8 Inc4 Inc2 Read RAM Words Words Words Words Words All MB/sec 32 Bit Only - All relatively faster with data from L2 cache and RAM Android T11 L1 0.50 0.68 0.79 0.83 0.96 1.04 Cortex L2 1.30 1.83 2.77 3.94 3.80 2.55 A15 RAM 1.75 2.04 1.61 3.20 3.92 3.13 4.44 T21 L1 0.26 0.23 0.48 0.62 0.71 0.76 Qualcomm L2 1.87 1.94 3.45 2.68 3.32 1.81 800 RAM 2.38 4.28 3.37 3.69 4.59 4.02 7.20 A1 L1 1.06 1.10 1.14 1.16 1.19 1.19 Atom L2 2.74 3.29 4.33 3.17 3.65 3.15 Z3745 RAM 2.51 3.48 4.71 5.51 8.01 6.18 9.58 A5 L1 1.30 1.06 1.20 1.15 1.14 1.16 Atom L2 2.75 3.01 4.57 3.28 3.60 3.05 z8300 RAM 2.00 2.70 3.41 3.31 3.82 3.68 5.64 P37 L1 0.63 0.60 0.60 0.59 0.60 0.60 Cortex L2 1.66 1.66 3.05 3.01 3.30 2.15 A53 RAM 2.53 3.03 3.48 3.01 4.34 5.03 6.29 ##################################################################### 32 Bit and 64 Bit - ARM CPU faster at 64 bits but Intel similar (or T22 particularly slow at 32 bits?. Intel Windows and Android/REMIX speeds similar. Android T22 32b L1 0.27 0.26 0.46 0.55 0.56 0.54 Cortex L2 1.59 1.58 2.91 2.89 3.14 2.03 A53 RAM 2.84 3.40 3.93 3.42 4.35 5.07 5.49 T22 64b L1 1.00 1.00 1.00 1.03 1.04 1.06 Cortex L2 1.49 1.48 2.78 3.09 4.02 2.84 A53 RAM 2.77 3.35 3.84 4.84 6.65 6.21 6.73 REMIX/Android R1 32b L1 1.30 1.10 1.17 1.15 1.09 1.03 Atom L2 2.90 3.00 4.59 2.71 2.20 2.17 Z8300 RAM 2.17 3.03 3.87 4.40 6.18 6.02 9.23 R1 64b L1 1.26 0.97 1.12 1.05 1.06 1.19 Atom L2 2.93 3.01 4.67 2.62 3.65 3.07 Z8300 RAM 2.84 3.70 4.51 5.53 7.37 6.33 9.70 R2 32b L1 1.37 1.21 1.77 1.66 1.66 1.66 Core i7 L2 3.21 3.18 5.98 6.81 8.61 6.28 4820K RAM 4.05 6.71 8.34 10.33 14.69 12.25 39.80 R2 64b L1 1.36 1.20 1.78 1.66 1.67 1.67 Core i7 L2 4.44 4.43 7.90 7.38 8.49 6.29 4820K RAM 3.97 6.69 8.33 10.30 14.49 12.47 40.52 Windows PC 32b L1 1.38 1.21 1.59 1.55 1.55 1.53 Core i7 L2 9.85 10.04 15.62 14.39 13.47 8.26 4820K RAM 3.63 6.17 7.74 10.07 14.25 10.22 33.22 PC 64b L1 1.38 1.21 1.70 1.63 1.63 1.62 Core i7 L2 11.72 11.67 17.24 16.07 14.77 8.67 4820K RAM 3.76 6.67 8.19 10.17 14.37 11.99 38.97 W1 32b L1 1.22 1.14 1.20 1.08 1.07 1.11 Atom L2 2.63 3.02 4.67 3.14 3.58 3.07 Z8300 RAM 2.80 3.74 4.35 5.51 7.28 6.30 9.65 W1 64b L1 1.26 0.97 1.12 1.05 1.06 1.19 Atom L2 2.93 3.01 4.67 2.62 3.65 3.07 Z8300 RAM 2.84 3.70 4.51 5.53 7.37 6.33 9.70 W2 32b L1 1.15 1.06 1.12 1.03 1.14 1.06 Atom L2 2.76 2.74 4.39 3.00 3.30 2.75 z8300 RAM 2.10 2.88 3.45 4.30 5.81 5.46 8.38 W2 64b L1 1.06 1.04 1.12 1.13 1.09 1.15 Atom L2 2.76 2.70 4.26 3.03 3.57 2.92 z8300 RAM 2.13 2.89 3.51 4.26 5.76 5.50 8.43

To Start

RandMem benchmark carries out four tests at increasing data sizes to produce data transfer speeds in MBytes Per Second from caches and memory. Serial and random address selections are employed, using the same program structure, with read and read/write tests using 32 bit integers. The main purpose is to demonstrate how much slower performance can be through using random access. Here, speed can be considerably influenced by reading and writing in bursts, where much of the data is not used, and by the size of preceding caches. For more details and further results see here, with results up to 2013 in British Library Archives.

On tablet A1, with the Intel Atom processor, results for the new 32 bit version were essentially the same as the Houdini instruction conversion of original ARM code via Android 5, both averaging 30% improvement, over the original Android 4 speeds on read only tests, but similar with reading and writing. The latter pattern of improvements were also apparent for 64 bit versus 32 bit benchmark modes on tablet T22, with the ARM Cortex-A53 processor, but only using cache based data. The later 32 bit benchmark produced inconsistent gains and some losses, running on the other ARM compatible systems (up to October 2015).

The benchmark code is the same as used on the Windows and Linux PC versions, with details and results here, where some of these results are also included.

Further MB per second/CPU MHz comparisons are provided below, showing the usual variability in performance. See comments in comparison table.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, Android RandMem Benchmark 20-Oct-2012 11.14 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 2788 3041 2795 3041 L1 32 2769 3011 2767 3020 64 1027 1038 839 911 L2 128 916 918 616 649 256 904 905 514 538 512 899 907 475 499 1024 712 699 345 354 4096 323 284 92 88 RAM 16384 316 282 73 70 65536 314 281 65 62 Total Elapsed Time 10.9 seconds #################### T7 ARM-Intel ##################### ARM/Intel RandMem Benchmark 1.1 25-Apr-2015 12.33 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 2521 3175 2490 3038 32 1427 1451 1218 1446 64 1133 1052 853 907 128 1039 871 646 650 256 1028 909 543 518 512 1025 895 499 502 1024 700 489 242 236 4096 487 282 90 88 16384 483 281 71 70 65536 478 274 63 62 Total Elapsed Time 11.3 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 2.0 GHz Cortex-A15, Android 4.2.2 Measured 1.7 GHz Android RandMem Benchmark 1.1 13-Aug-2013 17.29 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 2881 2478 3388 3650 L1 32 4301 2968 3197 3249 64 3669 2511 2201 2249 L2 128 3566 2560 1571 1566 256 3557 2461 1334 1256 512 3524 2547 1136 1098 1024 1933 1144 534 513 4096 1993 1064 184 173 RAM 16384 1970 1086 141 144 65536 1973 1117 106 104 Total Elapsed Time 9.1 seconds #################### T11 ARM-Intel #################### ARM/Intel RandMem Benchmark 1.1 23-Apr-2015 20.42 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3642 3102 5464 4114 32 5462 3409 4096 3737 64 4800 2785 2028 2064 128 4308 2575 1572 1589 256 4381 2574 1332 1260 512 4311 2544 1215 1097 1024 2033 1156 513 471 4096 1891 1042 213 178 16384 2028 1032 154 139 65536 2033 1055 109 106 Total Elapsed Time 9.2 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Android RandMem Benchmark 1.1 10-Jun-2015 12.43 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 4407 4704 3995 4900 L1 32 2611 3071 2207 2703 L2 64 2496 2797 1821 2139 128 2080 3173 1668 1758 256 2425 3183 1439 1520 512 2359 3116 1193 1355 1024 2366 3117 368 382 4096 2293 2280 201 209 RAM 16384 2293 2237 170 175 65536 2299 2261 146 150 Total Elapsed Time 8.5 seconds #################### T21 ARM-Intel #################### ARM/Intel RandMem Benchmark 1.1 10-Jun-2015 12.45 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 5005 4626 4067 4863 32 3253 2994 2246 2622 64 3223 2855 1986 2072 128 2861 3128 1912 1776 256 3246 3174 1666 1523 512 3195 3111 1469 1372 1024 3190 3079 369 383 4096 3027 2381 212 213 16384 3065 2300 174 177 65536 3080 2281 150 150 Total Elapsed Time 8.6 seconds ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel RandMem Benchmark 1.2 01-Nov-2016 11.19 Compiled for 32 bit ARM v7a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3386 4336 3389 4357 L1 32 3205 4049 1709 2343 64 3136 3929 1064 1372 L2 128 3139 3929 823 973 256 3138 3893 749 846 512 2898 2794 303 408 1024 2708 949 113 139 RAM 4096 2286 949 77 88 16384 2325 1100 72 77 65536 2302 1115 69 74 Total Elapsed Time 10.8 seconds Android 7.0 ARM/Intel RandMem Benchmark 1.2 10-May-2017 10.56 Compiled for 32 bit ARM v7a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3379 4351 3368 4353 32 3310 4203 2222 3056 64 3154 3965 1072 1392 128 3156 3947 825 981 256 3140 3902 751 848 512 2956 2675 377 503 1024 2759 962 111 141 4096 2357 916 78 85 16384 2519 940 72 75 65536 2480 939 69 73 Total Elapsed Time 10.7 seconds ###################### T22 32 Bit ###################### T22, ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel RandMem Benchmark 1.2 06-Aug-2015 12.29 Compiled for 32 bit ARM v7a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 2807 3606 2753 3595 L1 32 2719 3433 1429 1930 64 2615 3266 914 1166 L2 128 2592 3243 705 828 256 2570 3223 637 720 512 2367 2684 237 347 1024 2137 1855 120 163 RAM 4096 1918 1658 83 97 16384 2152 1665 74 85 65536 2104 1652 72 64 Total Elapsed Time 11.6 seconds ###################### T22 64 Bit ###################### T22, ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel RandMem Benchmark 1.2 06-Aug-2015 12.32 Compiled for 64 bit ARM v8a MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3865 3033 3798 3027 32 3622 2760 3105 2734 64 3094 2803 1011 1077 128 3074 2740 776 801 256 3050 2771 718 693 512 2420 2463 270 371 1024 1322 1853 131 164 4096 1754 1598 87 100 16384 1791 1586 75 91 65536 1856 1609 57 68 Total Elapsed Time 14.6 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s Android RandMem Benchmark 1.1 01-Feb-2015 10.12 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3434 5064 3462 5113 L1 32 2833 4042 2652 3645 L2 64 2837 4058 2068 2561 128 2822 4041 1809 2205 256 2828 4040 1435 1755 512 2816 3997 1245 1456 1024 2578 3256 379 445 4096 2412 1946 209 268 RAM 16384 2485 2039 179 217 65536 2457 2041 140 170 Total Elapsed Time 11.8 seconds ################## A1 V1 Android 5.0 ################### Android RandMem Benchmark 1.1 29-Oct-2015 12.11 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 5138 5638 5113 5611 32 3778 4226 3683 3774 64 3767 4249 2715 2647 128 3747 4234 2305 2261 256 3652 4218 1805 1847 512 3739 4209 1521 1565 1024 3300 3442 562 653 4096 3026 2094 268 277 16384 3009 2075 213 220 65536 2918 2064 157 176 Total Elapsed Time 16.6 seconds #################### A1 ARM-Intel ###################### ARM/Intel RandMem Benchmark 1.1 23-Apr-2015 17.27 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 4291 5626 4584 5630 32 3217 3792 3492 3783 64 3677 4253 2629 2644 128 3666 4241 2299 2289 256 3688 3930 1829 1850 512 3682 4189 1522 1592 1024 3285 3558 562 667 4096 2999 2007 272 274 16384 3019 2065 210 220 65536 2989 2068 141 186 Total Elapsed Time 8.8 seconds ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel RandMem Benchmark 1.2 28-Mar-2016 11.46 Compiled for 32 bit Intel x86 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 4167 5227 4397 5225 L1 32 3425 3779 3123 3518 L2 64 3249 3853 2386 2284 128 3169 3635 2110 2127 256 3271 3604 1752 1758 512 3361 3847 1515 1503 1024 2782 2843 457 564 4096 2565 1845 216 233 RAM 16384 2592 1922 149 186 65536 2583 1286 125 144 Total Elapsed Time 9.5 seconds #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel RandMem Benchmark 1.2 21-Oct-2016 14.42 Compiled for 32 bit Intel x86 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3808 5481 4520 5548 L1 32 3587 4096 2813 3686 L2 64 3496 4113 2569 2586 128 3571 3305 2262 2240 256 3567 4124 1665 1846 512 3595 4117 1601 1604 1024 2982 2372 666 860 4096 2778 2534 234 223 RAM 16384 2151 2147 159 195 65536 2606 2607 117 155 Total Elapsed Time 9.5 seconds ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 RandMem Benchmark From C/C++ 18.00.21005.1 for x86 Start of test Wed Dec 23 21:04:02 2015 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 4081 5576 4641 5577 L1 32 3436 3900 3561 3247 L2 64 3413 3923 2703 2738 128 3416 3891 2336 2386 256 3437 3901 1819 1830 512 3385 3897 1574 1592 1024 3204 3592 1256 1328 4096 2831 2745 248 272 RAM 16384 2882 2733 203 219 65536 2812 2735 146 160 End of test Wed Dec 23 21:04:13 2015 #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel Benchmark 1.2 01-Nov-2016 12.25 Compiled for 64 bit Intel x86_64 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3791 2785 3850 2430 L1 32 2894 2234 2977 2217 L2 64 2928 2165 2390 1676 128 3111 2291 2151 1656 256 2573 2215 1750 1527 512 3063 2278 1580 1191 1024 2507 1826 542 683 4096 2515 1856 272 253 RAM 16384 2654 1729 152 176 65536 2388 1889 122 137 ################# W1 Windows 10 64 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 RandMem Benchmark From C/C++ 18.00.21005.1 for x64 Start of test Wed Dec 23 21:21:13 2015 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3075 3919 3494 4406 L1 32 2691 3496 2958 3159 L2 64 2952 3174 2312 2405 128 2734 3342 2002 1984 256 2720 3257 1742 1627 512 2864 3261 1517 1503 1024 2573 2994 1036 1148 4096 2555 2603 248 303 RAM 16384 2543 2576 230 204 65536 2670 2646 142 173 End of test Wed Dec 23 21:21:26 2015 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 RandMem Benchmark From C/C++ 18.00.21005.1 for x86 Start of test Tue Apr 12 00:04:05 2016 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 4036 5400 4334 5181 L1 32 2992 3674 3178 3591 L2 64 2999 3415 2345 2402 128 2985 3405 2162 2087 256 3082 3374 1638 1688 512 3251 3879 1578 1564 1024 3047 3002 905 1099 4096 2469 2054 238 253 RAM 16384 2682 2029 194 205 65536 2532 2050 127 152 End of test Tue Apr 12 00:04:17 2016 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## RandMem Benchmark From C/C++ 18.00.21005.1 for x64 Start of test Tue Apr 12 00:05:25 2016 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 3144 4044 3490 4284 L1 32 2855 3283 2864 3277 L2 64 2660 3387 2408 2396 128 2955 3260 2220 2115 256 2910 3301 1756 1683 512 2935 3463 1600 1482 1024 2545 2705 1127 1223 4096 2368 2031 238 254 RAM 16384 2533 2030 196 199 65536 2466 2043 136 158 End of test Tue Apr 12 00:05:43 2016 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel RandMem Benchmark 1.2 21-Oct-2016 12.57 Compiled for 32 bit Intel x86 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 25574 27296 24958 26849 L1 32 25325 26810 25001 27507 64 22296 22973 16045 10549 L2 128 23486 23781 12933 8315 256 23090 21578 9566 6579 512 22335 18054 6923 5384 L3 1024 22334 17998 5701 4777 4096 21991 17421 2213 2176 16384 13490 10946 1038 1015 RAM 65536 13324 10718 690 669 Total Elapsed Time 7.2 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel RandMem Benchmark 1.2 01-Nov-2016 12.09 Compiled for 64 bit Intel x86_64 MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 16 25599 26995 26420 28071 L1 32 26908 28454 26385 28001 64 23275 23952 16065 10326 L2 128 23098 23882 12952 8322 256 22768 21331 9267 6400 512 22134 18233 6986 5423 L3 1024 22103 18157 5761 4810 4096 21765 17453 2214 2178 16384 13294 11069 1034 1006 RAM 65536 13118 10730 696 659 Total Elapsed Time 7.7 seconds ============================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz ============================================== MBytes/Second Transferring 4 Byte Words Memory Serial....... Random....... KBytes Read Rd/Wrt Read Rd/Wrt 32 Bit 16 25790 26457 24349 25976 256 22820 22093 10557 7046 4096 20516 16317 2212 2036 65536 13013 10116 654 644 64 Bit 16 25947 25477 25277 26063 256 21456 21136 9080 6107 4096 21368 17510 2205 2173 65536 12410 10451 676 643 =========================================== MB/sec/MHz Comparisons With Cortex-A9 =========================================== Serial..... Random..... Read Rd/Wrt Read Rd/Wrt 32 Bit Only - All relatively faster Serial RAM speeds than Random. Compared to Cortex-A9, many similar via L2 but faster from L2 ad RAM. Android T11 L1 1.02 0.69 1.55 0.96 Cortex L2 2.93 2.09 1.72 1.73 A15 RAM 3.00 2.72 1.22 1.21 T21 L1 1.11 0.81 0.91 0.89 Qualcomm L2 1.76 1.95 1.71 1.64 800 RAM 3.60 4.65 1.33 1.35 A1 L1 1.10 1.14 1.19 1.20 Atom L2 2.31 2.79 2.17 2.30 Z3745 RAM 4.03 4.87 1.44 1.94 A5 L1 1.08 1.07 1.15 1.12 Atom L2 2.08 2.59 2.10 2.21 z8300 RAM 3.52 3.06 1.29 1.51 P37 L1 1.07 1.09 1.09 1.15 Cortex L2 2.44 3.43 1.10 1.31 A53 RAM 3.85 3.26 0.88 0.95 ########################################################### 32 Bit and 64 Bit - No signs of 64 bits being consistently faster or slower. Android T22 32b L1 1.03 1.05 1.02 1.09 Cortex L2 2.31 3.27 1.08 1.28 A53 RAM 4.06 5.57 1.05 0.95 T22 64b L1 1.42 0.88 1.41 0.92 Cortex L2 2.74 2.81 1.22 1.23 A53 RAM 3.58 5.42 0.84 1.01 REMIX/Android R1 32b L1 0.99 1.13 1.18 1.19 Atom L2 2.26 2.96 2.00 2.32 Z8300 RAM 3.56 6.21 1.21 1.63 R1 64b L1 0.98 0.57 1.01 0.52 Atom L2 1.63 1.59 2.10 1.92 Z8300 RAM 3.26 4.50 1.26 1.44 R2 32b L1 3.12 2.65 3.08 2.72 Core i7 L2 6.91 7.30 5.42 3.91 4820K RAM 8.58 12.04 3.37 3.32 R2 64b L1 3.12 2.62 3.26 2.84 Core i7 L2 6.81 7.22 5.25 3.80 4820K RAM 8.44 12.05 3.40 3.27 Windows PC 32b L1 3.15 2.56 3.01 2.63 Core i7 L2 4.92 4.68 2.67 1.50 4820K RAM 3.53 2.96 0.24 0.22 PC 64b L1 3.17 2.47 3.12 2.64 Core i7 L2 4.63 4.48 2.29 1.30 4820K RAM 3.37 3.06 0.24 0.22 W1 32b L1 1.06 1.15 1.22 1.20 Atom L2 2.18 2.80 2.18 2.30 Z8300 RAM 3.84 6.51 1.51 1.68 W1 64b L1 0.80 0.80 0.92 0.95 Atom L2 1.73 2.34 2.09 2.05 Z8300 RAM 3.64 6.30 1.47 1.82 W2 32b L1 1.04 1.11 1.14 1.11 Atom L2 1.96 2.42 1.97 2.13 z8300 RAM 3.45 4.88 1.31 1.60 W2 64b L1 0.81 0.83 0.91 0.92 Atom L2 1.85 2.37 2.11 2.12 z8300 RAM 3.36 4.86 1.41 1.66

To Start

The benchmarks run code for single and double precision Fast Fourier Transforms of size 1024 to 1048576 (1K to 1024K), each one being run three times to identify variance. Results are displayed and saved in a log file (FFT-tests.txt), with FFT running time in milliseconds. Besides Android, the bechmarks are available to run via Windows and Linux. Two versions are available FFT1, original version and with optimised C code as FFT3c. Further details, results, and links for benchmarks and source code are in FFTBenchmarks.htm. The Android benchmarks are only available in the later 32 or 64 bit mode. Example results are below.

Version 3 Improvements - All systems produced significant gains, using the optimised benchmark, but some struggled running the smaller FFTs.

64 Bit Differences - Initially, only one tablet was available that runs at 64 bits, a Lenovo TAB 2 A8-50F using Android 5. In this case, 64 bit and 32 bit results were similar for the non-optimised version, but averaged 40% faster with the more efficient code. Later results, using Intel CPUs, produced similar performance via 32 bit and 64 bit versions.

Double and Single Precision - Using 64 bit DP numbers, instead of 32 bit for SP, can produce much slower speeds when a lower level cache space is exceeded and also though using more RAM based data. Other than these, there are slower and faster results.

Android Upgrades - First identified upgrades to Android 5, indicated better average performance but with wide variations on individual tests.

Intel/Windows 10 - 32 bit and 64 bit Intel/Windows results are now included for Atom and Core i7 CPUs.

A5 and W2 Dual Boot Tablet - Android and Windows speeds are again generally, similar, except for Version 3, where W2 is faster. Again W2 results using RAM are slower than W1.

Intel CPU Windows and REMIX/Android performance was quite similar.

Single Precision and Double Precision Results in milliseconds T7 Nexus 7 T11 VOYO A15 T21 Kindle HDX 7 Cortex-A9 1.2 GHz Cortex-A15 1.7 GHz Qualcomm 800 2.1 GHz L1/L2 KB 32/1024 32/2048 16/2048 Android 4.1.2 Android 5.0.2 Android 4.2.2 Android 4.4.3 32 Bit 32 Bit 32 Bit 32 Bit K Size SP DP SP DP SP DP SP DP Version 1.0 1 0.64 0.38 0.18 0.21 0.10 0.17 0.14 0.18 2 0.77 0.97 0.40 0.67 0.22 0.36 0.33 0.53 4 1.14 1.77 1.13 1.86 0.57 0.90 1.03 1.30 8 3.28 4.40 3.26 5.12 2.12 2.31 2.50 3.09 16 7.76 9.39 7.74 9.69 4.71 5.97 1.95 2.20 32 17.80 22.26 18.09 22.73 10.76 11.37 4.18 5.77 64 61.05 140.58 41.64 84.68 20.10 49.70 14.61 20.01 128 153.19 289.15 139.98 274.54 77.67 213.70 33.19 60.52 256 450.16 645.72 444.09 645.70 408.51 448.95 107.49 310.93 512 1084.11 1457.85 1102.20 1438.29 782.85 1101.70 584.54 497.23 1024 2388.33 3129.21 2388.56 3185.93 1799.89 2280.30 875.95 963.37 Version 3c.0 1 0.66 0.21 0.27 0.25 0.23 0.08 0.35 0.07 2 1.09 0.55 0.65 0.65 0.50 0.17 0.81 0.19 4 2.67 1.38 1.67 1.45 1.07 0.41 1.66 0.41 8 3.56 3.09 4.30 3.23 2.41 0.90 1.08 0.90 16 7.78 9.08 8.33 10.35 5.26 3.23 3.36 2.66 32 17.85 22.02 19.23 25.38 11.88 8.88 6.54 6.07 64 39.52 52.11 46.41 58.90 23.75 23.08 12.57 13.56 128 89.73 118.45 103.31 128.44 49.74 53.11 27.41 33.09 256 203.34 258.56 221.99 267.12 100.25 120.66 63.39 72.55 512 437.25 552.00 464.30 558.13 226.76 264.30 150.38 156.30 1024 918.32 1175.65 933.05 1182.49 505.68 586.18 306.32 337.07 T22 Lenovo TAB 2 A8-50F P37 Lenovo Moto G4 ARM Cortex-A53 1.3 GHz ARM Cortex-A53 1.5 GHz L1/L2 KB 32/512 32/512 Android 5.0.2 Android 6.0.1 Android 7.0 64 Bit 32 Bit 32 Bit 32 Bit K Size SP DP SP DP SP DP SP DP Version 1.0 1 0.20 0.21 0.21 0.21 0.21 0.21 0.17 0.18 2 0.44 0.50 0.43 0.53 0.45 0.51 0.38 0.40 4 1.06 1.26 1.03 1.24 1.16 1.33 0.90 1.17 8 2.52 3.03 2.52 2.85 2.62 2.59 2.29 2.45 16 5.89 6.41 5.68 6.60 5.06 6.09 4.95 5.64 32 14.09 25.29 13.05 30.59 14.10 30.26 11.25 27.12 64 49.97 109.32 45.80 92.16 52.78 113.24 40.72 105.27 128 188.37 256.98 153.25 221.98 173.52 256.88 160.31 236.64 256 447.62 583.33 362.62 504.60 409.24 578.50 383.80 544.43 512 826.77 1019.84 840.44 1107.14 917.86 1265.79 876.99 1198.03 1024 1846.27 2299.97 1835.82 2423.72 2047.09 2750.92 1972.58 2683.18 Version 3c.0 1 0.17 0.20 0.34 0.20 0.28 0.17 0.29 0.16 2 0.37 0.48 0.74 0.47 0.65 0.39 0.64 0.38 4 2.55 1.07 1.62 1.06 1.42 0.85 1.44 0.86 8 1.93 2.40 3.63 2.33 3.35 1.95 3.25 1.95 16 4.59 5.64 8.07 9.12 8.20 8.13 6.95 7.86 32 10.68 15.40 18.20 22.93 15.99 18.95 15.93 19.43 64 28.17 36.16 45.33 50.41 37.84 43.62 37.29 42.46 128 66.87 82.23 101.38 112.46 84.06 96.71 83.55 95.01 256 148.69 193.91 222.13 264.79 190.32 217.23 186.20 213.21 512 347.25 424.72 501.52 550.88 425.97 474.15 416.25 462.13 1024 760.74 960.28 1085.65 1206.83 928.38 1026.33 897.72 1001.54 Intel CPUs Android Dual Boot with W2 A1 Asus MemoPad 7 A5 Teclast X98 Plus Atom Z3745 1.86 GHz Atom Z8300 1.84 GHz L1/L2/L324/1024 KB 24/1024/0 Android 4.4.2 Android 5.0 Android 5.1 32 Bit 32 Bit 32 Bit K Size SP DP SP DP SP DP Version 1.0 1 0.09 0.11 0.10 0.09 0.09 0.12 2 0.21 0.29 0.16 0.23 0.18 0.31 4 0.61 0.66 0.48 0.52 0.61 0.57 8 1.35 1.17 1.07 1.17 1.17 1.56 16 3.20 2.57 2.38 2.59 3.15 3.34 32 5.41 5.75 5.30 6.02 6.65 9.20 64 11.74 29.95 11.77 28.31 15.62 45.48 128 67.54 99.31 54.05 97.58 49.67 110.14 256 194.13 225.94 189.11 219.98 222.78 264.65 512 438.49 501.59 433.06 487.49 521.72 602.38 1024 970.84 1121.61 968.37 1116.94 1187.13 1433.75 Version 3c.0 1 0.09 0.08 0.10 0.08 0.15 0.13 2 0.21 0.20 0.16 0.20 0.20 0.21 4 0.50 0.43 1.66 0.43 0.45 0.52 8 1.12 0.96 0.87 0.96 0.97 1.05 16 2.64 2.86 2.01 2.34 2.14 2.61 32 4.87 5.56 4.51 5.73 4.82 6.53 64 11.11 15.03 10.01 14.47 11.10 17.79 128 27.29 34.77 26.80 33.71 29.95 43.74 256 62.57 72.93 61.16 72.04 77.43 86.13 512 132.64 157.56 131.10 152.68 152.95 185.74 1024 282.99 332.37 274.01 363.60 314.54 460.91 Intel CPUs - Windows or Windows and Android W2 Teclast X98 Plus Atom Z8300 1.84 GHz KB 24/1024/0 Windows 10 32 Bit 64 Bit K Size SP DP SP DP Version 1.0 1 0.11 0.12 0.10 0.12 2 0.24 0.34 0.22 0.33 4 0.65 0.74 0.72 0.74 8 1.46 1.66 1.37 1.68 16 3.25 3.61 3.21 3.78 32 7.33 8.10 6.98 7.97 64 16.40 28.29 15.96 29.96 128 38.56 121.13 76.10 136.39 256 232.47 266.35 259.73 298.24 512 565.20 597.42 596.50 629.28 1024 1205.59 1450.84 1288.20 1439.44 Version 3c.0 1 0.08 0.09 0.09 0.08 2 0.19 0.23 0.18 0.19 4 0.45 0.51 0.48 0.43 8 1.00 1.12 1.08 0.93 16 2.67 2.68 2.51 2.50 32 5.54 5.59 5.74 6.06 64 10.64 14.72 12.54 14.77 128 32.82 36.71 28.28 36.95 256 66.71 77.48 67.25 78.47 512 157.72 153.43 150.14 168.63 1024 332.39 365.36 300.79 370.48 W1 Pipo W1S Tablet R1/W1 Pipo W1S Tablet Atom Z8300 1.84 GHz Atom Z8300 1.84 GHz L1/L2/L3 KB 24/1024/0 KB 24/1024/0 Windows 10 REMIX/Android 32 bit 64 bit 32 bit 64 bit K Size SP DP SP DP SP DP SP DP Version 1.0 1 0.11 0.12 0.10 0.12 0.31 0.37 0.29 0.37 2 0.24 0.45 0.23 0.35 0.84 0.85 0.65 1.04 4 0.67 0.75 0.63 0.74 1.52 1.46 1.91 2.37 8 1.44 1.80 1.50 1.69 2.56 2.65 4.26 5.31 16 3.29 3.71 3.16 3.65 4.46 3.59 7.42 6.24 32 7.32 7.83 5.94 6.98 6.12 7.93 8.26 6.98 64 14.36 31.51 13.95 25.44 13.03 35.52 17.14 32.47 128 46.45 120.79 50.90 115.44 69.30 105.02 73.44 117.75 256 209.39 235.36 203.02 266.34 228.05 244.75 237.24 295.39 512 455.89 534.68 491.49 576.91 536.19 620.66 502.33 626.71 1024 1024.78 1195.81 1040.39 1182.20 1086.25 1287.63 1039.91 1209.47 Version 3c.0 1 0.08 0.08 0.08 0.09 0.16 0.08 0.26 0.08 2 0.19 0.20 0.20 0.22 0.37 0.21 0.60 0.23 4 0.46 0.44 0.46 0.48 0.89 0.46 1.45 0.44 8 1.20 0.97 1.06 1.07 1.58 1.03 3.21 0.97 16 2.27 2.26 2.26 2.25 3.21 2.53 7.37 2.29 32 5.11 5.54 5.31 5.83 5.28 6.13 11.42 5.62 64 12.48 14.29 11.22 15.59 12.13 18.74 13.93 14.66 128 27.62 34.25 27.47 31.65 31.28 37.99 28.97 31.81 256 71.32 70.99 62.74 67.95 72.23 81.63 57.01 66.84 512 143.07 144.60 140.50 146.76 155.62 196.93 122.36 140.30 1024 298.00 322.13 289.98 334.07 295.55 450.03 271.67 302.49 PC 2015 Top End Desktop PC R2/PC Corei7-4820K 3.9 GHz Corei7-4820K 3.9 GHz L1/L2/L332/256/10 MB 32/256/10 MB Windows 10 REMIX/Android 32 bit 64 bit 32 bit 64 bit K Size SP DP SP DP SP DP SP DP Version 1.0 1 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.018 2 0.04 0.04 0.04 0.04 0.05 0.04 0.03 0.041 4 0.09 0.12 0.08 0.12 0.10 0.13 0.13 0.181 8 0.26 0.31 0.25 0.30 0.29 0.32 0.38 0.398 16 0.65 0.77 0.62 0.76 0.71 0.81 0.88 0.936 32 1.59 1.96 1.51 1.93 1.69 1.99 2.11 2.506 64 4.33 4.87 3.91 4.78 4.06 4.41 4.78 5.037 128 9.94 10.57 9.21 10.60 9.19 9.92 9.31 9.772 256 21.87 22.00 21.01 22.06 20.68 21.92 19.70 21.974 512 45.09 55.15 44.72 58.29 45.07 52.85 43.68 56.312 1024 105.75 199.77 111.23 199.11 106.39 188.55 110.34 176.725 Version 3c.0 1 0.02 0.02 0.01 0.01 0.02 0.02 0.01 0.018 2 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.04 4 0.07 0.08 0.06 0.07 0.07 0.08 0.06 0.09 8 0.16 0.18 0.14 0.16 0.16 0.17 0.22 0.199 16 0.37 0.41 0.33 0.38 0.39 0.45 0.47 0.402 32 0.81 0.86 0.73 0.82 0.85 0.96 1.11 0.873 64 1.76 1.86 1.56 1.75 1.82 2.05 2.18 1.888 128 3.77 4.05 3.38 3.76 3.94 4.36 4.45 4.047 256 8.24 9.36 7.38 8.78 8.47 9.78 8.66 9.282 512 19.09 22.96 17.28 22.50 19.52 24.29 17.74 23.361 1024 45.68 57.37 42.19 56.66 47.35 57.59 43.23 56.682

To Start

For more information on Whetstone Benchmark see stand alone version, above. The multithreading version runs multiple copies of the same shared code, with separate variables. In this case, performance of each of the eight test functions and overall MWIPS ratings is invariably (nearly) proportional to the number of CPU cores available. The driving program checks that calculations on every thread produce consistent numeric results.

The gcc 4.8 based ARM/Intel version, running on the Intel Atom tablet, is rated at twice the speed of the original, due to the use of native code. The fixed point results indicate overoptimisation, but the test uses little of the overall time, this being mainly dependent on the Cos, Exp and third MFLOPS tests. Running the original ARM converted code version via Android 5.0, mainly produced better performance, but an overall lower rating, due to slower Cos and Exp tests, same as stand alone version above.

Also the same as the stand alone version, the new native ARM program was generally slower, running on tablets T7, T11 and T21,

On T22, with the Cortex-A53 CPU, the new 32 bit single thread tests appeared to be slower than the stand alone version, but that was not the case at 64 bits, apparently indicating a 64 bit performance gain.

A5 and W2 Dual Boot Tablet - Android and Windows speeds are significantly different, on some tests, because of the different compilers, particularly due to optimisation, but these tests do not affect the overall MWIPS results much. The latter averages 18% faster via Android but both show 2 and 4 thread performance gains of around 1.9 and 3.5 times.

Intel CPU Windows and REMIX Android, 32 bit and 64 bit versions - overall MWIPS ratings were all quite similar on a Core i7 (PC/R2) and also on an Atom (W1/R1), but there were variations an individual tests, due to different compilers and instructions used.

MP Efficiency - For those with four cores, average throughput, compared with one core, was 4.0 times on the Core i7 with REMIX and Windows, 3.5 times Atom with Windows, and 2.7 times REMIX, 3.7 times Android, then 3.9 times with ARM/Android. Core i7 (with Hyperthreading) recorded 6.9 timed with 8 threads, and the 8 core P37 6.5 times (1 to 4 cores at 1.5 GHz and 5 to 8 at at 1.2 GHz).

##################### T7 Original ###################### T7, ARM Cortex-A9 1300 MHz, Android 4.1.2, Measured 1200 MHz Android MP-Whetstone Benchmark V1.0 17-Oct-2012 13.49 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1033.7 247.4 235.4 266.0 25.3 15.0 448.4 630.9 513.5 2T 2058.1 456.3 473.0 532.4 50.0 30.1 898.1 1198.4 1026.6 4T 4122.8 831.9 944.7 1064.6 100.7 60.1 1797.0 2392.2 2053.4 8T 4163.2 1016.0 948.2 1069.5 101.8 60.9 1808.0 2414.2 2051.5 Overall Seconds 5.28 1T, 5.34 2T, 5.42 4T, 10.81 8T #################### T7 ARM-Intel ##################### ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 21.32 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 602.2 242.3 242.3 140.2 27.2 4.9 482.8 1425.2 239.1 2T 1208.7 481.2 484.2 280.8 55.0 9.9 970.0 2869.6 478.7 4T 2398.7 805.4 966.7 562.5 109.5 19.5 1938.2 5722.5 957.1 8T 2429.1 974.6 1076.2 562.4 110.9 19.7 1981.5 5816.1 963.6 Overall Seconds 4.94 1T, 4.93 2T, 5.08 4T, 9.93 8T #################### T11 Original ##################### T11 Samsung EXYNOS 5250 2.0 GHz Cortex-A15, Android 4.2.2 Measured 1.7 GHz Android MP-Whetstone Benchmark V1.1 06-Sep-2013 12.49 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1308.2 345.9 379.0 294.1 30.8 17.2 1351.4 1265.7 843.1 2T 2886.6 782.1 782.6 614.0 80.1 34.3 2775.2 2463.7 1667.5 4T 3086.0 998.6 788.1 610.6 79.2 44.5 3472.0 2526.4 2191.4 8T 2930.0 788.2 843.5 616.5 80.5 35.0 2846.0 2799.1 1686.2 Overall Seconds 3.54 1T, 3.30 2T, 6.62 4T, 13.16 8T #################### T11 ARM-Intel #################### ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 21.23 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 837.2 340.1 341.7 191.2 39.1 6.2 1521.1 2532.8 629.3 2T 1676.2 596.2 683.2 387.3 77.8 12.4 3056.9 5055.1 1263.6 4T 1697.7 687.5 869.4 394.5 78.1 12.4 2980.7 6518.4 1258.8 8T 1685.2 685.9 691.0 389.7 78.3 12.4 3086.3 5113.7 1262.0 Overall Seconds 4.06 1T, 4.07 2T, 8.12 4T, 16.19 8T #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Android MP-Whetstone Benchmark V1.1 06-Jul-2015 10.42 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1877.1 645.2 642.6 524.1 44.0 22.3 1364.7 1572.1 898.9 2T 3668.6 1220.2 1262.4 1021.9 85.9 43.8 2663.5 3078.4 1753.4 4T 7426.9 2375.5 2474.7 2097.7 175.7 88.2 5052.6 6240.4 3555.0 8T 7706.6 2692.2 2746.2 2186.9 180.1 90.3 5822.5 6902.7 3681.3 Overall Seconds 4.44 1T, 4.62 2T, 4.64 4T, 9.00 8T #################### T21 ARM-Intel #################### ARM/Intel MP-Whetstone Benchmark V1.1 22-Jul-2015 12.02 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1598.0 512.1 508.7 311.7 43.6 22.1 1142.9 2123.3 598.4 2T 3161.2 960.0 996.7 614.2 86.7 43.8 2258.9 3820.9 1194.7 4T 6348.0 1593.5 2019.5 1231.5 174.2 88.5 4471.1 8139.4 2398.3 8T 6419.6 2058.2 2077.5 1252.6 175.0 88.7 4520.9 8875.0 2409.0 Overall Seconds 4.88 1T, 5.00 2T, 5.05 4T, 9.92 8T ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel MP-Whetstone Benchmark V1.2 14-Nov-2016 11.41 Compiled for 32 bit ARM v7a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1050.5 304.5 268.3 171.7 35.2 17.7 459.4 905.5 338.1 2T 2134.1 540.5 524.8 350.5 68.1 34.9 1316.8 1881.0 679.3 4T 4214.0 1090.4 1022.0 689.4 136.1 70.4 2283.5 3850.4 1348.4 8T 7490.8 1969.8 1759.1 1243.8 244.5 125.3 4038.0 6074.2 2392.9 Overall Seconds 4.67 1T, 4.65 2T, 4.71 4T, 5.75 8T Android 7.0 ARM/Intel MP-Whetstone Benchmark V1.2 11-May-2017 10.28 Compiled for 32 bit ARM v7a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1069.2 300.7 252.9 176.7 31.5 19.4 646.9 942.2 338.6 2T 2103.2 543.2 490.9 343.7 64.1 38.7 1101.2 1830.5 675.9 4T 4212.2 1072.1 958.5 686.7 128.7 77.5 2251.5 3802.1 1354.9 8T 7564.2 1931.6 1744.2 1242.6 231.8 137.1 4243.9 6856.4 2461.7 Overall Seconds 3.99 1T, 4.06 2T, 4.06 4T, 4.94 8T ###################### T22 32 Bit ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel MP-Whetstone Benchmark V1.2 10-Aug-2015 11.30 Compiled for 32 bit ARM v7a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 676.4 275.9 281.9 147.9 35.4 5.3 600.3 901.0 285.5 2T 1362.5 533.8 561.7 298.0 70.9 10.8 1203.1 1838.9 574.0 4T 2698.6 903.9 1071.7 594.4 141.2 21.5 2346.1 3305.5 1138.5 8T 2830.1 1463.2 1393.0 614.2 152.5 21.9 3243.9 4418.3 1171.4 Overall Seconds 4.95 1T, 4.94 2T, 5.11 4T, 10.09 8T ###################### T22 64 Bit ###################### ARM/Intel MP-Whetstone Benchmark V1.2 10-Aug-2015 11.34 Compiled for 64 bit ARM v8a Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1524.8 328.6 348.8 297.6 37.3 19.9 1462579 1867.2 1238.0 2T 3062.5 688.8 697.9 596.0 75.5 39.8 2097113 3726.7 2481.3 4T 6085.4 1214.9 1360.5 1185.4 150.5 79.4 2449153 7055.0 4951.8 8T 6222.4 1495.2 1545.6 1204.2 152.2 80.6 3869846 9218.8 5154.1 Overall Seconds 4.92 1T, 4.90 2T, 5.05 4T, 9.97 8T #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s Android MP-Whetstone Benchmark V1.1 04-Feb-2015 11.39 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 953.7 363.0 382.4 267.8 21.0 13.2 413.1 1842.4 392.3 2T 1921.2 726.0 663.5 541.4 42.6 27.0 816.1 3662.6 793.3 4T 3820.6 1419.2 1514.6 1081.5 84.1 54.0 1543.8 6292.4 1588.5 8T 4003.8 1912.9 1872.4 1114.1 86.5 56.4 2053.1 8292.6 1599.7 Overall Seconds 4.88 1T, 4.87 2T, 4.96 4T, 10.05 8T ################## A1 V1 Android 5.0 ################### Android MP-Whetstone Benchmark V1.1 05-Nov-2015 11.06 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 748.8 405.9 411.8 367.0 11.3 11.1 898.0 2129.1 459.8 2T 1468.5 822.0 827.5 744.8 22.4 22.2 1088.8 4228.4 924.5 4T 2781.0 1242.8 1638.6 1415.5 40.3 44.3 3404.6 8283.2 1852.1 8T 3050.7 1854.5 1831.0 1566.7 45.4 45.3 4519.7 10332.5 1844.5 Overall Seconds 5.00 1T, 5.09 2T, 5.72 4T, 10.30 8T #################### A1 ARM-Intel ###################### ARM/Intel MP-Whetstone Benchmark V1.1 30-Apr-2015 17.35 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1916.9 691.4 691.3 497.2 35.3 27.6 10209.8 2787.3 1351.8 2T 3800.3 1377.6 1381.2 980.0 70.1 54.7 20248.0 5252.8 2748.7 4T 7604.9 2713.2 2711.8 1977.1 140.2 110.0 33906.3 9526.5 5550.8 8T 7798.1 3141.5 3627.2 2064.2 141.2 110.2 59590.6 12743.7 5711.5 Overall Seconds 4.94 1T, 5.00 2T, 5.06 4T, 10.11 8T ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel MP-Whetstone Benchmark V1.2 14-Apr-2016 17.09 Compiled for 32 bit Intel x86 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 2121.9 695.0 695.7 483.5 39.6 34.8 10102.2 2700.8 1358.9 2T 4123.2 1319.0 1351.2 903.1 78.9 67.2 19593.6 5336.0 2604.5 4T 7368.1 2394.0 2375.9 1668.8 139.0 119.8 35711.8 9359.2 4603.0 8T 7391.0 2397.4 2769.0 1658.4 137.7 121.8 36643.4 9953.9 4670.9 Overall Seconds 4.88 1T, 5.04 2T, 5.84 4T, 11.52 8T #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-Whetstone Benchmark V1.2 21-Oct-2016 14.34 Compiled for 32 bit Intel x86 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1929.0 566.4 615.3 440.7 38.1 28.7 9518.0 2440.1 1235.3 2T 3528.5 912.9 1188.8 832.1 65.0 57.9 13330.0 4114.1 2272.6 4T 5295.0 1821.0 1784.7 1305.4 95.6 88.5 23671.1 6465.3 3461.3 8T 6406.2 2158.8 2247.6 1588.9 128.2 117.4 24747.2 8243.7 4403.3 Overall Seconds 4.81 1T, 5.38 2T, 7.72 4T, 14.07 8T #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-Whetstone Benchmark V1.2 11-Nov-2016 21.33 Compiled for 64 bit Intel x86_64 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 2189.0 524.1 488.1 402.0 44.7 41.7 1351656.1 1894.8 1758.8 2T 4036.7 1108.5 1178.5 780.0 78.1 73.2 4361015.9 4752.1 3140.7 4T 5652.4 1694.5 1270.9 1191.6 111.8 95.4 2680231.8 5593.2 4688.4 8T 7075.1 2126.0 2068.2 1522.4 147.6 134.8 3600866.1 6987.4 5694.7 Overall Seconds 4.84 1T, 5.22 2T, 8.26 4T, 14.49 8T ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1816.7 568.3 580.3 477.8 34.9 26.9 1395.8 1100.4 7327.8 2T 3469.7 1145.9 1086.9 905.6 66.1 52.6 2684.4 2118.7 13383.7 4T 6337.0 2026.1 2029.6 1658.4 121.2 95.1 4886.7 3800.8 24933.3 8T 6900.2 2162.4 2326.0 1870.2 134.7 98.8 6089.9 4071.4 29659.9 Overall Seconds 4.80 1T, 5.02 2T, 5.53 4T, 13.07 8T ################# W1 Windows 10 64 bit ################# MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1994.3 537.7 536.4 476.9 42.3 28.8 1420.0 1099.2 7305.8 2T 3760.6 1080.6 1075.4 894.9 79.9 53.3 2842.5 2115.5 12762.4 4T 6946.5 1850.0 1883.3 1655.9 146.8 101.3 4946.3 3787.9 25246.0 8T 7556.2 1891.4 2159.3 1867.7 163.1 104.8 5362.5 4283.3 26001.8 Overall Seconds 4.89 1T, 5.19 2T, 5.66 4T, 13.26 8T ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86 Start of test Fri Apr 15 16:28:12 2016 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1776.5 561.5 581.1 466.3 34.1 26.2 1402.2 1093.2 6981.4 2T 3364.9 1014.1 1020.8 832.8 65.6 51.6 2643.0 2027.2 12415.1 4T 6316.1 1987.1 2016.5 1655.2 121.2 94.2 4860.8 3793.2 24941.8 8T 6563.4 2372.8 2031.4 1850.4 122.8 96.6 5667.8 3844.8 28561.7 Overall Seconds 4.75 1T, 5.06 2T, 5.39 4T, 11.56 8T ######## W2 Windows 10 64 bit Dual Boot With A5 ######## MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64 Start of test Fri Apr 15 16:38:09 2016 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 1954.1 506.3 538.0 469.7 40.4 29.1 1411.3 1091.8 7280.9 2T 3615.7 1011.5 989.7 873.6 77.1 51.7 2477.6 1907.0 13107.0 4T 6941.8 1877.9 1879.3 1652.7 147.1 100.9 4946.8 3789.6 25046.5 8T 7124.5 2128.2 1975.4 1705.5 149.7 103.3 5058.7 4284.8 28862.8 Overall Seconds 4.95 1T, 5.36 2T, 5.59 4T, 11.72 8T ================================================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz ================================================================== 32 Bit MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x86 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 5273.9 1114.8 1119.2 921.1 129.4 90.7 3404.0 5351.3 22213.6 2T 11031.8 2238.4 2304.9 1938.0 271.1 189.4 6973.5 11713.2 46821.3 4T 21347.8 4713.1 4718.0 3879.9 493.4 375.2 14335.7 21161.6 89584.4 8T 39679.6 9374.0 9397.5 7687.6 874.8 726.5 24631.8 23418.6 93465.8 Overall Seconds 4.97 1T, 4.76 2T, 4.99 4T, 5.59 8T 64 Bit MP-Whetstone Benchmark From C/C++ 18.00.21005.1 for x64 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 6200.6 1236.5 1236.2 870.8 206.0 108.8 3359.1 4767.4 23413 2T 13050.4 2603.8 2606.2 1891.4 432.6 217.5 7076.8 10041.6 46840 4T 25336.0 5195.2 5211.7 3707.1 832.8 422.9 13626.9 16962.6 78346 8T 46141.7 10293.2 10379.0 7242.4 1332.7 814.2 24394.5 23451.3 93588 Overall Seconds 4.82 1T, 4.60 2T, 4.91 4T, 5.50 8T #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-Whetstone Benchmark V1.2 21-Oct-2016 12.50 Compiled for 32 bit Intel x86 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 5425.2 1343.1 1343.4 868.1 131.8 87.8 55255 11089 4899 2T 10969.5 2773.7 2475.7 1735.9 274.5 175.4 114023 23300 10637 4T 22989.7 5587.5 5609.8 3889.2 547.5 362.3 131855 44619 19739 8T 41099.9 10957 10752 7683.9 881.4 702.7 235813 46954 23348 Overall Seconds 4.91 1T, 4.80 2T, 4.74 4T, 5.76 8T #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-Whetstone Benchmark V1.2 11-Nov-2016 14.38 Compiled for 64 bit Intel x86_64 Using 1, 2, 4 and 8 Threads MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal 1 2 3 MOPS MOPS MOPS MOPS MOPS 1T 6033.0 1343.3 1342.8 831.9 162.5 109.3 33291632 11076 5540 2T 12746.2 2673.1 2827.1 1834.6 330.7 231.9 30432979 23301 9592 4T 25953.4 5598.0 5642.9 3788.8 662.1 473.8 44736026 34693 23308 8T 46218.9 11093 11108 7685.5 1035 889.3 99650183 46841 23415 Overall Seconds 5.14 1T, 5.07 2T, 5.04 4T, 6.10 8T

To Start

For further details see Dhrystone Benchmark above and the following, including further results Android MultiThreading Benchmark Apps. This multithreading benchmark runs using 1, 2, 4 and 8 threads, executing multiple copies of the same program. An initial calibration, using a single thread, determines the number of passes needed for an overall execution time of 1 second. Then all threads are run using the same pass count, running time being extended when there are more threads than CPUs. The same calculations are carried out on each thread. Separate data arrays are used for each thread but some variables can be used by all threads. The latter is probably responsible for failure to increase throughput much, using multiple threads or, in the case of A1, with the Atom CPU, reduced throughput using more than one thread.

On all the initial results shown, there was little difference in performance between the original and the new 32 bit version but T22, with the Cortex-A53, produced significant gains at 64 bits.

T21, the Kindle Fire with a Quad Core Qualcomm Snapdragon 800 CPU, failed to run using the new ARM/Intel version, and obtained a rather excessive score with 8 threads via the original benchmark (but similar to a possible 4 x 2850).

ARM vs Intel MP - Note that the systems using ARM processors increased performance with multiple threads but those with Intel CPUs did not.

32 Bit vs 64 Bit - The latter was typically 70% faster via Android and REMIX/Android but much less using the Windows compilations.

VAX MIPS or DMIPS Threads System CPU MHz Android 1 2 4 8 None See Original ARM Version A1 Z3745 1866 x4 4.4.2 2360 1394 1334 1321 1840 A1 Z3745 1866 x4 5.0 2411 1633 1313 1298 2488 T7 v7-A9 1200 x4 4.1.2 1584 2749 3836 3569 1610 T22 v8-A53 1300 x4 5.0.2 1686 2943 4232 4323 1683 T11 v7-A15 1700 x2 4.2.2 2271 4281 4326 4171 3189 T21 QU-800 2150 x4 4.4.3 2850 4395 7736 11821 3854 ARM/Intel 32 Bit Version A1 Z3745 1866 x4 4.4.2 2365 1322 1323 1319 2451 A5 ## z8300 1840 x4 5.1 2256 1155 1163 1054 2318 T7 v7-A9 1200 x4 4.1.2 1464 2399 3575 3737 1317 T22 v8-A53 1300 x4 5.0.2 1412 2559 4038 4291 1423 P37 v8-A53 1500 x8 6.0.1 1720 2923 4839 2618 1649 P37 v8-A53 1500 x8 7.0 1575 2899 4955 2697 1722 T11 v7-A15 1700 x2 4.2.2 2295 4057 3902 4096 2551 T21 QU-800 2150 x4 4.4.3 Failed to run 3319 P38 v8-A57 2700 x4 6.0.1 3094 5612 6849 3776 +V8-A53 1300 x4 R1=Atm Z8300 1840 x4 6.0.1 2174 1150 1170 1139 2390 R2 Core i7 3900 x4 6.0.1 9919 5685 5305 6076 10489 ARM/Intel 64 Bit Version T22 v8-A53 1300 x4 5.0.2 2548 4311 5560 5613 2569 R1=Atm Z8300 1840 x4 6.0.1 3900 1677 1709 1666 3769 R2 Core i7 3900 x4 6.0.1 16740 7595 7271 8612 17003 Intel/Windows 32 Bit Version W1 Z8300 1840 x4 Win10 3284 1477 1235 1313 3044 W2 ## Z8300 1840 x4 Win10 2521 1730 1333 1285 2906 PC Core i7 3900 x4 Win10 12776 7175 6116 7876 12090 Intel/Windows 64 Bit Version W1 Z8300 1840 x4 Win10 3745 1625 1400 1436 3291 W2 ## Z8300 1840 x4 Win10 3717 1566 1386 1441 3195 PC Core i7 3900 x4 Win10 15129 8535 7278 8769 11686 ## A5 and W2 Same Dual Boot Tablet =Atm R1 and W1 Same Tablet R2 and PC Same PC R1 and R2 Android via REMIX

To Start

This is a multithreading version of the above. Further details and results can be found in here. The benchmark is run on 100x100, 500x500 and 1000x1000 matrices using 0, 1, 2 and 4 separate threads, the programming code for zero theads being the same as the above example. Multithreading performance, using this standard linear equation solver, is severely degraded, due to overheads, the zero thread results being the only ones of real use and the others fairly constant, probably running one thread at a time and limited by RAM speed.

Performance of A1, with the Intel CPU and using native Intel compilation, is shown to be twice as fast as the Houdini ARM to Intel converted version, except at N = 1000, which is mainly dependent on calculations from data in RAM. Then, when running the ARM only version, using Android upgraded to 5.0, the performance difference was considerably reduced.

On ARM CPUs, speeds obtained from 32 bit and 64 bit compilations were similar, due to the programs use a limited number of identical NEON intrinsic functions. For the same reason, the new ARM/Intel version produced similar results as the original.

32 Bit vs 64 bit - Results from 64 bit versions were generally slightly faster than those compiled for 32 bits.

Android vs Windows - Intel based Android and REMIX/Android speeds were around three times faster than Windows results on the Atom CPU and twice as fast on the Core i7.

The program checks that the same numeric results are produced, irrespective of the number of threads used, at each matrix size. Then, due to rounding effects, these are slightly different from ARM and Intel hardware, as shown below.

MFLOPS 0 to 4 Threads, N 100, 500, 1000 ##################### T7 Original ###################### Android Linpack NEON SP MP Benchmark 31-Jan-2013 12.14 T7, ARM Cortex-A9 1300 MHz, Android 4.1.2, Threads None 1 2 4 N 100 413.47 45.95 48.22 48.34 N 500 253.08 187.51 189.69 189.94 N 1000 148.76 135.49 136.08 136.17 #################### T7 ARM-Intel ##################### ARM/Intel Linpack NEON SP MP Benchmark 14-May-2015 15.40 Threads None 1 2 4 N 100 385.49 28.79 29.06 29.25 N 500 272.07 184.85 183.70 183.18 N 1000 147.09 131.92 132.44 130.05 #################### T11 Original ##################### Android Linpack NEON SP MP Benchmark 13-Aug-2013 23.28 T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 Threads None 1 2 4 N 100 1399.82 54.86 55.31 54.66 N 500 1154.21 434.16 434.06 436.97 N 1000 571.26 482.57 487.25 485.80 #################### T11 ARM-Intel #################### ARM/Intel Linpack NEON SP MP Benchmark 14-May-2015 15.44 Threads None 1 2 4 N 100 1497.90 61.13 63.13 61.87 N 500 1399.10 491.49 489.29 494.69 N 1000 586.14 499.00 504.97 497.49 #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Android Linpack NEON SP MP Benchmark 26-Jul-2015 11.46 Threads None 1 2 4 N 100 1311.08 12.38 12.93 15.05 N 500 2271.56 344.04 419.52 381.73 N 1000 837.30 540.99 523.52 564.87 #################### T21 ARM-Intel #################### ARM/Intel Linpack NEON SP MP Benchmark 26-Jul-2015 11.51 Threads None 1 2 4 N 100 1308.07 14.89 11.77 11.63 N 500 2341.17 407.96 481.02 415.12 N 1000 901.21 551.80 566.77 564.31 ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Nov-2016 12.09 Compiled for 32 bit ARM v7a Threads None 1 2 4 N 100 555.85 26.39 26.62 26.78 N 500 459.23 224.55 207.08 217.47 N 1000 359.47 270.92 275.58 272.08 Android 7.0 ARM/Intel Linpack NEON SP MP Benchmark 1.2 09-May-2017 11.18 Compiled for 32 bit ARM v7a Threads None 1 2 4 N 100 560.74 25.96 26.35 26.41 N 500 501.69 234.14 237.16 236.78 N 1000 393.49 305.86 310.71 309.85 ###################### T22 32 Bit ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel Linpack NEON SP MP Benchmark 1.2 13-Aug-2015 12.52 Compiled for 32 bit ARM v7a Threads None 1 2 4 N 100 460.74 22.35 23.16 23.82 N 500 480.63 336.52 339.94 303.66 N 1000 470.02 405.86 403.01 405.98 ###################### T22 64 Bit ###################### ARM/Intel Linpack NEON SP MP Benchmark 1.2 13-Aug-2015 12.57 Compiled for 64 bit ARM v8a Threads None 1 2 4 N 100 548.67 27.70 33.93 37.00 N 500 470.04 285.95 297.79 301.67 N 1000 519.02 441.84 443.47 441.91 #################### A1 Original ####################### Android Linpack NEON SP MP Benchmark 07-Feb-2015 18.42 A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Threads None 1 2 4 N 100 452.39 21.00 23.48 17.48 N 500 663.38 275.56 88.66 312.71 N 1000 617.04 380.60 191.26 195.61 ################## A1 V1 Android 5.0 ################### Android Linpack NEON SP MP Benchmark 05-Nov-2015 11.49 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 662.21 25.84 25.59 25.43 N 500 1022.76 317.51 310.52 311.49 N 1000 861.75 549.32 558.52 547.91 #################### A1 ARM-Intel ###################### ARM/Intel Linpack NEON SP MP Benchmark 1.2 06-Nov-2015 22.11 Compiled for 32 bit Intel x86 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 979.81 49.01 42.69 45.34 N 500 1160.24 369.43 349.04 334.87 N 1000 716.94 560.86 535.46 486.61 ########## A5 ARM-Intel Dual Boot With W2 ############ Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Apr-2016 17.22 Compiled for 32 bit Intel x86 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 1131.44 16.52 16.05 17.00 N 500 1427.56 234.84 231.15 266.46 N 1000 874.35 474.20 423.36 577.54 #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 Android Linpack NEON SP MP Benchmark 11-Nov-2016 21.35 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 764.63 23.72 18.72 8.77 N 500 1387.27 153.52 153.30 145.98 N 1000 880.43 360.42 357.60 348.40 ARM/Intel Linpack NEON SP MP Benchmark 1.2 21-Oct-2016 14.38 Compiled for 32 bit Intel x86 Threads None 1 2 4 N 100 1095.33 53.33 57.76 57.01 N 500 1589.75 493.68 512.28 511.92 N 1000 886.08 638.19 635.86 638.70 #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel Linpack NEON SP MP Benchmark 1.2 14-Aug-2016 22.33 Compiled for 64 bit Intel x86_64 Threads None 1 2 4 N 100 1221.20 60.54 65.60 64.04 N 500 1405.14 567.66 554.66 568.40 N 1000 1058.21 729.60 734.22 747.03 ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 Linpack Single Precision MultiThreaded Benchmark 32 Bit, N=500, Wed Dec 23 21:01:12 2015 Threads 0 1 2 4 MFLOPS 740.71 256.40 226.44 163.99 Linpack Double Precision MultiThreaded Benchmark 32 Bit, N=500, Wed Dec 23 21:00:30 2015 Threads 0 1 2 4 MFLOPS 480.73 194.42 196.76 148.52 ################# W1 Windows 10 64 bit ################# Linpack Single Precision MultiThreaded Benchmark 64 Bit, N=500, Wed Dec 23 21:17:19 2015 Threads 0 1 2 4 MFLOPS 707.50 263.47 240.46 197.31 Linpack Double Precision MultiThreaded Benchmark 64 Bit, N=500, Wed Dec 23 21:16:42 2015 Threads 0 1 2 4 MFLOPS 488.12 205.02 202.39 165.47 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 Linpack Single Precision MultiThreaded Benchmark 32 Bit, N=500, Fri Apr 15 16:23:55 2016 Threads 0 1 2 4 MFLOPS 626.40 231.31 183.87 129.48 Linpack Double Precision MultiThreaded Benchmark 32 Bit, N=500, Fri Apr 15 16:23:21 2016 Threads 0 1 2 4 MFLOPS 412.89 221.03 148.56 94.62 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## Linpack Single Precision MultiThreaded Benchmark 64 Bit, N=500, Fri Apr 15 16:36:10 2016 Threads 0 1 2 4 MFLOPS 662.15 241.59 228.59 195.97 ResidN 3.96 3.96 3.96 3.96 Linpack Double Precision MultiThreaded Benchmark 64 Bit, N=500, Fri Apr 15 16:35:42 2016 Threads 0 1 2 4 MFLOPS 527.64 195.54 180.62 154.02 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, Android Linpack NEON SP MP Benchmark 11-Nov-2016 14.40 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 3829.87 113.83 90.99 52.76 N 500 6053.91 1024.25 1014.78 985.31 N 1000 6601.66 2628.01 2568.70 2522.01 ARM/Intel Linpack NEON SP MP Benchmark 1.2 21-Oct-2016 12.51 Compiled for 32 bit Intel x86 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 4738.29 284.27 288.92 289.43 N 500 7078.15 3328.75 3287.02 3288.17 N 1000 7556.05 5459.01 5478.02 5461.30 #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel Linpack NEON SP MP Benchmark 1.2 11-Nov-2016 14.42 Compiled for 64 bit Intel x86_64 MFLOPS 0 to 4 Threads, N 100, 500, 1000 Threads None 1 2 4 N 100 5622.61 318.61 317.19 320.32 N 500 7355.32 3448.71 3577.17 3541.12 N 1000 7734.14 5566.40 5622.47 5653.65 #################### PC Windows 32 Bit ################## Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Windows 10, Linpack Single Precision MultiThreaded Benchmark 32 Bit, N=500, Tue Nov 15 11:29:25 2016 Threads 0 1 2 4 MFLOPS 4018.79 1674.30 1583.93 1199.23 Linpack Double Precision MultiThreaded Benchmark 32 Bit, N=500, Tue Nov 15 11:29:03 2016 Threads 0 1 2 4 MFLOPS 3307.45 1521.69 1453.19 1185.62 #################### PC Windows 64 Bit ################## Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Windows 10, Linpack Single Precision MultiThreaded Benchmark 64 Bit, N=500, Tue Nov 15 11:37:57 2016 Threads 0 1 2 4 MFLOPS 4036.32 1891.33 1782.15 1345.03 Linpack Double Precision MultiThreaded Benchmark 64 Bit, N=500, Tue Nov 15 11:37:24 2016 Threads 0 1 2 4 MFLOPS 3370.00 1692.80 1590.42 1304.35 ################### Numeric Results ################### NR=norm resid RE=resid MA=machep X0=x[0]-1 XN=x[n-1]-1 Single Precision N 100 500 1000 ARM NR 1.60 3.96 11.32 RE 3.80277634e-05 4.72068787e-04 2.70068645e-03 MA 1.19209290e-07 1.19209290e-07 1.19209290e-07 X0 -1.38282776e-05 5.26905060e-05 1.62243843e-04 XN -7.51018524e-06 3.26633453e-05 -6.65783882e-05 Intel NR 1.68 3.96 11.39 RE 4.00543213e-05 4.72545624e-04 2.71725655e-03 MA 1.19209290e-07 1.19209290e-07 1.19209290e-07 X0 -1.38282776e-05 5.26905060e-05 1.62243843e-04 XN -7.51018524e-06 3.26633453e-05 -6.65783882e-05 Double Precision Intel SSE2 5.76 1.27986510e-012 2.22044605e-016 5.59552404e-014 3.39728246e-014

To Start

This is a multithreading version of the above. and here for further results. In the original MP-BusSpdi benchmark, all threads read data from the beginning. With large shared caches, this could lead to exaggerated data transfer speeds for RAM based data, using multiple threads. The revised MP-BusSpd2i attempts to avoid this by arranging for threads to have staggered starting points, but each still reading all the data, besides having a much longer running time for consistent scores. Performance using a single thread is similar to the non-threaded version and it is clear that multiple threads are needed to demonstrate maximum throughput. As usual, maximum RAM speeds can be estimated from burst transfer results, such as 16 times Inc16 MB/second. some results are provided below.

MP-BusSpdi.apk can be downloded from here.

Using A1, with the Intel Atom CPU, the initial Houdini ARM to Intel conversion speeds were slightly slower than the results from the native code compilations, but this was made up on running via Android 5.

Results for the original version, running on ARM CPUs, are not all shown, as they were similar to those for the new version. See here. On T22, with the Cortex-A53, performance could be more than twice as fast, reading all data, using the 64 bit compilation.

The problem associated with shared caches is probably best identified by wide variations in the burst reading tests, that are not apparent in the long running versions (see T7 and T21 below ).

Following the main tables are comparisons of the Read All speeds,for the revised benchmarks. They are based on MB/second/MHz for cached based data and MB/second using RAM.

MP Efficiency - The L1 cache based 4 thread gains over 1 thread ratios shown indicate more than 3.5 times on ARM CPUs but much less from Intel processors, but can be similar using L3 cache. There were also some significant gains reading data from RAM. However, this was influenced by relatively faster Intel speed, using one thread.

64 Bit vs 32 Bit - Windows tests indicated similar performance but 64 bit compilations were much faster than at 32 bits via Android, even using Intel CPUs via REMIX.

Some of the above might be due to the different compilers used.

#################### T7 ARM-Intel ##################### T7, ARM Cortex-A9 1.2 GHz, DDR3-1333, 5.3 GB/s Android 4.1.2, 4 x 32 KB L1 cache, 1 MB shared L2 cache ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.35 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2853 3392 3376 3511 3551 3494 2T 2857 3389 3542 5540 5730 5595 4T 7257 10326 10289 10997 11373 11100 8T 6584 10325 10485 11175 11322 11189 122.9 1T 362 379 347 546 623 978 2T 516 530 508 726 1227 1840 4T 598 658 548 1181 1556 2657 8T 721 733 736 1181 1548 2653 12288 1T 58 57 84 123 173 334 2T 111 111 182 248 348 664 4T 87 85 276 463 687 1290 8T 154 107 147 429 441 1242 Total Elapsed Time 12.7 seconds ########## T7 New Long Version ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.59 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2166 2774 3181 3307 3377 3263 2T 3924 5188 5207 5754 5759 5805 4T 7570 10011 10252 11165 11375 11777 8T 3510 4786 9011 8318 11351 11544 122.9 1T 383 409 359 558 663 983 2T 525 541 520 741 1241 1814 4T 739 752 753 1219 1590 2776 8T 735 741 753 1218 1607 2737 49152 1T 56 51 81 126 172 330 2T 65 67 107 196 335 620 4T 70 68 108 215 426 835 8T 70 68 109 215 428 851 Total Elapsed Time 48.2 seconds Maximum RAM Speed Estimate = 68 x 16 = 1088 MB/second #################### T11 ARM-Intel #################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 Dual core, 2 x 32 KB L1 cache, 1 MB shared L2 cache ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.45 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2165 3591 4256 5587 5998 6109 2T 4121 6469 9530 11381 11846 11936 4T 4106 6438 8827 6793 9802 12080 8T 4098 6390 9534 10141 10996 11603 122.9 1T 464 740 1173 2395 3276 3340 2T 579 989 1934 3994 5431 5792 4T 579 988 1930 3873 5469 5821 8T 580 985 1915 3999 5408 5812 12288 1T 134 172 211 462 602 1904 2T 269 343 387 934 1217 2685 4T 252 231 374 768 991 2625 8T 231 254 367 781 1104 2782 Total Elapsed Time 12.1 seconds ########## T11 New Long Version ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 17.07 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 3499 4539 5499 5505 6134 6045 2T 3775 7202 8377 10605 10457 11319 4T 3982 6676 7687 9326 9707 10807 8T 2546 3643 7891 8003 10725 11097 122.9 1T 672 901 1336 2784 3274 3334 2T 568 969 1931 3894 5427 5221 4T 574 971 1912 3831 5256 4811 8T 559 971 1917 3878 5387 5162 49152 1T 140 142 193 575 989 1499 2T 221 223 342 769 1379 2355 4T 228 223 344 783 1382 2376 8T 223 223 342 787 1385 2352 Total Elapsed Time 49.9 seconds Maximum RAM Speed Estimate = 223 x 16 = 2568 MB/second #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s L1 caches 4 x 16 KB, L2 cache shared 2048 KB Android MP-BusSpd v7 Benchmark V1.1 29-Jun-2015 18.37 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2580 2206 5048 5176 5679 5989 2T 4062 5175 9340 9868 10971 11281 4T 4688 10324 16552 17196 21714 23708 8T 8467 9834 16698 18183 21936 23693 122.9 1T 1152 1052 2068 3035 3927 5723 2T 1710 1840 3094 5001 7963 11475 4T 2047 2002 5031 9267 14698 22920 8T 2235 2275 5223 9348 14234 21783 12288 1T 262 382 508 867 1466 2661 2T 464 766 1049 1754 3186 5735 4T 612 1018 1796 3149 5892 9095 8T 575 680 1277 2308 4987 7948 Total Elapsed Time 12.7 seconds Impossible Maximum RAM Speed 1018 x 16 = 16288 MB/second #################### T21 ARM-Intel #################### ARM/Intel MP-BusSpd v7 Benchmark V1.1 23-May-2015 17.05 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 1840 2073 3512 3554 4829 5243 2T 3432 4591 7128 7651 9120 9821 4T 4398 7855 13752 15428 18530 20235 8T 6692 9507 13857 16110 18143 18796 122.9 1T 860 753 2011 2841 3205 5282 2T 1505 1609 3076 5038 8089 10421 4T 1924 1981 4299 7588 14614 20754 8T 1909 1988 4264 7980 13884 19027 12288 1T 270 379 538 856 1626 2859 2T 471 677 1098 1849 3304 5924 4T 549 787 1066 1874 6274 10781 8T 713 853 1649 2258 4664 8321 Total Elapsed Time 13.1 seconds ########## T21 New Long Version ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.39 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2247 2616 4010 4443 4909 5614 2T 3558 4725 7241 9048 9747 10892 4T 6074 8303 13442 16937 18525 21068 8T 3998 5106 14314 13615 18200 20740 122.9 1T 874 1198 2024 2935 4529 5345 2T 1686 1702 3174 5357 7688 10545 4T 1988 2139 4465 8171 14969 21169 8T 1972 2139 4468 8195 15261 21132 49152 1T 292 406 516 899 1663 2929 2T 449 541 962 1569 2851 4776 4T 495 605 1109 2439 4161 8243 8T 530 564 1156 2149 4172 7907 Total Elapsed Time 48.0 seconds Maximum RAM Speed Estimate = 605 x 16 = 9680 MB/second #################### P37 32 Bit V1.2 #################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel MP-BusSpd2 Benchmark V1.2 14-Nov-2016 12.11 Compiled for 32 bit ARM v7a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2060 2433 2430 2487 2555 2625 2T 3966 4727 4886 4964 5091 5167 4T 6843 8675 9208 9581 10025 10254 8T 5360 6326 13507 10947 15929 16546 122.9 1T 666 672 1231 2000 2368 2524 2T 1029 1036 1993 3570 4766 5089 4T 1062 1098 2144 4166 7694 9835 8T 1737 1793 3540 6473 10502 14201 49152 1T 164 172 339 658 1247 2014 2T 289 307 591 1124 2192 3839 4T 410 353 813 1692 3015 6058 8T 429 426 842 1495 2949 5790 Total Elapsed Time 56.3 seconds Maximum RAM Speed Estimate = 426 x 16 = 6816 MB/second Android 7.0 ARM/Intel MP-BusSpd2 Benchmark V1.2 11-May-2017 10.35 Compiled for 32 bit ARM v7a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2151 2396 2448 2516 2589 2632 2T 4042 4460 4824 4893 5336 5192 4T 6828 8657 9409 9755 10120 10339 8T 5401 6897 13508 11464 15960 16792 122.9 1T 674 692 1267 2019 2402 2584 2T 1031 1043 1999 3591 4737 5047 4T 1064 1164 2168 4185 7761 9879 8T 1734 1857 3429 6438 10447 15287 49152 1T 163 172 337 674 1236 2098 2T 297 282 566 1101 2175 3735 4T 431 390 751 1470 3053 5716 8T 406 369 786 1621 2897 6031 Total Elapsed Time 57.0 seconds ###################### T22 32 Bit ###################### T22, Tab 2 A8-50, 1.3 GHz quad core 64 bit ARM Cortex-A53 Single Channel RAM, LPDDR3 666 MHz, 5.3 GB/second 4 x 32 KB L1 cache, 512 KB L2 cache ARM/Intel MP-BusSpd Benchmark V1.2 12-Aug-2015 16.13 Compiled for 32 bit ARM v7a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 1849 2140 2079 2211 2270 2297 2T 3663 4252 4294 4400 4370 4580 4T 4630 5574 5691 5893 6015 6083 8T 5331 5775 6033 6622 7968 8023 122.9 1T 597 621 1119 1815 2135 2237 2T 869 943 1644 2992 3740 4412 4T 949 951 1922 3736 6468 7779 8T 948 978 1911 3717 6464 7542 12288 1T 123 174 344 678 1215 1840 2T 243 310 672 1332 2383 3974 4T 302 285 594 1282 2271 4606 8T 279 295 654 1198 2749 4660 Total Elapsed Time 12.8 seconds ########## T22 Long Version ARM/Intel MP-BusSpd2 Benchmark V1.2 12-Aug-2015 16.14 Compiled for 32 bit ARM v7a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 1877 2124 2176 2266 2296 2343 2T 3625 4198 4341 4468 4536 4613 4T 5733 7541 8293 8830 8024 9042 8T 2985 3829 7438 6117 8108 8923 122.9 1T 604 625 1142 1846 2150 2284 2T 924 950 1793 3277 4270 4504 4T 962 989 1939 3765 6798 8862 8T 965 993 1933 3748 6651 8239 49152 1T 165 175 344 677 1285 1979 2T 234 238 482 961 1907 3547 4T 266 298 562 1224 2296 4478 8T 272 275 538 1098 2149 4282 Total Elapsed Time 48.8 seconds Maximum RAM Speed Estimate = 298 x 16 = 4768 MB/second ###################### T22 64 Bit ###################### ARM/Intel MP-BusSpd2 Benchmark V1.2 12-Aug-2015 16.18 Compiled for 64 bit ARM v8a MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 2610 2472 2586 2727 2748 5841 2T 4404 4681 4994 5369 5420 11297 4T 6546 8125 9105 10243 10319 20610 8T 3380 4023 7919 7146 9871 19852 122.9 1T 604 621 1110 1872 2446 5100 2T 919 948 1855 3433 4853 10037 4T 961 974 1984 3924 7491 14935 8T 963 942 1931 3915 7572 14689 49152 1T 173 177 340 692 1300 2653 2T 266 241 479 968 1883 3724 4T 304 277 556 1130 2126 4328 8T 279 278 544 1138 2179 4275 Total Elapsed Time 49.4 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s 4 x 24 KB L1, 2 x 1 MB L2 Android MP-BusSpd v7 Benchmark V1.1 05-May-2015 13.02 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 3990 4458 6123 6512 6438 6729 2T 3894 5699 8948 10299 11800 12555 4T 5046 7109 11952 14750 15533 23304 8T 4533 7464 13097 16970 21674 22225 122.9 1T 1304 1613 2291 2661 3667 5063 2T 2568 3145 4529 5365 7440 10147 4T 4117 4801 7963 7495 8239 18911 8T 3130 5016 7355 8543 11648 15845 12288 1T 190 265 601 1203 2316 3832 2T 244 448 995 1771 3599 6575 4T 427 584 860 1741 3439 7449 8T 395 510 855 1613 3547 6776 Total Elapsed Time 13.5 seconds ################## A1 V1 Android 5.0 ################### Android MP-BusSpd v7 Benchmark V1.1 05-Nov-2015 11.52 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5509 6152 6796 6937 7060 7056 2T 4635 6757 9294 11284 12612 13486 4T 4545 9383 15861 21378 15369 23493 8T 4473 8723 15965 18476 23438 22747 122.9 1T 1467 1782 2386 2737 3799 5299 2T 2225 3460 4683 5421 7507 10514 4T 2493 5703 8165 9941 11313 11259 8T 4119 5481 6992 8726 12919 17166 12288 1T 213 253 589 1176 2309 3903 2T 252 396 842 1668 3325 6759 4T 404 437 1130 1659 4562 6911 8T 414 507 836 1902 3607 6670 Total Elapsed Time 13.9 seconds #################### A1 ARM-Intel ###################### ARM/Intel MP-BusSpd v7 Benchmark V1.1 05-May-2015 14.28 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5925 6494 6778 6979 7047 7026 2T 3966 7029 9689 11689 12856 13654 4T 4438 8698 16739 22057 23946 25729 8T 4455 8619 15787 19934 22576 20804 122.9 1T 1490 1975 2360 2802 3818 5330 2T 2881 3798 4647 5531 7536 10546 4T 4452 6338 5910 10217 14650 19903 8T 4096 5075 6264 9213 12610 15821 12288 1T 206 273 593 1198 2343 3935 2T 276 455 842 1821 3319 6591 4T 445 730 1401 2076 4457 7525 8T 424 539 954 1829 3688 7064 Total Elapsed Time 13.0 seconds ########## A1 New Long Version ARM/Intel MP-BusSpd2 Benchmark V1.0 24-Jul-2015 15.50 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5431 6110 6780 6262 6655 7313 2T 3550 4464 7375 9825 11777 12442 4T 2027 4442 4399 8841 17611 23509 8T 983 2477 5063 4433 8568 15867 122.9 1T 1499 1991 2357 2839 3818 5382 2T 2816 3808 4708 5592 7557 10677 4T 4316 6313 7991 9816 14335 19993 8T 4235 5610 7917 8791 12828 19661 49152 1T 215 275 611 1183 2328 3922 2T 276 435 787 1671 3323 6507 4T 398 455 884 1754 3490 6971 8T 376 511 867 1746 3512 7510 Total Elapsed Time 48.6 seconds Maximum RAM Speed Estimate = 511 x 16 = 8176 MB/second ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel MP-BusSpd2 Benchmark V1.2 14-Apr-2016 17.28 Compiled for 32 bit Intel x86 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5322 6275 6475 6901 6959 6925 2T 4625 4163 6792 8964 10879 11027 4T 2221 3775 4091 8006 15158 19631 8T 1178 1840 3907 3884 8002 15691 122.9 1T 1438 1891 2342 2601 3477 4957 2T 2509 3489 4597 5115 6807 9275 4T 3591 4849 6905 8356 11204 14596 8T 3868 5327 7014 7860 10754 15998 49152 1T 179 205 391 802 1372 3023 2T 238 310 495 1204 2397 4559 4T 240 336 653 1170 2008 4969 8T 291 321 681 1316 2378 5329 Total Elapsed Time 50.3 seconds Maximum RAM Speed Estimate = 336 x 16 = 5376 MB/second #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-BusSpd Benchmark V1.2 21-Oct-2016 14.29 Compiled for 32 bit Intel x86 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5659 5848 5977 6263 6100 6481 2T 4075 6144 7960 9632 10899 11283 4T 3766 6335 7923 9544 10679 11425 8T 3531 6367 7693 7739 8336 7918 122.9 1T 1389 1492 2456 2702 1564 5013 2T 2080 2904 2943 3073 4785 7541 4T 1995 2761 4446 4114 5075 8011 8T 1673 2504 2711 3097 6693 8366 12288 1T 190 230 453 877 1681 2396 2T 222 246 405 1287 2291 3926 4T 180 299 588 1469 2951 5002 8T 303 380 701 1265 2476 6796 Total Elapsed Time 14.2 seconds Maximum RAM Speed Estimate = 380 x 16 = 6080 MB/second #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-BusSpd2 Benchmark V1.2 11-Nov-2016 21.25 Compiled for 64 bit Intel x86_64 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 3870 3871 4281 4386 4382 16766 2T 3290 3312 5048 5924 6729 22511 4T 4909 6232 6866 7231 7745 27366 8T 2662 3012 6328 6364 8818 26211 122.9 1T 1506 1534 2471 2433 3510 9204 2T 2071 2479 3727 4428 5757 17952 4T 2636 2833 5013 4918 7263 22352 8T 2552 3360 5211 6178 7819 23389 49152 1T 243 245 565 1037 1469 3522 2T 329 370 565 1425 2421 4783 4T 329 387 673 1501 3148 4866 8T 402 433 858 1681 2838 6987 Total Elapsed Time 53.8 seconds Maximum RAM Speed Estimate = 433 x 16 = 6928 MB/second ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10 4 GB DDR3 1600 dual channel 12.8 GB/s MP-BusSpeed From C/C++ 18.00.21005.1 for x86 Start of test Wed Dec 23 20:57:34 2015 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 6170 6348 6836 6869 7029 6743 2T 1859 3059 5657 7800 9685 10880 4T 989 1804 3289 5900 10157 16055 8T 473 843 1578 3101 5665 10124 122.9 1T 1476 1532 2319 2679 3515 4824 2T 2234 2733 4337 5226 6710 9655 4T 3428 4628 6956 8606 10978 16225 8T 2675 3965 6432 8355 11139 15714 49152 1T 241 273 565 1090 2130 3848 2T 346 409 734 1591 3082 5762 4T 499 496 947 1887 3818 7634 8T 476 500 930 1888 3932 7625 End of test Wed Dec 23 20:58:22 2015 Maximum RAM Speed Estimate = 500 x 16 = 8000 MB/second ################# W1 Windows 10 64 bit ################# MPbusSpeed64 From C/C++ 18.00.21005.1 for x64 Start of test Wed Dec 23 21:15:07 2015 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5222 6158 6233 6523 6404 6580 2T 1882 3670 6113 8124 9540 10760 4T 1089 1817 3378 6083 10832 15242 8T 505 837 1846 3250 5899 9788 122.9 1T 1424 1540 2285 2544 3490 4854 2T 2567 2756 4233 4920 6579 9820 4T 3444 4858 6699 8186 11628 16690 8T 2593 3644 5671 7370 9304 13630 49152 1T 240 268 566 1097 2070 3860 2T 342 411 754 1448 2940 5836 4T 451 494 894 1902 3804 7526 8T 424 503 935 1830 3710 7180 End of test Wed Dec 23 21:15:55 2015 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 MP-BusSpeed From C/C++ 18.00.21005.1 for x86 Start of test Fri Apr 15 16:19:46 2016 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5387 5874 6023 6023 6158 6175 2T 2051 3414 5527 6968 9063 9875 4T 1105 1897 3213 5706 9238 13066 8T 452 830 1874 3063 5620 8967 122.9 1T 1266 1286 2041 2420 3084 4283 2T 2258 2657 3976 4624 5973 8438 4T 3163 4119 5893 7241 10447 15588 8T 2540 3404 5628 8170 8647 12274 49152 1T 139 170 319 592 986 2063 2T 202 225 442 802 1633 3542 4T 295 359 597 1220 2489 5001 8T 282 313 651 1159 2359 5166 End of test Fri Apr 15 16:20:38 2016 Maximum RAM Speed Estimate = 313 x 16 = 5008 MB/second ######## W2 Windows 10 64 bit Dual Boot With A5 ######## MPbusSpeed64 From C/C++ 18.00.21005.1 for x64 Start of test Fri Apr 15 16:31:03 2016 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 5414 5881 5982 6593 6320 6915 2T 2004 3844 6095 8469 10032 11237 4T 977 1709 3311 6239 12238 17994 8T 498 862 1737 3185 5915 10456 122.9 1T 1515 1537 2447 2750 3625 5040 2T 2330 2730 4064 4923 6364 9105 4T 3702 4830 7300 8835 11707 16740 8T 2587 3613 5718 7715 9699 16216 49152 1T 183 198 429 834 1652 3143 2T 244 303 565 1144 2221 4537 4T 346 324 644 1284 2552 5123 8T 306 307 618 1249 2421 4874 End of test Fri Apr 15 16:31:54 2016 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-BusSpd2 Benchmark V1.2 21-Oct-2016 12.32 Compiled for 32 bit Intel x86 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 13032 13915 24235 25197 22774 23523 2T 12780 25046 41965 50097 47757 52797 4T 27568 24981 36907 46686 50510 64687 8T 14880 22221 47422 54616 80188 96729 122.9 1T 7133 6612 9381 15623 21204 26016 2T 7641 13474 22117 24280 44150 51649 4T 19935 25520 43348 41204 69425 101560 8T 31478 38036 59094 79377 96106 103008 49152 1T 712 1034 2181 4347 8729 13516 2T 1510 2074 2393 8057 15548 27128 4T 2952 2228 6703 13593 27804 42109 8T 4961 4460 8805 25670 49205 68560 Total Elapsed Time 53.2 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-BusSpd2 Benchmark V1.2 11-Nov-2016 14.29 Compiled for 64 bit Intel x86_64 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 11234 11268 11549 9728 11075 83709 2T 13975 18788 21241 20376 21981 126069 4T 11950 16021 25702 25888 22591 129598 8T 7847 11333 22999 26446 39027 137208 122.9 1T 7270 7472 9070 11037 11565 57013 2T 12151 13359 18497 21814 22939 110321 4T 23054 19821 35736 42796 23494 145387 8T 25125 32352 39249 44178 46373 261178 49152 1T 651 966 1872 3496 7749 18057 2T 930 1979 3815 6002 11796 33883 4T 2876 3639 7142 13308 26695 60051 8T 3802 4639 12125 22329 39597 106907 Total Elapsed Time 56.2 seconds ============================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz Quad core, 8 threads, 10 MB shared L3 cache RAM 1600 MHz, quad channel, 51.2 GB/sec ============================================== Intel/Windows 32 Bit Version MP-BusSpeed From C/C++ 18.00.21005.1 for x86 Start of test Sun Feb 14 18:30:05 2016 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 14262 14567 19724 19553 19374 19743 2T 10737 12187 18359 23285 31442 31491 4T 5537 7660 13862 24507 32888 42530 8T 3967 6138 14340 22999 39199 60117 122.9 1T 7263 7213 11664 16448 19425 20552 2T 10361 9428 20446 31143 34263 40155 4T 18846 21063 38732 54792 57770 56587 8T 22328 32794 54749 69742 79276 80967 49152 1T 668 1031 2141 4185 8650 14974 2T 1210 1726 3867 7731 15627 28522 4T 2161 3177 6122 11449 25009 41192 8T 4728 4106 9842 23118 43257 61779 End of test Sun Feb 14 18:31:00 2016 Intel/Windows 64 Bit Version MPbusSpeed64 From C/C++ 18.00.21005.1 for x64 Start of test Sun Feb 14 18:46:52 2016 MB/Second Reading Data, 1, 2, 4 and 8 Threads KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll 12.3 1T 14760 14788 21402 20729 20934 21032 2T 12570 19878 27089 35589 37688 41618 4T 7000 11473 21725 34776 51827 74198 8T 3728 6525 14160 23059 40659 66975 122.9 1T 7571 7448 11828 16724 20283 21671 2T 13291 13676 22360 32586 39872 42740 4T 18270 21303 37555 62890 78583 84191 8T 21030 30880 53098 71255 91804 103575 49152 1T 663 1037 2159 4187 8611 15218 2T 1207 1720 3908 6418 15470 27796 4T 2319 2382 7002 13639 23754 46951 # 8T 4728 5602 12178 21784 35170 80274 # End of test Sun Feb 14 18:47:43 2016 # Some data from sharesd 10 MB L3 cache ######### Comparison MB/sec/MHz and RAM MB/sec ######### Unless indicated all are quad core CPUs, Core i7 runs up to 8 threads using HyperThreading dual 8 core T7 T11 T21 A1 A5 P37 KB Cortex Cortex Qualcom Atom Atom Cortex A9 A15 800 Z3745 z8300 A53 MB/sec/MHz 12.3 1T 2.72 3.56 2.61 3.93 4.81 1.75 2T 4.84 6.66 5.07 6.69 7.66 3.44 4T 9.81 6.36 9.80 12.64 13.63 6.84 8T 9.62 6.53 9.65 8.53 10.90 11.03 122.9 1T 0.82 1.96 2.49 2.89 3.44 1.68 2T 1.51 3.07 4.90 5.74 6.44 3.39 4T 2.31 2.83 9.85 10.75 10.14 6.56 8T 2.28 3.04 9.83 10.57 11.11 9.47 RAM MB/sec 49152 1T 330 1499 2929 3922 3023 2014 2T 620 2355 4776 6507 4559 3839 4T 835 2376 8243 6971 4969 6058 8T 851 2352 7907 7510 5329 5790 4T gain L1 3.61 1.79 3.75 3.21 2.83 3.91 L2 2.82 1.44 3.96 3.71 2.94 3.90 RAM 2.53 1.59 2.81 1.78 1.64 3.01 ======================================================== 64 bit compilations compared with 32 bit ======================================================== Android REMIX/Android 8HT 8HT T22 32 T22 64 R1 32 R1 64 R2 32 R2 64 KB Cortex Cortex Atom Atom Corei7 Corei7 A53 A53 Z8300 Z8300 4820K 4820K MB/sec/MHz 12.3 1T 1.80 4.49 3.52 9.11 6.03 21.46 2T 3.55 8.69 6.13 12.23 13.54 32.33 4T 6.96 15.85 6.21 14.87 16.59 33.23 8T 6.86 15.27 4.30 14.25 24.80 35.18 122.9 1T 1.76 3.92 2.72 5.00 6.67 14.62 2T 3.46 7.72 4.10 9.76 13.24 28.29 4T 6.82 11.49 4.35 12.15 26.04 37.28 8T 6.34 11.30 4.55 12.71 26.41 66.97 RAM MB/sec 49152 1T 1979 2653 2396 3522 13516 18057 2T 3547 3724 3926 4783 27128 33883 4T 4478 4328 5002 4866 42109 60051 8T 4282 4275 6796 6987 68560 106907 4T gain L1 3.86 3.53 1.76 1.63 2.75 1.55 L2 3.88 2.93 1.60 2.43 3.90 2.55 RAM 2.26 1.63 2.09 1.38 3.12 3.33 64/32 Bit L1 2.49 2.59 3.56 ======================================================== 64 bit compilations compared with 32 bit ======================================================== Windows 8HT 8HT W1 32 W1 64 W2 32 W2 64 PC 32 PC 64 KB Atom Atom Atom Atom Corei7 Corei7 Z8300 Z8300 z8300 z8300 4820K 4820K MB/sec/MHz 12.3 1T 4.68 4.57 3.36 3.76 5.06 5.39 2T 7.56 7.47 5.37 6.11 8.07 10.67 4T 11.15 10.58 7.10 9.78 10.91 19.03 8T 7.03 6.80 4.87 5.68 15.41 17.17 122.9 1T 3.35 3.37 2.33 2.74 5.27 5.56 2T 6.70 6.82 4.59 4.95 10.30 10.96 4T 11.27 11.59 8.47 9.10 14.51 21.59 8T 10.91 9.47 6.67 8.81 20.76 26.56 RAM MB/sec 49152 1T 3848 3860 2063 3143 14974 15218 2T 5762 5836 3542 4537 28522 27796 4T 7634 7526 5001 5123 41192 46951 # 8T 7625 7180 5166 4874 61779 80274 # # Core i7 results - some data from sharesd 10 MB L3 cache 4T gain L1 2.38 2.32 2.12 2.60 2.15 3.53 L2 3.36 3.44 3.64 3.32 2.75 3.88 RAM 1.98 1.95 2.42 1.63 2.75 3.09 64/32 Bit L1 0.98 1.12 1.07 Android/Win 0.75 1.99 1.19 3.98

To Start

This is an ARM/Intel version of the longer running MP-RndMem Benchmark, as the original, short version, produced inconsistent performance measurements. It is a multithreading variety of RandMem above. For further details and more results see here.

On tablet A1, with the Intel Atom CPU, the initial Houdini ARM to Intel conversion speeds were significantly slower than the results from the native code compilations. This problem was overcome via Android 5 procedures, when most results were faster.

On ARM based tablets, the new ARM/Intel compilations were generally slower than the original produced by an earlier compiler, but most of the difference was regained on a 64 bit version.

Intel/Windows Versions - Maximum data size of 12.3 MB was fine for early Android devices but performance can be affected by shared L2 caches on later ones. The Core i7 test, at this size, is mainly using the 10 MB L3 cache.

Later Intel Comparisons - Later measurements demonstrated inconsistent performance, using Intel Atom CPUs. For example, a second run of the tests could be faster, also between Windows and REMIX (Android) benchmarks and 32 bit vs 64 bit versions. On the other hand, all these comparisons were fairly consisten on the Intel Core i7 tests.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, 4 x 32 KB L1 cache, 1 MB shared L2 cache Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.17 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3120 3060 3128 3078 2T 6098 3003 6083 3004 4T 11354 2948 11188 2942 8T 11403 2857 10412 2872 122.9 1T 996 983 661 699 2T 1868 984 1012 697 4T 2600 982 1483 699 8T 2534 976 1459 694 12288 1T 335 286 91 80 2T 640 288 113 82 4T 892 286 130 82 8T 925 287 127 81 Total Elapsed Time 44.7 seconds #################### T7 ARM-Intel ##################### ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 11.59 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3060 2001 2867 1904 2T 5459 1879 5463 1867 4T 10797 1852 10537 1856 8T 10090 1802 10608 1813 122.9 1T 968 823 588 547 2T 1749 785 902 618 4T 2716 812 1328 672 8T 2733 810 1407 673 12288 1T 329 274 90 82 2T 636 272 112 82 4T 849 271 128 82 8T 869 271 126 81 Total Elapsed Time 45.4 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 2 x 32 KB L1 cache, 1 MB shared L2 cache Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.13 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 6696 4438 6594 4483 2T 12338 3078 12263 3573 4T 12419 2834 12166 2907 8T 12314 2903 11991 2934 122.9 1T 3371 2916 1639 1748 2T 6409 1922 2052 1097 4T 6155 1892 2027 1186 8T 6045 2105 2015 1192 12288 1T 1394 1048 153 133 2T 2245 985 285 123 4T 2277 1002 285 132 8T 2165 1001 286 127 Total Elapsed Time 44.0 seconds #################### T11 ARM-Intel #################### ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 12.07 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 6315 4486 6345 4484 2T 11837 2910 11846 3112 4T 11864 2835 11553 2858 8T 11821 3003 11805 3198 122.9 1T 3963 2681 1670 1704 2T 6672 1782 2040 1125 4T 6493 1817 2033 1218 8T 6673 1738 2038 1303 12288 1T 1805 1081 177 145 2T 2543 1066 279 137 4T 2600 1065 276 136 8T 2662 1073 281 138 Total Elapsed Time 43.7 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s L1 caches 4 x 16 KB, L2 cache shared 2048 KB Android MP-RndMem2 Benchmark V2.1 08-Jul-2015 16.33 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 5088 5325 4262 4711 2T 9752 4902 8895 4570 4T 17379 4653 17434 4096 8T 19771 4698 17358 4424 122.9 1T 2714 2578 1923 2163 2T 5614 2502 3483 2107 4T 10859 2219 4835 1972 8T 10654 2410 4904 1923 12288 1T 1798 952 186 204 2T 3489 974 341 195 4T 6515 943 563 196 8T 6218 922 563 187 Total Elapsed Time 42.3 seconds #################### T21 ARM-Intel #################### ARM/Intel MP-RndMem Benchmark V1.1 09-Jul-2015 11.48 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 4186 3777 4055 3933 2T 9324 3541 7710 3619 4T 16594 3350 15731 3142 8T 18117 3291 16187 3262 122.9 1T 2423 2043 1610 1683 2T 5235 2029 3013 1641 4T 10148 1935 4662 1565 8T 10015 1834 4611 1474 12288 1T 1363 886 171 186 2T 2643 845 325 187 4T 5197 823 534 184 8T 4801 835 542 184 Total Elapsed Time 42.6 seconds ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel MP-RndMem Benchmark V1.2 14-Nov-2016 12.13 Compiled for 32 bit ARM v7a MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3464 2779 3249 2792 2T 6473 2549 6471 2574 4T 12671 2355 12644 2243 8T 20039 2055 19677 1837 122.9 1T 3142 2667 843 847 2T 6072 2463 1552 785 4T 11678 2098 2400 675 8T 15639 2228 3822 668 12288 1T 2404 887 71 70 2T 4058 899 141 69 4T 5665 867 258 67 8T 7169 881 410 66 Total Elapsed Time 49.2 seconds Android 7.0 ARM/Intel MP-RndMem Benchmark V1.2 17-Mar-2017 10.43 Compiled for 32 bit ARM v7a MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3497 2803 3267 2770 2T 6443 2600 6495 2585 4T 12818 2264 12751 2318 8T 20056 2121 19918 2160 122.9 1T 3148 2672 824 865 2T 6104 2493 1562 800 4T 11723 2203 2423 698 8T 16376 2120 3930 733 12288 1T 2554 931 73 72 2T 4276 909 148 70 4T 6703 872 267 68 8T 6425 914 407 67 Total Elapsed Time 47.9 seconds #################### T22 Original ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 4 x 32 KB L1 cache, 512 KB shared L2 cache Android MP-RndMem2 Benchmark V2.1 11-Nov-2015 13.03 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3401 3874 3435 3892 2T 6777 3817 6592 3773 4T 13025 3729 12630 3685 8T 12848 3654 12113 3654 122.9 1T 3257 3583 827 946 2T 6416 3572 1481 943 4T 11897 3564 2205 934 8T 11106 3550 2173 945 12288 1T 2397 1734 82 93 2T 4652 1725 161 94 4T 5834 1748 287 94 8T 4774 1743 276 93 Total Elapsed Time 45.9 seconds ###################### T22 32 Bit ###################### ARM/Intel MP-RndMem Benchmark V1.2 12-Aug-2015 17.13 Compiled for 32 bit ARM v7a MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 2894 2438 2887 2433 2T 5665 2402 5663 2403 4T 10922 2369 11100 2310 8T 10065 2293 10648 2265 122.9 1T 2681 2368 757 758 2T 5351 2360 1398 769 4T 10056 2308 2121 772 8T 8838 2351 1916 742 12288 1T 2309 1662 80 78 2T 3986 1683 164 73 4T 5419 1684 283 82 8T 4658 1694 279 82 Total Elapsed Time 44.6 seconds ###################### T22 64 Bit ###################### ARM/Intel MP-RndMem Benchmark V1.2 12-Aug-2015 17.15 Compiled for 64 bit ARM v8a 12.29 1T 4445 3109 4455 3089 2T 8010 3100 8072 3105 4T 15909 3057 14711 3040 8T 14764 3036 14570 3037 122.9 1T 3457 2888 842 876 2T 6537 2924 1524 876 4T 11095 2892 2119 861 8T 11729 2916 2080 874 12288 1T 2475 1679 81 78 2T 4155 1713 163 73 4T 5503 1711 285 89 8T 4519 1717 281 89 Total Elapsed Time 48.1 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s 4 x 24 KB L1, 2 x 1 MB L2 Android MP-RndMem2 Benchmark V2.1 06-May-2015 12.14 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 1337 2505 1337 2509 2T 2637 2513 2657 2521 4T 3535 2420 3484 2454 8T 3195 2403 3088 2406 122.9 1T 1305 2280 963 1758 2T 2581 2285 1945 1748 4T 3588 2130 3125 1740 8T 3211 2269 2949 1745 12288 1T 1248 1962 101 215 2T 2469 1940 191 214 4T 3462 1954 323 214 8T 3127 1926 318 212 Total Elapsed Time 43.7 seconds ################## A1 V1 Android 5.0 ################### Android MP-RndMem2 Benchmark V2.1 05-Nov-2015 11.55 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 5580 5533 5554 5455 2T 10460 5393 8625 5336 4T 15584 5013 12183 5211 8T 14687 4850 9754 4882 122.9 1T 4180 4368 2557 2522 2T 8301 4276 5072 2511 4T 15613 4238 7764 2425 8T 14496 4259 7278 2466 12288 1T 3360 2180 239 239 2T 6219 2140 379 240 4T 6758 2135 418 238 8T 6991 2131 418 232 Total Elapsed Time 47.6 seconds #################### A1 ARM-Intel ###################### ARM/Intel MP-RndMem Benchmark V1.1 06-May-2015 11.54 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 4643 3593 4710 3641 2T 8583 3552 8761 3564 4T 12707 3450 12496 3384 8T 10410 3389 10796 3408 122.9 1T 3733 2874 2408 2150 2T 7259 2871 4781 2165 4T 11726 2897 7656 2133 8T 11673 2853 7100 2113 12288 1T 3153 2087 226 238 2T 5782 2073 327 238 4T 6451 1997 447 236 8T 6471 2071 446 233 Total Elapsed Time 41.5 seconds ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel MP-RndMem Benchmark V1.2 14-Apr-2016 17.41 Compiled for 32 bit Intel x86 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 4395 3558 4562 3346 2T 8094 3465 7975 3372 4T 11923 3377 11375 3220 8T 10165 3207 10220 3205 122.9 1T 3519 2796 2360 1993 2T 6875 2591 4233 1970 4T 10225 2761 5943 1935 8T 10158 2755 6363 2052 12288 1T 2586 1846 187 192 2T 3890 1728 310 213 4T 5035 1986 373 194 8T 3972 1887 359 186 Total Elapsed Time 44.0 seconds #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-RndMem Benchmark V1.2 21-Oct-2016 14.32 Compiled for 32 bit Intel x86 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 4504 3504 4322 3382 2T 7137 2799 5874 3446 4T 8441 2526 7759 3049 8T 7693 1763 8478 1300 122.9 1T 2947 2777 2389 2086 2T 5791 2196 3345 1799 4T 6721 1821 4257 1475 8T 7466 1129 4926 1201 12288 1T 3026 2278 201 239 2T 3850 1687 326 218 4T 4451 1772 304 215 8T 5007 1407 407 160 Total Elapsed Time 47.0 seconds #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-RndMem Benchmark V1.2 11-Nov-2016 21.30 Compiled for 64 bit Intel x86_64 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 3501 2736 3655 2561 2T 5999 2462 6015 1922 4T 7295 1306 5998 1930 8T 7895 983 7769 1607 122.9 1T 2851 2036 2273 1861 2T 4950 1772 2973 1623 4T 6384 1405 4053 1292 8T 6409 1046 4598 1049 12288 1T 2362 1826 207 225 2T 3609 1356 349 185 4T 3711 1378 288 174 8T 4910 1131 436 120 Total Elapsed Time 51.0 seconds ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10 4 GB DDR3 1600 dual channel 12.8 GB/s MPRandMem32 From C/C++ 18.00.21005.1 for x86 Start of test Mon Dec 12 16:17:43 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 4227 5149 4457 5086 2T 7978 5490 7846 5379 4T 10589 5292 10543 5208 8T 7912 5066 8068 5137 122.9 1T 3571 3893 2345 2380 2T 6453 3867 4227 2327 4T 11784 3845 6403 2385 8T 11449 3950 6431 2373 12288 1T 2948 2750 222 227 2T 4889 2761 408 229 4T 6290 2771 532 231 8T 6256 2724 534 269 End of test Mon Dec 12 16:18:27 2016 ################# W1 Windows 10 64 bit ################# MPRandMem64 From C/C++ 18.00.21005.1 for x64 Start of test Mon Dec 12 16:22:12 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 3816 4658 3884 4495 2T 7060 4531 6971 4390 4T 12603 4383 12604 4334 8T 12435 4179 12493 4215 122.9 1T 3212 3594 2431 2248 2T 5919 3437 4220 2302 4T 11178 3459 6838 2299 8T 10630 3539 6775 2280 12288 1T 2789 2689 228 229 2T 4688 2663 424 242 4T 6079 2670 561 250 8T 6061 2667 562 270 End of test Mon Dec 12 16:22:55 2016 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 MPRandMem32 From C/C++ 18.00.21005.1 for x86 Start of test Mon Dec 12 16:29:17 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 4151 4929 4126 5104 2T 7501 5063 7496 4887 4T 10549 4933 10620 5206 8T 7259 5126 7278 5072 122.9 1T 3576 3997 2358 2372 2T 6223 3629 3763 2206 4T 11064 3709 6300 2234 8T 11442 3464 5399 2334 12288 1T 2691 2043 195 203 2T 3706 1999 315 217 4T 5382 2098 371 205 8T 5067 1925 352 197 End of test Mon Dec 12 16:30:01 2016 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## MPRandMem64 From C/C++ 18.00.21005.1 for x64 Start of test Mon Dec 12 16:26:52 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 3606 4076 3535 4068 2T 5461 3879 6031 3761 4T 11092 3779 10265 4028 8T 9485 3753 9284 3728 122.9 1T 2465 2726 1897 1916 2T 4836 2957 3673 2066 4T 8259 3168 4491 1974 8T 10424 3125 6583 2052 12288 1T 2246 1655 187 188 2T 3245 1769 301 187 4T 4933 1560 360 186 8T 4345 1790 344 175 End of test Mon Dec 12 16:27:38 2016 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-RndMem Benchmark V1.2 21-Oct-2016 12.49 Compiled for 32 bit Intel x86 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 25329 28404 22502 27578 2T 45352 28049 43404 29901 4T 67532 27226 66231 26721 8T 73022 27909 70942 29210 122.9 1T 24237 24426 12519 8183 2T 40910 24130 22546 8612 4T 67966 22138 28955 7129 8T 74659 18872 46730 7929 12288 1T 14375 12505 1139 1127 2T 27645 11799 2248 1105 4T 48129 11772 3564 1078 8T 72818 12119 4256 775 Total Elapsed Time 43.6 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-RndMem Benchmark V1.2 11-Nov-2016 14.35 Compiled for 64 bit Intel x86_64 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.29 1T 26834 29490 27107 28870 2T 54416 29824 53434 25831 4T 105809 27746 56139 20591 8T 85898 19779 84818 21910 122.9 1T 23931 25524 11601 8270 2T 48842 25062 23859 8412 4T 98110 22674 47244 7154 8T 89250 16270 53559 5951 12288 1T 15175 12540 1077 1127 2T 29600 11483 2342 1095 4T 43737 10585 2200 904 8T 78035 11667 4351 755 Total Elapsed Time 46.1 seconds ============================================== Top end 2015 PC - Core i7-4820K at 3.9 GHz Quad core, 8 threads, 10 MB shared L3 cache RAM 1600 MHz, quad channel, 51.2 GB/sec ============================================== Intel/Windows 32 Bit Version MPRandMem32 From C/C++ 18.00.21005.1 for x86 Start of test Tue Feb 23 16:05:00 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 26590 29369 27321 27593 2T 52121 30063 48980 27757 4T 66651 29464 72519 27466 8T 58774 28464 57426 26236 122.9 1T 25876 28670 13416 8815 2T 46692 28183 21803 8767 4T 82678 28469 46885 8497 8T 83158 28482 49158 8677 12288 1T 16527 13042 1196 1191 2T 27888 12767 2389 1188 4T 49291 13049 3393 1191 8T 84109 12954 4176 1192 End of test Tue Feb 23 16:05:41 2016 Intel/Windows 64 Bit Version MPRandMem64 From C/C++ 18.00.21005.1 for x64 Start of test Tue Feb 23 16:06:04 2016 MB/Second Using 1, 2, 4 and 8 Threads KB SerRD SerRDWR RndRD RndRDWR 12.3 1T 26322 28220 25930 28695 2T 54658 30081 39512 27874 4T 99694 29950 89274 27925 8T 88620 29773 85848 27924 122.9 1T 25196 27993 13424 8633 2T 44627 28207 21816 8785 4T 65329 28108 44155 8620 8T 91445 28208 53751 8715 12288 1T 17662 13110 1301 1198 2T 32242 12856 2595 1198 4T 57536 13117 4905 1197 8T 85697 13079 4645 1197 End of test Tue Feb 23 16:06:46 2016

To Start

The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2 and 32 operations per input data word, using 1, 2, 4 and 8 threads. Data sizes are limited to three to use L1 cache, L2 cache and RAM at 12.8, 128 and 12800 KB (3200, 32000 and 3200000 single precision floating point words). Each thread uses the same calculations but accessing different segments of the data. The program checks for consistent numeric results, primarily to show that all calculations are carried out and can be run. The numeric results start with values of 1.0, with subsequent calculations reducing the values, the amount depending on the number of calculations. Further details, results and links to download original MP-MFLOPS benchmark can be found here, with more details of the latest MP-MFLOP2S compilations here. The newer versions have longer running times that avoid inconsistent speeds produced by the original.

Using Tablet A1, with the Intel Atom CPU, the original ARM only version was much slower than the native code variety, at 32 operations per word, and running via Android 5.0 was not much faster. Similarly, there was little difference on ARM based systems, between the original and later compilations.

Tablet T22 results, from the 64 bit compilation, showed that it could be much faster than the 32 bit benchmark, up to 3.7 times at 2 operations per word. The reason is that 64 bit vector SIMD instructions were produced, instead of scalars.

MFLOPS/MHz Comparisons - These are provided to compare different CPU technology. None of these are particularly good, the best being The Cortex A53 at 64 bits, producing just over 1 result per cycle per CPU.

Intel/Windows Versions - The compiler used for these appears to be somewhat more advanced than that used for Intel/Android, implementing full SIMD SSE instructions for 64 bit and 32 bit benchmarks. The result is that a Z8300 Atom CPU core produced up to 1.66 MFLOPS/MHz. The maximum speed of a Core i7, using SSE instructions, is 4 multiplies and 4 linked adds per cycle (8 MFLOPS/MHz). This benchmark demonstrated more than 5.5 MFLOPS/MHz.

A5 and W2 Dual Boot Tablet - The Windows compilation is much faster than the Android version, as SSE SIMD type instructions are used. For comparable performance see A5 results below in section NEON-MFLOPS-MP Benchmark. This uses hand coded NEON intrinsic functions, rather than compiler generated machine code. Speeds from the 64 bit version appear to be somewhat faster than the 32 bit variety. However, note that there can ve wide variations in recorded results.

REMIX/Android vs Windows - Windows was faster at 32 bits but performance was similar at 64 bits.for both Atom Z8300 and Core i7.

Other 64 Bit vs 32 Bit - REMIX/Android produced significantly increased speeds at 64 bits, on the Atom and Core i7.

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, Android MP-MFLOPS2 Benchmark V2.1 05-Feb-2015 11.37 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 182 156 114 598 578 572 2T 365 321 194 1194 1163 1141 4T 716 655 233 2367 2316 2240 8T 717 682 233 2347 2371 2246 Total Elapsed Time 135.5 seconds #################### T7 ARM-Intel ##################### ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 17.44 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 188 156 116 598 578 574 2T 365 319 197 1195 1161 1145 4T 682 709 237 2372 2345 2249 8T 678 731 237 2361 2381 2254 Total Elapsed Time 135.0 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 Android MP-MFLOPS2 Benchmark V2.1 29-Apr-2015 10.22 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 845 817 544 1546 1539 1512 2T 1593 1668 648 3140 3067 2977 4T 1974 1775 645 2963 3093 2845 8T 1935 2059 652 3108 3147 2985 Total Elapsed Time 58.5 seconds #################### T11 ARM-Intel #################### ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 20.30 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 695 756 536 1537 1501 1476 2T 1319 1527 645 3151 3077 3000 4T 1604 1567 657 3035 3095 2997 8T 1604 1639 658 3108 3125 2996 Total Elapsed Time 59.1 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Android MP-MFLOPS2 Benchmark V2.1 05-Jul-2015 15.35 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 718 781 590 1214 1220 1228 2T 1572 1583 1118 2406 2436 2442 4T 2338 2959 1836 4867 4911 4859 8T 3148 3266 1866 4870 4916 4888 Total Elapsed Time 56.4 seconds #################### T21 ARM-Intel #################### ARM/Intel MP-MFLOPS2 Benchmark V2.1 05-Jul-2015 16.50 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 822 768 636 1232 1228 1231 2T 1662 1637 1184 2460 2463 2446 4T 2509 3216 1659 4519 4762 4900 8T 2965 3193 1881 4847 4925 4880 ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Nov-2016 12.16 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 229 226 217 811 810 797 2T 451 446 422 1615 1617 1591 4T 884 857 646 3213 3199 3159 8T 1309 1276 714 5192 5164 5030 Total Elapsed Time 90.7 seconds Android 7.0 ARM/Intel MP-MFLOPS2 Benchmark V2.2 11-May-2017 10.39 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 229 227 220 814 813 801 2T 455 450 435 1626 1623 1609 4T 891 867 687 3225 3219 3181 8T 1283 1307 708 5156 5241 5142 Total Elapsed Time 90.1 seconds ###################### T22 32 Bit ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel MP-MFLOPS2 Benchmark V2.2 09-Aug-2015 21.17 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 190 190 184 670 672 664 2T 377 378 370 1343 1345 1329 4T 707 755 725 2657 2669 2621 8T 722 736 714 2640 2672 2631 Total Elapsed Time 113.0 seconds ###################### T22 64 Bit ###################### ARM/Intel MP-MFLOPS2 Benchmark V2.2 09-Aug-2015 21.24 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 705 701 636 1398 1394 1362 2T 1376 1395 942 2794 2797 2757 4T 2063 2602 962 5491 5546 5336 8T 2474 2611 957 5367 5500 5417 Total Elapsed Time 51.6 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s Android MP-MFLOPS2 Benchmark V2.1 04-Feb-2015 11.03 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 502 501 476 575 575 573 2T 1012 975 921 1133 1140 1115 4T 1571 1627 979 2238 2255 2258 8T 1550 1890 1007 2235 2239 2217 Total Elapsed Time 117.4 seconds ################## A1 V1 Android 5.0 ################## Android MP-MFLOPS2 Benchmark V2.1 05-Nov-2015 11.59 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 607 586 559 556 553 555 2T 1174 1153 1057 1111 1115 1112 4T 1539 2220 992 2181 2207 2179 8T 1736 2097 1011 2184 2194 2178 Total Elapsed Time 119.2 seconds #################### A1 ARM-Intel ###################### ARM/Intel MP-MFLOPS2 Benchmark V2.1 28-Apr-2015 17.24 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 695 696 661 1061 1061 1055 2T 1335 1382 1058 2088 2086 2102 4T 1832 2635 979 3993 4125 4145 8T 2026 2557 1007 3842 4044 4110 Total Elapsed Time 65.8 seconds ########### A5 ARM-Intel Dual Boot With W2 ############# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Apr-2016 17.53 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 422 450 401 945 964 939 2T 795 849 754 1809 1859 1815 4T 1161 1514 1084 3043 3159 3144 8T 1141 1376 1065 3173 3241 3234 Total Elapsed Time 78.8 seconds #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-MFLOPS2 Benchmark V2.2 21-Oct-2016 14.27 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 386 449 427 922 930 917 2T 579 738 733 1658 1642 1636 4T 894 1011 839 2326 2146 2121 8T 974 1084 1039 2239 2355 2433 Total Elapsed Time 90.6 seconds #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel MP-MFLOPS2 Benchmark V2.2 14-Aug-2016 22.35 Compiled for 64 bit Intel x86_64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1365 1369 926 2478 2525 2438 2T 2628 2746 1403 4420 4439 4382 4T 2505 3654 1462 5398 6022 5754 8T 2619 3133 1570 6133 6500 6224 Total Elapsed Time 34.0 seconds ################# W1 Windows 10 32 bit ################# Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 MP-MFLOPS From C/C++ 18.00.21005.1 for x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1467 1388 1215 2537 2529 2486 2T 2773 2825 1659 4937 4958 4740 4T 3334 4845 1512 8453 8813 8694 8T 2818 5068 1575 8338 8896 8627 ################# W1 Windows 10 64 bit ################# MP-MFLOPS From C/C++ 18.00.21005.1 for x64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1470 1471 1252 2936 3060 2996 2T 2775 2982 1653 5593 5860 5680 4T 3610 5290 1520 9401 10488 10326 8T 3132 5178 1562 8957 8365 10433 ######## W2 Windows 10 32 bit Dual Boot With A5 ######## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 MP-MFLOPS From C/C++ 18.00.21005.1 for x86 Start of test Sat May 21 19:10:08 2016 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1415 1346 968 2368 2274 2227 2T 2336 2436 857 4460 4433 4181 4T 2718 4196 1046 7192 7984 7678 8T 3073 3220 1071 6133 8773 6413 ######## W2 Windows 10 64 bit Dual Boot With A5 ######## MP-MFLOPS From C/C++ 18.00.21005.1 for x64 Start of test Fri Apr 15 16:41:27 2016 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1560 1584 1034 2952 2965 2877 2T 2590 2757 1160 5369 5862 5333 4T 3852 5094 1090 9407 10478 10331 8T 3480 4973 1133 7748 10417 7742 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-MFLOPS2 Benchmark V2.2 21-Oct-2016 12.47 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 3593 3565 3355 5610 5870 5859 2T 6858 7298 6767 10848 11732 11689 4T 7267 14299 7480 18157 23093 20018 8T 10919 13727 11940 22555 22935 22929 Total Elapsed Time 12.1 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel MP-MFLOPS2 Benchmark V2.2 11-Nov-2016 14.34 Compiled for 64 bit Intel x86_64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 13176 8885 6002 21867 22182 21447 2T 21999 22460 11030 42151 43598 45387 4T 24740 31790 15002 82615 86988 87136 8T 24161 41857 27639 78321 89838 85588 Total Elapsed Time 3.4 seconds ################# PC Windows 10 32 bit ################# Top end 2015 PC - Core i7-4820K at 3.9 GHz MP-MFLOPS From C/C++ 18.00.21005.1 for x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 11945 10323 6088 21760 21813 21691 2T 18020 20096 11072 34309 43919 45673 4T 25662 42897 13955 55831 89194 90429 8T 22256 49955 14299 80928 90240 88848 ################# PC Windows 10 64 bit ################# MP-MFLOPS From C/C++ 18.00.21005.1 for x64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 14218 12522 6044 22097 22201 22087 2T 21473 24706 11189 42464 44797 46061 4T 24241 28250 15774 59471 90548 81144 8T 27512 57442 14238 82808 92377 92959 ################ Comparison MFLOPS/MHz ################ FPU Add & Multiply using 1, 2, 4 and Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 Threads 32 Bit Only Android T7 1T 0.16 0.13 0.10 0.50 0.48 0.48 Cortex 2T 0.30 0.27 0.16 1.00 0.97 0.95 A9 4T 0.57 0.59 0.20 1.98 1.95 1.87 T11 1T 0.41 0.44 0.32 0.90 0.88 0.87 Cortex 2T 0.78 0.90 0.38 1.85 1.81 1.76 A15 4T 0.94 0.92 0.39 1.79 1.82 1.76 T21 1T 0.38 0.36 0.30 0.57 0.57 0.57 Qualcomm 2T 0.77 0.76 0.55 1.14 1.15 1.14 800 4T 1.17 1.50 0.77 2.10 2.21 2.28 A1 1T 0.37 0.37 0.36 0.57 0.57 0.57 Atom 2T 0.72 0.74 0.57 1.12 1.12 1.13 Z3745 4T 0.98 1.42 0.53 2.15 2.22 2.23 A5 1T 0.23 0.24 0.22 0.51 0.52 0.51 Atom 2T 0.43 0.46 0.41 0.98 1.01 0.99 z8300 4T 0.63 0.82 0.59 1.65 1.72 1.71 P37 1T 0.15 0.15 0.14 0.54 0.54 0.53 Cortex 2T 0.30 0.30 0.28 1.08 1.08 1.06 A53 4T 0.59 0.57 0.43 2.14 2.13 2.11 8 core 8T 0.87 0.85 0.48 3.46 3.44 3.35 ########################################################### 32 Bit and 64 Bit Android T22 32b 1T 0.15 0.15 0.14 0.52 0.52 0.51 Cortex 2T 0.29 0.29 0.28 1.03 1.03 1.02 A53 4T 0.54 0.58 0.56 2.04 2.05 2.02 T22 64b 1T 0.37 0.37 0.36 0.57 0.57 0.57 Cortex 2T 0.72 0.74 0.57 1.12 1.12 1.13 A53 4T 0.98 1.42 0.53 2.15 2.22 2.23 REMIX/Android R1 32b 1T 0.21 0.24 0.23 0.50 0.51 0.50 Atom 2T 0.31 0.40 0.40 0.90 0.89 0.89 Z8300 4T 0.49 0.55 0.46 1.26 1.17 1.15 R1 64b 1T 0.74 0.74 0.50 1.35 1.37 1.33 Atom 2T 1.43 1.49 0.76 2.40 2.41 2.38 Z8300 4T 1.36 1.99 0.79 2.93 3.27 3.13 R2 32b 1T 0.92 0.91 0.86 1.44 1.51 1.50 Core i7 2T 1.76 1.87 1.74 2.78 3.01 3.00 4820K 4T 1.86 3.67 1.92 4.66 5.92 5.13 8HT 8T 2.80 3.52 3.06 5.78 5.88 5.88 R2 64b 1T 3.38 2.28 1.54 5.61 5.69 5.50 Core i7 2T 5.64 5.76 2.83 10.81 11.18 11.64 4820K 4T 6.34 8.15 3.85 21.18 22.30 22.34 8HT 8T 6.20 10.73 7.09 20.08 23.04 21.95 Windows W1 32b 1T 0.80 0.75 0.66 1.38 1.37 1.35 Atom 2T 1.51 1.54 0.90 2.68 2.69 2.58 Z8300 4T 1.81 2.63 0.82 4.59 4.79 4.73 W1 64b 1T 0.80 0.80 0.68 1.60 1.66 1.63 Atom 2T 1.51 1.62 0.90 3.04 3.18 3.09 Z8300 4T 1.96 2.88 0.83 5.11 5.70 5.61 W2 32b 1T 0.77 0.73 0.53 1.29 1.24 1.21 Atom 2T 1.27 1.32 0.47 2.42 2.41 2.27 z8300 4T 1.48 2.28 0.57 3.91 4.34 4.17 W2 64b 1T 0.85 0.86 0.56 1.60 1.61 1.56 Atom 2T 1.41 1.50 0.63 2.92 3.19 2.90 z8300 4T 2.09 2.77 0.59 5.11 5.69 5.61 PC 32b 1T 3.06 2.65 1.56 5.58 5.59 5.56 Core i7 2T 4.62 5.15 2.84 8.80 11.26 11.71 4820K 4T 6.58 11.00 3.58 14.32 22.87 23.19 8HT 8T 5.71 12.81 3.67 20.75 23.14 22.78 PC 64b 1T 3.65 3.21 1.55 5.67 5.69 5.66 Core i7 2T 5.51 6.33 2.87 10.89 11.49 11.81 4820K 4T 6.22 7.24 4.04 15.25 23.22 20.81 8HT 8T 7.05 14.73 3.65 21.23 23.69 23.84

To Start

NEON-MFLOPS-MP carries out the same calculations as MP-MFLOPS Benchmarks above, but with NEON intrinsic functions used for all calculations. For further results see here. The effect of using these functions, instead of leaving it to the compiler, is that 32 bit performance, on ARM based systems, was similar between the original and new benchmarks.

T22 NEON 64 bit compilation produced a small performance gain over 32 bit results, at 2 operations per word, but near double speed at 32 operations, the latter benefiting from availability of sufficient registers for all the variables.

On the Intel Atom based tablet A1, via the ARM to Intel conversion layer, performance was similar via Android 4 and 5, but the native code version was more than twice as fast at 32 operations per word.

MFLOPS/MHz Comparisons are also provided, including examples on maximum speeds from the non-NEON version, demonstrating NEON gains of up to more than three times as fast. A result submitted for P33, with an ARM Cortex-A57 produced the best single core performance (at November 2015) of 3.47 results per cycle at 64 bits, followed by the Cortex-A53 at 2.13. This is still disappointing, compared with Intel desktop processors, such as the Core 2 onwards, at 6 per clock cycle out of a maximum of 8, with SSE SIMD code (See Linux results).

Intel REMIX/Android - For some reason. this native ARM/Intel and 64 bit/32 bit version failed to run. In this case, the compiler probably failed to translate NEON intrinsic functions into appropriate Intel instructions. The original benchmark had pure ARM code, translated by the Houdini interpreter and that ran successfully, results being included below. This demonstrated up to 4.11 MFLOPS/MHz using a single core.

Following the performance details are the numeric results of calculations from the fixed parameters used in the new version, for both ARM and Intel. It seems that Tablet T11 has an intermittent fault, as it occasionally fails to calculate a correct answer or causes the Tablet to crash and reboot. Now, this also appears to happen using the older version. The benchmark appeared to run successfully with an Energy Saving On setting, where performance was much slower and CPU MHz was measured as 1000 MHz instead of 1700 (see results below).

##################### T7 Original ###################### T7, ARM Cortex-A9 1200 MHz, Android 4.1.2, Android NEON-MFLOPS-MP Benchmark V1.0 20-Dec-2012 16.57 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 532 402 124 1135 1044 960 2T 1255 798 213 2041 1987 1916 4T 2441 1553 229 4185 4034 3450 8T 1922 2403 226 3774 3996 3346 Total Elapsed Time 4.5 seconds #################### T7 ARM-Intel ##################### ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.24 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 657 407 132 1077 1074 1053 2T 1265 817 222 2147 2150 2078 4T 2024 1695 234 4214 4276 3555 8T 2435 2495 234 4196 4100 3523 Total Elapsed Time 39.0 seconds #################### T11 Original ##################### T11 Samsung EXYNOS 5250 1.7 GHz Cortex-A15, Android 4.2.2 Dual Core Android NEON-MFLOPS-MP Benchmark V1.1 13-Sep-2013 13.44 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1847 1415 597 3772 4096 3545 2T 3649 3309 664 8065 7966 7505 4T 3670 3922 658 7753 8148 7490 8T 5664 5570 681 8092 8355 7672 Total Elapsed Time 13.0 seconds #################### T11 ARM-Intel #################### ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.07 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1965 1630 582 3792 4077 3521 2T 3789 2690 663 8497 8133 7297 4T 5714 4883 654 8364 8192 7554 8T 5414 6316 673 7976 8437 6635 Total Elapsed Time 13.0 seconds ######## T11 ARM-Intel Power Saving On 1.0 GHz ######## ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-Nov-2015 16.55 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1935 1290 645 2516 2397 2339 2T 3664 2644 684 4945 4780 4657 4T 3436 3337 690 4911 4931 4674 8T 3133 3543 689 4818 4959 4651 Total Elapsed Time 19.2 seconds #################### T21 Original ##################### T21 Qualcomm Snapdragon 800 2150 MHz, Android 4.4.4 Dual Channel 32 Bit LPDDR3-1866 RAM 14.9 GB/s Android NEON-MFLOPS2-MP Benchmark V2.1 25-Jul-2015 18.44 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 2757 2576 771 2808 2825 2800 2T 5662 5525 1516 5631 5664 5570 4T 6550 7846 1945 11167 11281 10939 8T 10273 10928 1981 10851 11211 11350 Total Elapsed Time 40.0 seconds #################### T21 ARM-Intel #################### ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 28-Jun-2015 16.32 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 3049 2857 622 2923 2874 2098 2T 5508 4887 1009 5477 5736 4349 4T 5643 5282 1410 11244 11601 8564 8T 9294 11156 1681 11288 11605 8946 Total Elapsed Time 14.0 seconds ###################### P37 32 Bit ###################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 Single Channel RAM, LPDDR3 933 MHz, 7.5 GB/second 8 x 32 KB L1 cache, 512 KB shared L2 cache ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 14-Nov-2016 12.18 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 740 660 399 1739 1729 1691 2T 1334 1228 566 3449 3416 3328 4T 2188 2139 675 6671 6674 6463 8T 2489 3261 722 10379 10466 9768 Total Elapsed Time 22.1 seconds Android 7.0 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-May-2017 10.44 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 716 686 432 1740 1740 1703 2T 1367 1255 614 3457 3427 3358 4T 2389 2131 726 6814 6682 6644 8T 2914 2776 744 10082 9994 9712 Total Elapsed Time 21.8 seconds ###################### T22 32 Bit ###################### T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 13-Aug-2015 16.35 Compiled for 32 bit ARM v7a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 619 613 575 1444 1446 1426 2T 1174 1206 889 2894 2902 2839 4T 1585 1616 901 5679 5726 5596 8T 2075 2130 944 5400 5585 5519 Total Elapsed Time 25.8 seconds ###################### T22 64 Bit ###################### ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 13-Aug-2015 16.38 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 726 745 647 2766 2774 2639 2T 1397 1402 903 5523 5552 5371 4T 1871 1930 898 10780 10479 10439 8T 2496 2876 1011 9736 10679 9900 Total Elapsed Time 15.1 seconds ##################### P33 64 Bit ##################### P33 Quad-core 2 GHz Qualcomm Snapdragon 810, Android 5.0.2 4 x Cortex-A57 and 4 x Cortex-A53 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 16-Sep-2015 17.59 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 2811 3126 1089 6943 6589 6342 2T 2488 4114 1541 12084 10559 8809 4T 4759 5480 2038 16516 14826 11960 8T 4840 8985 2452 22082 23563 12461 Total Elapsed Time 7.6 seconds #################### A1 Original ####################### A1 Quad Core 1.86 GHz Intel Atom Z3745, Android 4.4 Dual Channel LPDDR3-1066 Bandwidth 17.1 GB/s Android NEON-MFLOPS2-MP Benchmark V2.1 07-Feb-2015 18.38 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1796 1520 1025 1231 1228 1227 2T 3354 2959 1047 2427 2445 2445 4T 4627 5508 978 4690 4791 4733 8T 3861 6307 1030 4611 4869 4742 Total Elapsed Time 88.3 seconds ################## A1 V1 Android 5.0 ################## Android NEON-MFLOPS2-MP Benchmark V2.1 05-Nov-2015 12.09 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1969 1913 832 1230 1245 1225 2T 3537 3632 1046 2482 2487 2445 4T 3388 6497 982 4546 4847 4819 8T 4197 6863 1026 4640 4899 4828 Total Elapsed Time 87.7 seconds #################### A1 ARM-Intel ###################### ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 13-May-2015 12.17 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 2151 1962 1064 2619 2694 2650 2T 4421 3849 1048 5296 5463 5343 4T 5886 6652 982 9592 10735 10362 8T 3744 7284 1018 9085 10791 9493 Total Elapsed Time 13.8 seconds ################### W1 REMIX Original ################## R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 Android NEON-MFLOPS-MP Benchmark V1.1 11-Nov-2016 21.39 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 392 414 388 1964 1954 2084 2T 1790 2301 1133 3237 3775 3774 4T 2130 2386 1068 4165 3541 4188 8T 2110 2047 1026 4438 4091 3631 #################### W1 REMIX 32 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 21-Oct-2016 14.40 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1322 1342 965 2377 2517 2354 2T 2261 2627 1155 4140 4316 4329 4T 2187 2656 1361 5494 6082 5693 8T 1978 2673 1613 5888 6050 6119 Total Elapsed Time 17.7 seconds #################### W1 REMIX 64 Bit ################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB Shared L2 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-Nov-2016 21.40 Compiled for 64 bit Intel x86_64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS Can't run - Not an ARMv7 CPU Total Elapsed Time 0.0 seconds #################### A5 ARM Intel ###################### Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Android 5.1, 4 GB DDR 3 1600 4 x 24 KB L1, 2 x 1 MB L2 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 14-Apr-2016 17.57 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 1501 1551 1030 2520 2485 2301 2T 2300 2957 1161 4699 4999 4632 4T 3106 5126 1097 7929 8173 8015 8T 2692 4623 1108 7830 8432 7989 Total Elapsed Time 15.7 second ################### PC REMIX Original ################## R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, Android NEON-MFLOPS-MP Benchmark V1.1 11-Nov-2016 14.44 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 7381 6891 4206 16044 14885 15134 2T 8892 8294 6078 25814 15291 15897 4T 20783 20566 12919 55052 33458 58857 8T 14049 16003 13811 49462 46915 53373 #################### PC REMIX 32 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 21-Oct-2016 12.53 Compiled for 32 bit Intel x86 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS Can't run - CPU doesn't support NEON Total Elapsed Time 0.0 seconds #################### PC REMIX 64 Bit ################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo 4 x 32 KB L1, 4 x 256 KB L2, 10 MB L3 800 MHz RAM, 4 channels, 51.2 GB/s, Android 6.0.1, ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 11-Nov-2016 14.45 Compiled for 64 bit Intel x86_64 FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS Can't run - Not an ARMv7 CPU Total Elapsed Time 0.0 seconds ################ Comparison MFLOPS/MHz ################ 2 Ops/Word 32 Ops/Word Not NEON KB 12.8 128 12800 12.8 128 12800 12.8 Threads 32 Bit Only Android T7 1T 0.55 0.34 0.11 0.90 0.90 0.88 0.50 Cortex 2T 1.05 0.68 0.19 1.79 1.79 1.73 1.00 A9 4T 1.69 1.41 0.20 3.51 3.56 2.96 1.98 T11 1T 1.16 0.96 0.34 2.23 2.40 2.07 0.90 Cortex 2T 2.23 1.58 0.39 5.00 4.78 4.29 1.85 A15 4T 3.36 2.87 0.38 4.92 4.82 4.44 1.79 T21 1T 1.42 1.33 0.29 1.36 1.34 0.98 0.57 Qualcomm 2T 2.56 2.27 0.47 2.55 2.67 2.02 1.14 800 4T 2.62 2.46 0.66 5.23 5.40 3.98 2.10 A1 1T 1.16 1.05 0.57 1.41 1.45 1.42 0.57 Atom 2T 2.38 2.07 0.56 2.85 2.94 2.87 1.12 Z3745 4T 3.16 3.58 0.53 5.16 5.77 5.57 2.15 A5 1T 0.82 0.84 0.56 1.37 1.35 1.25 0.51 Atom 2T 1.25 1.61 0.63 2.55 2.72 2.52 0.98 z8300 4T 1.69 2.79 0.60 4.31 4.44 4.36 1.65 P37 1T 0.49 0.44 0.27 1.16 1.15 1.13 0.54 Cortex 2T 0.89 0.82 0.38 2.30 2.28 2.22 1.08 A53 4T 1.46 1.43 0.45 4.45 4.45 4.31 2.14 8 core 8T 1.66 2.17 0.48 6.92 6.98 6.51 3.46 ########################################################### 32 Bit and 64 Bit Android T22 32b 1T 0.48 0.47 0.44 1.11 1.11 1.10 0.52 Cortex 2T 0.90 0.93 0.68 2.23 2.23 2.18 1.03 A53 4T 1.22 1.24 0.69 4.37 4.40 4.30 2.04 T22 64b 1T 0.56 0.57 0.50 2.13 2.13 2.03 0.57 Cortex 2T 1.07 1.08 0.69 4.25 4.27 4.13 1.12 A53 4T 1.44 1.48 0.69 8.29 8.06 8.03 2.15 P33 1T 1.41 1.56 0.54 3.47 3.29 3.17 N/A Cortex 2T 1.24 2.06 0.77 6.04 5.28 4.40 A57 64b 4T 2.38 2.74 1.02 8.26 7.41 5.98 REMIX/Android R1 32b 1T 0.72 0.73 0.52 1.29 1.37 1.28 0.50 Atom 2T 1.23 1.43 0.63 2.25 2.35 2.35 0.90 Z8300 4T 1.19 1.44 0.74 2.99 3.31 3.09 1.26 R1 64b 1T Can't run - Not an ARMv7 CPU Atom 2T Z8300 4T R2 32b 1T Can't run - Not an ARMv7 CPU Core i7 2T 4820K 4T 8HT 8T R2 64b 1T Can't run - Not an ARMv7 CPU Core i7 2T 4820K 4T 8HT 8T Original Houdini Interpreted Windows R2 32b 1T 1.89 1.77 1.08 4.11 3.82 3.88 5.56 Core i7 2T 2.28 2.13 1.56 6.62 3.92 4.08 11.71 4820K 4T 5.33 5.27 3.31 14.12 8.58 15.09 23.19 Windows Not applicabe ##################### New Results ##################### Results x 100000, 12345 indicates ERRORS ARM/Intel NEON-MFLOPS2-MP Benchmark V2.1 1T 44934 86735 99850 36770 79897 99759 2T 44934 86735 99850 36770 79897 99759 4T 44934 86735 99850 36770 79897 99759 8T 44934 86735 99850 36770 79897 99759 T11 44934 12345 99850 36770 79897 99759 Android NEON-MFLOPS-MP Benchmark V1.1 1T 86735 98519 99984 79897 97638 99975 2T 86735 98519 99984 79897 97638 99975 4T 86735 98519 99984 79897 97638 99975 8T 86735 98519 99984 79897 97638 99975 Android NEON-MFLOPS2-MP Benchmark V2.1 1T 40015 66980 99522 35216 54898 99234 2T 40015 66980 99522 35216 54898 99234 4T 40015 66980 99522 35216 54898 99234 8T 40015 66980 99522 35216 54898 99234

To Start

The benchmark does not rely on complex visual scenes or mathematical functions. The objective being to generate moderate to excessive loading via multiple simple objects. It uses all Java code, with OpenGL ES GL10 statements, to measure graphics performance in Frames Per Second (FPS). Four tests draw a background of 50 cubes first as wireframes then colour shaded. The third test views the cubes in and out of a tunnel with slotted sides and roof, also containing rotating plates. The last test adds textures to the cubes and plates. The 50 cubes are redrawn 15, 30 and 60 times, with randomised positions, colours rotational settings. With 6 x 2 triangles per cube, minimum triangles per frame for the three sets of tests are 9000, 18000 and 36000.

An example of the last scene is on the right. The tunnel is provided to show 3D effects, the plates rotating in fixed positions. The numerous cubes are in the distant background, the tunnel slots showing that they are still there, with size varying according to proximity. The cubes appear more as jumping objects, with changing colours and position.

Android 5 has switched to ART virtual machine for Java, instead of Dalvik. First results indicate severe degradation in performance with this benchmark.

Further details and results can be found here. This includes information on Vertical Synchronisation (VSYNC) that limits Frames Per Second (FPS) to 60 and can lead to heavier loading reducing speed in 50% steps. as is apparent in the results below. Links to my Windows and Linux OpenGL benchmarks are also provided.

On tablets A1 and T7 Android was upgraded to version 5.0, leading a reduction in measured speeds by up to 50%, possibly suggesting that VSYNC had change to 30 FPS. The graphics in A5 appear to be slightly faster than A1, but maximum speed appears to be similarly restricted to 30 FPS.

Except for tablet T15, none of the results are particularly good at the heavier loading. T15 results were also produced via Android 5, with several measurements at near 60 FPS, suggesting that speed reductions on the other tablets are not solely dependent on Android 5.

P37, with Adreno graphics and Android 6 was also slower than T21, with an inferior Adreno GPU and Android 4. So was Wi/R1 Intel Atom based REMIX/Android 6 tablet The powerful Intel Core i7 REMIX speeds were some of the fastest but disappointing for high end GeForce graphics (All effects of the change to Java via ART?).

########################## T7 ########################## T7 Nexus 7 Quad 1200 MHz Cortex-A9, Android 4.1.2 nVidia ULP GeForce Graphics 12 core, 416 MHz Android Java OpenGL Benchmark 06-Mar-2013 21.51 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 42.18 43.57 33.38 23.54 18000+ 23.68 23.47 19.91 13.38 36000+ 12.05 11.95 11.00 7.10 Screen Pixels 1280 Wide 736 High Total Elapsed Time 121.0 seconds #################### T7 Android 5.0 #################### Android Java OpenGL Benchmark 12-Oct-2015 16.06 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 22.61 23.23 17.71 13.46 18000+ 12.03 12.11 10.36 7.57 36000+ 6.14 6.01 5.64 4.03 Screen Pixels 1280 Wide 736 High Total Elapsed Time 121.5 seconds ########################## T11 ######################### T11 Samsung EXYNOS 5250 Dual 1.7 GHz Cortex-A15, Android 4.2.2 Mali-T604 Quad Core GPU Android Java OpenGL Benchmark 09-Aug-2013 09.42 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 39.13 41.52 32.19 27.25 18000+ 22.03 20.73 19.69 16.30 36000+ 12.24 12.23 10.75 8.68 Screen Pixels 1920 Wide 1032 High Total Elapsed Time 120.8 seconds ########################## T15 ######################### T15 HTC Nexus 9, dual core Denver CPU 2400 MHz, Android 5.0.1 Kepler DX1 Graphics Android Java OpenGL Benchmark 28-Jan-2015 22.38 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 59.79 59.84 59.84 57.79 18000+ 59.97 59.26 52.64 32.74 36000+ 31.33 30.95 29.02 17.59 Screen Pixels 2048 Wide 1440 High Total Elapsed Time 121.0 seconds ########################## T21 ######################### T21 Quad Core 2.2 GHz Snapdragon 800, Android 4.4.3 GPU Qualcomm Adreno 330, 578 MHz Android Java OpenGL Benchmark 27-Jul-2015 16.50 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 35.05 35.45 25.60 21.58 18000+ 18.04 18.05 15.32 12.73 36000+ 9.28 9.33 8.47 6.91 Screen Pixels 1200 Wide 1803 High Total Elapsed Time 120.8 seconds ########################## P37 ######################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 GPU Adreno 405 550 MHz Android Java OpenGL Benchmark 17-Oct-2016 10.01 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 27.46 27.68 21.16 17.96 18000+ 14.56 14.60 12.47 10.36 36000+ 7.17 7.21 6.56 5.37 Screen Pixels 1776 Wide 1080 High Total Elapsed Time 121.0 seconds Android 7.0 Android Java OpenGL Benchmark 17-Mar-2017 10.39 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 18.49 18.74 14.45 11.73 18000+ 9.70 9.75 8.40 6.31 36000+ 4.78 4.78 4.45 3.48 Screen Pixels 1776 Wide 1080 High Total Elapsed Time 121.3 seconds ########################## T22 ######################### T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53 Android 5.0, GPU Mali T720 MP2 Android Java OpenGL Benchmark 26-Aug-2015 16.24 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 22.55 22.11 16.67 14.27 18000+ 11.55 11.60 9.98 8.27 36000+ 5.92 5.98 5.48 4.48 Screen Pixels 800 Wide 1216 High Total Elapsed Time 120.9 seconds ########################## A1 ########################## A1 Asus MemoPad 7, Quad Core 1.86 GHz Intel Atom Z3745 Intel HD Graphics, Android 4.4.2 Android Java OpenGL Benchmark 21-Dec-2014 16.30 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 37.95 37.64 29.86 23.63 18000+ 19.44 19.70 17.26 13.26 36000+ 9.99 9.93 9.35 7.17 Screen Pixels 1280 Wide 736 High Total Elapsed Time 120.6 seconds #################### A1 Android 5.0 #################### Android Java OpenGL Benchmark 10-Oct-2015 13.44 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 25.87 25.89 20.27 16.29 18000+ 13.43 13.56 11.72 9.38 36000+ 6.92 6.73 6.32 4.98 Screen Pixels 800 Wide 1216 High Total Elapsed Time 120.9 seconds #################### A5 Android 5.1 ###################### Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Intel HD Graphics, Android 5.1 Android Java OpenGL Benchmark 21-May-2016 13.00 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 29.77 30.17 22.58 18.54 18000+ 16.09 16.03 13.70 10.78 36000+ 8.31 8.27 7.79 5.76 Screen Pixels 2048 Wide 1440 High Total Elapsed Time 121.0 seconds ####################### W1 REMIX ###################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, HD Graphics Android Java OpenGL Benchmark 14-Aug-2016 22.40 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 19.87 20.29 15.75 12.98 18000+ 11.57 11.68 9.90 7.71 36000+ 6.12 6.14 5.64 4.22 Screen Pixels 1920 Wide 996 High Total Elapsed Time 121.4 seconds ####################### PC REMIX ###################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo Android 6.0.1, GeForce GTX 650, 64-Bit Windows 10 Android Java OpenGL Benchmark 14-Aug-2016 14.23 --------- Frames Per Second -------- Triangles WireFrame Shaded Shaded+ Textured 9000+ 59.96 59.95 60.00 56.78 18000+ 58.21 58.49 53.97 36.68 36000+ 33.45 33.47 31.29 20.46 Screen Pixels 1920 Wide 996 High Total Elapsed Time 120.3 seconds

To Start

This all Java benchmark uses small to rather excessive simple objects to measure drawing performance, again via Frames Per Second (FPS). Five tests draw on a background of continuously changing colour shades. The image on the right is after four tests.

Test 1 loads a PNG file twice, the bitmaps moving for each frame, one right to left and back, the other circling.

Plus Test 2 generates 2 SweepGradient multi-coloured circles moving towards the centre and back.

Plus Test 3 draws 200 random small circles in the middle of the screen (mainly hidden on a small screen).

Plus Test 4 draws 80 lines from the centre of each side to the opposite side, again with changing colours.

Plus Test 5 draws the same small random circles as Test 3 but with 4000, filling the screen.

Each test runs for approximately 10 seconds.

Further details and results can be found here, that includes links to an off line version that runs on PCs via Windows and Linux.

As with Java OpenGL, speeds are limited to 60 FPS by imposed VSYNC. In general, there was not a great deal of differences in performance on the initial systems shown here. In the cases of Android upgrades to version 5. it was virtually identical to tablet A1 but T7 speed was much faster on the tests least dependent on CPU speed.

March 2016 - Results from W1, the Windows 10 based tablet, indicate that VSYNC is not imposed, producing the fastest speeds at this time. Windows/Android dual boot tablet W2/A5, confirms the faster Windows performance (via Java). However, the android version runs at full screen, as opposed to a fixed 1280 x 720 with the Windows variety. The latter was recompiled to use full screen, producing much slower speeds (see below). Windows results from the PC, with a reasonably powerful graphics card, are also shown, to reflect the huge difference in performance.

A5 and W2 Dual Boot Tablet - At Screen pixels 2048 x 1440, the Windows speed was slower than via Android, on the first test, but faster on others. A second test on W2, at 1280 x 720, demonstrates faster speed using a smaller window.

REMIX Android vs Windows - Unlike Android, Windows based tests were not limited to 60 FPS, due to VSYNC, and particularly the PC results shown indicated superior performance.. As with the OpenGL benchmark, P37 was relatively slow (More ART/Java issues?).

########################## T7 ########################## T7 Nexus 7 Quad 1200 MHz Cortex-A9, Android 4.2.1 nVidia ULP GeForce Graphics 12 core, 416 MHz Android Java Drawing Benchmark 12-Apr-2013 19.50 Test Frames FPS Display PNG Bitmap Twice 204 20.38 Plus 2 SweepGradient Circles 165 16.48 Plus 200 Random Small Circles 145 14.50 Plus 320 Long Lines 113 11.30 Plus 4000 Random Small Circles 39 3.81 Screen pixels 1280 Wide 736 High Total Elapsed Time 50.4 seconds Maximum 19.2 Million Pixels Per Second #################### T7 Android 5.0 #################### Android Java Drawing Benchmark 01-Oct-2015 12.24 Test Frames FPS Display PNG Bitmap Twice 487 48.70 Plus 2 SweepGradient Circles 297 29.66 Plus 200 Random mall Circles 231 23.02 Plus 320 Long Lines 149 14.85 Plus 4000 Random Small Circles 39 3.90 Screen pixels 1280 Wide 736 High Total Elapsed Time 50.1 seconds ########################## T11 ######################### T11 Samsung EXYNOS 5250 2.0 GHz Cortex-A15, Android 4.2.2 Mali-T604 quad core GPU Android Java Drawing Benchmark 09-Aug-2013 09.39 Test Frames FPS Display PNG Bitmap Twice 558 55.74 Plus 2 SweepGradient Circles 277 27.66 Plus 200 Random Small Circles 244 24.36 Plus 320 Long Lines 169 16.84 Plus 4000 Random Small Circles 68 6.72 Screen pixels 1920 Wide 1032 High Total Elapsed Time 50.4 seconds Maximum 110 Million Pixels Per Second ########################## T21 ######################### T21 2.2 GHz Quad Core Snapdragon 800, Android 4.4.3 GPU Qualcomm Adreno 330, 578 MHz Android Java Drawing Benchmark 27-Jul-2015 16.47 Test Frames FPS Display PNG Bitmap Twice 533 53.24 Plus 2 SweepGradient Circles 248 24.73 Plus 200 Random Small Circles 218 21.72 Plus 320 Long Lines 158 15.75 Plus 4000 Random Small Circles 57 5.61 Screen pixels 1200 Wide 1803 High Total Elapsed Time 50.3 seconds ########################## T22 ######################### T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53 Android 5.0, GPU Mali T720 MP2 Android Java Drawing Benchmark 26-Aug-2015 16.21 Test Frames FPS Display PNG Bitmap Twice 558 55.72 Plus 2 SweepGradient Circles 368 36.70 Plus 200 Random Small Circles 286 28.52 Plus 320 Long Lines 178 17.76 Plus 4000 Random Small Circles 50 4.99 Screen pixels 800 Wide 1216 High Total Elapsed Time 51.5 seconds ########################## P37 ######################### P37, 8 Core ARM Cortex-A53 1500/1200 MHz, Android 6.0.1 GPU Adreno 405 550 MHz Android Java Drawing Benchmark 17-Oct-2016 09.59 Test Frames FPS Display PNG Bitmap Twice 246 24.53 Plus 2 SweepGradient Circles 158 15.77 Plus 200 Random Small Circles 130 12.98 Plus 320 Long Lines 98 9.71 Plus 4000 Random Small Circles 27 2.66 Screen pixels 1776 Wide 1080 High Total Elapsed Time 50.4 seconds Android 7.0 Android Java Drawing Benchmark 17-Mar-2017 10.32 Test Frames FPS Display PNG Bitmap Twice 236 23.57 Plus 2 SweepGradient Circles 149 14.85 Plus 200 Random Small Circles 132 13.19 Plus 320 Long Lines 103 10.24 Plus 4000 Random Small Circles 41 4.06 Screen pixels 1776 Wide 1080 High Total Elapsed Time 50.3 seconds ########################## A1 ########################## A1 Asus MemoPad 7, Quad Core 1.86 GHz Intel Atom Z3745 Intel HD Graphics, Android 4.4.2 Android Java Drawing Benchmark 21-Dec-2014 16.35 Test Frames FPS Display PNG Bitmap Twice 599 59.79 Plus 2 SweepGradient Circles 486 48.55 Plus 200 Random Small Circles 383 38.25 Plus 320 Long Lines 219 21.88 Plus 4000 Random Small Circles 64 6.38 Screen pixels 1280 Wide 736 High Total Elapsed Time 50.1 seconds #################### A1 Android 5.0 #################### Android Java Drawing Benchmark 10-Oct-2015 13.42 Test Frames FPS Display PNG Bitmap Twice 595 59.40 Plus 2 SweepGradient Circles 458 45.79 Plus 200 Random Small Circles 383 38.27 Plus 320 Long Lines 199 19.81 Plus 4000 Random Small Circles 56 5.60 Screen pixels 800 Wide 1216 High Total Elapsed Time 50.1 seconds #################### A5 Android 5.1 #################### Same Tablet as W2 Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84 Intel HD Graphics, Android 5.1 Android Java Drawing Benchmark 02-Mar-2016 17.37 Test Frames FPS Display PNG Bitmap Twice 447 44.62 Plus 2 SweepGradient Circles 212 21.12 Plus 200 Random Small Circles 171 17.02 Plus 320 Long Lines 93 9.25 Plus 4000 Random Small Circles 32 3.13 Screen pixels 2048 Wide 1440 High Total Elapsed Time 50.4 seconds ####################### W1 REMIX ####################### R1 Intel Atom Z8300 quad core 1.84 GHz Android 6.0.1, HD Graphics Android Java Drawing Benchmark 14-Aug-2016 22.38 Test Frames FPS Display PNG Bitmap Twice 594 59.39 Plus 2 SweepGradient Circles 375 37.47 Plus 200 Random Small Circles 315 31.43 Plus 320 Long Lines 210 20.96 Plus 4000 Random Small Circles 66 6.57 Screen pixels 1920 Wide 1032 High Total Elapsed Time 50.1 seconds ############## W1 Windows 10 1280 x 720 ############## Intel Atom Z8300 quad core 1.44 GHz Turbo 1.84 Windows 10, Intel HD Graphics Gen8 Java Drawing Benchmark, Dec 27 2015, 21:51:45 Produced by javac 1.7.0_2 Test Frames FPS Display PNG Bitmap Twice Pass 1 872 87.13 Display PNG Bitmap Twice Pass 2 991 98.95 Plus 2 SweepGradient Circles 961 95.98 Plus 200 Random Small Circles 782 78.08 Plus 320 Long Lines 605 60.44 Plus 4000 Random Small Circles 164 16.32 Total Elapsed Time 60.1 seconds Operating System Windows 10, Arch. x86, Version 10.0 Java Vendor Oracle Corporation, Version 1.8.0_66 Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs ############## W2 Windows 10 1280 x 720 ############## Same Tablet as A5 Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84 Windows 10, Intel HD Graphics Gen8 Java Drawing Benchmark, Mar 2 2016, 21:30:58 Produced by javac 1.7.0_2 Test Frames FPS Display PNG Bitmap Twice Pass 1 748 74.78 Display PNG Bitmap Twice Pass 2 833 83.24 Plus 2 SweepGradient Circles 828 82.78 Plus 200 Random Small Circles 690 68.99 Plus 320 Long Lines 560 55.94 Plus 4000 Random Small Circles 163 16.30 Total Elapsed Time 60.0 seconds Operating System Windows 10, Arch. x86, Version 10.0 Java Vendor Oracle Corporation, Version 1.8.0_66 Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs ############ W2 Windows 10 2048 x 1440 ############# Java Drawing Benchmark, Mar 3 2016, 12:22:42 Produced by javac 1.7.0_2 2048 x 1440 Test Frames FPS Display PNG Bitmap Twice Pass 1 275 27.42 Display PNG Bitmap Twice Pass 2 301 30.01 Plus 2 SweepGradient Circles 296 29.54 Plus 200 Random Small Circles 286 28.51 Plus 320 Long Lines 225 22.45 Plus 4000 Random Small Circles 118 11.72 Total Elapsed Time 60.3 seconds Operating System Windows 10, Arch. x86, Version 10.0 Java Vendor Oracle Corporation, Version 1.8.0_66 Intel64 Family 6 Model 76 Stepping 3, GenuineIntel, 4 CPUs ####################### PC REMIX ###################### R2 Core i7 4820K quad core + HT at 3900 MHz Turbo Android 6.0.1, GeForce GTX 650, 64-Bit Windows 10 Android Java Drawing Benchmark 14-Aug-2016 14.19 Test Frames FPS Display PNG Bitmap Twice 582 55.49 Plus 2 SweepGradient Circles 601 60.01 Plus 200 Random Small Circles 415 41.41 Plus 320 Long Lines 303 30.25 Plus 4000 Random Small Circles 43 4.20 Screen pixels 396 Wide 674 High Total Elapsed Time 50.8 seconds ################ PC REMIX Full Scrren ################# Android Java Drawing Benchmark 14-Aug-2016 14.21 Test Frames FPS Display PNG Bitmap Twice 553 55.21 Plus 2 SweepGradient Circles 539 53.86 Plus 200 Random Small Circles 330 32.91 Plus 320 Long Lines 212 21.19 Plus 4000 Random Small Circles 39 3.88 Screen pixels 1920 Wide 996 High Total Elapsed Time 50.2 seconds ########### PC Windows 10 GeForce GTX 650 ########### Core i7-4820K at 3.9 GHz Java Drawing Benchmark, Mar 7 2016, 10:56:24 Produced by javac 1.7.0_2 2048 x 1440 Test Frames FPS Display PNG Bitmap Twice Pass 1 5237 523.39 Display PNG Bitmap Twice Pass 2 5477 547.04 Plus 2 SweepGradient Circles 5484 548.07 Plus 200 Random Small Circles 5144 513.58 Plus 320 Long Lines 4736 473.32 Plus 4000 Random Small Circles 735 73.49 Total Elapsed Time 60.0 seconds Operating System Windows 10, Arch. x86, Version 10.0 Java Vendor Oracle Corporation, Version 1.8.0_60 Intel64 Family 6 Model 62 Stepping 4, GenuineIntel, 8 CPUs

To Start

This program measures CPU MHz samples over 30 seconds, with 300 reports at 100 millisecond intervals (timing functions and overheads increase this time to 120 ms or above). The procedures are open a benchmark, open MHz program and run, switch in benchmark from recent screens and run, save benchmark results, switch in MHz program from recent screens and save results when finished. Further details and results can be found here. and here

Note - This program might not measure the CPU MHz that controls reductions in speed (throttling), introduced to reduce power consumption when temperature increases too much. No simple programming functions appear to be available for logging via a single app. Installing CPU Z might enable independent measurements to be noted. CPU Z can also provide CPU temperature measurement, with a range of values for a number of sensors on different systems. Research might be needed to find which are appropriate for CPU cores.

Below is an example, over the first 18 seconds, whilst running NEON-MFLOPS-MP benchmark (taking 14.6 seconds). In this case, MHz is fairly constant, but the frequency can vary a lot on other devices, or might run at a constant low value, if power saving is switched on.

T22, Quad Core ARM Cortex-A53 1300 MHz, Android 5.0.2 ARM/Intel NEON-MFLOPS2-MP Benchmark V2.2 15-Nov-2015 17.21 Compiled for 64 bit ARM v8a FPU Add & Multiply using 1, 2, 4 and 8 Threads 2 Ops/Word 32 Ops/Word KB 12.8 128 12800 12.8 128 12800 MFLOPS 1T 785 771 669 2862 2851 2739 2T 1485 1499 895 5654 5729 5606 4T 1937 2074 995 10862 11024 10636 8T 2678 3021 1012 9971 10730 10534 Total Elapsed Time 14.6 seconds Android CPU MHz 100 ms Sampling 15-Nov-2015 17:21:45 0.00 1300 0.12 1300 0.23 299 0.38 299 0.53 1300 0.67 1300 0.88 1300 1.01 1300 1.15 1300 1.29 1300 1.43 1300 1.57 1300 1.70 1300 1.83 1300 1.97 1300 2.11 1300 2.25 1300 2.39 1300 2.53 1300 2.67 1300 X 2.81 1300 2.95 1300 3.09 1300 3.22 1300 3.36 1300 3.49 1300 3.63 1300 3.76 1300 3.90 1300 4.04 1300 4.18 1235 4.30 1300 4.42 1300 4.59 1300 4.76 1300 4.91 1300 5.08 1300 5.23 819 5.40 1300 5.55 299 X 5.74 1300 5.91 1300 6.09 1300 6.25 1300 6.41 1300 6.59 1300 6.76 1300 6.92 1300 7.08 1300 7.24 1300 7.40 1300 7.52 1300 7.68 299 7.88 442 8.06 1300 8.24 1300 8.40 819 8.56 1300 8.71 1300 8.86 1300 X 9.01 1300 9.16 1300 9.32 1300 9.48 1300 9.64 1300 9.80 1300 9.97 1300 10.13 1300 10.27 1300 10.43 1300 10.57 1300 10.72 1300 10.88 1300 11.02 1300 11.17 1300 11.33 1300 11.47 1300 11.62 1300 11.78 1300 11.92 1300 X 12.07 1300 12.22 1300 12.37 1300 12.53 1300 12.68 1300 12.84 1300 12.99 1300 13.15 1300 13.30 1300 13.46 1300 13.61 1300 13.76 1300 13.92 1300 14.08 1300 14.24 1300 14.40 1300 14.56 1300 14.72 1300 14.88 1300 15.04 1300 X 15.20 1300 15.36 1300 15.52 1300 15.69 1300 15.86 1300 16.02 1300 16.21 1300 16.39 1300 16.55 1300 16.71 1300 16.85 1300 17.00 1300 17.14 1300 17.30 1300 17.45 1300 17.60 1300 17.75 1300 17.91 1300 18.05 1300 18.21 1300

To Start

The program runs the second most demanding OpenGL drawing benchmark test except CPU MHz is displayed, along with Frames Per Second (FPS) and running time in minutes, the MHz figure being the average of one measurement per frame. Default running arrangements are 60 passes of one second each, producing two columns of results that are displayed and saved on the Internal Drive. The CPU MHz is the average of samples taken once per frame. These results are to demonstrate any reductions as the battery capacity reduces. Before running, Display/Power Settings should be changed to never switch off and CPU to run at maximum speed, if possible.

Three buttons are provided where, besides the Run and usual Email option, to save results, there is a Time button, enabling manual input of the number of seconds for each pass. After rebooting, following a flat battery turning the device off, and after recharging, restarting the program reads and displays the saved results, ready for E-mailing. NOTE: some Android versions will not open a log file for saving results.

Following are results from a test set to run for 2 hours (60 x 120 seconds) and run twice. Displayed MHz, whilst the test was running, showed rapid variations that affected the final speed and FPS had a similar variation, but these were fairly constant over 4 hours.

Note - the later CPU Stress Tests might be more effective. Also, the CPU MHz app might not work on later systems.

T21 Quad Core Qualcomm Snapdragon 800, Android 4.4.3 GPU Qualcomm Adreno 330, 578 MHz Up to 60 120 second runs, MHz 1 sample/frame Log File /storage/emulated/0/BatteryTest.txt Android Battery Test 28-Jul-2015 11.08 28-Jul-2015 13.28 Run FPS MHz Run FPS MHz Run FPS MHz Run FPS MHz /b> 1 12.0 2100 2 12.1 1937 1 12.1 2014 2 12.2 2005 3 7.6 1874 4 12.1 1966 3 12.0 1975 4 12.0 2005 5 12.2 1993 6 12.2 1996 5 12.1 1962 6 12.1 1948 7 12.2 1996 8 12.2 1966 7 12.2 1979 8 12.2 2004 9 12.0 1935 10 12.3 1925 9 12.1 1959 10 12.1 2060 11 12.3 1983 12 12.0 2015 11 12.0 2017 12 12.2 1992 13 11.9 2013 14 12.1 2000 13 12.1 1987 14 12.2 1964 15 12.1 1934 16 12.1 2005 15 12.1 1973 16 12.1 1978 17 12.0 1948 18 12.0 2000 17 12.1 1998 18 12.1 1977 19 11.9 1979 20 12.0 1972 19 12.0 2007 20 12.0 1956 21 12.0 1997 22 12.0 1994 21 11.9 1966 22 12.1 1975 23 12.2 2035 24 12.1 2013 23 12.0 1978 24 12.0 2012 25 12.2 1981 26 12.1 1977 25 12.1 1988 26 12.1 2010 27 12.2 1976 28 12.2 1991 27 12.1 2004 28 12.0 1994 29 12.2 2000 30 12.3 1984 29 12.1 1989 30 12.2 2004 31 12.2 1986 32 12.3 1964 31 12.2 2009 32 12.1 1979 33 12.2 1955 34 12.1 1980 33 12.1 1945 34 12.0 1951 35 12.1 2002 36 12.2 2045 35 12.1 1997 36 12.1 2022 37 12.1 1993 38 12.2 2010 37 12.2 2038 38 12.1 2024 39 12.1 1947 40 12.1 1959 39 12.1 1997 40 12.1 2049 41 11.9 1949 42 12.0 1993 41 12.2 1996 42 12.1 1994 43 12.1 1953 44 12.2 2005 43 12.0 1978 44 12.0 1985 45 12.4 1928 46 12.3 1989 45 11.9 1947 46 12.2 1982 47 12.1 1987 48 12.1 1969 47 12.1 2022 48 11.8 1964 49 12.1 1992 50 12.1 1999 49 11.9 1985 50 12.1 1991 51 12.2 1929 52 12.1 1955 51 12.1 1988 52 12.0 2002 53 12.4 1950 54 12.3 1990 53 12.0 2009 54 11.9 2018 55 12.3 1930 56 12.2 1922 55 12.0 1994 56 12.0 1974 57 12.4 1952 58 12.1 1977 57 12.1 1950 58 12.1 2009 59 12.1 1986 60 12.2 1962 59 12.0 1976 60 12.0 2010 Total Elapsed Time 7614.9 seconds Total Elapsed Time 7202.9 seconds

To Start

This is primarily intended for measuring performance of SD cards and internal drives, but can also be used to test USB drives. DriveSpeed carries out four tests.

Test 1 - Write and read three 8 and 16 MB; Results given in MBytes/second
Test 2 - Write 8 MB, read can be cached in RAM; Results given in MBytes/second
Test 3 - Random write and read 1 KB from 4 to 16 MB; Results are Average time in milliseconds
Test 4 - Write and read 200 files 4 KB to 16 KB; Results in MB/sec, msecs/file and delete seconds.

The first DriveSpeed benchmark has two run buttons, RunS for an SD card and RunI for the internal drive, the file path being identified by standard functions. The external SD test worked on earlier Android tablets but failed on later Android versions. RunS ran but provided distorted reading speeds by caching data in RAM. An extra button was added to prevent large files from being deleted and a read only option to measure uncached speeds after rebooting.

DriveSpd2 requires input of the file path to use and this might be identified using a file browser app. The file path can sometimes be selected for internal drives, SD cards and USB devices but there are complications associated with permissions and caching.

Running these benchmarks can require a lot of experimentation. Lots of paths, results and explanations are here and here. Following are example DriveSpd2 results from T22 ( Lenovo Tab 2 A8-50) testing an external SD card, T11 (Voyo A15) from a USB 3 flash drive and read only benchmark results.

Intel/Windows Versions - Results for Tablet W1 main drive are below, with USB 3 and SD card speeds Here along with some via Windows and Linux.

########################## T22 ######################### T22 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53 Android DriveSpeed2 Benchmark 1.0 28-Aug-2015 12.56 Data Not Cached MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 3.7 3.7 3.6 20.3 20.6 20.4 16 2.6 3.7 3.7 20.5 20.5 20.5 Cached 8 52.4 107.8 13.2 228.8 226.3 226.7 Random Write Read From MB 4 8 16 4 8 16 msecs 4.65 4.91 18.23 0.01 0.01 0.66 200 Files Write Read Delete File KB 4 8 16 4 8 16 secs MB/sec 0.07 0.18 0.49 2.16 3.79 6.51 msecs 59.14 44.59 33.61 1.90 2.16 2.52 2.099 Total Elapsed Time 85.4 seconds File Path Used - /storage/sdcard1/ Drive MB 15258 Free 14687 ########################## T11 ######################### T11 Samsung EXYNOS 5250 Dual 1.7 GHz Cortex-A15, Android DriveSpeed2 Benchmark 1.0 10-Dec-2013 12.52 Data Not Cached MBytes/Second MB Write1 Write2 Write3 Read1 Read2 Read3 8 40.9 46.6 46.2 100.7 95.9 71.4 16 45.2 51.9 51.1 98.8 70.7 66.2 Cached 8 150.4 127.7 50.9 687.6 688.7 709.2 Random Write Read From MB 4 8 16 4 8 16 msecs 0.91 0.90 0.82 0.01 0.01 0.02 200 Files Write Read Delete File KB 4 8 16 4 8 16 secs MB/sec 0.56 1.18 1.85 4.20 13.33 34.79 msecs 7.29 6.96 8.88 0.98 0.61 0.47 0.149 Total Elapsed Time 24.8 seconds File Path Used - /mnt/udisk/ Drive MB 30517 Free 30466 ###################### Read Only ####################### Android DriveSpeed Benchmark Internal Drive Read Only MBytes/Second Device Write1 Write2 Write3 Read1 Read2 Read3 T7 0.0 0.0 0.0 41.7 42.8 39.0 T11 0.0 0.0 0.0 53.7 53.5 53.9 T21 0.0 0.0 0.0 102.9 104.0 103.6 T22 0.0 0.0 0.0 127.7 145.7 139.9 A1 0.0 0.0 0.0 155.7 128.6 156.2 ################## W1 DriveSpeed32.exe W1 Windows 10 ################# Current Directory Path: C:\Test Total MB 58722, Free MB 45286, Used MB 13436 Windows Storage Speed Test 32-Bit Version 1.2, Mon Jan 04 16:09:25 2016 Copyright (C) Roy Longbottom 2011 8 MB File 1 2 3 4 5 Writing MB/sec 100.68 101.04 110.81 105.04 113.32 Reading MB/sec 154.58 155.78 132.18 153.97 153.86 16 MB File 1 2 3 4 5 Writing MB/sec 115.96 117.50 118.53 113.16 116.46 Reading MB/sec 150.29 155.47 156.13 150.62 157.92 32 MB File 1 2 3 4 5 Writing MB/sec 118.84 118.26 123.01 123.42 125.39 Reading MB/sec 146.70 153.65 146.41 148.77 155.54 --------------------------------------------------------------------- 8 MB Cached File 1 2 3 4 5 Writing MB/sec 176.10 292.34 462.14 201.19 452.46 Reading MB/sec 599.06 830.94 992.19 878.99 1033.57 --------------------------------------------------------------------- Bus Speed Block KB 64 128 256 512 1024 Reading MB/sec 101.09 107.71 123.43 139.70 136.62 --------------------------------------------------------------------- 1 KB Blocks File MB > 2 4 8 16 32 64 128 Random Read msecs 0.22 0.18 0.18 0.18 0.18 0.19 0.19 Random Write msecs 0.13 0.13 0.13 0.13 0.14 0.19 0.21 --------------------------------------------------------------------- 500 Files Write Read Delete File KB MB/sec ms/File MB/sec ms/File Seconds 2 0.56 3.68 3.00 0.68 0.629 4 0.84 4.85 6.79 0.60 0.541 8 1.92 4.27 13.34 0.61 0.502 16 1.01 16.17 22.14 0.74 0.528 32 1.95 16.81 38.21 0.86 0.527 64 3.75 17.50 59.57 1.10 0.490 End of test Mon Jan 04 16:10:53 2016

To Start

Reliability/Stress tests were run using the ARM CPUs on various Raspberry Pi Systems, including 32 bit and 64 bit Operating Systems. Besides attempting to identify any false calculations or system crashes, a main purpose was to demonstrate performance reductions as the CPUs became overheated and identify processor clock throttling. This was aided by the availability of programmable functions that measure CPU MHz and temperature. The Raspberry Pi tests exercised multiple processor cores by running a number copies of the same programs via script files.

Running multiple copies of the same program does not appear to be possible using Android. So, multithreaded versions were produced, one using floating point calculations and the other integers. Earlier Android CPU benchmarks did not display results until the end of executing all tests. With long running stress tests, it is desirable to display running time and performance on an on-going basis. In this case, unreported calibration phases attempt to set run time parameters that lead to initial reportable test periods of around 10 seconds. This can be longer, if the initial pass takes more than 10 seconds, such as when other programs are running at the same time (as in the screen shot below).

Besides the CPU slowing down due to heating effects, the mobile devices, of course, run slower as the battery becomes discharged. In event of the latter, or CPU MHz throttling cannot avoid overheating, the CPU should turn off automatically (OR WORSE! - WATCH IT). It is recommended that stress testing is limited to one or a number of 15 minute sessions, to allow results to be saved and judgments made whether to continue.

Apparently running CPU MHz Benchmark and Raspberry Pi Stress Tests, functions required to obtain effective CPU MHz, can vary. This also applies to the measurement of CPU temperature. Hence, there can be no simple program to monitor these. In some cases, manual measurements can be noted after installing CPU-Z from Google Play. One difficulty there, is that a number of temperature measurements might be provided, without indications of the location.

The screenshot, below, of both stress tests, was from P37, a Moto G phone running Android 7. This has the option to run two programs at the same time, via a split screen. Besides performance, note the displayed sumchecks. An indication is given if data is not of the expected value.

The source code and project files are included in Android Intel-ARM Benchmarks.zip.

FPU + Int Test
Buttons
RunB - Run Benchmark - Runs most combinations of number of threads, data sizes and calculations per data word for the FPU tests. This is mainly to help to decide which options to use for stress testing. The benchmark runs using fixed parameters, carrying out exactly the same number of calculations using all thread combinations and data sizes. The pass count changes according to the number of calculations per word, for the FPU tests.
RunS - Run Stress Tests - Default running time is 15 minutes, with the middle data size, intended for containment in L2 cache, using 8 threads. and 32 operations per word in the FPU tests.
SetS - Specify run time parameters for stress test - These are 1, 2, 4, 8, 16 or 32 threads, 2, 8 or 32 Operations per word for FPU tests, 12.8 or 16 KB, 128 or 160 KB, 12.8 or 16 MB for FPU or Integer tests, and running time in minutes.
Info - Test description and details - The is essentially the same as details provided here.
Save - This offers details of the results and identified CPU hardware and Operating System for E-mail. Default addressee is the program author via results@roylongbottom.org.uk but this can be changed or additional addresses added.
Timing
On benchmarking running time of each pass is provided, reducing, where appropriate, on doubling the thread count.
Cumulative running time is provided for the stress tests, demonstrating the number of passes carried out in the specified running time. This increases as the CPU slows down due to heating effects or a discharged battery.

To Start

Benchmark - This is essentially the same program as used for the MP-MFLOPS Benchmark which, besides carrying out calculations with 2 and 32 floating point operations per data word, includes a further function with 8 operations. As a reminder, the benchmark runs using fixed parameters, carrying out exactly the same number of calculations using 1, 2, 4 and 8 threads. Note the sumchecks of numeric results of calculations, where every word is checked for identical values and results of zero are reported if any are incorrect. The number of calculations, and associated sumchecks, vary using different memory sizes and varying speeds of operation of caches and RAM.

Stress Test - As indicated earlier, the stress test runs multiple times, using the same run time parameters for number of threads, data size, floating point operations per data word and operations per pass, for the specified number of minutes. Then, the number of repeat passes can be fewer if CPU MHz is reduced. The calculated sumchecks should be identical for all threads. In the event of any comparison failures, the reported sumcheck is shown as zero.

Below are results from one minute stress tests using 16 and 32 threads, demonstrating similar throughput of around 6 GFLOPS. This is followed by details from 15 minute runs on various systems using 8 threads, including the same T22 system, that still produced a consistent performance of 6 around GFLOPS. All tests were carried out with fully charged batteries and power connected.

The table demonstrates a wide variation in the number of passes carried out in 15 minutes, where some are influenced by the calibration calculations for 10 seconds test duration, in this case the first pass shown as taking between 9.8 and 11.5 seconds. Besides speed reductions due to heating effects, or little change at the end, there can be short term reductions due to other system activity (worst case like downloading and installing updates).

P37 produced the highest performance degradation for initial tests, at 43%. The next three had similar beginning and end performance, with the occasional short term hiccup. The first T21 session produced slightly slower speed at the end. Repeating this shortly afterwards produced a 12% degradation. Kindle3 was run with the tablet in direct sunlight, with surrounding air around 30°C. This led to a 57% performance degradation. The last set of results were somewhat inconsistent over the whole period.

Benchmark Mode Results ARM/Intel MP-FPU Stress Test V1.0 30-May-2017 19.39 Compiled for 32 bit ARM v7a MFLOPS Numeric Results Ops/ KB KB MB KB KB MB Secs Thrd Word 12.8 128 12.8 12.8 128 12.8 8.6 T1 2 228 227 220 40392 76406 99700 4.4 T2 2 451 449 434 40392 76406 99700 2.4 T4 2 882 882 736 40392 76406 99700 2.0 T8 2 1182 1250 758 40392 76406 99700 16.3 T1 8 477 477 466 54760 85092 99819 8.2 T2 8 951 949 925 54760 85092 99819 4.2 T4 8 1856 1879 1830 54760 85092 99819 2.8 T8 8 2738 2941 2744 54760 85092 99819 38.1 T1 32 811 813 801 35218 66014 99520 19.1 T2 32 1625 1621 1605 35218 66014 99520 9.7 T4 32 3190 3222 3186 35218 66014 99520 6.1 T8 32 4909 5179 5135 35218 66014 99520 End Time 30-May-2017 19.41 Stress Test 16 Threads ARM/Intel MP-FPU Stress Test V1.0 01-Jun-2017 11.43 Compiled for 64 bit ARM v8a Data Ops/ Nmeric Seconds Size Threads Word MFLOPS Results 11.9 128 KB 16 32 6058 35951 22.1 128 KB 16 32 6012 35951 32.8 128 KB 16 32 5717 35951 43.1 128 KB 16 32 5988 35951 53.3 128 KB 16 32 5991 35951 63.6 128 KB 16 32 5962 35951 End Time 01-Jun-2017 11.46 Stress Test 32 Threads ARM/Intel MP-FPU Stress Test V1.0 01-Jun-2017 11.40 Compiled for 64 bit ARM v8a Data Ops/ Nmeric Seconds Size Threads Word MFLOPS Results 11.8 128 KB 32 32 6087 35951 22.0 128 KB 32 32 6040 35951 32.2 128 KB 32 32 6001 35951 42.4 128 KB 32 32 6020 35951 52.7 128 KB 32 32 6001 35951 63.1 128 KB 32 32 5897 35951 End Time 01-Jun-2017 11.43 Various Systems, all 8 Threads, 32 Ops/word, 128 KB, 15 Minutes System P37 T22 A1 A5 T21 T21 T21 T11 Device moto Leno Asus Tec Kindl1 Kindl2 Kindl3 Voyo CPU A53 A53 Atom Atom QC800 QC800 QC800 A15 Cores 8 4 4 4 4 4 4 2 GHz 1.5+1.2 1.3 1.86 1.44 2.2 2.2 2.2 2 Test Secs Start 11.5 10.2 10.0 9.8 10.5 10.5 10.6 10.4 End 17.0 10.3 10.0 9.1 11.1 11.7 23.8 12.2 Pass -------------------------- MFLOPS -------------------------- 1 5435 6025 4131 3329 4766 4853 4810 2758 2 5451 5937 4110 3183 4856 4876 4826 2226 3 5451 6005 4114 3097 4886 4886 4846 2937 4 5349 5919 4107 3168 4889 4882 4729 3045 5 5396 5995 4137 3138 4863 4897 4833 3052 6 5332 5997 4117 3154 4895 4766 4712 3032 7 5334 5985 4103 3161 4877 4690 4717 3023 8 5431 6009 4097 3214 4889 4610 4864 3056 9 5195 5977 4099 3193 4894 4609 4873 2726 10 5415 5879 4144 3153 4898 4574 4876 3033 11 5278 5994 4087 3149 4805 4610 4891 2592 12 5315 5989 4109 3140 4835 4592 4878 3046 13 5311 5977 4136 3151 4862 4617 4874 1617 14 5142 5991 4106 3173 4890 4557 4856 2216 15 5069 5964 4138 3113 4890 4569 4894 2719 16 5017 4618 4101 3118 4899 4546 4805 3037 17 5102 5879 4128 3161 4869 4553 4604 2729 18 5073 5945 4098 3135 4871 4520 4733 2727 19 5064 5963 4144 3170 4869 4533 4652 2973 20 5104 5976 4131 3139 4885 4558 4605 2672 21 4625 5824 4139 3152 4882 4512 4594 2699 22 4558 5984 4145 3106 4892 4547 4559 2924 23 4572 5934 4164 3128 4870 4508 4535 2739 24 4701 5968 4128 3132 4860 4497 4524 2626 25 4674 5975 4083 3121 4870 4550 4488 2987 26 4298 5979 4079 3139 4734 4525 4443 2675 27 4384 5963 4124 3034 4697 4485 4413 2623 28 4343 5981 4106 3118 4781 4483 4416 2928 29 4442 5965 4135 3180 4866 4514 4441 2692 30 4147 5974 4141 3130 4817 4492 4436 2619 31 4246 5998 4099 3032 4837 4505 4422 2744 32 4530 6008 4046 3393 4872 4469 3390 2892 33 3903 5951 4120 3380 4876 4488 4259 2615 34 3979 5990 4098 3350 4864 4519 3228 2617 35 4639 5973 4120 3388 4858 4488 3572 2833 36 3934 5953 4107 3364 4889 4499 3408 2801 37 4021 5921 4118 3372 4842 4474 3150 2579 38 3872 5983 4138 3401 4855 4515 3377 2624 39 4002 5925 4109 3397 4853 4464 2772 2613 40 4212 5996 4141 3384 4832 4474 2996 2838 41 3997 5970 4109 3397 4854 4460 2892 2800 42 3998 5986 4084 3397 4856 4446 2686 2645 43 3878 5992 4116 3302 4878 4432 2691 2523 44 3907 5965 4150 3400 4854 4485 2695 2589 45 3955 5922 4113 3402 4818 4429 2696 2840 46 3795 5944 4132 3368 4862 4475 2702 2765 47 3843 5938 4098 3359 4786 4432 2690 2652 48 3799 5979 4118 3379 4817 4464 2690 2492 49 3532 5947 4125 3374 4876 4438 2202 2619 50 3115 5986 4121 3375 4804 4435 2162 2798 51 3962 5979 3728 3399 4840 4435 2162 2694 52 3922 5980 4084 3401 4697 4404 2165 2520 53 3822 5977 4120 3383 4776 4448 2148 2621 54 3669 5967 4067 3364 4732 4383 2113 2607 55 3777 5991 4141 3389 4673 4444 2126 2702 56 3591 5964 4111 3390 4739 4423 2170 2830 57 3660 5992 4113 3372 4700 4428 2137 2627 58 3883 5966 4115 3397 4684 4445 2163 2510 59 3727 5972 4114 3395 4723 4436 2158 2522 60 3710 6002 4105 3209 4700 4356 2152 2792 61 3951 6009 4046 2951 4722 4408 2699 62 3628 5807 4082 3109 4745 4381 2546 63 3572 5929 4124 3069 4728 4390 2527 64 3743 5963 4113 3144 4714 4442 2522 65 5954 4133 3145 4699 4405 2785 66 5949 4142 3074 4688 4360 2698 67 5964 4112 3087 4688 4374 2532 68 5903 4107 3152 4685 4334 2468 69 5956 4088 3037 4527 4370 2630 70 5962 4136 3146 4664 4399 2793 71 5981 4146 3158 4658 4407 2598 72 5985 4107 3119 4647 4382 2508 73 5937 4086 3134 4618 4372 2512 74 5944 4130 3162 4658 4387 2504 75 5965 4086 3143 4602 4395 2787 76 5971 4153 3163 4652 4382 2607 77 5987 4130 3155 4588 4383 2547 78 5957 4150 3145 4581 4391 2512 79 5920 4137 3128 4596 4381 80 5984 4109 3141 4631 81 5989 4146 3121 4623 82 5959 4120 3174 4609 83 5957 4140 3184 4533 84 5982 4102 3143 4634 85 5902 4111 3171 86 5954 3787 3144 87 6000 4097 3167 88 4101 3121 89 4084 3162 90 4155 3141 91 3027 92 3336 93 3397 94 3365 Average 4403 5948 4108 3217 4779 4498 3715 2690 Maximum 5451 6025 4164 3402 4899 4897 4894 3056 Minimum 3115 4618 3728 2951 4527 4334 2113 1617

To Start

This test writes data, comprising two data patterns out of 24 variations (such as binary 0000. 0101, 0011, 1111) then reads it via alternate additions and subtractions. This leaves the original data unchanged, which is checked for correctness and any errors reported. As with the Floating Point Stress Test, buttons are provided to run a quick benchmark or long running stress test and one to set parameters for the latter. Performance is measured in MB/second.

Benchmark - Below is an example of results, the program using all thread and data size combinations, and the first 6 data patterns. Note fastest speeds are with all threads using different sections of 160 KB.

Stress Test - Following benchmark output are some stress test results, all at the default parameter settings and mainly with the systems connected to the power source. As with MP-FPU-Stress.apk, the number of passes in 15 minutes varies, depending on the initial calibrated time and whether speed is changed due to the CPU clock speed reducing at higher temperatures. Results include some with the devices running without the power supply connected. One (T21), showed similar performance between battery and power supply driven, but the battery was probably fully charged.

Benchmark Mode Results ARM/Intel MP-Int Stress Test V1.0 21-Jun-2017 16.50 Compiled for 32 bit ARM v7a MB/second KB KB MB Same All Secs Thrds 16 160 16 Sumcheck Tests 9.1 1 2970 2855 2336 00000000 Yes 4.7 2 5770 5605 4523 FFFFFFFF Yes 3.0 4 10876 10907 5534 5A5A5A5A Yes 2.4 8 14361 16162 6156 AAAAAAAA Yes 2.3 16 16522 18100 6091 CCCCCCCC Yes 2.3 32 15948 17827 6187 0F0F0F0F Yes End Time 21-Jun-2017 18.41 Various Systems, 8 Threads, 160 KB, 15 Minutes System P37 T22 T11 T11 A1 A5 T21 T21 Device moto Leno Voyo Voyo Asus Tec Kindl2 Kindl2 CPU A53 A53 A15 A15 Atom Atom QC800 QC800 Cores 8 4 2 2 4 4 4 4 GHz 1.5+1.2 1.3 2 2 1.86 1.44 2.2 2.2 Test Secs start 9.5 10.1 9.7 8.5 9.7 7.2 9.4 9.0 end 13.5 9.7 14.2 12.9 9.0 7.1 10.3 9.9 Pass Battery Battery 1 20037 16331 12745 11029 25433 22184 14100 9778 2 19149 16111 12102 10888 26589 21509 14361 14046 3 19451 16127 10349 10629 26185 20577 14570 13577 4 19308 16073 10938 8464 26727 18492 14433 14111 5 19386 16308 10988 9600 26541 18574 14449 12458 6 19714 16511 10713 6264 26841 19075 14468 12866 7 19376 16283 10186 6286 26982 18206 14468 14298 8 19327 16110 9845 6453 26761 18080 14488 13913 9 19224 16036 9792 6239 26804 16131 14101 14174 10 20331 16409 10116 6267 26563 15385 14097 13272 11 19945 16324 10797 6302 26765 14799 13961 12987 12 19101 15923 9830 6348 26946 12244 13875 13444 13 19478 16066 9630 6469 26928 17727 13381 14341 14 19482 16472 9043 6358 26708 16036 13173 14083 15 18492 16146 9873 6441 26985 12831 13121 14339 16 18664 15971 10445 6272 26678 18164 13071 13381 17 18476 16296 9875 6337 26818 18394 13017 13738 18 16615 16371 8887 6402 27028 18204 13033 13732 19 15829 16419 9170 6441 27069 18688 13078 13842 20 16755 16205 9185 6399 26640 18542 13067 13650 21 14564 16059 10952 6297 26796 18379 12936 13692 22 16996 15787 9597 6418 26967 18645 12896 13573 23 14891 16051 9359 6201 26830 19071 12966 13614 24 17154 16219 9178 6244 26765 18641 12759 13540 25 14580 15907 8707 6373 26817 18589 12908 13484 26 17185 15995 10194 6327 26875 18765 12891 13518 27 14063 15978 9824 6362 26716 17421 12781 13392 28 15158 16004 8697 6422 26725 18432 12771 13301 29 14347 16341 8705 6292 26909 16909 12779 13060 30 13116 16060 8774 6420 26854 15801 12689 13445 31 13267 16475 10325 6281 26888 13700 12768 13196 32 13814 16123 9327 6411 26948 20494 12785 13248 33 14348 16107 8643 6313 26960 20499 12723 13073 34 12555 16150 8445 6334 25360 21228 12794 12926 35 12579 16043 8702 6450 26332 21266 12613 12942 36 14506 16026 9960 5991 26047 21142 12613 13107 37 14338 16309 9510 6435 26233 20585 12594 13225 38 12474 16409 8837 6389 26708 23052 12523 13167 39 14030 15855 8564 6435 26985 23206 12551 13142 40 14399 16140 8594 6322 26869 23224 12503 13108 41 13122 15976 8876 6174 26730 23089 12447 13092 42 12340 16181 10637 6233 26445 23183 12443 13055 43 12880 16184 9317 6376 26554 23255 12555 12914 44 12454 16184 8423 6423 26970 23393 12473 13159 45 12220 16341 8614 6447 26637 23267 12408 12927 46 11486 16351 8183 6391 27022 23291 12423 13078 47 13306 16196 10820 6327 26650 23321 12383 12895 48 13629 16254 8968 6312 26702 23192 12234 13033 49 11897 16351 8463 6286 26558 23306 12353 12884 50 14640 16069 8036 6364 26757 23229 12344 12942 51 11354 16054 10331 6440 26781 23166 12320 12918 52 13217 16080 9005 6360 26759 23270 12261 12868 53 12672 15856 8373 6425 26863 23259 12216 12919 54 11752 16150 8603 6441 27063 23352 12255 12808 55 12783 16147 8228 6384 26641 23302 12283 12923 56 12984 16340 9258 6424 26832 23249 12201 12838 57 11459 16434 9690 6247 26491 23395 12228 12829 58 13042 16378 8510 6461 26706 22927 12265 12856 59 11289 16019 8259 6372 26866 23063 12241 12821 60 14140 16443 8089 6360 26919 23335 12212 12743 61 11527 16392 9949 6249 26639 23257 12227 12683 62 12224 16332 9177 6332 26721 23148 12081 12581 63 11942 16209 8383 6430 26806 23201 12302 12732 64 11836 15867 8339 6366 26724 23254 12210 12706 65 12680 16161 8215 6435 26560 23295 12143 12740 66 11171 16228 8719 6338 26823 23278 12166 12731 67 13207 16193 9821 6422 27002 23175 12107 12792 68 11382 16320 8978 6410 26768 23358 12077 12673 69 11365 16309 8073 6353 26942 23214 12135 12575 70 13366 16169 7896 6426 26681 22885 12111 12817 71 10909 16759 9942 6360 26608 23360 12208 12645 72 13351 15734 9088 6300 26883 23238 12098 12650 73 16595 8319 26708 23302 12652 13016 74 16291 26859 23308 12639 13126 75 15874 27044 23292 12656 13135 76 15990 26904 23344 12775 13155 77 16167 26865 23252 12664 13198 78 16320 26734 23181 12554 13108 79 16416 26781 23351 12423 12995 80 15805 27054 23130 12554 12889 81 16097 26781 23135 12464 13011 82 16405 26911 23331 12538 12960 83 16120 26803 19419 12370 13119 84 16338 26739 20700 12478 13037 85 16211 26756 21231 11963 12552 86 16311 26798 21087 12061 12426 87 16082 26798 21321 12123 12465 88 16141 26969 20290 11974 12387 89 16125 26974 21290 11936 12448 90 15928 26737 20739 11999 12499 91 16068 26703 21205 11932 12403 92 16321 26724 21281 12033 12422 93 16329 26595 21068 11915 12455 94 16398 27019 20730 12305 95 26742 21275 12404 96 26705 21495 12317 97 27052 20938 98 26899 21087 99 26613 21154 100 26466 21066 101 26338 20967 102 26626 19663 103 20117 104 20996 105 21445 106 20935 107 20882 108 21328 109 21347 110 20954 111 21175 112 21107 113 20924 114 20812 115 21636 116 21501 117 21184 118 21303 119 21259 120 21161 121 21398 122 20517 123 20803 124 21484 125 21334 126 20291 Average 14780 16186 9356 6615 26739 20978 12713 13046 Maximum 20331 16759 12745 11029 27069 23395 14570 14341 Minimum 10909 15734 7896 5991 25360 12244 11915 9778 End 11365 16398 8319 6300 26626 20291 11915 12317 End Pass Seconds 13.5 9.7 14.2 12.9 9.0 7.1 10.3 9.9 Passes 72 94 73 72 102 126 93 96

To Start

The following series of tests comprised running both the floating point and integer stress tests at the same time, both using the default parameters with 8 threads. All tests were run using battery power. Results provided are for the first three test runs and the last three of the overall 15 minutes. Note that the two test programs had different running times.

The first (P37) has 8 cores. On starting, each of the two stress tests, as might be expected, initially running at around half speed. After 15 minutes, both produced similar performance degradations and essentially the same as single system tests, using power supplies. Following a slight delay, the second tests started running at slightly decreased temperatures and faster speed, but produced slower end speeds. Run 3 started in a similar manner, then went haywire, with FPU tests running at a crawl and the other speeding up. The fourth test runs fitted the normal pattern, each ending with performance equivalent to a quarter of the maximum of that running a single program.

The second system (T21) has a quad core CPU and produced fairly consistent performance over this particular hour of testing.

Next log details are provided to demonstrate that a device can handle 64 threads, using 32 from each of the stress tests. In this case (with P37), performance over 5 minutes was similar to that at the start of the 8 thread test, using both apps.

P37 Octa-core Cortex-A53 T21 Quad Core Snapdragon 800 Secs MB/sec %max Secs MFLOPS %max Secs MB/sec %max Secs MFLOPS %max Max 1 program 20037 5435 14570 4899 Run 1 Start 17 10441 52 11 2790 51 9 7192 49 21 2702 55 16 10677 11 2819 8 7313 20 2482 17 10163 11 2862 8 6848 21 2517 End 22 7703 15 2018 9 6718 25 2482 25 7030 16 1886 9 6419 24 2517 25 6913 35 16 1899 35 9 6564 45 24 2479 51 Run 2 Start 18 8713 43 16 1969 36 10 6414 44 24 2140 44 20 7848 16 1964 10 6669 23 2268 20 7711 16 1949 10 6312 24 2179 End 26 5966 20 1529 10 6513 27 1883 27 5865 20 1522 10 6299 25 2092 27 5733 29 20 1569 29 12 5339 37 27 1900 39 Run 3 Start 21 6957 35 18 1746 32 10 6619 45 23 2247 46 23 6445 20 1553 10 6609 24 2120 24 6135 18 1680 10 6888 24 2168 End 17 8548 53 581 10 6738 26 1996 17 8542 84 367 12 5353 28 1849 18 8414 42 74 413 8 12 5519 38 28 1811 37 Run 4 Start 22 6275 31 12 1738 32 10 6880 47 22 2341 48 24 5941 12 1699 10 6821 23 2183 25 5532 14 1437 10 6563 24 2117 End 26 5309 15 1396 12 5357 28 1834 26 5277 15 1370 12 5629 28 1845 27 5081 25 16 1276 23 12 5572 38 28 1853 38 P37 Both Stress Tests 32 Threads Each ARM/Intel MP-Int Stress Test V1.0 ARM/Intel MP-FPU Stress Test V1.0 25-Jul-2017 10.53 25-Jul-2017 10.54 Compiled for 32 bit ARM v7a Compiled for 32 bit ARM v7a Data Same All Data Ops/ Numeric Secs KB Threads MB/sec Sumcheck Threads Secs KB Threads Word MFLOPS Results 17 160 32 11626 00000000 Yes 15 128 32 32 2771 42157 35 160 32 10456 00000000 Yes 28 128 32 32 2492 42157 52 160 32 10743 00000000 Yes 38 128 32 32 2845 42157 71 160 32 10198 00000000 Yes 50 128 32 32 2758 42157 89 160 32 10534 00000000 Yes 61 128 32 32 2677 42157 106 160 32 10752 00000000 Yes 72 128 32 32 2926 42157 125 160 32 10168 FFFFFFFF Yes 84 128 32 32 2549 42157 142 160 32 11094 FFFFFFFF Yes 94 128 32 32 3017 42157 160 160 32 10389 FFFFFFFF Yes 104 128 32 32 2881 42157 178 160 32 10408 FFFFFFFF Yes 117 128 32 32 2474 42157 195 160 32 11203 FFFFFFFF Yes 127 128 32 32 2920 42157 214 160 32 9938 FFFFFFFF Yes 139 128 32 32 2712 42157 230 160 32 11622 5A5A5A5A Yes 150 128 32 32 2826 42157 249 160 32 9857 5A5A5A5A Yes 161 128 32 32 2816 42157 267 160 32 10381 5A5A5A5A Yes 173 128 32 32 2496 42157 285 160 32 10317 5A5A5A5A Yes 183 128 32 32 3067 42157 305 160 32 9808 5A5A5A5A Yes 194 128 32 32 2779 42157 207 128 32 32 2410 42157 End Time 25-Jul-2017 11.00 217 128 32 32 3020 42157 228 128 32 32 2743 42157 Average 10537 240 128 32 32 2474 42157 250 128 32 32 3147 42157 Started a little earlier 263 128 32 32 2423 42157 273 128 32 32 2927 42157 284 128 32 32 2876 42157 297 128 32 32 2422 42157 305 128 32 32 3758 42157 End Time 25-Jul-2017 11.00 Average 2786 Ended slightly later

To Start

Running the stress tests did not reveal any real data comparison failures, although one did appear to occur before a flat battery lead to a switch off. Also, there were a couple of inexplicable program crashes, where, of course, the recorded results are lost. However, there is an issue regarding false error reports.

All of my Android CPU benchmarks arrange for starting, stopping and displaying results via Java code, with those executing native machine code produced from compiled C. The original benchmarks only display results when all processing is finished and do not appear to demonstrate the peculiar behaviour of the stress tests.

When running these stress tests, rotating the device leads to the initial starting display to be produced. Then, after pressing the Run button, errors, as shown below, are indicated. Before this, running VMSTAT, via a Terminal Emulator app, indicates that processing had not stopped executing the benchmark code. Hence, it seems that two copies of the program were running at the same time, confusing reported results. The same effect is reproduced by pressing the Run button whilst the program is executing.

AVOID RUNNING WHEN THE PHONE/TABLET IS HAND HELD

The following are examples of false errors, when it was known that tests had been restarted after a stoppage caused by rotating the device. For a clean restart, normally the offending program can be killed by tapping the “RECENTS” (square) button and swiping the app off the screen. then restarted via the main display. However, one tablet (T21) had to be removed via the Settings, App, Force Stop button.

ARM/Intel MP-FPU Stress Test V1.0 26-Jun-2017 11.25 Data Ops/ Nmeric Seconds Size Threads Word MFLOPS Results 32.2 128 KB 8 32 1410 0 Zero indicates eorrors found 54.1 128 KB 8 32 1172 0 Time/pass much greater than 10 seconds 83.4 128 KB 8 32 1195 0 121.9 128 KB 8 32 1385 0 135.4 128 KB 8 32 1507 49805 Unexpected result 153.8 128 KB 8 32 1367 0 189.7 128 KB 8 32 1159 0 204.5 128 KB 8 32 693 0 Shorter time but worse performance 323.5 128 KB 8 32 692 0 350.8 128 KB 8 3234222848 99999 Measured test time near zero in 27 seconds, 99999 reflects initial data 410.1 128 KB 8 32 1005 0 433.9 128 KB 8 32 1255 66014 Expected result ARM/Intel MP-Int Stress Test V1.0 26-Jun-2017 11.25 Data Same All Seconds Size Threads MB/sec Sumcheck Threads 8.7 160 KB 8 4568 00000000 Yes Test seconds as expected around 10 25.3 160 KB 8 6375 00000000 Yes to 11 seconds 32.4 160 KB 8 3451 00000000 Yes Yes means all threads correct result 39.5 160 KB 8 3492 00000000 Yes 46.6 160 KB 8 1951607840 00000000 Yes Impossible MB/sec suggests test did 79.5 160 KB 8 4728 FFFFFFFF Yes not run 98.0 160 KB 8 5205 FFFFFFFF Yes 114.6 160 KB 8 5760 FFFFFFFF Yes Should be six times 00000000 134.4 160 KB 8 4797 5A5A5A5A Yes then six times FFFFFFFF 158.9 160 KB 8 3537 5A5A5A5A Yes then six times 5A5A5A5A 174.1 160 KB 8 4891 5A5A5A5A Yes etc. 237.4 160 KB 8 1951607840 5A5A5A5A No 1 Impossible MB/sec, 1 thread wrong 374.6 160 KB 8 279600040 CCCCCCCC No 8 Impossible MB/sec, 8 threads wrong 397.3 160 KB 8 3204 CCCCCCCC Yes 415.7 160 KB 8 6304 0F0F0F0F No 4 4 threads wrong, were they CCCCCCCC 420.5 160 KB 8 1951607840 CCCCCCCC Yes

To Start

T7 Nexus 7 quad core CPU 1.3, GHz 1.2 GHz > 1 core Device Asus Nexus 7 RAM 1 GB DDR3L-1333 Bandwidth 5.3 GB/sec Screen pixels w x h 1280 x 736 MHz Twelve-core Nvidia GeForce ULP graphics 416 MHz Android Build Version 4.1.2 Processor : ARMv7 Processor rev 9 (v7l) processor : 0 BogoMIPS : 1993.93 processor : 1 BogoMIPS : 1993.93 processor : 2 BogoMIPS : 1993.93 processor : 3 BogoMIPS : 1993.93 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc09 - Cortex-A9 CPU revision : 9 Hardware : grouper - nVidia Tegra 3 T30L Revision : 0000 Linux version 3.1.10 Runs at 1.2 GHz T11 Voyo A15, Samsung EXYNOS 5250 Dual core 2.0 GHz Cortex-A15, Device Urbetter VOYO A15 Mali-T604 GPU, 2 GB DDR3-1600 RAM, dual channel, 12.8 GB/s Screen pixels w x h 1920 x 1032 Android Build Version 4.2.2 - Jelly Bean Processor : ARMv7 Processor rev 4 (v7l) processor : 0 BogoMIPS : 992.87 processor : 1 BogoMIPS : 997.78 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc0f CPU revision : 4 Hardware : SMDK5250 Linux version 3.4.35Ut Runs at 1.7 GHz T15 HTC Nexus 9, dual core Denver CPU 2400 MHz Screen pixels w x h 2048 x 1440 Android Build Version 5.0.1 Processor : NVIDIA Denver 1.0 rev 0 (aarch64) processor : 0 & 1 Features : fp asimd aes pmull sha1 sha2 crc32 CPU implementer : 0x4e CPU architecture: AArch64 CPU variant : 0x0 CPU part : 0x000 CPU revision : 0 Hardware : Flounder Revision : 0000 MTS version : 33410787 Linux version 3.10.40 T21 Kindle Fire HDX 7, 2.2 GHz Quad Core Qualcomm Snapdragon 800 (Krait 400) 2 x 32 Bit LPDDR3-1866 Memory, 14.9 GB/s, GPU Qualcomm Adreno 330, 578 MHz Device Amazon KFTHWI Screen pixels w x h 1200 x 1803 Android Build Version 4.4.3 Processor : ARMv7 Processor rev 0 (v7l) processor : 0, 1, 2, 3 BogoMIPS : 38.40 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x2 CPU part : 0x06f CPU revision : 0 Hardware : Qualcomm MSM8974 Revision : 0000 Linux version 3.4.0-perf (gcc version 4.7) T22 Lenovo Tab 2 A8-50, 1.3 GHz quad core 64 bit MediaTek ARM Cortex-A53 1 GB LPDDR3, GPU Mali T720 MP2 Device LENOVO Lenovo TAB 2 A8-50F Screen pixels w x h 800 x 1216 Android Build Version 5.0.2 Processor : AArch64 Processor rev 3 (aarch64) processor : 0, 1, 2 BogoMIPS : 26.0 Features : fp asimd aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: AArch64 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 3 Hardware : MT8161 Linux version 3.10.65 P33 Sony Xperia Z3+ E6533, Quad-core 1.5 GHz & Quad-core 2 GHz Qualcomm Snapdragon 810 64-bit CPU Screen pixels w x h 1080 x 1776 Android Build Version 5.0.2 Processor : AArch64 Processor rev 1 (aarch64) processor : 0 to 7 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x1 CPU part : 0xd07 CPU revision : 1 Hardware : Qualcomm Technologies, Inc MSM8994 Linux version 3.?10.?49 P36 LGE LG-H811 Qualcomm Snapdragon 808, 1.8 GHz 64-bit Hexa-Core Device LGE LG-H811 Screen pixels w x h 1440 x 2392 Android Build Version 5.1 Processor : AArch64 Processor rev 2 (aarch64) processor : 0, 1, 2, 3, 4, 5 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x1 CPU part : 0xd07 CPU revision : 2 Hardware : Qualcomm Technologies, Inc MSM8992 Revision : 000b Linux version 3.10.49- P37 Lenovo Moto G4 Snapdragon 617, Octa-core Cortex-A53 Cores 4x1.5 GHz 4x1.2 GHz, 2 GB RAM 933 MHz, GPU Adreno 405 550 MHz Device Motorola Moto G (4) Screen pixels w x h 1080 x 1776 Android Build Version 6.0.1 CPU part : 0xd03 CPU revision : 4 Hardware : Qualcomm Technologies, Inc MSM8952 Revision : 82a0 Processor : ARMv7 Processor rev 4 (v7l) Device : athene_13mp Radio : EMEA MSM Hardware : MSM8952 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 processor : 5, 6, 7 model name : ARMv7 Processor rev 4 (v7l) BogoMIPS : 38.00 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 Linux version 3.10.84-g061c37c P37 Later Android Build Version 7.0 Linux version 3.10.84-g478d03a P38 Samsung Galaxy Note 4 Snapdragon 805, 4x2.7 GHz Cortex A57 + 4x1.3 GHz Cortex A53 Device Samsung SM-N910C Screen pixels w x h 1440 x 2560 Android Build Version 6.0.1 processor : 4 to 7 model name : ARMv7 Processor rev 0 (v7l) BogoMIPS : 76.00 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x1 CPU part : 0xd07 CPU revision : 0 Hardware : Samsung EXYNOS5433 Revision : 0015 Serial : bfc12ce406b30041 Linux version 3.10.9-9186796 P39 Galaxy Tab S2 SM-T710 EXYNOS 5433, 4x1.9 GHz Cortex A57 + 4x1.3 GHz Cortex A53 Device Samsung SM-T710 Screen pixels w x h 1536 x 2048 Android Build Version 6.0.1 processor : 4 to 7 model name : ARMv7 Processor rev 0 (v7l) BogoMIPS : 76.00 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x1 CPU part : 0xd07 CPU revision : 0 Hardware : Samsung EXYNOS5433 Revision : 0008 Serial : 5f827412e6280033 Linux version 3.10.9-8374498 P40 Moto X 1st XT1049, dual core 1.7 GHz Qualcomm Snapdragon S4 Pro MSM8960 Device Motorola XT1049 Screen pixels w x h 720 x 1184 Android Build Version 5.1 Processor : ARMv7 Processor rev 0 (v7l) processor : 0, 1 BogoMIPS : 13.53 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x2 CPU part : 0x04d CPU revision : 0 Hardware : msm8960dt Revision : 8300 Serial : 0001000c044ef01d Device : ghost Radio : 4 Linux version 3.4.42-gd5fa9d8 P41 Moto G Play XT1607, quad core 1.2 GHz Cortex A53 MSM8916 Snapdragon 410 Device Motorola Moto G Play Screen pixels w x h 720 x 1184 Android Build Version 6.0.1 CPU revision : 0 Hardware : Qualcomm Technologies, Inc MSM8916 Revision : 81b0 Serial : e5c8122300000000 Device : harpia Radio : US MSM Hardware : MSM8916 processor : 0 to 3 model name : ARMv7 Processor rev 0 (v7l) BogoMIPS : 38.00 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0 Linux version 3.10.49-g41f86a8 A1 Asus MemoPad 7 ME176CEX, 1.86 GHz Atom Intel Atom Z3745 Device Asus K013 Screen pixels w x h 800 x 1216 Android Build Version 4.4.2 Processor : ARMv7 processor rev 1 (v7l) BogoMIPS : 1500.0 Features : neon vfp swp half thumb fastmult edsp vfpv3 CPU implementer : 0x69 CPU architecture: 7 CPU variant : 0x1 CPU part : 0x001 CPU revision : 1 Hardware : placeholder Revision : 0001 Linux version 3.10.20 Mainly runs at 1.86 GHz Turbo Boost A4 Intel(R) Atom x5-Z8300 1.84 GHz (turbo) Device Intel cht_cr_rvp Screen pixels w x h 800 x 1216 Android Build Version 5.1.1 : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf nonstop_tsc_s3 pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch ida arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms bogomips : 2879.90 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 76 model name : Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz stepping : 3 microcode : 0x358 cpu MHz : 1840.000 cache size : 1024 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid Linux version 3.14.37 A5 Same tablet as W2 - Intel Atom Z8300 1.44 GHz, Turbo 1.84 Device Teclast X98 Plus(A5C8) Screen pixels w x h 2048 x 1440 Android Build Version 5.1 Processor : ARMv7 processor rev 1 (v7l) BogoMIPS : 1500.0 Features : neon vfp swp half thumb fastmult edsp vfpv3 vfpv4 idiva idivt CPU implementer : 0x69 CPU architecture: 7 CPU variant : 0x1 CPU part : 0x001 CPU revision : 1 Hardware : placeholder Revision : 0001 Linux version 3.14.37-x86_64-L1-R429 R1 Same as tablet W! running via Remix for PC with Android 6 Intel Z8300 quad core 1.44 GHz Turbo 1.8 Device PIPO W1S Screen pixels w x h 396 x 674 Android Build Version 6.0.1 - 64 bit flags etc. As A4 above processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 76 model name : Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz stepping : 3 microcode : 0x34f cpu MHz : 1599.975 cache size : 1024 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 6 initial apicid Linux version 4.4.14-android-x86_64 R2 Same as PC - Core i7 4820K quad core + HT at 3900 MHz Turbo Screen pixels w x h 396 x 674 Android Build Version 6.0.1 - 64 bit flags: numerous bogomips : 7421.92 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Core(TM) i7-4820K CPU @ 3.70GHz stepping : 4 microcode : 0x416 cpu MHz : 2471.484 cache size : 10240 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes Linux version 4.4.14-android-x86_64 W1 Pipo W1S Tablet. Intel Z8300 quad core 1.44 GHz Turbo 1.84 Same as R1 above Windows 10, 4 GB DDR 3 1600 CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000406C3 Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz Measured 1440 MHz Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow, AMD64 processor architecture, 4 CPUs Windows NT Version 6.2, build 9200, Memory 4020 MB, Free 2520 MB W2 Same tablet as A5 Teclast X98 Plus, Intel Atom Z8300 1.44 GHz, Turbo 1.84 CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000406C3 Intel(R) Atom(TM) x5-Z8300 CPU @ 1.44GHz Measured 1440 MHz Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow, Intel processor architecture, 4 CPUs Windows NT Version 6.2, build 9200, Memory 4021 MB, Free 2540 MB User Virtual Space 4096 MB, Free 4083 MB 64 Bit AMD64 processor architecture, 4 CPUs User Virtual Space 134217728 MB, Free 134217716 MB PC Core i7 4820K quad core + HT at 3900 MHz Turbo Same as R2 above CPU GenuineIntel, Features Code BFEBFBFF, Model Code 000306E4 Intel(R) Core(TM) i7-4820K CPU @ 3.70GHz Measured 3711 MHz Has MMX, Has SSE, Has SSE2, Has SSE3, No 3DNow, AMD64 processor architecture, 8 CPUs Windows NT Version 6.2, build 9200, Memory 32705 MB, Free 30584 MB User Virtual Space 134217728 MB, Free 134217715 MB

To Start

The Official Internet Home for my Benchmarks is via the link
Roy Longbottom's PC Benchmark Collection

Java Whetstone.apk
First standard benchmark

LinpackJava.apk
All Java version

JavaOpenGL1.apk
3D Graphics Frames Per second

JavaDraw.apk
Draw Frames Per Second

CP_MHz2.apk
Measure CPU MHz

BatteryTest.apk
Battery Drain Test using graphics

DriveSpeed.apk
SD card/internal drive tests

DriveSpd2.apk
Drive tests, user defined path