|
Contents
GeneralPerformance investigation of Raspberry Pi USB drives formatted with F2fs, compared with Ext4, were prompted by reports in
XBMC Community Forum
that copying files to the former was significantly faster than to the same drive formatted as Ext4. The particular page is not now directly available but might still be found by Googling for some of the following content, showing the claims.
The initial investigation is reported in
Raspberry Pi Benchmarks.htm.
This involved a range of file copying exercises on the Raspberry Pi. Results, using a high speed SanDisk Extreme USB 3.0 drive, are repeated below, and indicating that F2fs was only faster at larger file sizes.
The most unusual thing on the above Silicon results is the huge difference between copying time and CPU time.
Differences between Ext4 and F2fs file sizes are considered below under Copying Files and Data Volumes. The exercise also included running my DriveSpeed benchmark, where the standard version showed no real benefit of F2fs formatting, except on writing random access speed. Modified versions of the benchmark were produced to run for extended periods with RAM based file caching enabled. These showed some benefits with F2fs handling small files but up to 8 times faster on random writing speed. XBMC is a Media Center that can be installed on different Operating Systems. This was installed under Windows to produce a Thumbnail directory (4T above) of the type used for the claimed high speed operation. Copying and benchmark speed results in this report are with files and directories generated and used by XBMC installed on the Raspberry Pi.
SetupXBMC for the Raspberry Pi is part of OpenElec (Open Embedded Linux Entertainment Center), a small Linux distribution. Instructions how to install it for use with USB based files can be found here. Besides OpenElec 3.2.4, that was found not to support F2fs format, two other varieties were installed on different SD cards (see [1] below). The SanDisk Extreme drive, mentioned above, was formatted via Linux Ubuntu 13.10, initially using GParted Partition Editor, with separate Ext4, F2fs and FAT partitions. For use under OpenElec/XBMC, the F2fs partition needs a label for automatic mounting, the command shown below being used [2]. There is no Terminal available in OpenElec but commands can be executed from an SSH client, in my case via PuTTY installed under Windows 7. Copies of benchmark programs used were saved in the USB stick’s partitions and run from there. To allow XBMC to use a USB drive, cmdline.txt has to be produced and copied to the system SD drive (see below [3]). There can be complications in specifying the correct drive/partition path, in my case, further complicated as I use a multi-port USB hub. With multiple partitions, the F2fs partition address (/dev/sd??) can be deduced on using the df command, but experimenting under the normal Raspberry Pi Operating System might be necessary. When first used, XBMC generates a series of directories (some hidden) on the USB drive. Then, such as thumbnail files are added later. These directories can be copied to other drives or partitions but it might be necessary to change ownership. [4] below shown F2FS mount command needed for Raspbian. This is mounted by [3] file with OpenElec (when F2FS available [6]). The different versions of Linux used are also shown [5].
OpenElec OverheadsOpenElec has an RSS Feeds option, that scrolls news headlines across the bottom of the screen. This is reported to degrade performance. Before considering other test results, it is appropriate to measure these overheads and to see if there are others imposed on running OpenElec. The measurement were made using Raspbian and the three OpenElec versions, the latter with and without RSS feeds. The Milhouse varieties produced similar results, the average being shown. Performance Monitor - The top command was used to measure utilisation after settling down immediately following booting (top -d 5 -n 5 five second intervals, five samples). Memory utilisation is shown below [7], initial demands for OpenElec being far greater than Raspbian, particularly the Milhouse versions, but RSS scrolling makes little difference. My stressInt program, see here, can be used to reduce cache space, as shown below (using ./stressInt KB 200000), but attempting larger demands caused a crash, not the case with Raspbian at 450 MB (with swapping). CPU loading measurements at [8] show that OpenElec leads to a much higher utilisation than Raspbian, particularly when running RSS News Feeds. This must mainly be at a lower priority, as demonstrated with some of the following benchmark scores. Raspberry Pi Benchmarks
include Dhrystone and Linpack test programs that measure integer and floating point arithmetic speeds respectively, see [9]. Here, RSS scrolling produces some degradation (lower score).
Similarly using BusSpeed , that measures data transfer speed from data in caches and RAM, a small sample being provided below [10].
Significant RSS performance differences are identified by my multithreading MP-MFLOPS benchmark
(see details here).
This executes 2 and 32 floating point operations per data word from caches and RAM, using 1, 2, 4, and 8 threads, a sample being shown below [11].
Running the top monitor, whilst running MP-MFLOPS, shows that XBMC/MP-MFLOPS CPU utilisation is up to 45%/52% with RSS Feeds running, but typically 18%/81% without. Results suggests that each thread is given the same priority as RSS, leading to worst MP-MFLOPS performance using one thread.
DriveSpeed BenchmarkOn running, results are displayed and saved in driveSpeed.txt log file, as shown below. This shows the file path, helping to identify that the correct partition has been used. Tests carries out are: [12] Test 1 - Write and read three 8 and 16 MB; Results given in MBytes/second The execution file and source code are in
Raspberry_Pi_Benchmarks.zip.
The source code was modified for other variations.
DriveSpeed ResultsThe following shows DriveSpeed results, via Raspbian and different releases of OpenElec, providing average speeds or running times over all tests and three runs of the benchmark. Relative F2fs/Ext4 speeds are also provided. It was found that mounting F2fs files was not possible in the first OpenElec tests. Further examination of the posts for reported high speed F2fs copying appeared to suggest that Milhouse releases were used, but earlier versions than those currently available. The first and last ones identified at this time were used here. It is difficult to be conclusive due to wide variations in speed, but F2fs appears to be slower on writing with the large file and random tests. Except for writing large files, performance via OpenElec is slower than running via Raspbian. Milhouse versions of OpenElec at least support F2fs but provide no real advantage over using Ext4 format. Caching requires further consideration.
Cached Small FilesThe above DriveSpeed results suggested that caching could benefit F2fs performance. A modified version of the benchmark was produced to measure writing an reading times of 1000 files of increasing sizes, with caching enabled. The first example below [16] shows writing and reading times separately, where reading speeds are particularly fast, handling data cached in RAM. This one runs out of cache space at 256 KB x 1000 files (half RAM size), with reading time increasing by more than six times compared with the earlier measurement at 128 KB. Other results [17-20] are for writing plus reading time to compare F2fs and Ext4 with something more resembling copying tests, which are also subject to caching. Generally, F2Fs is up to 40% faster at file sizes less than 64 KB, but are then hit harder up to 256 KB, probably due to caching differences. At the larger file sizes, Ext4 test are slightly faster. Performance of the three OpenElec based systems is quite similar (except no F2fs support with 3.2.4). Then, all are slower that running via Raspbian [17], in some cases at less than half speed. This is probably caused by XBMC overheads. With scrolling RSS News Feeds enabled [21], Raspbian is typically three times faster.
Anyway, it was here that F2fs demonstrated the best relative performance to Ext4.
As indicated later, there are some issues by a program attempting to allocate more than 200 MB. This and excessive CPU overheads might be associated with OpenElec crashing.
Copying FilesThe command used [22] and example output [23] are shown below. For these tests, a Thumbnails directory was produced on the Raspberry Pi via OpenElec/XBMC. Under Windows the size is indicated as 75 MB occupying 82 MB, with 3,471 Files with average size around 22 KB. Using the du command, drive space reported by Raspbian, for Ext4 files, is only slightly higher than that by Windows and, as the earlier data above, average F2fs file sizes are 4 KB greater than those in the Ext4 format. OpenElec du reported file space approximately twice that indicated by Raspbian (IMPOSSIBLE!), with average F2fs files being 8 KB greater than Ext4. This is considered at Data Volumes below. On measuring speed of copying files, particularly of the volume used here, the system should be powered down and rebooted between tests, to clear the RAM based cache. As shown below, deleting copied files and repeating the copy process can be much faster. The difference is more significant under Raspbian, indicating that OpenElec has caching issues. These are considered later. Copying, after booting, under Raspbian [24] is a little faster than via OpenElec [25-27] with Ext4 format and more so with F2fs files, also using less CPU time in both cases. OpenElec performance is similar across the three versions, where Ext4 is a little faster than F2fs but somewhat better with repeated tests, due to caching effects. Performance is again degraded with RSS Feeds enabled [28]. The last results [29] shown are for an old slow drive where F2fs formatting did not show any significant performance improvements over Ext4.
Data VolumesLinux - Performance monitors were used in an attempt to confirm drive and memory data volumes, again using the SanDisk Extreme drive. For Linux (Ubuntu [31] and Raspbian [32]), the vmstat command [30] was used, taking 1 second samples. Memory and cache size used was calculated from start and end free memory and cache occupancy data. Total KBytes read and written are sums of the 1 second samples of bi (KBytes in per second) and bo (KBytes out per second). User and System CPU time, idle time and waiting for I/O time were calculated from the %us, %sy, %id and %wa utilisation entries. The Linux PC CPU is much faster than that in the RPi, confirmed by the CPU seconds used. Memory and drive I/O data sizes were effectively the same on the two systems. The results further confirm that F2fs file sizes are larger than with Ext4 format (4 KB per file was noted earlier), but CPU time used is not much higher. OpenElec - Vmstat is not available using OpenElec (at least the versions I installed). For recording memory occupancy, the top command was used, shown below [33], along with a sample of output. Then, iostat was executed [34] to measure drive data volumes and CPU utilisation (-m for MB/second, -z omit output when no activity). Each of these, and the copy command, were run using three separate PuTTY SSH client terminals via Windows. Of particular note, memory and drive data volumes [35], recorded under OpenElec, are effectively the same as Raspbian and Ubuntu. This implies that those measured by the du command [25-29] reflect memory demands and should not be used to calculate drive MB/second speeds. Recorded CPU utilisation for OpenElec Ext4 and F2fs copying are not much different but quite a bit higher than using Raspbian. Note that the start and end cache occupancies are different between Raspbian and OpenElec. However, attempts were made to reduce initial demands by running my stressInt reliability test to clear some space (command at least ./stressInt KB 200000) - see Raspberry Pi Stress Tests.htm. Initial demands and the impact of running stressInt are shown below [36]. The program managed to reduce demands under Raspbian to the extent that pages could be swapped out to the SD drive (see [32] above). Running under OpenElec, stressInt could not allocate more than 200 MBytes and initial cache occupancy could not be reduced below 100 MB. Perhaps there are some tuning options that allow this and enable swapping. Of the initial Raspbian copying exercise above, test 5 data, at 472 MB, clearly could not be cached and cache occupancy would not reflect data volume. It was also found that vmstat results for test 1, with 22945 files, did not reflect memory expected to be used. Two further directories were created with fewer of the larger and smaller files. Results [37] below show that measured memory occupancy still reflects twice the size of data read. It seems that the latter equates to size on disk, rather than data size (small files size/data space on Windows was 34/46 MB), but much smaller files would be needed to show significant differences.
This depends on drive sector size, most frequently being 4 KB.
|