RC2CryptoServiceProviderand parts of
Those are all the symmetric algorithms that Mono provides (using managed code). Now they will all, for MonoTouch, be native except for the non-AES parts of Rijndael, i.e. when its block size is not the default 128 bits.
This time hardware acceleration is easier to detect as we can compare AES Electronic Codebook (ECB) mode with AES Cipher Block Chaining (CBC) mode – the later documented to be accelerated. In software the most basic mode is ECB and the other modes are built on top of it. So ECB is generally a bit faster than CBC. If CBC turns out to be faster then it was either further optimized (small margin) or hardware accelerated (large margin).
The above graphics, made from the iPad 3 results, shows the performance (MB/s) on the vertical scale versus the buffer size (bytes) on the horizontal. Unlike the digest results it’s hard to suspect anything other than hardware acceleration for such a drastic difference.
The next graphics compares the performance of the native/hardware implementations (of CommonCrypto) versus the managed implementations (of Mono). Values are the average, in MB/s, of all my devices. You’ll see that using older algorithms, like DES or TripleDES, is not the way to better performance, managed or native.
Wait! where’s the 35.4x performance increase ?
Uho, the AES results looks more like 10x than 35x, right ? That’s (a bit) because I used the average values of my devices and (mostly) because I did not use the magic tricks. IMO that gives a more honest picture – at least as much as benchmark can be
But, as promised, here are the exact incantations and sacrifices required to get up to 35x increase. You’ll need three things:
1. An iPad 1st generation. It simply has the best performance – well over the iPad3 at 18.8x and iPod4 at 25.2x (see graphics below). It’s large and surprising variation. Yet all results are better than the non-magical 10x average;
2. Your device needs to run on AC power. That will give more juice to the crypto processor – roughly doubling its performance. Power management, under battery, does not allow this level of performance. That’s part of the sacrifice required to get the maximum performance;
3. Use the optimal buffer size (again) but this time we’re talking about huge buffers. In case you did not notice them, on the ECB/CBC graphic above, you will need 512KB to get the maximum throughput (86.9 MB/s on iPad1). If you drop the buffers to 256KB you lose more than 3 MB/s. Drop them to 128KB and you’re down to 68.1 MB/s. Drop to 16 kb (the optimal size for battery operation) and you’ll only get 44.1 MB/s – a long way from the 86.9MB but still better than 29.4MB/s (same buffer on battery). Keep in mind of the I/O required to keep the buffers full – it might be easier (and finally faster) to use smaller ones…
Running iOS application on AC power is uncommon for most people. However it’s not uncommon when developing and testing applications, including benchmarking them. Take care when testing yours! It took my a while and a bit of juggling between computers and devices to figure out why some numbers were so different than others.
Here is the new, battery operated, and the original (from the first blog post), AC powered, performance graphics.