Apart from everything else here, I'd point out that anyone taking TabletMark seriously really ought to read the BaPCo whitepaper about it. What I find particular problematic is the compiler options it uses.
These are, and I quote:
[table]<colgroup><col style="width: 21.523157%"><col style="width: 17.649918%"><col style="width: 60.826925%"> </colgroup><tbody>[TR] [TD="bgcolor: rgb(0.000000%, 0.000000%, 0.000000%), colspan: 3"] [/TD] [/TR] [TR] [TD="bgcolor: rgb(0.000000%, 0.000000%, 0.000000%)"] [/TD] [TD="bgcolor: rgb(0.000000%, 0.000000%, 0.000000%)"] [/TD] [TD="bgcolor: rgb(0.000000%, 0.000000%, 0.000000%)"] [/TD] [/TR] [TR] [TD]
Android
[/TD] [TD]
defaultincluded withAndroid NDKr10c
[/TD] [TD]
Compiler options:
-O3 (enables many general optimizations)
--ftree-vectorize (implied by -O3)
--ffast-math (enables common math optimizations for codethat doesn’t require strict IEEE compliance)--fomit-frame-pointer (implied by -O3, reduces memoryconsumption to support lower RAM—e.g., 1 GB—devices)
[/TD] [/TR] [TR] [TD]
iOS
[/TD] [TD]
defaultincluded withXcode 6.0.1
[/TD] [TD]
Compiler options:
default
(-Os and Automatic Reference Counting)
[/TD] [/TR] [TR] [TD]
Windows
[/TD] [TD]
defaultincluded withMicrosoftVisual Studio2013
[/TD] [TD]
Compiler options:
default (/O2) + /Oi (generate intrinsic functions)
[/TD] [/TR] </tbody>[/table]
Note how Android is using the most aggressive optimizations, iOS the least aggressive, Windows in between.
This strikes me as making the benchmark misleading for many purposes.
You can argue that the default compiler options represent some sort of "average user experience", but my suspicion is that for code that actually MATTERS, developers on each platform do rather better than just use the defaults. The defaults don't matter when you're using your Citibank app or diet-logging app (and in that case, I'd argue Apple's default is the most sensible), but for cases where performance does matter, I would imagine that the top tier game developers, Chrome, and Apple's internal developers, to give a few examples, are being rather more aggressive in their choices.
The point is not to slander BaPCo --- different benchmarks have different uses --- but to point out that TabletMark is not an especially useful metric of CPU performance. It is VERY MUCH an entire system metric. (Even as an entire system metric, I'm not at all convinced that it works well, but I don't know enough to comment on that. In particular what it is TRYING to do, as far as I can tell, is simulate a stream of user events into apps, and then measuring how long it takes some process from "simulated event" to "process completion". The problem is that if the OS does not provide specific hooks allowing one to know when the supposed process being timed is complete, then you have to use proxies, and those proxies may be problematic. And iOS seems like the sort of place that is NOT going to be favorable to this sort of monkeying around and trying to fake user behavior...)
To compare base CPU performance, I consider GeekBench to be a lot more helpful because it is substantially less subjectto these problems.