Stop the presses. The PC is dead, and so is the Mac. Killed by the more powerful iPad Pro. At least that’s what some tech writers proclaimed after Apple’s latest iPad Pro wonder hit the streets.
But is the iPad Pro really a PC killer? After days of poking and prodding, I can safely say Hell no. Far from it.
Before this turns into a flame-fest, let me say this: The iPad Pro is shockingly fast, and Apple has again worked its mastery of hardware, software and virtually unlimited resources to build an amazingly fast chip for the iPad Pro. But let’s not get ahead of ourselves. My battery of tests shows that in some things, it ain’t that fast.
How we got here
What started the “Intel and its CPUs are doomed” talk were benchmarks showing the A9X SoC in the iPad Pro overpowering Intel’s older Haswell chips and even its newest Skylake CPUs.
Many of those conclusions were based on performance results from the popular multi-platform Geek Bench 3 benchmark, as well as browser-based benchmarks such as Mozilla’s Kraken and Google’s Octane 2.0. This limited data set had the faithful buzzing that the end was nigh for x86.
If you like to test hardware, you know the weakness of the last two tests: A browser test isn’t a test of the CPU/SoC, it’s a test of the chip plus the browser and OS optimizations underneath it. On the iPad Pro the browsers are pretty much the same, as Apple makes all use its highly optimized rendering engine. On the PC, your browser pick matters. Browser-based benchmarks are hardly the best tools on the PC either.
Geek Bench 3 is different. The creators of Geek Bench 3 have stated their goals are to create a cross-platform test that isolates the CPU as much as possible, using algorithms that it believes are valid for chip performance. If you peep at the chart below, you can see what got people in a tizzy.
Yes: Whoa. That iPad Pro in single-core performance (which is a good metric to use to judge across platforms where some chips have more cores) is every bit as fast as the CPU in the newest mid-range Core i5 Surface Pro 4 in Geek Bench 3. It’s uncomfortably close to that Core i7-6600U in the far pricier top-end Surface Book, too.
For the record: Almost all of the tests in this section were run within the last few days, with the latest OSes and updates applied. The only OS that was out of date was my corporate-issue Windows 8.1 box with its 3.4GHz Core i7-2600, which I threw in for kicks.
Although I think it matters less, I’ll hit you with the results from Geek Bench 3 for multi-core too. The iPad in multi-core performance is on a par with the older Haswell-based Surface Pro 3, but it loses to the newer Skylake-based Surface Pro 4. Why? I’m not sure, but the Intel chips' Hyper-Threading resource management could be a factor. That’s why I think the single-core performance is more meaningful.
First up is BAPCo’s TabletMark V3. While Geek Bench 3 attempts to create what its makers think is an accurate measure of CPU performance using seconds-long “real world” algorithms, BAPCo’s approach is actually more “real world.” The consortium of mostly hardware makers set out to create workloads across all the different platforms that would simulate what a person does, such as actually editing a photo with HDR, browsing the web, or sending email. Because there’s no universal app that runs in Windows, Windows RT, Android and iOS, BAPCO set out and custom-created apps that did the same thing with the same interface across all platforms. When you watch it run on the platforms, it looks like someone is using an application on all three doing the same task on all three.
A white paper on the benchmark discloses the approach as well as the libraries, compilers and APIs used in the test. The test runs in real time, which could take a few hours on some devices. Here’s how the iPad Pro fares.
In TabletMark V3, the iPad Pro doesn’t look quite as threatening, does it? Even the Intel Haswell Core i5-4300u in the two-year old Surface Pro 3 easily outpaces the A9X here. It isn't even far ahead of the tablet pack. The worst performer for x86 is the budget Surface 3 with its Atom X7-z8700. For shame Atom, for shame.
The benchmark has two performance modules, which give you an idea of how fast the device would be in web browsing and email. The result is tepid, with even the Nexus 9 and its Tegra K1 just behind it in performance. If you were to put stock in this test, the iPad Pro is maybe a little faster than a Surface 3 and Atom.
TabletMark V3 also measures photo and video performance, which gives the iPad Pro a healthy lead over the ARM competitors and the Atom X7-Z8700. Except for the Atom X7, the A9X doesn’t come close to the Core i5 or Core i7 devices, or even the Core M.
The puzzler is the performance of the Surface Pro 3 and the Dell Venue 11 Pro, which use older chips. I expected this to be in the bag for the Skylake parts, but the Broadwell-based Core M and the even older Haswell Core i5 are hanging right there.
Every other test I’ve run shows Skylake with a healthy performance bump over Broadwell and Haswell. I attribute that to the chip's running at higher clock speeds, and other micro-architecture improvements. For what it’s worth, I don’t generally bother with TabletMark V3 when I test anything with any actual performance. I haven’t found it to scale with faster CPUs, and other tests are far more intensive.
3DMark and graphics performance
For graphics performance I turned to 3DMark’s Ice Storm Unlimited. It’s a popular test that happens to run on iOS, Windows and Android. It renders the test without regard to the screen resolution and is a pretty good measurement of lower-grade graphics performance. By lower grade I mean, this isn’t Assassin’s Creed Syndicate, which will reduce even a $650 GeForce 980 Ti to 45fps.
All of the devices here used the integrated graphics. The Surface Book was in Clipboard mode, with its GPU disconnected and two feet away. The overall score factors in game physics and the graphics performance.
Apple put a lot of resources into giving the A9X a metric ton of graphics performance, and it shows. It slightly outpaces the Nvidia Tegra K1 in the Nexus 9 and the Shield Tablet in 3DMark. But if you keep looking up that chart, the A9X is still a good clip behind the Dell Venue 11 Pro and the Surface Pro 3. Mind you, that Venue 11 Pro’s Core M is an older power-sipping chip that uses 4.5 watts, not a 15-watt chip like the Surface Pro 3's.
The improved graphics core in the Skyake Core m3 is even more impressive. I'm currently testing the Asus UX305 with the Skylake-based Core m3, and it's posted an overall 3DMark score of 51,181, which would make it third in the chart above.
I had access to an Nvidia Shield TV, which can run 3DMark in Android TV, so I threw the score from the Tegra X1 into the mix for reference. The idea is to to show where Google’s Pixel C could fall, as it should be the first mobile use of a Tegra X1. Before you think the Tegra X1 will whip the A9X, you should remember that the Shield TV is thicker than any tablet and runs on unlimited AC, not DC. There’s no need to worry about chewing through the battery in the Shield TV, unlike with the upcoming Pixel C, so the latter's graphics performance could fall shorter. We’ll see.
3DMark breaks out performance for two areas: Graphics and physics. Here are the scores for the same devices in graphics. The Asus UX305 with its Core m3 isn’t on the chart, but it produces a score of 65,904, so third again.
One thing I will say after all of this is my opinion on Atom X7 is changing for the worse. It would be nice if Intel’s budget chip didn't drag its butt across the finish line dead last in just about every test.
3DMark also runs a physics test, which measures how the platform would run a theoretical game engine. In short, it’s supposed to measure how fast a device’s CPU would be, not its GPU. The result here actually puts the iPad Pro and the A9X at a pretty big disadvantage against all of the x86 chips—yes, even the lowly Atom. Nvidia’s Shield Tablet and the Shield TV also run past the A9X. The rest of the legit x86 chips are sipping lemonade and reading the paper while the iPad Pro crosses the finish line.
The search for answers
The iPad Pro's demise may lie with how the A9X works and the way Futuremark builds its benchmark. Futuremark has been through this before, when the iPhone 5s proved no faster than the iPhone 5 in the physics test despite claims of double the performance from Apple. Futuremark’s investigation led to how the A7 chip in the iPhone 5s (and iPad Air) handles non-sequential memory structures. Futuremark said it was a conscious design change Apple made between the A6 and A7 that hurt its performance, and 3DMark was showing the result of that.
But rather than make a change just to help show off Apple’s performance, Futuremark chose to stick to its benchmarking method, declaring:
“3DMark is designed to benchmark real world gaming performance. The Physics test uses an open source physics library that is used in Grand Theft Auto V, Trials HD and many other best-selling games for PC, console and mobile. Higher scores in 3DMark Ice Storm Physics test directly translate into improved performance in games that use the Bullet Physics Library and are a good indicator of improved performance in other games.”
Keep reading for even more benchmarks...
My own test
I didn’t want to rely on just third-party benchmarks for this. Instead I wanted to keep it real by finding an actual heavy-duty function that would be done on a device. Such as, say, a test that measures how fast a device handles an Excel spreadsheet used in trading scenarios. For that I turned to a publicly available test put out by Excel Trader.
Excel, unlike Word, can use the all the CPU power you throw at it if you use it for intensive purposes. My thought was, it didn’t matter what any of the benchmark theories were. This is a real app. A real task. By someone who can drive an Excel spreadsheet like Richard Petty navigating Riverside International Raceway.
I tried both the Excel 2007 and the Excel 2011 tests available from Excel Trader. No dice. Meanwhile, on the Surface Pro 3 with Excel 2013 I had no issues running either of the site’s test files:
It’s Microsoft’s fault
To be fair, some of this failure goes back to Microsoft's uneven support of Excel on the Mac platform. But it also shows that you can't quite do “everything” on an iPad Pro that you can on a PC, even now that Microsoft is actively supporting the iPad. People really do rely on Excel to push some very complicated financial and statistical modelling, or use a lot of Office’s Visual Basic scripting. Is it what 75 percent of people do? No, but if you’re in that 25 percent that needs it, you’ll be pissed it doesn’t work. Or you’ll just buy a Surface.
Well, hell, I needed to test something. I certainly can’t run AutoCAD 2016, Photoshop CC or Premiere Pro CC on the iPad Pro. Ahem.
I decided to settle for something a person would do on both platforms that makes you drum your fingers on the desk. Like decompressing a 1GB ZIP file that also had 256-bit AES encryption applied to it. I took several thousand tiny 5K .ini files, added low-resolution screen shots and web photos, then ladled on some higher-res JPG files, a 267MB .MTS video file shot on my Sony NEX, maybe a hundred PDF files and a dozen or so MP3 files. Finally I compressed them all with 256-bit AES using 7-Zip. The file was copied to each device.
On the Windows machines I used 7-Zip 15.11 beta to decompress the files. The results were timed with a stopwatch, and an average of the last three runs of a four-run series was recorded, with the first test discarded. This type of testing is inherently unreliable and sometimes can’t be reproduced, but I’ve run it enough times on all of the platforms to have some confidence in it.
With the Windows machines, there were no issues. Not one. The iPad though, gave me fits depending on the app I used. Winzip, for example, just hung even trying to decompress an encrypted file a quarter the size. I finally found an app that would work in iZipPro. It reliably decompressed the file over and over and over again without issue.
The result in my jury-rigged test? Pretty good, but no cigar. Check it out:
Keep in mind, this test is more of a system-level test than a pure CPU test. Memory bandwidth, storage performance, file system and CPU are all working for the result you see here. Once the file has decrypted the .MTS file, for example, it’s just writing a copy back to the drive as fast as it can.
In this one test, the iPad Pro can’t touch even the two-year old Surface Pro 3 and its Haswell chip, but the Core M in the Dell Venue Pro 11 is just barely ahead of the iPad Pro. It’s a good show for the A9X, but a two-year Haswell chip finishes the job in half the time. Atom X7? Um, yeah, Coach wants to talk to you after practice today.
But what about Geek Bench 3? Keep reading...
But what about Geek Bench 3?
The bulk of these tests would indicate the A9X is a really fast ARM chip, at the top of the mountain—for ARM-based tablets. If you still want laptop-like performance with a full-service OS though, you’ll need a full Core-class chip
That’s not what you’d think if you peeped at Geek Bench 3. Per core, remember, it’s a CPU test that shows the iPad Pro to be every bit as fast as a current Core i5 Skylake CPU.
Some have claimed the problem is how Geek Bench 3 weights its results. SHA2 encryption, for example, is overly represented for CPU performance. Given the hardware acceleration in the A9X, it’s showing Apple’s chip to be far faster than that it actually is.
Geek Bench 3 lets you view sub scores, and here’s how SHA2 performance played out across various devices, including a desktop 3.4GHz Core i7-2600 and a water-cooled 4GHz Core i7-4790K chip. Because it's single-core performance, the Hyper-Threading and additional cores don’t cloud the issue. The results are rather interesting.
If you believe Geek Bench 3’s SHA2 numbers, the A9X in the iPad Pro and Nvidia’s Tegra K1 are actually faster than all of Intel’s current mobile dual-cores in SHA2 encryption performance.
Both are also faster than a desktop Sandy Bridge chip. To put that in perspective, the SHA2 performance of the Tegra K1 and A9X aren’t too far behind that of an 88-watt 4GHz Core i7-4790K chip.
Geek Bench 3 is hardly without controversy either. Often cited by its detractors is a post by calm sage Linus Torvalds from a forum in RealWorldTech.com, where he went Kanye West on Geek Bench 3.
”Wilco, Geek Bench has apparently replaced dhrystone as your favourite useless benchmark,” Torvalds wrote. ”Geekbench is SH*T.”
”It actually seems to have gotten worse with version 3, which you should be aware of,” Torvalds wrote. “So basically a quarter to a third of the ‘integer’ workloads are just utter BS. They are not comparable across architectures due to the crypto units, and even within one architecture the numbers just don’t mean much of anything. And quite frankly, it’s not even just the crypto ones. Looking at the other GB3 ‘benchmarks,’ they are mainly small kernels: not really much different from dhrystone. I suspect most of them have a code footprint that basically fits in a L1 cache.”
Geek Bench’s side of the story
To get Geek Bench’s side of the story, I spoke with John Poole, one of the primary developers behind the test. Poole said he understands the controversy and has taken it to heart, but he also disagrees with Torvalds.
“We have a lot of respect for him,” Pool said. “I think he’s wrong in this case.”
Torvalds argues against the value of small code loops in measuring performance, but Poole said the future is mostly about smaller loops. Poole said moving a window around a screen or opening a window is mostly a solved problem for CPUs.
Poole said they’ve been very transparent with what the test measures and have provided extensive documentation as well. In order to measure the chip performance, Geek Bench tries to execute the same code on every platform, Poole said.
Poole claims the question of whether the A9X is faster than, say a Core m3, is beside the point. The software just isn’t there on the iPad Pro today that can do what you can on a laptop, rendering the iPad Pro mostly a curiosity until software changes that.
Still Poole said, he does wonder how the A9X would run OS X. I do too, as it would make comparisons far easier to measure.
To be fair to Geek Bench, benchmarks developers are often accused of serving internal political needs. BAPCo, the software and hardware consortium, has its roots in the PC industry and some feel it’s too PC-centric. Even among PC vendors, there was strife when AMD quit the group and accused it of being too closely aligned with Intel on its once-popular SYSMark.
BAPCo officials have always said their intentions are to create benchmarks that offer insight to the public. If the test is cooked to favor Intel or x86, it certainly didn’t show when the new version was released earlier this year and Intel’s older Atom chips were kicked down the stairs by Apple’s and Nvidia’s CPUs.
I spoke with BAPCo’s John Peterson about the philosophical differences between the two.
”TabletMark utilizes a wide variety of APIs provided by each platform to represent productivity/media app performance and battery life. The workloads are implemented in a way that’s meant to reflect the implementation choices app developers would make for each of the platforms,” Peterson said. “GeekBench generally utilizes its own libraries to perform tasks, while TabletMark utilizes the available platform APIs to perform equivalent functionality in a more platform-tailored way."
Peterson also said relying on browser-based benchmarks, which many testers have done, is inherently problematic, as you’re limited to what hardware is exposed to the browser.
What about editing three 4K streams?
One of Apple’s most impressive claims is the ability to “edit” three simultaneous 4K movie streams—something that is no fun on a desktop and probably out of reach of most laptops.
To test that, I decided to throw the platform a fastball. I grabbed several 4K RAW video files shot on a Red digital cinema camera.
On the Core i7 Surface Book I installed Premiere Pro CC, created a project, added several of the R3D files to the timeline, and tried to play it back. Without doing a render, it wasn’t going to happen at full-res without major hitching—and this with the assistance of the GeForce GPU. Once rendered out, editing and scrubbing through the timeline was possible.
On the iPad Pro I tried to open up the same R3D file in iMovie with no luck. I’m pretty sure the issue is iMovie’s inability to read the raw R3D files. I also copied over a 4K-resolution .MOV file to the device but had no luck opening it in iMovie either. I'm not saying Apple is wrong, but I had no luck trying to do what Apple says the iPad Pro can do.
So what does this all mean?
For one thing, I don’t think you can look at just a couple of numbers from one or two benchmarks and make a conclusion. That is, unless you’re looking to bend the truth to fit a pre-conceived agenda. That’s called benchmarketing, not benchmarking and there’s a difference.
The truth is, most of the testing I’ve run shows the iPad Pro isn’t faster than a current or even two-year old Core-class Intel CPU. (Atom, now that’s another story.)
It’s just not.
But it’s still one hell of a chip.
I tried to heat up the A9X to check for performance throttling by repeatedly running 3DMark and simply gave up. It just does not get hot. I can’t say the same for the admittedly smaller (and harder-to-cool) Google Nexus 9, which gets hot just browsing the web. Watching the same 4K movie file (that I couldn’t actually edit) on the iPad Pro was buttery-smooth even playing in a background window. It’s a very impressive tablet.
It still won’t replace my Surface Pro 3 nor my laptop because it’s not up to desktop-grade functionality. That’s not the A9X's fault. It’s because the OS and apps aren’t up to the “pro” requirements for multi-tasking, nor the precision control of a mouse and keyboard experience.
If Intel’s Atom X7 can run a full-service OS, the superior A9X theoretically could, too. That’s a win for Apple in the long run, and it should be a wake-up call for Intel.