Apple comments on Consumer Reports faulty MacBook Pro tests

Back in December, Consumer Reports issued a statement saying they could not recommend Apple’s new MacBook Pro because the latest batch of MacBook Pro laptops exhibited “battery life results (that) were highly inconsistent from one trial to the next.”

Many people saw issues with the tests as soon as they were published, and as it turns out, they were right. Consumer Reports were using hidden settings meant for developers, instead of using the normal settings that people use everyday, to test the battery.

“We appreciate the opportunity to work with Consumer Reports over the holidays to understand their battery test results,” Apple said in a statement provided to The Loop. “We learned that when testing battery life on Mac notebooks, Consumer Reports uses a hidden Safari setting for developing web sites which turns off the browser cache. This is not a setting used by customers and does not reflect real-world usage. Their use of this developer setting also triggered an obscure and intermittent bug reloading icons which created inconsistent results in their lab. After we asked Consumer Reports to run the same test using normal user settings, they told us their MacBook Pro systems consistently delivered the expected battery life. We have also fixed the bug uncovered in this test. This is the best pro notebook we’ve ever made, we respect Consumer Reports and we’re glad they decided to revisit their findings on the MacBook Pro.”

Consumer Reports updated their MacBook Pro page, but they are blaming the bug for the previous faulty results. The true problem was their methodology.



  • Agreed the main issue is the methodology, why would you turn off browser caching if it’s not a standard option that users would use anyway? It’d be like finding a secret setting that would force the computer to constantly look for wi-fi signals (which will also kill the battery).

    Consumer Reports should publish their findings, but note that they are working with Manufacturers to discover the cause of their lower findings, that would make them more useful to consumers, especially on software based hardware where the issue isn’t necessarily the hardware.

    • Meaux

      You would turn off browser caching so that you could load the same page multiple times to simulate the real customer experience of visiting a lot of pages. CR’s test loads a handful of pages a bunch of times, real users would go to a bunch of different websites. So, in order to make loading a few websites act like going to a lot of different websites, you would need to turn off browser caching or the browser would just pull from the cache. There’s nothing unreasonable about doing it that way.

      • Blair Hanley Frank

        I concur (as someone who used to benchmark Macs).

        The 100% real life scenario is that users will get some performance benefit from the cache, but it won’t be consistent and can’t really be reproduced in a lab setting.

      • John Parkinson

        Most people go to a handful of websites and multiple pages on those websites a lot, so caching is highly relevant if you’re testing for real life usage.

        • Meaux

          But a lot of those websites are places like ESPN, Youtube or the NYTimes, where it is the same place, but you go to different pages within it and the bulk of what you’re downloading is new. I guess if there’s a setting for, “Only use 20% of the cache,” it would be more realistic, but using no cache is closer to reality than using a cache.

          • rick gregory

            Not really because even ESPN etc will have some static resources from page to page (the CSS likely, the logos, etc). I get what you’re saying, but if CR wanted to simulate that they should, perhaps, cycle through pages from the top 10 or 20 sites. Or build a set of sites that update themselves throughout the test.

            But don’t say you’re setting up your computers with the settings that normal people will use when you’re not.

          • Meaux

            Which is why I said 20% of the cache. And let’s face it, the static resources are dwarfed by the updated stuff like the high res images of Alabama-Clemson or autoplayed videos of Stephen A. Smith yelling at you.

          • rick gregory

            20% is pulling a figure out of thin air. What they SHOULD do is construct a test on the server that simulates actual sites and leave the client alone. Record some sessions, play those back from the server (i.e grab a set of ESPN pages and have them change periodically on the internal server). Altering client settings to simulate what someone things really happens is bullshit.

      • Billy Razzle

        Except that Apple clearly doesn’t do that when they test for “real customer experience”. Why would you even call it that if you’re doing a simulation in a lab?

        • Glaurung-Quena

          Apple probably does what Anandtech does and take a series of recorded real world browsing sessions and then have that be played back on the test machine. or just use a long enouh list of websites to visit that the battery runs down before you run out of sites to visit. This isn’t rocket science.

        • Cranky Observer

          Why doesn’t the EPA take cars out on the road and report the fuel economy after a few hours of random driving, different on each run?

      • rick gregory

        However, people tend to visit a set of pages more regularly and some new pages.

        The point to me is that CR claimed they were using settings that normal people would use. They weren’t. They LIED. Credibility = zero now.

        • Meaux

          As you note below, they didn’t lie, you made up the point that they claimed they were using settings that normal people would use. So by your own standard, is your credibility = zero?

          • rick gregory

            You know.. this is why people are so loath to admit any mistakes. People like you beat them over the head with it rather than address the substantive points that don’t rely on the mistaken assertion.

          • Meaux

            Perhaps you didn’t pick up that I was ironically using your exact same words in response to you. So perhaps you can take that as a lesson to be less extreme in your rhetoric about people making mistakes.

      • Glaurung-Quena

        Sorry, no. CR should have used a larger corpus of sites instead of turning off the cache.

      • komocode

        No, CR should reset Safari instead of disabling the cache. Consumers visit Twitter and Facebook multiple times a day, so there shouldn’t be a reason to reload all of the static images and js files over and over.

  • well this is fairly major. if CR doesn’t admit it was their test then they are bunk.

  • totalitat

    Er. I’m a regular user and I use that setting. And way to avoid the fact that there was a software bug causing the actual issues.

    • rick gregory

      Then you are deliberately compromising the speed at which Safari loads pages. There’s no reason not to load the cached resources of a site if you just visited that page unless you’re a developer debugging things. Even if the content changes, you will gain real advantages caching resources like the CSS, site logo, etc.

      The point is that CR lied. They presented their tests as just how most uses use the MacBook, with out of the box settings. They didn’t in fact do this.

      • totalitat

        Actually, I’m really not deliberately compromising the speed that Safari loads. I have a MBP with a non-SSD harddrive, and the speed of my network is comparable to the speed with which it can access things from the cache.

        “They presented their tests as just how most uses use the MacBook, with out of the box settings.”

        If you’d like to quote for me where they said “out of the box settings” on their test cases, I’d love to see it. Hint: you won’t be able to, because they don’t say that.

        • john doofus

          “I have a MBP with a non-SSD harddrive, and the speed of my network is comparable to the speed with which it can access things from the cache.”

          You’re still rocking the original Mac Portable? Respect.

          • totalitat

            Hah! It’s a bit heavy to carry but you should see the eyes widen of other hipsters in the coffee shop when I arrive.

        • rick gregory

          You’re right in your last point. IN fact, they admit to drastically altering settings from what people will use in order to… do something. Not sure what. But it means you cannot take their results seriously since they don’t reflect what people will experience if they use default, out of the box settings.

          • totalitat

            So, in fact, they didn’t lie, did they?

            “Not sure what”

            Er…they explain it in the article you link to:

            “Because people use laptops differently and because their usage can vary from day to day, our battery tests are not designed to be a direct simulation of a consumer’s experience. Rather, we look to control as many variables as possible, then perform a test that gives potential users a reasonable expectation of battery life when a computer’s processors, screen, memory, and antennas are under a light to moderate workload . This test has served as a good proxy for battery life on the hundreds of laptops in our ratings.”

            It does help to read the article, you know.

            But it means you cannot take their results seriously since they don’t reflect what people will experience if they use default, out of the box settings

            Is your contention then that the average consumer uses their laptop exactly as it came out of the box? They don’t install applications, adjust the brightness, or do anything to alter the way it came? Would you care to defend that assertion?

          • rick gregory

            I’ll easily defend the assertion that very few people ever disable the cache since to do so they need to reveal a hidden menu and then alter a setting in that menu. Once we talk about brightness, etc the permutations become too much to test, but I think it’s perfectly reasonable to expect a test to be conducted with out of the box settings since you KNOW those are at least a common starting place. How is that even slightly controversial?? Sure, I can see turning off screen dimming since that simulates what will happen if someone is using the machine, but aside from that defeating features meant to help battery life and then measuring battery life is deeply silly.

            Aa for what they said in the article, I did read it. My point was entirely about the headline which makes it sound like the variation was entirely because of the Safari bug. A secondary point is that they evaluate battery life after deliberately defeating features meant to maximize it. That’s idiotic. It’s at the “Did you know if you shoot your foot, it will bleed??!!” level.

          • totalitat

            I’ll easily defend the assertion that very few people ever disable the cache since to do so they need to reveal a hidden menu and then alter a setting in that menu

            That’s not what I asked.

            Once we talk about brightness, etc the permutations become too much to test, but I think it’s perfectly reasonable to expect a test to be conducted with out of the box settings since you KNOW those are at least a common starting place.

            It’s a common starting place among all the different manufacturers? Microsoft sets up their computers the same way Apple does the same way Lenovo does? If they don’t, then you’ve just lost any ability to compare battery life between manufacturers.

            defeating features meant to help battery life and then measuring battery life is deeply silly.

            Trying to simulate consumers going to a range of different web pages is “deeply silly”? That’s an odd statement.

            My point was entirely about the headline which makes it sound like the variation was entirely because of the Safari bug

            Uh, the comment I responded to had nothing to do with the headline, so I hardly see how the above could be true. Please don’t change your argument midstream. It just gets everyone wet.

            deliberately defeating features meant to maximize it.

            If by “deliberately defeating features” you mean “trying to simulate actual usage,” then sure. Otherwise, no.

          • Cranky Observer

            Very few people need to drive automobiles in a closed 2.5 mile loop with exactly one hill, 4 lane changes, one wet zone, one rough patch, etc. Yet that’s what the automakers, the NHTSB, the EPA, the car enthusiast magazines, and Consumer Reports all do.

        • David Stewart

          Even a really fast internet connection isn’t going to be nearly as fast as a local cache. Also the way browsers load network resources can create real bottlenecks.

          • totalitat

            That has not been my experience.

    • Dan Andersen

      Er, now you know you should not use that setting. Get to it, totalitat!

      • totalitat

        Er. If I didn’t use any setting that might have the chance of provoking an obscure bug in OS X, I would never turn the computer on in the first place.

  • rick gregory

    So to all of the people slamming Apple and pooh-poohing the idea that CR should have used the results to look at their methodology (moons them).

    THIS is why I and others questioned the CR results in other comments. They made no sense and should have triggered some self-examination on the part of CR. Instead, they went for clickbait.

  • RodoBobJon

    People are being a bit unfair to Consumer Reports. At the end of the day, this was a bug in Apple’s software that caused the erratic battery results. Turning off caching is a totally reasonable testing methodology if you want to simulate users visiting new sites, and it’s not CR’s fault that this triggered a weird bug.

    • David Stewart

      It’s a problem is that their testing was designed to model average users and it did not. They should have recognized that something was wrong with their testing and worked to fix it rather than release misleading results (if they care about credibility).

    • rick gregory

      No, they deliberately defeated settings that are there precisely to help users get better performance and battery life. In doing so they bias the results in unpredictable ways not only for the MBP but for any laptop they test.

      They also are not using the settings that most people will use, thus making their results not indicative of what people will see.

      • Cranky Observer

        The inconsistency was a critical issue, not just the average lifetime.

    • John

      If you’re testing battery life for end users then it’s not a reasonable methodology. If you’re developing or testing website content then it is reasonable.

      • Colin Mattson

        Heck, I’d argue it’s barely even reasonable if you’re developing or testing web content: It’s 2017, dammit, and most development tooling is perfectly capable of telling the browser whether or not an asset has changed.

        I mean, sure, if you’re using a really dumb development server and some totally barebones roll-your-own code, maybe you don’t want your browser caching anything. But the vast majority of modern frameworks, CMSes, and servers will properly use or invalidate the cache in the vast majority of circumstances with zero effort on your part.

        • John

          It’s nice when you get a greenfield project and all the newer stuff. You’re right, the cycle is so much smoother.

          However it’s not so smooth when you’re asked to fix some 10 year old piece of crap hung on the front of some SAP system and it only works with IE 6 or something. 😉

        • Bill_the_binman

          I’d like someone independent of CR to test a non-Apple notebook with the same settings and see if they get similar results. If not, this adjustment was made just for Apple…

    • Cranky Observer

      Apple doesn’t have bugs – the universe has flaws in its fine structure that create the illusion of bugs in Apple software; these need to be repaired.

  • GS

    The comments on here clearly show reality for humans is based on what they think it is. Feel free to disagree.

  • jfutral

    Regardless of what you think about CR and it’s “methodologies” or their update comment and Apple’s admission to a bug, it was pretty cool that CR was open to Apple’s input and cooperative with the investigation AND amended their review. That’s still pretty straight up good behavior.

    Joe

    • GS

      I like too that Apple remains calm and polite towards CR when disputing their results, as that is how reasonable adults handle disagreements.

    • rick gregory

      Sure, they got the clickbait traffic and publicity. But look at their headline in the post about this result now “Apple Releases Fix to MacBook Pros in Response to Consumer Reports’ Battery Test Results” as if their methodology wasn’t at fault at all.

      Nothing about how their test methods alter settings from what someone will get if they just start using Safari.

      Note that they do this on ALL laptops – so this goes beyond the MBP result and calls into question every results they’ve done.

      • Hieronymus Washington

        So if they’ve always used this methodology and always previously recommended MacBook Pros does that call into question all those past recommendations?

        • rick gregory

          Honestly… it kind of does. At least on battery life. But not only Macs, ALL laptops. Mind you, battery life is only part of what makes any laptop good, but to the degree that people relied on the accuracy of the CR test, yes, all past evaluation of battery life have to be suspect.

      • GS

        Totally agree, gave up on CR ages ago. I don’t think they gained even one new paying customer, which I believe is their real goal here.

      • jfutral

        One thing is clear, though, regardless of CR methodology there is something about this MBP or the OS/software that comes with it that battery life is crazy across the board, as other users report (such as the recent article at Verge).

        CR’s methodology didn’t change as far as we know and this MBP responded differently than ANY MBP before. And this did uncover a bug. None of that is disputable. That doesn’t mean everyone else having issues is doing the same thing CR did, but it does make me wonder if the bug was related to the issue everyone else is having. Not so much that the bug is the issue, but a symptom.

        Just a thought, Joe

        • rick gregory

          Oh I agree that there was a bug and that the CR methods uncovered it. To that degree, this was a good thing to have happened. My issue is with how CR presented this initially and how they still refused to note in their headlines etc that they weren’t really testing a configuration that most people would use. Screen dimming prevention aside, it makes one wonder about their other tests. Defeating settings meant to help battery life and then measuring battery life just feels like a deeply weird methodology to me.

  • Some nice selective quoting of the Apple statement in the CR update article, specifically omitting the line “This is not a setting used by customers and does not reflect real-world usage”.

  • The Cappy

    Rene Ritchie took a lot of crap for being an Apple yes-man, because he said CR needed to be more rigorous about their results. People pointed out that CR had seen what a “regular user” would have seen, and a regular user wouldn’t have known to do a bunch of tricksy regressions. But this is pretty ironic. It turns out that CR’s were the opposite of what a regular user would have seen. If you don’t know what the hell you’re doing, you shouldn’t be blasting your results to the world, misinforming everybody.

  • John

    You turn off or clear caching in developer mode when developing and testing so you can reload the website fresh. That’s why developer mode is off by default under the advanced tab. Apple reported an intermittent bug with the browser in developer mode but, long story short, no developer or QA would turn on developer mode to test the battery life of a laptop. They would turn it on to test web site content and behavior actively under development.

  • Hieronymus Washington

    You may not agree with the methodology but the fact remains that it uncovered a bug that affected the tests. Saying the methodology was the true problem is an incorrect and misleading conclusion.

    Regular users have been seeing inconsistent battery results since it’s been released. I’d even wager that more have been having issues with it than have been having problems with Apple Music…

    • David Stewart

      Their test was designed to simulate real-world behavior, it didn’t. That’s a methodological problem.

      • Hieronymus Washington

        But the inconsistent results were caused be the bug, as Apple has said. My point is everyone has their tits in a wringer over them turning off the local cache, but if they have always done that and it hasn’t affected their past recommendations, it shouldn’t have made a difference this time, at least as far as them recommending it goes. Saying the inconsistent results they got were solely because of the methodology is wrong.

        • David Stewart

          The bug may have been the proximate cause of the erratic data, but it is purely down to methodological error that those results were published as they were. It is up to the investigator to validate their results and ensure their models accurately reflect their subject.

          • Hieronymus Washington

            If they get good results with the cache off after the bug fix then it doesn’t matter.

          • David Stewart

            It matters because it reveals a lack of competency in their testing.

  • John

    My favourite quote from the original CS article:

    “Once our official testing was done, we experimented by conducting the same battery tests using a Chrome browser, rather than Safari. For this exercise, we ran two trials on each of the laptops, and found battery life to be consistently high on all six runs. That’s not enough data for us to draw a conclusion, and in any case a test using Chrome wouldn’t affect our ratings, since we only use the default browser to calculate our scores for all laptops. But it’s something that a MacBook Pro owner might choose to try.”

    • That’s pretty funny. Without disabling the cache on Safari, it would probably perform much better than Chrome. So their recommendation screwed anyone who followed it.

  • lattermanstudio

    Sooooooo will Apple PUT BACK the ‘time remainging’ info that the TOOK OUT to help hide the battery issues… !!!

  • Do people still read Consumer Reports? I thought everybody read reviews on The Wirecutter!

  • Денис

    They turn off caching because if you left it turned on, browser doesn’t download any info from wifi – so laptop doesn’t consume much power. And if you work in web, you really refresh only one page? Or you looking different pages, and you browser load info from web? Sorry for bad English.