Debunking Google’s Internet Optimization Tips

Category: Internet   Tags: , , , ,

Google recently published a website called Let’s make the web faster.

Aimed mostly at newbies, they have a few tips that made people cringe despite having Google’s Seal of Approval. We will look at some of these optimizations and see if they really help.

spoiler

Micro-optimization

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. – C.A.R. Hoare

PHP

Let’s first take a look at their PHP tips.

Don’t copy variables for no reason.

Sometimes PHP novices attempt to make their code “cleaner” by copying predefined variables to variables with shorter names before working with them. What this actually results in is doubled memory consumption, and therefore, slow scripts. In the following example, imagine if a malicious user had inserted 512KB worth of characters into a textarea field. This would result in 1MB of memory being used!

PHP will use the same string for multiple variables until you attempt to change it. This behavior is called “copy on write”.

Use single-quotes for long strings.

The PHP engine allows both single-quotes and double-quotes for string variable encapsulation, but there are differences! Using double-quotes for strings tells the PHP engine to read the string contents and look for variables, and to replace them with their values. On long strings which contain no variables, that can result in poor performance.

Now this might make sense. But … Google and common sense are not always right!

The Benchmark – Testing 1000 repetitions in creation of a string

Single Quote with $'s : 1431 µs
Double Quote with $'s : 3120 µs

Single Quote without $'s : 437 µs
Double Quote without $'s : 374 µs

Double Quote with $'s escaped : 285 µs

Surprised? I was too. In fact, I tried it on 2 different machines and with different repetitions to make sure I wasn’t seeing things.

The results were the same. The lesson here is that single quotes are only faster when you have $ signs. If you escape the $’s in double quoted strings, they are the fastest!

If you don’t believe me, you can try it here online with Codepad, an online script evaluator.

Update
This test is being re-examined because rearranging the statements seem to change the speed alot more than being double or single quoted. The current conclusion now is that they are approximately the same speed.

Use switch/case instead of if/else.

Using switch/case statements rather than loose-typed if/else statements when testing against a single variable results in better performance, readability, and maintainability. It’s important to note that using switch/case does a loose-comparison, and should be taken into consideration when being used.

For this myth, I’d like to introduce you to this excellent website phpbench.com. It benchmarks many common PHP myths.

PHPBench’s results of 1000 switches vs if/else

  • if and elseif (using ==) : 171 µs
  • switch / case : 174 µs
  • if and elseif (using ===) : 105 µs

Here we see Google’s statements are off the mark. If statements and switches are essentially the same. Using “===”, or exact equality in the if statement (which is not possible in a switch) is even 60% faster!

I think they were probably a victim of thinking that if it applies to C, it applies to dynamic languages.

HTML

Let’s take a look at their suggestions for HTML.

I commend them for suggesting optional tags. This can really help sometimes, but I was always afraid to do it in case some bad browser chokes because of leaving them out. (Anyone know for sure?)

Update

According to maht from the comments, leaving out </td> and </tr> will result in errors for Netscape 4.

What baffles me is Google’s consistency. If you go to google.com, you will see that they leave out <html>. This is expected. However, for some reason they include both <head> </head> when they are both optional, and the <body> tag which is also optional.

This is strange because in their privacy center, they leave all of that out.

Google only saves 6 bytes from leaving out the <html> </html> and </body> tags after gzip. Is this really worth saving 6 bytes, or is this another micro-optimization?

Let’s take a look at an often overlooked source of bandwidth wasted: the HTTP headers. Here are the headers from google.com

Cache-Control: private, max-age=0
Date: Sun, 28 Jun 2009 04:49:01 GMT
Expires: -1
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: gws
Content-Length: 3275
X-Cache: MISS from .
Via: 1.0 .:80 (squid)
Connection: keep-alive

242 Bytes. Can we reduce this?

(Note: Google omits a few of these headers when you request HTTP 1.0 instead of 1.1.  Also this page on Google does not have X-Cache or Via headers, so its not my ISP or router.)

  • Cache-Control: private, max-age=0: In theory, Expires: -1 should be enough. But there may be an odd proxy program that publicly caches the page. We’ll keep it.
  • Date: Sun, 28 Jun 2009 05:18:12 GMT: Shouldn’t be needed if its not supposed to be cached.
  • Server: gws: No need to tell the world your custom server that no one else can use, or is there?
  • Content-Length: 3275 : The only use for this is to show a progress bar. We’ll keep it though.
  • X-Cache: MISS from . : Not needed.
  • Via: 1.0 .:80 (squid) : Not needed.
Cache-Control: private, max-age=0
Expires: -1
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Content-Length: 3275
Connection: keep-alive

151 Bytes. A reduction of 91 bytes (37.6%) in headers sent. Note that HTTP headers are not compressed unlike HTML.

Conclusion

It appears that the doubt surrounding Google’s new guide is not unfounded. Several errors and half-truths were found.

Google’s use of optional HTML tags could save some bandwidth, but for pages like google.com it doesn’t save that much compared to what could have been saved with the headers.

Update
It appears that the PHP team has also verified some of the same findings.

  • Reddit
  • HackerNews
  • Twitter
  • DZone
  • del.icio.us
  • FriendFeed
  • StumbleUpon
  • RSS

Related posts:

  1. Google hires laid-off Microsoft evangelist. He then writes a blog post bashing Microsoft.
  2. Free Google Wave Invites
  3. 6 Simple Tips to Get Stackoverflow Reputation Fast
  4. Microsoft Bing Cashback Fail
  5. Debunking the Erlang and Haskell hype for servers

22 Comments  »

  1. orip says:

    The Squid headers, including X-Cache, may be on your end (or your ISP’s), not Google’s.

    • admin says:

      Here are my headers for another website.

      Server: nginx/0.8.2
      Date: Sun, 28 Jun 2009 17:31:55 GMT
      Content-Type: text/html
      Last-Modified: Mon, 08 Jun 2009 08:28:13 GMT
      Transfer-Encoding: chunked
      Connection: keep-alive
      Content-Encoding: gzip

    • K says:

      Response Headers:
      Cache-Control[private, max-age=0]
      Date[Mon, 29 Jun 2009 07:00:22 GMT]
      Expires[-1]
      Content-Type[text/html; charset=UTF-8]
      Content-Encoding[gzip]
      Server[gws]
      Content-Length[3364]
      X-Cache[MISS from .]
      Via[1.0 .:80 (squid)]
      Connection[keep-alive]

      I got them for Google. I’m not behind any proxies and my ISP doesn’t add any. BTW, the request was HTTP 1.1

  2. Somebody says:

    I checked, and Google doesn’t send me Content-Length, X-Cache, Via or Connection. Those are added on your side.

    What is with the endless spam errors? It seems impossible to get a post through here.

    • admin says:

      I found the problem. You guys are using HTTP1.0 requests instead of 1.1.

      That probably also explains why you are getting spam errors. You’re probably running an old or limited browser.

  3. Paolo Bonzini says:

    Regarding single-quotes, you should try permuting the tests. Example:

    http://codexon.codepad.org/54L3miwN (your code)

    0.027257204055786
    0.097667932510376
    0.079727172851562
    0.016388893127441
    0.014074087142944 — fastest: double quote with $ escaped

    http://codexon.codepad.org/vYj282cF (reversing the order of the test; output reversed to match the previous order):

    0.013306856155396 — fastest: single quote with $
    0.12155199050903
    0.016154050827026
    0.015068054199219
    0.066812038421631

    • admin says:

      That’s very interesting. PHP is a very odd beast.

      • sapphirecat says:

        In my own testing, I’ve discovered that PHP takes some time to warm up. If I run 10 sets of iterations worth 5 seconds each, there’s an obvious trend downward in time. The first one may take 10s (or more), followed by 7.5s and 6s, before settling down to ~5s each for the final seven sets. These numbers are for the sake of argument, but the effect is always visible, whether a set takes 10s or 30s or whatever. The first one is always significantly longer.

        I suggest Zed Shaw’s rant on benchmarking, http://www.zedshaw.com/essays/programmer_stats.html, which is what changed my own methods and caused me to discover this behavior.

  4. Paolo Bonzini says:

    Also, the Server header is mandatory. Omitting it would violate the RFC.

  5. Concerning, if-elseif statements versus switch-case, McCabe’s Cyclomatic Complexity should be considered. Aside for performance, the testability of the code should be important. A switch statement will always have a complexity of 1, that is, very testable. An if statement starts off with a higher complexity and quickly grows to untestable with each consecutive elseif statement.

    • atomi says:

      oh lol hey Doug. Fancy seeing you here. BTW I agree with op. Have a great day buddy!

  6. John WOods says:

    I dunno dude, one day Google is going to rule the world!

    Russ
    http://www.complete-privacy.tk

  7. Angelo says:

    I really can’t think of a reason why Mr. Evil even came up with those stupid solutions.

    There is no alternative to raw IF statements. WTF is google thinking?

    PHP has come far into making these kinds of things not be a big issue (like you explain copy-on-write and many others).

    Google should just keep working on world domination and not worry about how we build applications.

  8. Mike says:

    oh man, google is going to be pissed now :) Expect your site to plunge in SE results :) )

  9. JurgenHennig says:

    I’m sorry but you are just plain wrong about the double variables. The suggestion was to not store the results of computations. Look at Google’s examples:

    $description = strip_tags($_POST['description']);
    echo $description;

    Here the variable $_POST['description'] is duplicated because it is processed by strip_tags(). The result persists after the echo. vs.

    echo strip_tags($_POST['description']);

    Here the result of the strip_tags() function is discarded after the echo.

  10. tedivm says:

    I was with you right up until you got to the HTTP headers sections. A few of the headers you said were optional are actually required (date, content-length), not just for caching.

    If you’re up for an interesting read, http://www.ietf.org/rfc/rfc2616.txt

    • admin says:

      I actually read that before I wrote this article.

      Date is actually not “required” for servers that don’t have a reliable clock. Following this rule seems pedantic to me.

      Content-length is unnecessary beyond progress bars and is in fact, neglected in case of streaming because you don’t know the potential size of the file. Also if you set the content-length too small, many browsers will cut off the transfer.

      • Content-Length is required when using keep-alives (multiple requests over the same TCP connection). Without it, the browser has no way to know when the current response ends and another one starts.

        “In order to remain persistent, all messages on the connection MUST have a self-defined message length (i.e., one not defined by closure of the connection)”. RFC 2616 section 8.1.2.1.

  11. maht says:

    Netscape 4 requires and or else it stops processing.

  12. maht says:

    Netscape 4 requires you close td and tr or else it stops processing and you get a whole lot of grey

  13. Marlin says:

    These are general tips too, every “tip” can be different depending on the case involved. Different scenarios yield different results.

    Always do your own tests in your own enviornment.

    http://www.dealpi.com

Trackbacks/Pingbacks

    1. | Cafyn
    2. Google’s “Let’s Make the Web Faster” Movement - Monday By Noon
    3. links for 2009-06-29 | Yostivanich.com

    RSS feed for comments on this post, TrackBack URI

    Leave a Comment

    (Cookies must be enabled)