Google recently published a website called Let’s make the web faster.
Aimed mostly at newbies, they have a few tips that made people cringe despite having Google’s Seal of Approval. We will look at some of these optimizations and see if they really help.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. – C.A.R. Hoare
PHP
Let’s first take a look at their PHP tips.
Don’t copy variables for no reason.
Sometimes PHP novices attempt to make their code “cleaner” by copying predefined variables to variables with shorter names before working with them. What this actually results in is doubled memory consumption, and therefore, slow scripts. In the following example, imagine if a malicious user had inserted 512KB worth of characters into a textarea field. This would result in 1MB of memory being used!
PHP will use the same string for multiple variables until you attempt to change it. This behavior is called “copy on write”.
Use single-quotes for long strings.
The PHP engine allows both single-quotes and double-quotes for string variable encapsulation, but there are differences! Using double-quotes for strings tells the PHP engine to read the string contents and look for variables, and to replace them with their values. On long strings which contain no variables, that can result in poor performance.
Now this might make sense. But … Google and common sense are not always right!
The Benchmark – Testing 1000 repetitions in creation of a string
Single Quote with $'s : 1431 µs Double Quote with $'s : 3120 µs Single Quote without $'s : 437 µs Double Quote without $'s : 374 µs Double Quote with $'s escaped : 285 µs
Surprised? I was too. In fact, I tried it on 2 different machines and with different repetitions to make sure I wasn’t seeing things.
The results were the same. The lesson here is that single quotes are only faster when you have $ signs. If you escape the $’s in double quoted strings, they are the fastest!
If you don’t believe me, you can try it here online with Codepad, an online script evaluator.
Update
This test is being re-examined because rearranging the statements seem to change the speed alot more than being double or single quoted. The current conclusion now is that they are approximately the same speed.
Use switch/case instead of if/else.
Using switch/case statements rather than loose-typed if/else statements when testing against a single variable results in better performance, readability, and maintainability. It’s important to note that using switch/case does a loose-comparison, and should be taken into consideration when being used.
For this myth, I’d like to introduce you to this excellent website phpbench.com. It benchmarks many common PHP myths.
PHPBench’s results of 1000 switches vs if/else
- if and elseif (using ==) : 171 µs
- switch / case : 174 µs
- if and elseif (using ===) : 105 µs
Here we see Google’s statements are off the mark. If statements and switches are essentially the same. Using “===”, or exact equality in the if statement (which is not possible in a switch) is even 60% faster!
I think they were probably a victim of thinking that if it applies to C, it applies to dynamic languages.
HTML
Let’s take a look at their suggestions for HTML.
I commend them for suggesting optional tags. This can really help sometimes, but I was always afraid to do it in case some bad browser chokes because of leaving them out. (Anyone know for sure?)
Update
According to maht from the comments, leaving out </td> and </tr> will result in errors for Netscape 4.
What baffles me is Google’s consistency. If you go to google.com, you will see that they leave out <html>. This is expected. However, for some reason they include both <head> </head> when they are both optional, and the <body> tag which is also optional.
This is strange because in their privacy center, they leave all of that out.
Google only saves 6 bytes from leaving out the <html> </html> and </body> tags after gzip. Is this really worth saving 6 bytes, or is this another micro-optimization?
Let’s take a look at an often overlooked source of bandwidth wasted: the HTTP headers. Here are the headers from google.com
Cache-Control: private, max-age=0 Date: Sun, 28 Jun 2009 04:49:01 GMT Expires: -1 Content-Type: text/html; charset=UTF-8 Content-Encoding: gzip Server: gws Content-Length: 3275 X-Cache: MISS from . Via: 1.0 .:80 (squid) Connection: keep-alive
242 Bytes. Can we reduce this?
(Note: Google omits a few of these headers when you request HTTP 1.0 instead of 1.1. Also this page on Google does not have X-Cache or Via headers, so its not my ISP or router.)
- Cache-Control: private, max-age=0: In theory, Expires: -1 should be enough. But there may be an odd proxy program that publicly caches the page. We’ll keep it.
- Date: Sun, 28 Jun 2009 05:18:12 GMT: Shouldn’t be needed if its not supposed to be cached.
- Server: gws: No need to tell the world your custom server that no one else can use, or is there?
- Content-Length: 3275 : The only use for this is to show a progress bar. We’ll keep it though.
- X-Cache: MISS from . : Not needed.
- Via: 1.0 .:80 (squid) : Not needed.
Cache-Control: private, max-age=0 Expires: -1 Content-Type: text/html; charset=UTF-8 Content-Encoding: gzip Content-Length: 3275 Connection: keep-alive
151 Bytes. A reduction of 91 bytes (37.6%) in headers sent. Note that HTTP headers are not compressed unlike HTML.
Conclusion
It appears that the doubt surrounding Google’s new guide is not unfounded. Several errors and half-truths were found.
Google’s use of optional HTML tags could save some bandwidth, but for pages like google.com it doesn’t save that much compared to what could have been saved with the headers.
Update
It appears that the PHP team has also verified some of the same findings.
Related posts:









The Squid headers, including X-Cache, may be on your end (or your ISP’s), not Google’s.
Here are my headers for another website.
Server: nginx/0.8.2
Date: Sun, 28 Jun 2009 17:31:55 GMT
Content-Type: text/html
Last-Modified: Mon, 08 Jun 2009 08:28:13 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Content-Encoding: gzip
Response Headers:
Cache-Control[private, max-age=0]
Date[Mon, 29 Jun 2009 07:00:22 GMT]
Expires[-1]
Content-Type[text/html; charset=UTF-8]
Content-Encoding[gzip]
Server[gws]
Content-Length[3364]
X-Cache[MISS from .]
Via[1.0 .:80 (squid)]
Connection[keep-alive]
I got them for Google. I’m not behind any proxies and my ISP doesn’t add any. BTW, the request was HTTP 1.1
I checked, and Google doesn’t send me Content-Length, X-Cache, Via or Connection. Those are added on your side.
What is with the endless spam errors? It seems impossible to get a post through here.
I found the problem. You guys are using HTTP1.0 requests instead of 1.1.
That probably also explains why you are getting spam errors. You’re probably running an old or limited browser.
Regarding single-quotes, you should try permuting the tests. Example:
http://codexon.codepad.org/54L3miwN (your code)
0.027257204055786
0.097667932510376
0.079727172851562
0.016388893127441
0.014074087142944 — fastest: double quote with $ escaped
http://codexon.codepad.org/vYj282cF (reversing the order of the test; output reversed to match the previous order):
0.013306856155396 — fastest: single quote with $
0.12155199050903
0.016154050827026
0.015068054199219
0.066812038421631
That’s very interesting. PHP is a very odd beast.
In my own testing, I’ve discovered that PHP takes some time to warm up. If I run 10 sets of iterations worth 5 seconds each, there’s an obvious trend downward in time. The first one may take 10s (or more), followed by 7.5s and 6s, before settling down to ~5s each for the final seven sets. These numbers are for the sake of argument, but the effect is always visible, whether a set takes 10s or 30s or whatever. The first one is always significantly longer.
I suggest Zed Shaw’s rant on benchmarking, http://www.zedshaw.com/essays/programmer_stats.html, which is what changed my own methods and caused me to discover this behavior.
Also, the Server header is mandatory. Omitting it would violate the RFC.
According to RFC 2616, I don’t see any section where it says the Server header is mandatory.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html
Concerning, if-elseif statements versus switch-case, McCabe’s Cyclomatic Complexity should be considered. Aside for performance, the testability of the code should be important. A switch statement will always have a complexity of 1, that is, very testable. An if statement starts off with a higher complexity and quickly grows to untestable with each consecutive elseif statement.
oh lol hey Doug. Fancy seeing you here. BTW I agree with op. Have a great day buddy!
I dunno dude, one day Google is going to rule the world!
Russ
http://www.complete-privacy.tk
I really can’t think of a reason why Mr. Evil even came up with those stupid solutions.
There is no alternative to raw IF statements. WTF is google thinking?
PHP has come far into making these kinds of things not be a big issue (like you explain copy-on-write and many others).
Google should just keep working on world domination and not worry about how we build applications.
oh man, google is going to be pissed now
Expect your site to plunge in SE results
)
I’m sorry but you are just plain wrong about the double variables. The suggestion was to not store the results of computations. Look at Google’s examples:
$description = strip_tags($_POST['description']);
echo $description;
Here the variable $_POST['description'] is duplicated because it is processed by strip_tags(). The result persists after the echo. vs.
echo strip_tags($_POST['description']);
Here the result of the strip_tags() function is discarded after the echo.
I was with you right up until you got to the HTTP headers sections. A few of the headers you said were optional are actually required (date, content-length), not just for caching.
If you’re up for an interesting read, http://www.ietf.org/rfc/rfc2616.txt
I actually read that before I wrote this article.
Date is actually not “required” for servers that don’t have a reliable clock. Following this rule seems pedantic to me.
Content-length is unnecessary beyond progress bars and is in fact, neglected in case of streaming because you don’t know the potential size of the file. Also if you set the content-length too small, many browsers will cut off the transfer.
Content-Length is required when using keep-alives (multiple requests over the same TCP connection). Without it, the browser has no way to know when the current response ends and another one starts.
“In order to remain persistent, all messages on the connection MUST have a self-defined message length (i.e., one not defined by closure of the connection)”. RFC 2616 section 8.1.2.1.
Netscape 4 requires and or else it stops processing.
Netscape 4 requires you close td and tr or else it stops processing and you get a whole lot of grey
These are general tips too, every “tip” can be different depending on the case involved. Different scenarios yield different results.
Always do your own tests in your own enviornment.
http://www.dealpi.com