Part 2 of 3: Reducing Server Calls

In This Tutorial

  1. Introduction
  2. Empty href and src
  3. DNS Lookups
  4. Sprite Sheets
  5. Ajax Requests
  6. URL Redirects
  7. CSS/JS
  8. Cache Headers
  9. Cache Ajax
  10. Conclusion

Introduction

You have likely reached this tutorial from Part 1: Reducing File Size. This second section will cover minimizing server calls. In layman's terms, that means decreasing the number of times a client will have to connect to a server to download files. With all the bandwidth saved from the first tutorial, why are server calls important? A server can handle ten times the amount of traffic now and still use the same amount of bandwidth as before. Server calls are not entirely about bandwidth. They are mostly about the maximum number of connections a server can handle.

Think of the server as a telethon, and the clients are the callers who are trying to donate. One person calls and is speaking to an operator to make a donation. Someone else calls, and the first line is busy, so they're automatically redirected to the second line. This pattern repeats for each caller, as they are redirected to whichever phone line isn't busy. If all lines are busy, the caller gets a busy signal and is unable to donate. That's terrible news if you are on the receiving end of that donation.

A server can only communicate with so many people at a time. Thankfully, it communicates much faster than human speech and doesn't require an entire human per connection, so it can communicate with a lot of users at once. However, we're not talking about a telethon; we're talking about a website, which receives much more traffic, especially since every external file on the webpage requires another "phone call" or connection to the server. If all the lines are busy, the client simply won't get the file.

Using small numbers for easy reference, if a server can only communicate ten pieces of information at a time, and its document is comprised of ten files, that means no two people will be able to receive the document at the same time. By reducing those file references to five, it can receive twice the viewership. Imagine this reduction applied to every webpage on a legitimate server which can handle much more than ten measly connections. Still displaying the exact same web page - every color, every image, every file - the maximum number of connections the server can handle has doubled (or better!). This greatly reduces the risk of encountering the "Slashdot effect".

How is it possible for the client to receive the same amount of information without connecting to a server to get it? This tutorial will cover avoiding empty src and href attributes, reducing DNS lookups, using sprite sheets, using GET for Ajax requests, avoiding URL redirects, making JavaScript and CSS external, adding Expires headers (telling the client to cache the document), and making Ajax cacheable.

Avoid Empty src and href Attributes

While YSlow specifically addresses an anchor element's href value, the main concern here is the src value of an image element. The problem with an empty href value is that each browser interprets it differently. Should it lead to the current page, directory of the current page, root directory, or anywhere at all? This is more of a standards issue than a page optimization issue, but YSlow includes it nonetheless.

Just like the href attribute, the src attribute will automatically be determined by the browser if left blank. Some browsers, such as Safari or Chrome, will load the current page within the image element (which is the same as settings the src to $_SERVER['PHP_SELF']). Why would a browser do this? It's more about standards compliance than logical decisions. The reference of an empty string as a URL is interpreted by the browser to mean the current page, whether in a src attribute or any other location.

Why does this matter? If an image has empty src attribute, it likely isn't intended to appear; and if the image source is an HTML page, it won't appear anyway. However, it does have to reconnect to the server to redownload the current page. Unnecessary connections to the server are what should be decreased for the reasons outlined in the introduction.

Reduce DNS Lookups

Reducing DNS lookups is a unique item on this list in that it doesn't fall into the category of decreasing connections to the website server. It is still a connection to a server though, and thus takes time for the user to connect to the DNS server in order to perform the DNS lookup.

A DNS lookup is how the browser converts a domain name (such as www.charlesstover.com) to an IP address (such as 127.0.0.1) in order to connect to the server, similar to how a person uses the phone book to convert a name to a phone number. DNS lookups may also provide additional information, but that is irrelevant to optimization.

As it should be implied, but often forgotten, DNS lookups don't only occur when typing a domain into your address bar. They occur whenever a file within a web page is accessed. Browsers, being the modern piece of technology that they are, are smart enough to cache the IP address of the domain for the session. This means whenever displaying an image from the same domain as the web page, the browser doesn't have to do a DNS lookup to determine the IP address of the server on which the image is hosted. If, however, a webpage were to display an image from external website, the browser would have to do a DNS lookup for both domains.

This connection to a DNS server, combined with the sending and receiving of the data (domain and IP address), takes time. When the client has to convert multiple domains to IP addresses on a page, it can take a noticeable amount of time. Having twenty external files, each from a separate host (such as a social media widget, external image host, Google Analytics script, ad network, etc.), multiplied by the approximately fifty milliseconds it takes to communicate with the DNS server, you've used an entire second of absolutely nothing happening on the page! That's not including the time it takes to download the file content itself. Users will not understand that it is the fault of DNS lookups; it's this dumb website that's slow. This is why reducing DNS lookups is an important step in web page optimization.

Reduce the number of domains (this includes subdomains) that are referrenced to decrease the page loading times for clients. A simple mistake some people may make is using a subdomain for each file type (e.g. css.domain.com, img.domain.com, js.domain.com). Each subdomains requires its own DNS lookup. A webmaster would be much better off simply using static.domain.com for all of their static stylesheets, images, and scripts. The reason static files get their own domain is discussed in part three of this tutorial.

If an external file host must be used due to space or bandwidth restrictions, it should be restricted to as few hosts as possible. Do not switch between one image host and another just because one is more convenient at the time.

Use Sprite Sheets

Images are the fastest way to use up your server's available connections. Some people go overboard on images: pre-CSS3 rounded borders, graphical borders and horizontal rules, rotating banners, social media icons, author pictures, background images, footer and header styles, advertisements, and then maybe some images that are actually a part of the content that the client reads. Images are undoubtedly a part of modern web design, and are something a website can't live without; and nobody's advocating for a website without images!

The proposed solution is to download multiple images without connecting to the server multiple times. This is accomplished by sending them all in one connection. As an example, this is demonstrated using social media icons. Instead of using a separate image for a Facebook fan page link, your Steam Community link, and Twitter account link, a "sprite sheet" would contain all three in a single image: social network sprite sheet! That sprite sheet alone is of no use. The user can't click separate parts of it, and there need to be three different links; and what if they aren't to be displayed right next to each other? That's the beauty of sprite sheets. There are multiple methods of implementing a sprite sheet (such as an image's clip property), but the most common and easiest method of using a background image will be exemplified.

The final goal for your output is to appear as such:

  • Become a fan on Facebook:
  • Befriend on Steam:
  • Follow on Twitter:

Typically, this would take three images. With sprite sheets and the magic of background images, it only takes one (the one shown above).

/* set the attributes that are the same for all three images */
main a.sprite-icon {
	background-image: url('sprites.png');
	display: inline-block;
	height: 16px;
	padding: 0;
	width: 16px;
}

/* the Facebook icon is located at the top left of the image */
main a.sprite-facebook {
	background-position: 0 0;
}

/* the Steam icon is located 16px from the left of the image */
main a.sprite-steam {
	background-position: -16px 0;
}

/* the Twitter icon is located 32 pixels from the left of the image */
main a.sprite-twitter {
	background-position: -32px 0;
}
<ul>
	<li>
		Become a fan on Facebook: <a class="sprite-icon sprite-facebook" href="http://www.facebook.com/pages/Charles-Stover/143062495713195" rel="nofollow" target="_blank" title="Facebook | Charles Stover"></a>
	</li>
	<li>
		Befriend on Steam: <a class="sprite-icon sprite-steam" href="http://www.steamcommunity.com/id/gamechief" rel="nofollow" target="_blank" title="Steam Community :: ID :: Charles Stover"></a>
	</li>
	<li>
		Follow on Twitter: <a class="sprite-icon sprite-twitter" href="http://www.facebook.com/pages/Charles-Stover/143062495713195" rel="nofollow" target="_blank" title="Charles Stover (charlesstover) on Twitter"></a>
	</li>
</ul>

By using the background-position property of an element, the sprite sheet background is shifted to the image we want displayed. Using height and width, the rest of the sprite sheet is cropped out. Although not shown in the example, the sprite sheet can be shifted vertically using the second value in background-position, allowing images to be stacked horizontally or vertically. Play Tetris with images to fit many into as small of a space as possible. Check out Google's sprite sheet to see this principle in action.

It should be noted that this is not necessary for every image on the server. This should only be done if most of the images are displayed on the single page on which the sprite sheet is loaded.

Use GET for Ajax Requests

Whenever a Ajax (or XMLHttpRequest) is used, the client obviously makes a connection to the server. A little known fact is that a POST request takes two connections - the first of which sends the headers, and the second of which sends the content. A GET request only requires one connection, as all the data is sent at the same time. If less than a few kilobytes of data is being sent, formatting it as a GET request may be more beneficial than a POST request.

Avoid URL Redirects

URL redirects, to be blunt, make two connections instead of one, on top of screwing up the user's Back button. Someone may use a URL redirect for a page that changes often or to handle multiple actions through a single file, among other reasons.

For the former, say there exists a Featured Page of the Day feature. This link should exist on every page of the website, but no one would want to have to update every page of the website every single day to update the feature. The solution? Link to daily-feature.php and update the single file:

// today's featured article
header('Location: /articles/reducing-server-calls');

Unfortunately, the user has to connect to the server to receive the redirection in the aforementioned file. The user then makes a connection to the server to receive the featured page of the day. That's two connections for one page. Some websites, for whatever reason, will redirect their users multiple times before finally spitting them out at the intended destination. This should be avoided, and there are a few ways to do so.

Using Dynamic Links

Displaying a link directly to the featured page is ideal over linking to daily-feature.php on every page. As an example in PHP, here's a before and after of this hypothetical redirection scenario:

<a title="Featured Page of the Day" href="daily-feature.php">Featured Redirect of the Day</a>
<a title="Featured Page of the Day" href="<?php echo file_get_contents('daily-feature.txt'); ?>">Featured Direct Link of the Day</a>

By updating daily-feature.txt, the link will then update on all pages while allowing the webmaster the convenience of only having to update one file. Of course, alternatively, the link can be querried from a database or retrieved from any other location. The desired end result is merely to link directly to the page instead of redirecting to it.

Using Dynamic Files

Some programmers may use one file to quickly and easily retrieve information on a lot of other files. It's not too uncommon to see something along the lines of get.php?file=123. Once accessed, get.php will redirect you to the actual file. This is more useful than the aforementioned text file because it can handle multiple redirects within a single PHP file, instead of dedicating a TXT file to each link; but there are alternatives to even the almighty get.php, without resorting to hundreds of text files.

By running the get.php algorithm within the link itself:

<a href="<?php
// instead of get.php?file=1
include 'array-of-files.php';
echo $files[1];
?>">my file</a>

What if the situation isn't that easy? The URLs are stored in a database, and it's too resource-intensive to run a query every time the link is displayed. Assuming the files are stored on the same server as get.php, the file can just be displayed instead of redirected. This has negative SEO connotations, but this is a micro-optimization tutorial, not an SEO tutorial.

include 'convenient-mysql-file.php';
$url = convenient_mysql_function($_GET['id']);

// Wrong! Redirections are for squares.
// header('Location: ' . $url);

// Right! Display the file without making another server call.
echo file_get_contents($url);

Viola. With the convenience of a database to store the files, a redirection is not needed to display them.

Make CSS/JS External

While it may seem contradictory to the idea of minimizing file requests to take inline stylesheets and scripts and make them external [thus requiring another server call], there are actually two reasons to do this. Both, actually, revolve around saving bandwidth; but since external JavaScript and CSS has to do with cache and server calls, it fits better into this tutorial as precursor to Add Expires Headers (Cache) than it does into parts one or three.

The first reason to make JavaScript and CSS external is that browsers with CSS turned off or with JavaScript disabled won't waste bandwidth or download time by downloading external stylesheets or scripts. On these minority occasions, it's a win-win situation.

The second reason, which is the most important and most relevant, is cache. The browser cannot cache just part of a single document, thus cannot cache inline stylesheets or scripts. It can, however, cache entire documents, such as external stylesheets and scripts. By taking heavily-used scripts and storing them in their own file, combined with the Expires headers outlined in the next section, the browser won't have to re-download that file each time it is used; the browser will simply use the local cached copy.

Add Expires Headers (Cache)

The most important aspect of all web optimization is the cache! While it is ultimately up to the user's browser to determine whether or not it wants to cache a file, servers have the option to suggest a cache duration, and most browsers listen to the recommendation.

As mentioned under Reduce DNS Lookups, a subdomain should be used solely for static files - files that don't change depending on if the user is logged in, don't undergo periodic updates, etc. This will come into play again in part three of this tutorial, but for now this mass movement of files to their own subdomain will make it easier to keep track of them and quickly set all of their Expires headers.

An Expires header is a header sent with the file by the server that tells how long the client should cache the file. It is, as it implies, an expiration date for when the cache will become invalid. It is good practice to format your files in such a way that they will never become invalid. It's impossible to foresee problems that will arise in the future with your scripts, stylesheets, and images; but there are workarounds.

Perhaps you have a simple JavaScript files that updates content:

var updateStatus = function(author, content) {
	document.getElementById(author.replace(/\s/g, "_") + "-status").appendChild(
		document.createTextNode(content)
	);
};
updateStatus("Charles Stover", "is writing a tutorial.");

The updateStatus's parameters are going to change on a fairly regular basis, even though the function itself won't, so the user can't just cache this file permanently. The static parts of the file can be cached by splitting the original into two files.

<script src="update-status.js" type="text/javascript"></script>
<script type="text/javascript">
updateStatus("Charles Stover", "is writing a tutorial.");
</script>

Separating unchanging (or rarely changing) content from dynamic content allows caching to be harnessed to decrease bandwidth and server calls.

But what if the function or stylesheet updates every now and then, if there's a bug fix or minor update to the template? Is it still classified as static? That's a good question. The answer is yes - it is static and should be cached, and there are two things that can be done to prevent the conflicts of a browser using an outdated cache.

The first option is to give the updated file a new name, e.g. update-status-v2.js. The client won't have a cache of update-status-v2.js and will thus download the updated file. The problem some have with this is that references to the file must be updated to the new URL.

This leads to the second option, which requires a bit more tweaking and the ability to use mod_rewrite. When separating your static (updateStatus function) from your dynamic (updateStatus call), reference the static file using its filemtime:

<script src="update-status/<?php echo filemtime('update-status.js'); ? >.js" type="text/javascript"></script>
<script type="text/javascript">
updateStatus("Charles Stover", "is writing a tutorial.");
</script>
?

The URL is now update-status/timestamp-of-last-modification.js. The only problem is that that file doesn't actually exist. To mod_rewrite!

RewriteEngine On

# If the client is trying to access a file that doesn't exist,
RewriteCond %{REQUEST_FILENAME} !-f

# And it's in the format of some-name/numeric-value.extension,
# Send them the some-name.extension file.
RewriteRule ^(.+?)\/(\d+)\.(.+?)$ $1.$2 [L,NC,QSA]

By simply uploading an updated update-status.js file, the PHP will automatically reference it by a new file name, the client won't have a cache of the new filename and will attempt to download it, and the server will send them the new update-status.js file! Beautiful. By placing an .htaccess file with the above mod_rewrite contents into a static subdomain, it will be halfway configured to helping clients cache documents.

Only the most important part remains: actually recommending a cache length. Given the above method of file name reference, there is no reason to cache a document for any length less than permanently. Unfortunately, there is no standard for permanent cache. The common length, and often browser limit, to which a cache can be set is one or two years. You can do this two ways - via .htaccess or via PHP.

To add expiration via .htaccess, just add this piece of code to the same .htaccess in your static subdomain:

ExpiresActive On
ExpiresDefault "access plus 1 year"

To add expiration via PHP, for any static file that's displayed via a dynamic file (e.g. get.php?file=1):

header('Cache-Control: max-age=31536000, public');
header('Expires: ' . gmdate('D, d M Y H:i:s', time() + 31536000) . ' GMT');

By caching static content, clients will no longer have to request them from the server. The majority of server calls are cut, thus so is the bandwidth, and page loading times are reduced immensely. Since the client will have static files stored locally, the client won't have to redownload them; they will display almost instantaneously.

This is likely the largest and most important step a web programmer can take in increasing page loading speeds.

Make Ajax Cacheable

Just like other files, Ajax is cacheable. Dynamic files can have static results (such as get.php?file=1). While Ajax requests may go through a dynamic file, their content is not always so dynamic. No matter how many times the user accesses contact.php?list=admins, it will display the same thing. There is no reason for the user to connect to the server to get this content a second time when the browser could simply cache it after the first. You should be well equipped by now with the knowledge necessary to apply the same rules for files to your Ajax content as well. YSlow presumably only adds a check for cacheable Ajax content as a reminder.

Conclusion

You should no longer be making unnecessary DNS lookups, making a server connection for each image on your webpage, or redirecting between pages unnecessarily. Most importantly, you should be sending appropriate cache headers! This reduction in server connections will improve the speed at which your website loads and displays, a metric often taken for granted.

To get into the final nitty gritty, Part 3: Reducing Parse Time will deal with handling information after it has loaded.