HTTP notes
From Helpful
| For more webdev-related articles, see the webdev category. Among the more interesting are general webdev notes, Javascript related notes, CSS notes, browser peculiarities, jQuery |
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
Contents |
Extent of response and connection, persistent and pipelining connections
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
Persistent connections, a.k.a. keepalive: allows a series of request/response pairs on the same connection. Avoids having more TCP connections and more latency for the case where you want many objects transferred from the same server. (Often disconnected after one or a few minutes)
Pipelining: persistent connections without the requirement for things to happen in series, meaning many requests can be sent (often ending up in the same TCP packet, even) and many reponses can be received at once (...in in the same order as requested).
Pipelining is not negotiated. Clients may simply try to pipeline requests on a persistent connection, but should be able to detect failure (usually because the client assumed a persistent connection when it was not), and be prepared to make a new connection and not try to pipeline on that. There are some further limitations. (See e.g. RFC 2616, section 8.1.2.2).
There are a number of HTTP 1.1 servers and HTTP 1.1 proxies that have limited or broken support for pipelining (or, in the case of some proxies, even for persistent connections), so some browsers selectively enable pipelining, and some disable it outright.
In HTTP 1.0, persistent connections should not be assumed, only used on specific negotiation. Unless they did negotiate it, HTTP 1.0 clients simply read until end of stream; a closed connection marked the end of the request and its body. With persistent connections and pipelining (HTTP 1.1), that approach won't work. Since in HTTP HTTP 1.1 persistent connections are the default type, servers must support it may assume persistent connections unless talking to a HTTP 1.0 client(verify), or talking to a HTTP 1.1 client that has signalled not supporting/wanting them by including a Connection: close header in its first request (which is also the only way the protocol itself supports disconnection; in general, both HTTP 1.1 server and client can choose to disconnect between requests).
HTTP 1.0 can negotiate persistent connections using:
Connection: keep-alive
If the server sends back this same header in the response, that is a signal that the server leaves the connection open and signals the client that this is a persistent connection.
It is expected of HTTP 1.1 servers to close any connection after it sends a response for which the client can't know where it ends. (Which depends mostly on presence/absence of Content-Length header in the response, the request's http version, its request headers, the response code we'll send)
As to Content-Length
(See e.g. RFC 2616, section 4.4)
When you have multiple transfers on a connection, transfers must somehow exactly imply their own length or their end.
The (slightly over-)simplified summary is that for responses that may carry a body (exceptions include 1xx, 204, 304 responses, among others) this means a Content-Length should be set.
Use of Transfer-encoding: chunked (which all HTTP 1.1 servers are required to understand) will also work, but does require both ends to be HTTP 1.1, so arguably this shouldn't be the only thing that you rely on.
For most static content it is (resource-)trivial to calculate Content-Length, so this all mostly matters for dynamic pages, in which you may have reason to not set it, such as laziness or the wish to start streaming out data without buffering all of it, whenever that means you can't know Content-Length at the time of sending headers.
Upsides to such streaming and to immediate flushing into the connection is often that it causes various browsers to do incremental rendering, which means pages that take long to build completely will be partly viewable before they are (that is, than the server waiting for all data to be ready before setting the length header and sending any data at all). Usually you wouldn't want to do that on a connection you want to immediately use for other things as well, so this is just as well.
Note that this is not necessarily a problem, as browsers generally open two connections to a server; the second can be used to pipeline fetching external resources.
Transfer-encoding: chunked (see RFC 2616, section 3.6.1) means that the body will be sent as a sequence of:
- length-as-hex-string
- that amount of data
- CRLF
...which is terminated with an empty chunk ("0<cr><lf>")
There are no real constraints on the chunk sizes(verify).
HTTP 1.1 implementations are required to understand chunked transfer-coding. Of course, it cannot send chunked to an HTTP/1.0 recipient.
A server can opt to specify that it will delay certain headers (e.g. Content-Length, Content-MD5) until the trailer after the chunked body. There are a number of restrictions on this. See e.g. RFC 2616, secton 14.40.
Note that a client may close a connection while the server is not done, such as a user reloading the page while it was still loading; various servers will log this this as an error (in apache these seem to be logged as 500 errors) while no one is actually bothered - the user will not see an error.
See also:
HTTP redirect
What
The common HTTP statuses (because they are the only ones in HTTP/1.0(verify)) are:
- 302 (MOVED_TEMPORARILY, "Found"), meaning "fetch this content over there for now, but in the future come back to the URL you just used."
- 301 (MOVED_PERMANENTLY, "Redirect"), meaning "it has moved, and in the future always go where I'm pointing."
Both require the new URL in a return header called Location (verify). Location should contain an absolute URL, although browsers may choose to work with relative ones as well.
A regular browser will end up at a new URL in either case. A larger difference exists in the difference in how spiders react.
Other codes that can be used since HTTP/1.1(verify):
- TEMPORARY_REDIRECT (307): useful for pages that use POST, since it instructs the browser to re-POST the POSTed data (while 301 and 302 seem to imply GET)(verify)
- SEE_OTHER (303): like 307, but forcing a GET instead. Primarily useful for scripts that may/always take POST requests to redirect to a basic URL / GET. (while 302/301 are relatively method-agnostic)(verify)
Note that older user agents may understand only 301/301 and not 303/307.
What and when
The 301 probably has the most practical uses, including:
- Redirecting between domains and sites (e.g. from example.com to www.example.com)
- moving domain
- effectively moving the pagerank from an old to a new location (not sure this is as true as some people make it sound, but google itself suggests using 301 over 302)
- site reorganisation (that is, using URL aliasing to keep old URLs non-broken), e.g. into a consistent or human-readable URL scheme, to a CMS, or some such thing
Uses for 302 include:
- redirect services, such as tinyURL
- Temporarily redirecting to backup content while restoring main content. (Though in practice backups are often out of date enough to make a "We're working on restoring the site" notice more practical)
Practice
The common example is to redirect the site root (/) to a new URL, but in practice this can be one location of many you want to direct, in which case you probably want to offload these individual redirects to the web server.
If you are moving sites and want to redirect the main url, you can easily use a line in .htaccess. If you want all URLs to be mapped (with a simple mapping), such as in a domain name move, or for mass URL aliasing, you can use something like mod_rewrite. See also pages like [1].
You may also be served by a script that fetches the new URL from a database.
See also rel="canonical"
See also
- RFC2616, section 10 (HTTP 1.1 Status Code Definitions)
- ".htaccess, 301 Redirects & SEO"
- "301 Vs 302 redirect"
Some header notes
Location
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
A response header mostly used in (external) redirects (and also in a few other places, such as 201 Created to refer to the URL that was created)
See also:
Content-Location
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
An optional response header to signal
See also
- http://www2.research.att.com/~bala/papers/h0vh1.html ( 'Key Differences between HTTP/1.0 and HTTP/1.1' )

