Cookie notes

From Helpful
Jump to: navigation, search
Related to web development, lower level hosting, and such: (See also the webdev category)

Lower levels


Server stuff:


Higher levels


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Introduction

Cookies give a website some memory on your computer.

Mechanically:

A server can ask a visiting browser to remember a piece of text
later visits means the client will send back that same text.


The exact same thing, verbatim, only, which is why this isn't a security issue in itself: the server can only hear back what it originally set.

Consider giving someone a secret word to repeat back later. The word itself can be entirely meaningless, but the fact they use it will let you know it's them.

This is the what cookies end up being used for ninety percent of the time.



Why remember things?

You may want websites to remember who you are.

Websites may want to remember who you are.



Logins could almost not work without it. The best you could do is, after you authenticate successfully once, sending some sort of (verifiable) token meaning "yes they successfully authenticated before" in the URL of every single request. If you squint, this is mechanically almost the same as cookies (just in a different place in the request), except that it's awkward, will fall out of some link somewhere, and then you're no longer logged in.

That would be a lot of complaining and/or helpdesk calls. From this perspective, cookies are just the "do it for all requests to the same hosts for the next X time" convenience.


Other purposes include identifying returning users regardless of login

...which can be used e.g. remember shopping basket state from before you logged in
or remember other state within a session, for a person who hasn't logged into an account, or for a site without accounts
Can be done anonymously, e.g. storing a number that is randomly generated and itself meaningful, but remembered for a while


Plus an extension of the previous point: tracking user visits, usage, or the order in which pages are accessed within a site, for example

doing so requires identifying returning users, cookies happen to be the most convenient way of doing so.
anonymously recording the set of pages that a specific browser visits on a site (or within the domain in which the cookie is set)
otherwise maintaining specific information about users (often relatively anonymously unless you choose to identify yourself to a site/domain that does this, or happen to be identifiable another way)


Many uses of cookies are some variant of this "yup, this is still me" use, but there are others. It depends mostly on what is stored.

Mechanism and syntax

Cookies are just text.


Their values and related metadata are sent in a HTTP header.

A cookie set consists of

  • a set of semicolon-separated predefined parts
  • most of which are optional (will default to certain behaviour when omitted)
  • most of which take the form of attr=value, a few of which take no value

...and actual values to store in the cookie, in the form name=value, which is actually the only required part.


For example, a server may have a response containing:

Set-Cookie: foo=bar; Expires=Mon, 09-Dec-2002 13:46:00 GMT; Secure

A UA may then later send a request containing:

Cookie: foo=bar

If multiple cookies apply, they are merged into one Cookie header, separated by semicolons (RFC6265 forbids multiple Cookie header fields[1])


Scripting?

Originally, it was only the HTTP response that could ask for cookie sets.

Assuming the browser agrees to do so, it will then send that exact information whenever it returns to that same domain/server/site/application.

Later, scripting added some extra abilities, which meant that scripting on the same page that served the cookie could alter it.

This does not change who can interact with this information - unless people made the security mistake of allowing anyone to add code to their page.






Standards and the real world
  • The first standard was Set-Cookie:, the basics of which were standardized in RFC 2109 (from 1997)
  • Set-Cookie2 was written to extend that, standardized in RFC 2965 (from 2000), with the idea that it would replace Set-Cookie
but it never really caught on, and was deprecated in 2011 by RFC 6265.
Modern browsers do not support Set-Cookie2.
  • RFC 6265 is basically an update to Set-Cookie:
  • There have been various things used in the real world since the RFC 2109 spec that became widely supported.
Some of them made it into RFC 6265, some of them are just common enough that you'ld want to support them.


Name and value

Required.

Basically just name=value

Notes:

  • attribute names are case-insensitive
  • the value should be escaped so that it will not clash with parsing -- see RFC 2616's definition of quoted-string
  • if a particular name appears multiple times in a set-cookie, only the first should be used
  • Names starting with $ are reseved and not to be used by applications

You can set a variable name to contain a new value to make the browser overwrite it.

Note that set-cookies with new values only overwrite values for a name when the old and new Domain and Path values are also equal.

Note that an Expires in the past and/or a Max-Age of 0 will cause a cookie to be discarded regardless of value (a common way to delete a cookie).


Expires

Optional, but common because a lot of uses want persistent cookies, not session cookies.


Cookies without an Expires=

will lead to the UA removing it when the UA closes.
Sometimes called session cookie - in the per-run-of-the-browser sense, not to be confused with cookies that support login sessions.


Cookies with an Expires=

should persist between different runs of the browser
until the given expiration
..or until the UA removes the cookie for other reasons
These are sometimes called persistent cookies.


Domain

Optional.


If omitted, the UA will decide to send the cookie for the hostname that set it, excluding subdomains.

If set, the UA will decide to send the cookie for the requested domain (unless refused for some reason), including subdomains.


For example,

if domain was omitted, and set-cookie was sent from app1.example.org, cookies will be sent back from visits to app1.example.org, and not example.org
if domain was omitted, and set-cookie was sent from example.org, cookies will be sent back from visits to example.org and not app1.example.org
if domain=example.org, cookies will be sent back from visits to example.org, app1.example.org, and any other subdomain of example.org


If you meant it as a site-wide login, you might want to send Domain=example.org so that it will be sent to example.org and anything under it, e.g. app2.example.org


Notes

  • can be a feature, e.g. for site-wide logins
  • can be a security risk. Consider e.g. the case where app2.example.org is hosted hosted by someone else.
  • You can only send one value
so you can't craft a list of specific allowed hosts
  • that value will only be accepted if the host that requests it is part of that domain
  • browsers may have further restrictions, e.g. most will refuse 'Domain=org' - see supercookies[2]

Path

Optional.


You can ask that a cookie be sent for all requests that are, directory-logic-wise, under a specific path.


If omitted, "the user agent will use the 'directory' of the request-uri's path component as the default value." (basically the request path up to the rightmost /)

Setting this is often done to widen that.

For example, app1.example.org/log/me/in may want to set Path=/ when it sets a login cookie


Matching will be

  • from the start
e.g. Path=/docs will not match /my/docs
  • full directory name (up to the next slash or end of the string), not substring
e.g. Path=/docs will match /docs and /docs/
e.g. Path=/docs will not match /docsets


Note that when you have a reverse proxy in front of your app, the path (also host, and potentially domain) for the application may not be the one the browser sees, which can lead the browser to reject or just not send a cookie.

Having such a proxy rewrite cookies is possible, but not always easy to do well.

Secure

Optional.

Is a flag, takes no value.


This is the server requesting that the UA only sent the cookie when doing HTTPS requests to the originating server, and not in HTTP requests.

This should make it more resistant to snooping and certain man-in-the-middle attacks.

HttpOnly

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Optional.

Is a flag, takes no value.

Supported since 2010 or so. [3][4]


Requests frrom the browser that it sends this cookie back on on pages served by the domain (including XHR/Fetch requests(verify)), but does not expose its value to the page's browser-side scripting.


This lets you make a hard split between

cookies that don't need to be read out by scripting (like login tokens),
cookies that you specifically want to use from scripting (e.g. remembering parts of UI state).


The idea being even if your page has XSS (Cross-Site Scripting) issues, inserted scripting cannot read out or alter that cookie.

However, HttpOnly was only ever meant as a useful mitigation, never as a secure solution.

While XSS cannot read/steal the cookie, there are still certain flaws.

  • XSS may in certain cases still effectively replace/overwrite the cookie's value (but not read it)(verify) (consider attacks such as that creating many new cookies when a browser has a limit per domain - this can flush the oldest, and replace the value with a new cookie).
  • CSRF
  • XST

SameSite

Third-party cookies

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


Rejection of cookies

Rejection of invalid cookies

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Incompleteness and invalidity will lead browsers to reject cookies.


According to RFC 2965, section 3.3.2, invalid cookie-store requests are those that satisfy one of the following:

  • the path in the cookie is not a prefix of the URI path of the page that set-cookie was requested from (you can't set other paths' cookies)
  • the UA request's effective host not match the cookie's domain (this can be a potential bother when using reverse proxies, named virtual hosts and such. You often want to use request information for the cookie)


And, in practice, a number of domain/host requirements

  • often the minimum-total-dot requirement:
    • Two for .com, .net, .org, .edu, .gov, .mil, and .int (verify) (For example, there are two in .example.com, so it passes this test)
    • one for .local (verify)
    • Three otherwise (verify)
  • when the domain in the cookie implies that the host part has a dot
    • e.g. a set-cookie from www1.webservers.example.com for domain .example.com implies that the host is www1.webservers, which contains a dot so is invalid. For this example, you would want to specify the domain .webservers.example.com
  • the domain does not follow general DNS rules (made of letters, digits, and hyphens [a-z0-9.-]). Note that intranet naming does not always necessarily keep to DNS rules, e.g. by containing underscores. See also Microsoft KB 909264, RFC 952, RFC 1123.


In browsers, incompleteness or invalidity of a cookie may mean the cookie will not be set at all, be used but not persist, and this may differ between browsers and specific types of invalidity.


Note that in reverse proxying, the path and host for the application may not be the one the browser sees, which can (rightly) lead it to reject the cookie (or just not send it as expected). Having a proxy rewrite cookies is possible, but not always easy to do well.

Limitations

Size

Storing data directly in a cookie (rather than a token to refer to data elsewhere) is possible, but you cannot count on cookies storing more than approximately 4KB.

Browsers have rules about per-site size as well as total cookie size, in part because having large cookies means larger requests makes all requests to the host/path, application (or even domain, if you set the domain and path broadly) larger so a little slower, particularly if there are many large cookies.

Amount

The spec gives no limits, real world may give you maybe 50-300 cookies if you're lucky, but probably shouldn't count on getting more than 20.


Keep in mind that browsers are free to e.g. delete the least-used cookies once you reach such a per-domain limit, or some 'total' limit.

Further notes

A server can effectively remove a cookie in several ways.

  • you can mention the name with an empty value, and
  • you can mention the name and a new expiration date - one in the past.

Javascript

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


Flaws relating to security

Privacy

Login

When you log into a website, what it typically does is

  • some mechanism of check supplied credentials, once,
and in the same interaction, store some value that signifies "I recently checked your credentials" that can be actively verified (and reveals little else), here in a cookie
  • on all later requests (until the cookie and thereby the login expires), actively verify that cookie


That cookie usually stores a large randomly generated number, that is completely meaningless in itself, except that both sides remember it for a while, and on the server side it is associated with the user you logged in as.


There are a handful of basic security details, such as

make that number large, so that trying random numbers would guessing take thousands of years, so infeasible"
not accidentally giving someone runtime control of your webpage scripting, as they could just read it off
it's a good idea to use the HttpOnly flag, which tells the browser "this cookie is just for you to send back, not for scripting have access to at all" (by default it's both)


...but given such care, this is a pretty good system.


It's also the basis of most login systems, largely because without having any way to persist that "I've checked you", you would have to send your credentials every new connection (which for a browser is a at least one and typically a for every refresh)

Just a tool? Minor evil? Better than the alternatve?

On cookie laws

See also

  • RFC 2109, 'HTTP State Management Mechanism' ((technically) obsoleted by...)
  • RFC 2965, 'HTTP State Management Mechanism'
  • perhaps read RFC 2964, Use of HTTP State Management