Memcached usage notes

From Helpful
Jump to: navigation, search
These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


The main command line options:

  • -d daemon
  • -m 2048 take up to 2048MB of memory
default when unspecified is 64MB, usually too conservative, so distros tend to set something larger already
  • -l bind to this IP
  • -p 11211 ...and this port

(On 32-bit machines, you cannot give a single process more than, usually, 3GB or 2GB of memory (see 4GB of memory on a 32-bit machine). On 32-bit machines with PAE style extensions, you could run multiple daemons to effectively still use most memory.)

Some client, server, and interaction details

Client libraries don't break when a server drops out - they react as if all cache lookups are misses, causing your app to fall back to always generating things the slow way.

You would usually want to bring up memcached at the same address, which will then start caching again.

A single server is mostly just a simple and fast hashmap.

When given multiple servers, clients can choose to distribute objects among them (based on the hash, you may also get control of the hash so that you can ensure related objects are stored on the same server. Note that you'd want to do that consistently on all clients).

The client effectively adds another layer of hashing, in that it chooses the server to store an object on based on its hash and the available servers.

Note that this means multiple clients doing so should use the same client list, and the same hashing method, or they will send items to different servers, duplicating items.

It helps to have each be the same client implementation (also because some may have transparent serialization, that may not be compatible between everything you have).

Client interfaces

There are APIs/clients for most major languages (see [1], [2], [3]).

They wrap a text protocol you could also implement yourself.

Exactly what such an API takes and returns varies. Some may pickle (easier), only do strings (more interoperable with non-python), etc.

Basic use notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


  • get: note that you can ask for multiple entries in a single request
  • gets fetches more information, including the the value needed for the cas operation (see below)

Storage commands are:

  • set: creates entry if necessary
  • replace: like set, but only stores if the key already existed
  • add: like set, but only stores if they key was not already present exist
  • append: append to current cached value (only if it already existed)
  • prepend: prepend to current cached value (only if it already existed)
  • cas: 'check-and-set': store, but only if no one else has updated it
  • incr and decr increment and decrement a 64-bit integer.
entry must already exist, so start with e.g. set(key,'0') first)
Interprets a non-integer value as 0.

Time argument on sets are interpreted either as

  • if <2592000 (which is 30 days in seconds): a delta time from the current server time
  • if larger than that: Unix time (seconds since epoch)

Clients may have their own automatic conversions.

Say, some may translate unicode into utf-8 transparent, other require you to do that explicitly.

Some may pickle object transparently, others may require you do to that explicitly

...and explicit conversion is a great idea when you interoperate with other languages.


get_multi: fetch for many keys at once. Avoids latency overhead from doing multiple requests

flush_all: Clears the cache. Useful when developing, since you don't get to list items to clear.

Usage considerations

  • Tackle the most obvious cases first
what you want to cache often follows 90-10 patterns, in terms of cost to generate and/or frequency of fetch
You can leave this to the LRU-ness of the cache, but you may be able to avoid entering a bulk of nonsense.
  • avoid accidental collisions by typing your items
People seem to like a style like myapp:user:12 from myapp:doc:12
  • when mixing languages, either
avoid collisions between their varying serialization like py:user:12,php:user:12
or decide on a shared serialization
  • for speed
    • consider serialization (marshalling) costs.
    • consider network overhead - write your interaction to minimize the amount of round trips (i.e. amount of commands)
consider multi_get, multi_set.
this can be micro-optimization that barely matters, though


  • don't count too much on touching individual cache elements to keep them in there
in that badly designed setups may mean a lot of touches per page view, and become your new bottleneck unnecessarily
  • It can help to layer your cache a little more.
For example, fragments of pages may be more constant than the overall page ever is, but it might make senes to cache at both levels depending on cost
Some of this can also be cached in whatever end process you have, to lighten the load on memcached for things that have fairly simple/obvious/static use cases.
  • You may want to monitor memcache use statistics while developing, to see long-term patterns, and you may see some some obvious mistakes in, say, relative amounts of gets/sets this way.
  • you can't e.g. say that certain types of item should always be removed first (e.g. drop images before you drop page elements)
it is sometimes useful to set up multiple daemons to do this
  • bulk invalidation doesn't exist (because that'd be a potentially slow wildcard thing)
you can fake it in a few ways, but generally means using a version or time or counter as part of a key, meaning the entire system moves on every increment, and letting memcached deal with outdating the now aging items. (this gets called namespacing, sounds fancier). Footnotes:
if all things move on at the same time, this may mean a lot of cache misses at the same time
may lead to duplication if not all clients do this the same way
so consider you can put a "currently applicable version number" in the memcache too, so that you can tell other clients to move on to new keys (and control the delay with which they do so by having them only fetch it every so often).
either case can be annoying for debug because it relies on client-side logic

See also

Python memcache notes

There are a few different modules, including:


pure python
supports noreply flag, unlike various others




pure python

pipy's memcache

unfinished, don't use[7]