General WSGI notes

From Helpful
(Redirected from WSGI)
Jump to navigation Jump to search
Related to web development, lower level hosting, and such: (See also the webdev category)

Lower levels

Server stuff:

Higher levels


WSGI (Web Server Gateway Interface) is a callback-based API defined by PEP 333 and later revised by PEP 3333.

It is an interface standardization to allow freeer combination of web apps, web servers, and middleware. This is its primary grace: it made python on the web pluggable, and for apps, it does this pretty well.

Apps are fairly easy, though due to a somewhat unclear/underspecified spec, servers and middleware are harder to write fully correctly. Fortunately you have to touch that unless you really want to.

For tutorials, see e.g.

For a reference implementation of WSGI, see [1].

There is now also ASGI, for async-capable code, which you may well prefer for things like WebSockets

Introduction by example

See also: #Code_snippets_for_a_quick_start for hosting it in a server.

Example apps

Very basic apps look something like:

def application(environ, start_response):
    start_response('200 OK', [('Content-type','text/plain')])
    return ['Hello world!\n']


# Shows your environ within WSGI
def application(environ, start_response):
    start_response('200 OK', [ ('Content-type', 'text/plain') ])
    for k in sorted(environ):
        yield '%-30s:  %r\n'%(k,environ[k])


# Some people prefer to wrap apps in a class. 
#   This is equivalent (WSGI just wants a callable)
#   and useful when you want to instantiate with some state (I never have, but hey)
class Enver(object):
    def __call__(self, environ, start_response):
        start_response('200 OK', [ ('Content-type', 'text/plain') ])
        for k in sorted(environ):
            yield '%-30s:  %r\n'%(k,environ[k])

application = Enver()

If you want to quickly serve these as a test, see #Code_snippets_for_hosting

Helpers, higher levels, frameworks and such

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

There are various other styles you can write in, which can be anything between 'a few helpers' and 'a whole framework'. There are also some python frameworks whose setup let them wrap-and-return things into WSGI apps to serve them that way.

These formalisms/frameworks/helpers include:

  • falcon
  • WerkZeug
  • bobo
  • WebOb[2] (uses a Request/Response object style that you may be used to, and adds other parsing), part of paste
  • Routes, a python version on Ruby on Rails's routing
  • restish REST-style URL dispatcher, uses webob)
  • CherryPy (in that its application objects can also be used as WSGI application objects)
  • repoze (Zope stuff) [3] [4]
  • Grok

...and various others, see e.g.

Note that there is overhead inherent in all convenience, which can matter particularly when comparing to minimalistic bare-WSGI setups - request rate can be pretty impressive (for a dynamic language) when you do just barely what is necessary. (the above list is very roughly from less to more overhead/convenience)

Chances are that in larger projects, you would end up writing your own approximation of the functionality that already exists in one of these, so you might as well choose one for projects on projects of any real complexity.

You might also wish to write some critical site portions in bare wsgi code to get maximum per-node requests/sec throughput, and write most of the HTML stuff using easier, higher-level frameworks and count on version (implicit and explicit) caches for certain speed aspects.

This is usually not very hard since no matter what the framework, the thing you hand to the WSGI hoster is a WSGI compliant application object.

Various things can decide to act as WSGI apps. For example, Pylons is modeled around WSGI and can be used that way. CherryPy often runs its own threadpooled HTTP server, but its application objects can also act as WSGI apps. Various other frameworks support WSGI in similar ways.

More technical


The above examples are callables that conform to WSGI's basic structure:

  • environ is a dict (containing string keys and mostly string values, but often also a few objects from the WSGI host)

  • It is an application's responsibility to:
    • build the headers
    • call start_response() (with the headers and status)
    • return data in an iterable of zero or more strings
      • a list or tuple is often sensible
      • a generator (using yields) is sometimes nicer, though doing this beings in some extra footnotes
(there are a few subtle differences in the way the data leaves your application, which can have some effect on the way it is served, but you don't need to care about that unless you want to)

  • start_response is a callable that takes three positional arguments:
    • a string containing the HTTP Status-code and Reason-phrase [5], for example '200 OK'
    • headers, in a list of (name,value) tuples
    • exc_info (optional, defaults to None), used to pass exception exception information around (in the presence of an exception should be a (exceptiontype, exceptionvalue, tracebackobject) tuple (which is what sys.exc_info() returns when handling an exception)
    • returns a write function -- but you should probably generally ignore that

  • You leave it to the server to do the lowel-level serving. It may do some of its own thing.

Because this is primarily a description of behaviour, there is some flexibility while still conforming.

Including the use of middleware. For example:

# Wrap paste's error handling around this app, so that exceptions show in the browser
# in the browser, instead of a generic '500 internal server error'
from paste.exceptions.errormiddleware import ErrorMiddleware
application = ErrorMiddleware(application, debug=True)

No form, cookie or even URL/path parsing is provided by the server interface.

You may wish to use third party library (e.g. paste, or things like Webob). While the standard library has them, that tends to be a little more tedious.

For example, the following code is already relatively fleshed out, using paste:

import paste

def application():
    response_headers = []
    status='200 OK' # As a default, since you probably return it most of the time

    # A little parsing convenience:
    path      = environ.get('PATH_INFO','')
    reqvars   = paste.request.parse_formvars(environ, include_get_vars=True) #Note: MultiDict
    cookies   = paste.request.get_cookie_dict(environ) # ...for example

    if path.startswith('/hello'): #Gotta have one of these (apparently)
        response_headers.append( ('Content-type', 'text/plain') )
        output.append('Hello world!')

    elif path.startswith('/env'): # To give you an idea what's in the environment dict
        response_headers.append( ('Content-type', 'text/plain') )
        for k in sorted(environ):
            output.append('%-30s:  %r\n'%(k,environ[k]))
    # You could add elifs for your real handlers here, or of course do your 
    # URL dispatching differently, which on a larger scale you likely would.
    else: # Can be handy for debug, such as:
        status='404 Not Found'
        response_headers.append( ('Content-type', 'text/plain') )
        output.append('\nNo handler for this URL. Request details:\n\n')
        output.append(' Path: %r\n'%path)
        output.append(' Request/form variables: %r\n'%reqvars)
        output.append(' Cookies: %r\n'%cookies)

    #We can add this one:
    response_headers.append( ('Content-Length', str(sum(len(e) for e in output))) )

    start_response(status, response_headers)
    return output

Dispatching, mounting, and path parsing
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

A server may wish to host more than one app.

That means you'll want one root app to dispatch to others.

If your framework lets you mount apps on paths, most of what it's doing below the covers is altering PATH_INFO and SCRIPT_NAME variables, in this case in the environ dict

  • SCRIPT_NAME: The part of the URL's path part that was used to dispatch to the current application object -- basically the location this particular instance was placed at. (Can be empty, when the app is mounted at the root path)
  • PATH_INFO: The request path without SCRIPT_NAME -- i.e. the virtual path within the application


  • Not all leaf apps need to care about these values, but a good amount does, so dispatchers (be it your framework or your own code) must do this correctly, or will have tomatoes thrown at them later.
  • An app itself usually doesn't have to care about SCRIPT_NAME, in that many need only take information from PATH_INFO
    • ...except when reconstructing absolute paths or full URLs to themselves - but you may want to use existing tools to do that for you, to catch a bunch of special cases (e.g. special headers in reverse-proxy cases) that apply in the real world
  • On escaping of those two: (unsure so far)

Pluggability, applications, and middleware

There are two sides to the WSGI API: the side the app sees, and the side the server sees.

Because both are well-defined and no-frills nature of both, things can easily be WSGI applications, be gateways to WSGI applications, or both.

When they are both, they sit between written applications and actual WSGI hosts, and we usually call it middleware.

Middleware can be useful for things like logging errors or mailing them to you, transparently supporting sessions (e.g. [6]), to do simple load balancing, to do selective content gzipping, authentication, support debugging, support profiling, give "please wait" user feedback when an application is being slow[7], and various other potentially useful things.

Writing middleware
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

It is suggested that you write middleware only when you can explain clearly why middleware is the best way of doing the thing.

If it interacts with your actual app, it's not really middleware, but rather a entangled pseudo-library, and is hazardous to portability. Put it in a real library and use that.

Simple middleware is fairly easy. Compliant and robust middleware is a little harder. When writing something serious it is suggested you work wsgiref.validate into your process, on both ends.

Remember that a WSGI application is mostly a callable. Middleware both takes such a callable (to be executed later, as it must act like an application itself) and is a callable.

It's typical to see it be a class. Not necessarily, but that makes it easier to hand in some state, like configuration, into the middleware at instantiation time.

The bare bones would be:

class AddsNothingMiddleware(object):
    def __init__(self, app): = app 

    def __call__(self, environ, start_response):
        return, start_response)

#given some application (or other middleware class instantiation) hello:
wrapped_app = AddsNothingMiddleware(hello)

This adds absolutely nothing. It passes though environ, and the start_application callable, roughly the minimum of what it must do to conform to what WSGI expects.

Say you want to do something, like set a header. This implies changing what start_response does, and the specs imply that you have to wrap that function (otherwise the responsibilities of the response become very fuzzy indeed). For example:

class StupidCookieMiddleware(object):
    def __init__(self, app): = app 

    def __call__(self, environ, start_response):
        def my_start_response(status, headers, exc_info=None):
            headers.append(('Set-Cookie', "name=value"))
            return start_response(status, headers, exc_info)

        return, my_start_response)

Much middleware will do one or more of:

  • taking some configuration at instantiation time
  • wrap the call to the application, for example to return an error page when it raises an exception
  • wrap start_response so that we can intercept and change the headers (instead of letting the call pass through to, eventually, the WSGI server).
  • check out exc_info to do error handling
You can capture, handle, log, and/or re-raise the exception if you wish, of course)

Doing all of those takes a little knowledge and practice to do right. -->


  • The environment is a handy place to place data to move it around(verify). Please use keys with names that are likely to be unique, though.

Notes on...

Input headers

Input headers are copied to the environ, and normalized in the process (capitalised, underscored, more?(verify)). For example, if you add "My-Header: 1", you'll get an entry like 'HTTP_MY_HEADER': '1'

It seems it's impossible to get to get at the underlying data (from strict WSGI, some servers may allow you to cheat(verify)).

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


  • HTTP_* (header values, mostly as in the HTTP request. Names capitalized), e.g. HTTP_HOST,
  • SSL variables [8] (see the way apache uses these), for example HTTPS, SSL_PROTOCOL
  • wsgi.something: (see the PEP for more details)
    • wsgi.version: e.g. (1,0), referring to WSGI 1.0
    • wsgi.url_scheme: e.g. 'http' or 'https'
    • wsgi.input: input steam
    • wsgi.errors: error output stream (often ends up in a log, sometimes sys.stderr)
    • wsgi.multithread
    • wsgi.multiprocess
    • wsgi.run_once
  • non-pre-defined variables, including those set by:
    • server
      • for example paste.parsed_formvars, paste.throw_errors, paste.httpserver.thread_pool
    • libraries
    • you

Note that your own variables should be named so that they are unlikely to clash, lowercased, and (verify). Note it is often a bad idea to rely on placing things in environ, unless you know why people say that an can explain why it isn't)

Most of these details copied from [the PEP

The return iterable, Content-Length, streaming (and write())
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

You return an interable, usually either

  • a list, of zero or more bytestrings
  • a generator, by using yield

There are implications from

  • WSGI spec (e.g. may not delay a block before reading the next),
  • HTTP,
  • Content-Length

...that may let the server do

  • persistent connections
  • chunked response
...sort of. That is, the WSGI specs require Content-Length, which implicitly prohibits chunking

The short summary of take-home suggestions include:

  • try to make chunks larger when easy/possible
...regardless of
whether you use yield or list-of-strings
whether you are aiming at a HTTP-chunked response
reason being that the server behind it has to handle each chunk separately, and
writing very small things on a socket is inefficient
any per-chunk overhead adds up for unnecessary inefficiency
the server cannot decide to merge things -- that would take all control away from you. So you have to do it yourself
when you don't care about incremental loading, then return [''.join(parts)] at the end of your handlers makes a lot of sense
you can write a bit of middleware-style code that aggregates things into few-kilobyte-at-least chunks.
  • Calculating Content-Length allows the underlying connections to be persistent, which is good for speed
not everything will do it, but it's a very good habit
When you are using the return style (and don't start your response before you know what your response is), then something like response_headers.append( ('Content-Length', str(sum(len(e) for e in output))) ) goes a long way.
Note that a length-one response (such as that produced by the join mentioned above) allows backing servers add Content-Length when not present. Not all will, but it's valid for them to do so. (verify)
  • avoid write() unless you know more than you wanted to know about WSGI. It's hard to use correctly, and (even) when you know when you're doing adds little over yield style

More details

Specs say that the server/gateway/middeware must not delay a block (even in the case of a list of strings), that they must send one string to whatever underlies it (eventually the client connection) before requesting the next.

This makes it easier to avoid complex concurrency mechanisms (and related problems) in servers, but also has some implications to your coding.

For example, if you want output buffering, you must do it in your app (even doing it in middleware is technically a violation of specs, and one that you shouldn't break by design, though one which you could possibly justify as in some specific situations).

HTTP's specs around (absence of) the Content-Length header has implications on WSGI server behaviour.

  • It can send out the response without that header, and (have to) close a connection immediately afterwards (non-persistent connection; slower)
  • It could use a chunked response (if the request was HTTP1.1), in which case the connection can be reused.
Does rely on compliance with quite a bit of RFC 2616
  • If the response is a length-one list/tuple, it can itself add a Content-Length header (since it can do so without violating the one-chunk-at-a-time rule) and so the connection can be reused

If you use a generator-style iterable, the server may stream it, and it will try to so if implemented -- though details may vary(verify).

This can be handy for serving of large amounts of data, or use incremental parsing or such to avoid immediacy of other resources.

Again, without a Content-Length header it may have to close the connection(verify).

On use of write()

write() is a function returned by start_response.

Some people associate the use of write() with "low resource because I'm bypassing stuff", but this is not necessarily true.

Its use should be functionally equivalent to the generator case (although it is more producer-based than consumer-based).

If the server implements chunked coding (and technically it has to if it says it's HTTP 1.1), it may back write() calls with a chunked response.

The iterable setup is more flexible - you can choose for best throughput as well as get streamed response with it.

On the other hand, write() complicates things slightly:

  • in an application that uses write(), you should return an empty list/tuple
  • the write() implementation must guarantee either that the data was sent/buffered for transmission, so it may block
  • write() is considered a hack, and its use is discouraged
  • In the case of middleware, write adds some extra rules, such as that if the encapsulated app uses write, you must not use write() to transmit data that was yielded by an underlying application via the iterable, and must use write yourself to pass that data through.

It seems it is possible to use correctly, just more bother than it's ever worth.


WSGI is made mostly for (coded) data transfer and makes no hard assumptions about content (Just as HTTP is).

As such, all encoding/decoding must be done by the application. Strings you hand into WSGI functions or as return data must not contain unicode characters .

The string you pass should contain only byte values (0x00-0xff). When you have a str/unicode distinction (e.g. cpython before 3) you should use str (which is a bytestring), while in in Jython, IronPython, Py3k, etc., where str is unicode-based, you can use that type as long as it contains only U+00-U+ff.

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

If you go the list-returning way, try/finally is a simple and handy enough way to get cleanup code to always run - though indefinitely blocking calls can still stall the handler.

Since py2.5, you can use yield in try-finally [9].

Servers can choose to kill stalling threads.

In many cases, you can also use the interpreter's atexit.

See e.g.:

Early client connection closes; blocking resources
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

The effect that the TCP connection closing (e.g. pressing escape, a broken connection, etc.) has on execution of your handler depends on a few things, including how you return your data, on the server, and whether you (can) check.

The best defined case is probably the one in which your handler does all of its work in one go and returns a list (of one or more strings), because (unless you access wsgi.input or such itself) it's independent of the connection, and you can assume it will do all the work before returning it to the server that decides to throw it away (including all cleanup you need to do, which is why it may be the safest option in terms of dangling file handles, connections, and such).

You generally want timeouts on all the IO work you do in a handler. If you don't, you risk the handler hanging endlessly, which is independently of the server's connection to the client. In the general case, you can't count on the WSGI host killing off IO, particularly since even insertion of a python exception won't do anything if the blocking is in C code.

A word of warning on calling apps from apps

It may be tempting to write an app that relays to others using something like:

if path_info.startswith('/other_app'):
    import other_app
    output = other_app.application(environ,start_response)
# ...and others

Don't do that.

Yes, it can work, and sometimes you can justify a quick hack. But don't do without realizing how you are now blurring the responsibilities of the response.

  • other_app.application calls start_response (it has to, to be a valid app itself), so you cannot call this in the wrapping code (in this specific code path - you may have to in others).
  • You give up all control of the status and headers. (Also, possible use of the write() function can become even less clear).
  • The above example does not alter SCRIPT_NAME or PATH_INFO like you would expect, so applications that rely on that being set will may not work properly (...when not directly under the root path).

If you insist of doing this, you'll probably want to at least get back control of start_response, and probably alter SCRIPT_NAME and PATH_INFO so that this application work properly in URL mounting.

For start_response, you would wrap in a way similar to what middleware does (you are selectively being middleware), something like:

if path_info.startswith('/other_app'):
    import other_app

     def my_start_response(status, headers, exc_info=None):
        # captures the call's values
        headers.append(('Set-Cookie', "name=value")) # and you can change them if you want.
        # then emits it as our own:
        return start_response(status, headers, exc_info)

     output = other_app.application(environ,my_start_response)

Hosting WSGI

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

A WSGI apps's longevity (and of course the request processing overhead) depends on the type of hosting and the amount of wrapping that implies.

You probably want to host WSGI apps in moderately long-lived processes, since this lowers latency for most requests.

One way is embedding it, e.g. with mod_wsgi for apache2.

A slightly more flexible way is reverse proxying to a a standalone server

mod_wsgi allows this
but you can do this to

(such as apache children; see mod_wsgi), or in processes that won't be killed or only after serving a whole load of requests (then often a separate daemon/server that you HTTP-proxy to),

WSGI connectors/hosts/clients include (I've not used many of these yet):

  • twisted web (twistd) [13]
  • paste.httpserver [14] (fairly simple threadpooling server)
  • mod_wsgi [15], an apache2 module (embeds interpreters in apache children, or proxies to a (apache-spawned) set of daemon processes) (see also notes on mod_wsgi in here). See also mod_wsgi notes
  • green unicorn(gunicorn)[16], prefork-style (ported from unicorn for ruby)
  • wsgiref.simple_server [17] - useful as a simple test server (hosts one app, no URL resolution)
  • waitress[18] (pure-python)
  • cherrypy.wsgiserver (foregoes most of CherryPy, mostly uses its networking. See also [19])
  • Spawning [20] (threadpooling, more)
  • isapi-wsgi [23], an IIS plugin (there are some others)
  • some other major and minor web servers.
  • ...and more

On speed: There are a number of comparisons, though no good or complete ones - that I've seen so far. That said, at least mod_wsgi, spawning (*nix-only), and CherryPyWSGIServer seem like fast enough options. paste's server can be handy for development, but is a bit slower (but still better than e.g. simpleserver).


  • wsgiref.CGIHandler: Wrap WSGI into CGI (using sys.stdin, sys.stdout, sys.stderr and os.environ)
  • paste.cgiapp (takes a CGI app and wraps it into a WSGI interface)
  • Flup[[24] servers/gateways:
    • flup.server.ajp (Host WSGI apps in an AJP interface(verify))
    • flup.server.fcgi (Host WSGI via FastCGI; persistent apps)
    • flup.server.scgi (Host WSGI via SCGI; persistent apps)
    • flup.server.cgi (this will obviously be slow)
  • ajp-wsgi[25] (low level is C, with an embedded interpreter to run the WSGI. Faster than flup's ajp)
  • ...and more

See also:


Code snippets for a quick start

There are many ways of hosting code. Some of the few-liners, to get you to choose one and get started:

These are just meant as copy-paste stuff to get above examples running.

Note: The below assumes you've defined/imported an object called application.

WSGI reference implementation server:

import wsgiref.simple_server


import paste.httpserver
paste.httpserver.serve(application, host='', port=8282)

Werkzeug / Flask (note: can also be done from code):

FLASK_DEBUG=1 python -m flask run --host --port=8282

(FLASK_DEBUG is quite nice for debug feedback, but it's not how you'ld serve in production: it doesn't like concurrency, it doesn't like clients closing while generating data, it has the CLOSE_WAIT problem. Some of those go away when running it in typical mode, but you may also want to find another server, e.g. tornado)

Tornado[26][27] (a bit more capable, may have to install):

import tornado.wsgi
import tornado.httpserver
import tornado.ioloop 

server = tornado.httpserver.HTTPServer( tornado.wsgi.WSGIContainer( application ) )

Note: It logs via logging, so you can get more feedback like logging.getLogger('tornado.access').setLevel(logging.INFO). See also [28]

Cherrypy's WSGI server:

from cherrypy import wsgiserver
server = wsgiserver.CherryPyWSGIServer(('', 8282), application, server_name='localhost')
try: # if you don't have this try-catch, Ctrl-C doesn't always stop the server
except KeyboardInterrupt:

Spawning works from the shell:

# spawn pythonfilename.appobjectname

If you want to do this without the shell, look at - running spawn calls that module's main(), which mostly just calls run_controller() after working out the options. Google around for details. One example might be:

from spawning import spawning_controller
args = { 'args': ['modulename.application'], 'host': '', 'port': 8282, }
spawning_controller.run_controller('spawning.wsgi_factory.config_factory', args, None)

Note that args is an application specified in the same way as it works from the command line, and that most settings are left to their defaults.

Tornado notes
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Tornado is centered around a single event loop [29].

On concurrency

To keep things fast, your code must always be able to return control to the ioloop quickly, whether that means coroutines, separate calculations, etc.

Quoth [30], in general, you should think about IO strategies for tornado apps in this order:

Use an async library if available (e.g. AsyncHTTPClient instead of requests).
Make it so fast that even synchronously blocking in the IOLoop, it's not noticeable in typical practice
e.g. fine to implement a memcache this way, good goal to do this as much as possible, e.g. a well-indexed database
Do hard calculation in a ThreadPoolExecutor
Remember that worker threads cannot access the IOLoop (even indirectly) so you must return to the main thread before writing any responses. In many cases you don't win that much.
If there is work that can be done separately and cached, do that thing.
and consider being clever about what you keep warmest in that cache (e.g. bias it by the most-requested things)
Things that do not need a response can be handled by a background script
e.g. "send mail" can easily be "store mail in database, count on separate mailer process to get to it", unless you must tell the user exactly when it was sent to the mail server
Accept that occasional slow code in the IOLoop slows everything things down.

Note that you can get a forking server, which gets you one independent ioloop per process (helps when clients use multiple connections, and helps saturate your cores). Basically, after you construct the HTTPSserver and before you start the ioloop,

A single-process server will often be started like:


A forked one like:

server.start(0)  # 0 for 'detect amount of CPUs'

There will be footnotes to the forking itself (I e.g. ran into trouble because of duplicated sockets to other services, so do that after forking i.e. initialize that from the app).

See also

On logging

See also