Python usage notes/Networking and web

From Helpful
Jump to navigation Jump to search
Syntaxish: syntax and language · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency

IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly


Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


Notebooks

speed, memory, debugging, profiling · Python extensions · semi-sorted

URL fetching

Nice third-party libraries

requests

While urllib2 and httplib are standard libraries, their APIs are a little tedious for both basic things, and various custom stuff (see below).


requests is a library you have to install, but it makes life so much simpler that you may care to do so.

It uses urllib3 under the hood. Where urllib3 does fancier low-level features, requests adds higher-level convenience (e.g. less typing for most common tasks, easier auth).


Simple examples:

import requests

# simple text get
r = requests.get('http://api.ipify.org')

print( r.text )  # decoded according to r.encoding, OR
print( r.content )     # bytes object

print( r.status_code )
print( r.headers )
print( r.cookies )


#fancier
r = requests.put('http://example.com/in', 
                 data    = {'key':'value'},   
                 headers = {'user-agent':'my-app/0.0.1'},
                 timeout = 8)


It also has a nice interface to many of the less usual things you may occasionally need, like OAuth, certificates, streaming, multiple file uploads. Also timeouts are a bit easier, as they should be.

See:




RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!

Or other versions.

It seems that a pip3 install --upgrade requests should fix that.


urllib3
This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

urllib3 is a third party library that, compared to urllib2 and such, adds higher level features like connection pooling, retries, TLS verification, easier file uploads, compressed transfers


It's the library that requests uses under the hood.


https://urllib3.readthedocs.io/en/stable/

https://github.com/urllib3/urllib3




Standard library

urllib2

In py2, URL fetching is probably most commonly done with urllib2, a built-in library

in py3, its contents were merged into, mostly, urllib.request and urllib.error, and largely the same


create a Request object (...optionally alter it to change exactly what request to make)
do the actual request on the network
read out the response (often just for its body data)

The simplest version is something like:

req  = urllib2.Request(url)
response = urllib2.urlopen(req)
bodydata = response.read()

Comments:

  • on exceptions:
    • The most common exceptions are probably socket.error, urllib2.URLError, and httplib.HTTPException. You can separate them out if you want different handling/logging for each.
    • HTTPException lets you .read() the error page contents.
    • HTTPError is a more specific subclass of URLError - in case you want more detailed cases (e.g. split out timeouts). For error reporting, note that the former has e.code, the latter e.reason.




urllib2 fetcher function

Note: If you don't mind installing extra modules, there are now much better, e.g. requests for easier high level stuff and urllib3 for more advanced lower level stuff.


....but when you're tied to standard library (e.g. minimal scripts), I use a helper function that makes more basic fetches simpler.

At one point it looked like:

import socket, httplib, urllib, urllib2

def urlfetch(url, data=None, headers=None, raise_as_none=False, return_reqresp=False):
    """ Does a HTTP fetch from an URL.   (Convenience wrapper around urllib2 stuff)
        By default either returns the data at the URL, or raises an error.
       
        data:          May be
                        - a dict               (will be encoded as form values),
                        - a sequence of tuples (will be encoded as form values),
                        - a string  (not altered - you often want to have used urllib.urlencode)
                        When you use this at all, the request becomes a POST instead of the default GET
                           and seems to force  Content-type: application/x-www-form-urlencoded
        headers:        a dict of additional headers.
                          Its values may be a string, or a a list/tuple (all will be add_header()'d)
        raise_as_none:  In cases where you want to treat common connection failures as 'try again later',
                          using True here can save a bunch of your own typing in error catching

        Returns:
        - if return_reqresp==False (default), the data at an URL in a string
        - if return_reqresp==True,            the request and response objects 
        The latter can be useful reading from streams, inspecting response headers, and such
    """
    try:
        if type(data) in (tuple, dict):
            data=urllib.urlencode(data)
        req  = urllib2.Request(url, data=data)
        if headers!=None:
            for k in headers:
                vv = headers[k]
                if type(vv) in (list,tuple): # allow multiple values for a header name
                    for v in vv:             #  (emit as multiple headers)
                        req.add_header(k,v)
                else: # assume single string.  TODO: consider unicode
                    req.add_header(k,vv)
        response = urllib2.urlopen(req)
        if return_reqresp:
            return req,response
        else:
            return response.read()
    except (socket.error, urllib2.URLError, httplib.HTTPException), e:
        #print 'Networking problem, %s: %s'%(e.__class__, str(e)) # debug
        if raise_as_none:
            return None
        raise

# Example:
formdata    = ( ('id','fork22'),  ('submit','y') )
headerdict  = {'User-Agent':'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'}

htmldata = fetch('http://db.example.com', data=formdata, headers=headerdict)
CLOSE_WAIT problem
This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)


It seems that older versions of urllib2 (apparently in unices up to py2.5ish, windows up to ~py3(verify)) had a bug where it left some uncollectable objects behind (including an open file descriptor), which could cause a few different problems.

See also http://bugs.python.org/issue1208304


One result was that a connection would linger in CLOSE_WAIT state (meaning it's closed on the other end, but not by us), which can potentially be a problem for both ends if they are very busy connection-wise and/or these connections may stay around for very long.

When you cannot upgrade past this bug or don't want to assume everyone has new versions, there's a fairly simple code fix too (setting some reference to None when you are done with the handle and will be closing it). I put it in my convenience function.

try: # where handle is the object you get from urllib2's urlopen
    handle.fp._sock.recv = None
except: # in case it's not applicable, ignore this.
    pass


TODO: check history - when was this fixed, and to what degree did this vary with OS?

httplib

If you want POSTs with arbitraty data or methods, as some protocols require, you'll notice urllib2's POST forces use of application/x-www-form-urlencoded where some protocols put looser things in there.


If tied to standard library, you'll probably want to use httplib to write slightly lower-level code, for example:

conn = HTTPConnection('localhost', 80)
conn.request('POST','/path', post_data, headers)
conn.close()


httplib fetcher function

A similar-to-the-above life-simpler-maker. -er.

def httplib_request(url, username=None, password=None, data=None, headers=None, method='GET'):
    ' headers should be a mapping, data already encoded'
    uo = urlparse.urlsplit(url)

    host = uo.netloc # includes port, if any                                                                                        
    port = None
    if ':' in host:
        host,port = host.split(':',1)

    if username and password: # Basic HTTP auth                                                                                     
        base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
        headers['Authorization'] = "Basic %s" % base64string

    if uo.scheme.lower() in ('http',):
        conn = httplib.HTTPConnection(host, port)
    elif uo.scheme.lower() in ('https',):
        conn = httplib.HTTPSConnection(host, port)
    else:
        raise ValueError("Don't understand URL scheme %r in %r"%(uo.scheme, url))

    path = uo.path
    if uo.query: 
        path+='?'+uo.query # note that this doesn't cover all uses
    conn.request(method, path, data, headers)
    resp = conn.getresponse()
    data = resp.read() # close() can cut off before we read data, so we read it here.
    conn.close()

    return resp, data




Timeouts

By default, sockets block without a timeout, which may cause processes to hang around too long.


Since Python 2.3, you can set an interpreter-wide default for anything that uses sockets:

import socket
socket.setdefaulttimeout(300)

The fact that is is interpreter-global is hairy - the last value that was set applies. Particularly threaded programs can become less predictable.

But more to the point, some protocol implementations may rely on this global being sensibly high, so setting it too low can break things.


Since python 2.6, you have per-socket control, though few higher-level functions expose it.

See 'Socket Objects' in the documentation, and perhaps 'Socket Programming HOWTO' for more detail.


TODO: figure out when exactly they started doing this properly.

DNS lookup

Much of the socket library is specific to IPv4.

The exception is socket.getaddrinfo in that it follows the system configuration.

I've used a convenience function like:

import socket,urlparse

def dns_lookup(hostname, prefer_family=socket.AF_INET):
    """ Looks up IP for hostname  (probably FQDN)
        
        Mostly is a function to also take the hostname out of URLs
        (looks for presence of :// and picks out hostname using urlparse)

        Returns an IP in a string,
                or None  in case of failure (currently _any_ exception)

        prefer_family can be a socket.AF_* value (e.g. AF_INET for IPv4, AF_INET6 for IPv6), 
                          or -1  for don't care
    """
    if '://' in hostname: # is a URL
        hostname = urlparse.urlparse(hostname).netloc
        if '@' in hostname:  # take out  user@  or  user:pass@  if present
            hostname = hostname[hostname.index('@')+1:]
    try:
        retval = None
        for entry in socket.getaddrinfo(hostname,0):
            (family, socktype, proto, canonname, sockaddr) = entry
            # Assumptions: first member of sockaddr tuple is address (true for at least IPv4, IPv6)
            #              'prefer' here means "accept any, but break on the first match"
            retval = sockaddr[0]
            if prefer_family==-1  or  prefer_family==family:
                retval = sockaddr[0]
                break
        return retval
    except Exception, e:
        print "Exception",e
        return None

# examples:
dns_lookup('google.com', prefer_family=socket.AF_INET6)
dns_lookup('http://google.com')

# ...which on my initial test system returned  IPv4 addresses for both,
# because I don't have IPv6 enabled


Getting MAC address

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)


  • uuid.getnode()
crossplatform
returns MAC as 48-bit int -- except when the code fails to do so, in which case it returns a random UUID
also not necessarily predictable with more than network controller (e.g. lan+wifi)


  • ioctl on an UDP socket
not really crossplatform
can ask for specific interface


  • netifaces module [1]
crossplatform
not standard library
compiled
  • getmac
crossplatform
not standard library
compiled(verify)
  • psutil
crossplatform(verify)
not standard library


  • /sys/class/net/interface/address
linux-only
  • ifconfig
linux-only


  • wmi module
windows-only


https://stackoverflow.com/questions/159137/getting-mac-address

-->

Python library details

cgitb

Tracebacks for basic CGI, to give more useful error reports than 'internal server error'.

The simplest use in bare-bones CGI is:

import cgitb
cgitb.enable()


In other frameworks it needs a little more work, because the above assumes it can write to stdout (being CGI)

Also, you often want a little more control over the the exception catching.

You might wrap your real handler call in something like:

try:
    return realhandler()
except:  
    # it may help to force content type to 'text/html' through whatever mechanism you have
    cgitb.Hook(file=output_file_object).handle() #the interesting line


Forms, File uploads

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)



Parsing an URL, resolving relative URLs

The urlparse module (urllib.parse in py3k) splits a URL into its various parts, can join them again, resolve relative URLs, and such.

urlparse() splits a URL into a tuple: (scheme, netloc, path, params, query, fragment). The object it returns is actually a subclass of tuple that also contains the same information as members.


urljoin( context_url, path) also deserves mention, being useful to take links that may be relative or absolute paths, or entire URLs, and resolve them into full URLs (in the context of the page's URL). The behaviour is basically what you would expect:

assert urljoin('http://www.example.com',       '')         == 'http://www.example.com'
assert urljoin('http://www.example.com/',      '')         == 'http://www.example.com/'
assert urljoin('http://www.example.com',       '/')        == 'http://www.example.com/'
assert urljoin('http://www.example.com/',      '/')        == 'http://www.example.com/'
assert urljoin('http://www.example.com',       '/bar')     == 'http://www.example.com/bar'
assert urljoin('http://www.example.com/',      'bar')      == 'http://www.example.com/bar'
assert urljoin('http://www.example.com',       'bar')      == 'http://www.example.com/bar'
# starting from page
assert urljoin('http://www.example.com/foo',   'bar')      == 'http://www.example.com/bar'                
assert urljoin('http://www.example.com/foo',   '/bar')     == 'http://www.example.com/bar'
# starting from directory:
assert urljoin('http://www.example.com/foo/',  '/bar')     == 'http://www.example.com/bar'                
assert urljoin('http://www.example.com/foo/',  'bar')      == 'http://www.example.com/foo/bar'
assert urljoin('http://www.example.com/foo/',  'bar/')     == 'http://www.example.com/foo/bar/'
assert urljoin('http://www.example.com/foo/',  'bar/boo')  == 'http://www.example.com/foo/bar/boo'
# absolute:
assert urljoin('http://www.example.com/foo/',  'http://elsewhere.com')  == 'http://elsewhere.com'

WSGI

See CGI,_FastCGI,_SCGI,_WSGI,_servlets_and_such#WSGI, mod_wsgi notes and some notes elsewhere (e.g. those on the CherryPy page)

Servers

Note this section describes things that either are, or have their own servers. Frameworks focus more on the logic on top of them, though since they sometimes depend tightly on some server or other they are by no means incomplete.

CherryPy

Various WSGI

See Python_notes_-_WSGI#Hosting_WSGI

Twisted Web

  • http://twistedmatrix.com/projects/web/ (Twisted/Web)
  • Twisted is higher-level networking things, but also has a HTTP server - though I'm guessing you still need to work in a twisted sort of model:
  • Providings concurrency without threading: is event-based and requires any potentially blocking things to be deferred (verify)
  • Looks like it'll work under most OSes, with a managable extra dependency here and there.


BaseHTTPServer and friends

  • In the standard python library, so always available
  • Consists of quite basic handling for HTTP requests.
  • You have to decide how to decide what to do based on incoming URLs yourself (no 'run script if there, fall back to file serving' logic at all)
  • A single-purpose server like this can be written in a dozen or two lines, see the example.
  • Not threaded, asynchronous: things are served in sequence, so if one handler is heavy, it will block the next requests.


There are also SimpleHTTPServer and CGIHTTPServer, building on (and interface-compatible with) the BaseHTTPServer:

  • Both in the standard library
  • Simple... serves files based on a simple URL-path-to-directory-path map.
  • CGI... adds the ability to run external scripts for output. (except on MacOS (pre-X or general?(verify))). It sends a 200 status, then hands output to the executable (so it doesn't serve all needs).


You can make servers using the above thread and fork using using the two SocketServer mixins. It seems to make the TCPServer (that the HTTPServers are based on) wrap requests in a function that creates a thread/forked process, then calls the actual handler.

Minimal code for the basic, threaded and forked versions:

from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
from SocketServer import ThreadingMixIn, ForkingMixIn
import cgi

class ThreadingBaseHTTPServer(ThreadingMixIn, HTTPServer):
    pass

class ForkingBaseHTTPServer(ForkingMixIn, HTTPServer):
    " Not on windows, of course, since it doesn't do forking. "

class ExampleHandlerSet(BaseHTTPRequestHandler):
    """ Hello-world type handler """
    def do_GET(self):
       import threading # only imported for the print below
       vars = cgi.parse_qs( self.path[1:],keep_blank_values='true' )
       if 'name' in vars:
           name=vars['name'][0]
           self.wfile.write( "Hello WWW, hey %s"%name )
           print "%d threads"%threading.activeCount()
           self.wfile.flush()
           self.wfile.close()
       else:
           self.send_error(404,"No name given.")

# Choose one:
#srv = HTTPServer(('0.0.0.0',8765),ExampleHandlerSet)
srv = ThreadingBaseHTTPServer(('0.0.0.0',8765),ExampleHandlerSet)
#srv = ForkingBaseHTTPServer(('0.0.0.0',8765),ExampleHandlerSet)

srv.serve_forever() #there are alternative ways of starting, this one's the shortest to type

A very basic handler like this is pretty fast. ab2 -n 1000 -c 1 http://localhost:8765/name=Booga shows that 99% of requests are served within 1ms, or 6ms when causing the 404. I'm not sure why the other 1-2% took up to 100-300ms. Perhaps some occasional cleanup.


About the threaded versions:

  • This is taken from an example apparently quoted from the book "Python Web Programming," along with the note that the MaxIn class should be the first in that inheritance list.
  • As always with threading, watch your race conditions and all that mess.
  • This sort of concurrency isn't not geared for speed. The treads aren't as lightweight as they could be nor the cheapest solution in the first place.
  • You can't control the amount of threads (without implementing threadpooling) and various OSes start having problems above a few thousand threds, so this won't scale to very heavy load.
  • On windows (Python 2.5) it started refusing connections when I ran ab2 with a concurrency of 6 or higher. I'm not sure why (could be windows being pissy, perhaps the firewall)


Something based on asyncore will be more efficent and still based fairly simply on python internals. Medusa does this.

Medusa



Framework and tool notes

Django

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

Django is a server-side framework, intended to make it quick to make a basic dynamic site. (...like a whole list of simpler frameworks, but this one stuck around)

Django is probably the most featured in this particular list.


Has a pure-python development server, but production suggested to be hosted in WSGI, FastCGI, or ASGI.



It has a lot of tools to avoid boilerplate, like

  • routing URLs to
  • ORM for data access, with some extra tools (aggregation, search) controlled by code
  • security features
  • eases creating forms [2]
  • helps testing [3]
  • makes dealing with exceptions a little less painful [4]
  • optional Internationalization and localization [5]

The MTV split is fairly loosely coupled, but still allows reuse

Arranges things into

Model - define the structure of stored data
Template - how to represent data
Views - the function that takes model+template and renders the result

(It helps to not draw parallels to MVC, or if you do, that actual MVC is awkward to apply to the web to begin with; MVC is meant for GUI elements



Pylons

Pyramid

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

Pyramid may be for you when

things like Flask are too lean for your more complex project, and would force too much repetitive work
django is too coupled and monolithic and macroframeworky that forces too much of a way of working for you
pyramid gives roughly the same sort of tools as Django, but tries to make them a little more optional and modular


Pedigree/history

You could see pyramid as a merger of pylons and repoze (done around 2010), so a successor to both. (and so also has influences from Zope), but seems to merge/allow other things.

Dispatch and models seem to be mostly be pylons's (e.g. the model page notes specific differences to pylons)

Response seems to mostly be webob (Request less so?(verify)) Templaying can choose jinja2, mako, and others.


It seems to take some names from Django, but has a different take from it, calling itself a lot less opinionated than it (but it still does have preferences).

Sure you don't need to use MVT architecture or ORMs, but its scaffolding belies that that is its background.

In practice you may easily end up with SQLAlchemy (or ZODB) as the ORM, and it seems to like alembic for database migration.


A minimal example does little more than basic routing and request parsing:

from pyramid.config import Configurator
from pyramid.response import Response

def hello_world(request): # view callable
    return Response('Hello %s!'%request.params.get('name','world') )

if __name__ == '__main__':
    with Configurator() as config:
        config.add_route('hello', '/')
        config.add_view(hello_world, route_name='hello')
        app = config.make_wsgi_app()

    from wsgiref.simple_server import make_server
    server = make_server('0.0.0.0', 16543, app)
    server.serve_forever()


The Configurator
database / model
Requests and responses
Scaffolding
templating
serving - from console
debug and testing
auth stuff
events (internal pub/sub?)
See also

https://docs.pylonsproject.org/projects/pyramid/en/latest/index.html

https://trypyramid.com/

Emmett notes

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

(previously weppy)

async.


See also:


starlette

async


https://www.starlette.io/

Flask notes

Flask eases some common serving and routing.

Consider the following routing examples:

Static files

 
# from dirs
@app.route('/css/<path:path>')
def send_css(path):
    return flask.send_from_directory('css', path,                 mimetype='text/css')

@app.route('/css/graphics/<path:path>')
def send_image(path):
    return flask.send_from_directory('image', path,               mimetype='image/png')


# more hardcoded URL->files
@app.route('/favicon.ico')
def fav1():
    return flask.send_from_directory('ico', 'favicon-96x96.png',  mimetype='image/png')

@app.route('/favicon-96x96.png')
def fav2():
    return flask.send_from_directory('ico', 'favicon-96x96.png',  mimetype='image/png')


Dynamic handlers, where URL-path parts can be arguments if you wish:

 
@app.route('/shiftpng/<fid>')
def shiftpng(fid):
    fid = int(fid)
    return "Shift: %d"%fid

# the above passed it in as string, though you can restrict types and get conversion:
@app.route('/shiftpng/<int:fid>')
def shiftpng(fid):
    return "Shift: %d"%fid


# another choice would be to skip path-based argument and do it all in code
# (allows them to be optional. You may prefer this if you're going to do heavy sanitation anyway)
@app.route('/shiftpng')
def shiftpng():
    fid = int( req.form.get(fid,'-1') )
    return "Shift: %d"%fid

The types are string (anything sans slash, also the default), path (accepts slashes), int, float, uuid, any (?)


Catch-all: (TODO: explain which applies)

 
@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def catch_all(path):
    return 'You want path: %s' % path
Request context

So why does:

from flask import request

@app.route('/hello')
def hello():
    name = request.get('name',  'world')   # default to return 'world' instead of raising KeyError
    return 'Hello %s'%name

...work? request is a global, and not explicitly filled.

Short answer: Flask makes sure your handler always sees the right values (and is thread-safe)


Request request.method request.args request.cookies


Responses
This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

Flask ends up using a Reponse object one way or the other (created by you, via the return shorthands, for render_template, make_response, etc.) (verify)


The most controlled (but longer) reponse is creating a Response object yourself, because it lets you at all the details, for example:

resp = Response(''.join(page_chunks), mimetype='text/plain')
response.status_code    = 200
resp.headers['Expires'] = 'blah' # TODO: actual value
return resp


returning something else is a one-liner shorthand. These variants include:

response
(response, status)
(response, status, headers)
(response, headers)

Where

anything not mentioned will have a default
e.g. status 200
Some defaults can be configured
status is an integer
response can be
byte data
list of byte data
a Response object (see e.g. example above). Sometimes makes sense to subclass.
a generator, to stream content from a function. Tends to look something like:
@app.route('/csv')
def generate_large_csv():
    def generate():
        for row in iter_all_db_rows():
            yield ','.join(row) + '\n'
    return Response(generate(), mimetype='text/csv')



If you write generator-style and get

UnboundLocalError: local variable referenced before assignment [duplicate]

Then you've just run into Python's stupid scoping rules. What's happening is a combination of two things:

  • at some point in the inner function you're (re)assigning to it. Without that it would come from the outer scope, the assignment makes it local to the inner one.
  • in that inner function you're reading from it before that assignment. Which makes it a generic case of reference before assignment.

Since there is no easy way to say "I want the parent scope", your options here are roughly:

if the use is actually local, and you don't mean to write to parent scope's variable (which in Flask is typically true for request params), then assign it into a different-named variable
if you have a class, use of self can make sense
there are some other tricks for this issue - search around
use global, if that's not a horrible hack

See also:

internal requests / unit tests
This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

When what you want isn't a redirect, but actually a network request (and there are enough reasons that you can't share all the code behind the other handler, e.g. you've entangled with the argument handling a lot), you can get such a fetcher (from the underlying Werkzeug) via app.test_client()

Meant for testing, this lives in the context of the given app, so lets you use its paths (verify)

See also:

Paste notes

Note that paste is sort of maintained, but no longer developed. Look at other options (like?)


Parsing stuff

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)


paste.httpserver

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

paste.httpserver is relatively simple, based on BaseHTTPServer, but does allow handler threadpooling.

Simple usage example:

if __name__ == '__main__':
    from paste.httpserver import serve
    serve(application, host='0.0.0.0', port=8080)


serve() has a bunch of keyword arguments:

  • application
  • host: IP address to bind to (hand in an IP, or name to look up). Defaults to 127.0.0.1. (non-public)
  • port: port to bind to
  • ssl_pem: path to a PEM file. If '*', a development certificate will be generated for you. (verify)
  • ssl_context: by default based on ssl_pem
  • server_version, defaults to something like PasteWSGIServer/0.5
  • protocol_version (defaults to HTTP/1.0. There is decent but not complete support for HTTP/1.1)
  • start_loop: whether to call server.serve_forever(). You would set this to false if you want to call serve() to set up as part of more init code, but not start serving yet, or want to avoid blocking execution.
  • socket_timeout (default is none, which may lead to )
  • use_threadpool - if False, creates threads in response to requests. If True, it keeps a number of threads (threadpool_workers) around to use. Reduces the request time that comes from thread startup.
  • threadpool_workers: see above. Default is 10
  • threadpool_options - a dict of further options; see [6]
  • request_queue_size=5 - maximum number of connections that listen() keeps in queue (instead of rejecting them)

Paste Deployment

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)


deploy is an optional part of paste. It is useful to ease configuration/mounting/use of WSGI apps.


From code, it can return a WSGI application, created/composited based on a given configuration file.

It allows (URL) mounting of WSGI apps by configuration, and can load code from eggs (and ?).


Together with packaging as eggs, this allows you to make life easier for sysadmins (since you need much less python knowledge).

See also:

AuthKit notes

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

Primarily provides user login logic. Has three main pieces:

  • Authentication middleware:
    • Intercept HTTP 401 (instead of entangling on a code level)
    • present one of several supported authentication methods
    • Sets REMOTE_USER on sign-in.
  • Permission system to assign/check for \specific permissions that users may or may not have
  • Authorization adaptors:
    • Handles the actual permission check
    • Throws a PermissionError on a problem, intercepted by the middleware.


See also:

Unsorted / older

This hasn't been updated for a while, so could be outdated (particularly if it's about something that evolves constantly, such as software or research).


The more complex frameworks often employ a (framework-wise) MVC model in that their major components are usually:

  • an Object-Relational Mapper,
  • a templating framework, and
  • their own URL dispatching and/or webserver

...and often various tools.


Turbogears

For a good part just a nice combination of things out there with a set of useful scripts


Bottle

routing, utilities, server, and templates


https://bottlepy.org/docs/dev/


Zope

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)




See also:

Nevow

  • https://github.com/twisted/nevow
  • Templating that build on Twisted (...web) and provides a web-app framework including templating, a javascript bridge and such things.
  • more specific than most of the others mentioned here, but powerful in its way


Karrigell


Quixote

  • http://www.mems-exchange.org/software/quixote/
  • geared to integrate relatively easily with basic Python (though that's true for all, just to different degrees)
  • Fairly general; apparently works as/under plain CGI, FastCGI, SCGI, mod_python, Twisted, Medusa.



Grok

http://grok.zope.org/


CMS

Django CMS

https://www.django-cms.org/en/


Wagtail

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

A CMS built on top of (parts of) Django.

(It's comparable to Django CMS, but probably just a tad more flexible(verify))


Model and template editing is still developer stuff, but has an admin interface to do content management.

To get an idea of workflow, see e.g.

https://docs.wagtail.io/en/stable/getting_started/tutorial.html


Plone

Templating

This hasn't been updated for a while, so could be outdated (particularly if it's about something that evolves constantly, such as software or research).

Django templating

https://docs.djangoproject.com/en/4.0/ref/templates/language/

Jinja


Cheetah


Genshi


Spyce