Python usage notes - PIL

From Helpful
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Syntaxish: syntax and language · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency

IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly


Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


Notebooks

speed, memory, debugging, profiling · Python extensions · semi-sorted

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Importing

While I habitually always used

import Image, ImageChops


It's better to use:

from PIL import Image, ImageChops

because that'll work with Pillow (this is apparently by design)(verify)


PIL and Pillow

tl;dr: the fork named Pillow is now the more complete replacement of the original PIL.


Pillow is an interface-compatible drop-in fork of PIL. We even tend to just call it PIL, except when pointing out this particular install difference.

PIL has apparently not seen development since apparenty 2009 or so (which also meants it's not very py3k, and is annoying to package), so the fork named Pillow is now the more complete replacement of the original PIL.


Pillow intentionally uses the import name PIL - to be a drop-in at the time. This was also slightly confusing, in that for a while you might get either, depending on which you installed.

These days it's pretty much all pillow.


See also:

Pixels and speed

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

An image behaves like a two-dimensional mapping (or a one-dimensional one if you like). You want fast array/pixel access, and you want loops to be fast.


Your options include (roughly from fast to slow):

  • if you like the idea of Halide, there's a python interface
takes a little getting used to, though.
  • PIL supports the array interface, meaning you can use numpy, scipy, and image-aware libraries and often get their C-like speed.
I would suggest this for all nontrivial calculation
details vary, see notes below
  • get/put arrays - which works for a few simpler operations,
such as per-pixel lookup tabling, e.g. out.putdata([lut[x] for x in im.getdata()])
treats the image as a flattened, one-dimensional iterable (of color tuples according to whatever mode/bands it has)
putdata also lets you scale (multiply) and offset (add) arguments
it seems that psyco may help
  • assignment/fetches via im[x,y]
  • getpixel((x,y)) and putpixel((x,y),v)
slow because instead of C working on C arrays, even assuming the operation on the python side is perfectly fast, you still have overhead to the tune of amount_of_pixels * (convert from C to python, convert from python to C)


Note that ImageIO reads a lot of formats, with means that if you focus on Halide or numpy/scipy, you can sometimes skip PIL.

This can make sense if your data is varied, rather than necessarily 3-channel uint8 photos. PIL does supports varied formats, but sometimes makes you work for it.

Loading / converting

Note: This is biased to converting RGB images. (Grayscale may be simpler, )

Other-colorspace images and alpha channels can make things more involved, and sometimes be very hard.


Read from various typical image files

Image.open(filename)

Read compressed file data from memory

e.g. if you have fetched it from the network, memcache, etc.

Image.open( StringIO.StringIO( pngdata ) )


Read from uncompressed raw pixel data

When knowing its size and type, e.g.

Image.frombuffer("RGB",(640,480), rawpixels)

For more control: the next parameter is the decoder (e.g. "raw"), and any beyond are parameters to that decoder

Numpy, scipy

Assuming you have a PIL after 1.1.6, you can use the fact that it supports the array interface (before that it was more involved).


You can deal with arbitrary numyp arrays. There is scipy's

  • ndimage is, as a type, a little more tuned to images(verify) (where array is completely generic)
  • scipy's skimage, a.k.a. scikit-image[1] has some very useful things



numpy to PIL

you'ld often control your dimensionality and type/bit depth on the numpy side (though can also do the latter by converting it to a known type first), e.g.

Image.fromarray( ary )
Image.fromarray( ary.astype(numpy.uint8) )


PIL to numpy

ary = numpy.array( im ) # You can also convert while loading by specifying a dtype.

And you can get a no-copy (read-only, same-type) view like:

ary = numpy.asarray( im )




On array versus asarray
imar = numpy.array(im)   # writable copy
# or, if the first step isn't in-place, you can avoid a copy by doing:
#imar = numpy.asarray(im) # readonly view 

# [do fancy stuff here]

# as things are likely to become float along the way, you often want to 
#  choose an output type to work towards (and scale/clamp values if necessary)
im  = Image.fromarray(  imar.astype(numpy.uint8)  )


Notes:

  • on asarray() versus array():
    • array() always constructs something new(verify) based on the input
    • asarray is part of the array protocol(verify), meant as a 'guarantee we now see it as a numpy array'
      • If the object handed to it is a subclass of ndarray (e.g. which since 1.1.6 include PIL image objects(verify)), then a read-only, non-copied view is returned
      • If the object only presents itself as python tuples, lists, or nestings of such, a new array is returned
      • TODO: check about a.flags.writeable = True -- possibly copy-on-write?


Some tests I did to verify I understood the array/assaray difference:

>>> im = Image.fromstring('L',(2,2), '\x50\x00\xff\x12','raw')
>>> aa = numpy.asarray(im)
>>> aa
array([[ 80,   0],
      [255,  18]], dtype=uint8)
>>> aa[0][0]=130
RuntimeError: array is not writeable

>>> a2 = numpy.array(im)
>>> a2[0][0]=130
>>> a2
array([[130,   0],
      [255,  18]], dtype=uint8)
>>>
>>> im2=Image.fromarray(a2)
>>> im2
<Image.Image image mode=L size=2x2 at 0x83DCFCC>
>>> im2.getpixel((0,0))
130

See also:


OpenCV

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

OpenCV has matrix type and image types:

  • Image formats are based on:
    • IPL_DEPTH_8U - 8-bit unsigned int, grayscale
    • IPL_DEPTH_8S - 8-bit signed int, grayscale
    • IPL_DEPTH_16U - 8-bit unsigned int, grayscale
    • IPL_DEPTH_16S - 8-bit signed int, grayscale
    • IPL_DEPTH_32S - 8-bit signed int, grayscale
    • IPL_DEPTH_32F - 32-bit float
    • IPL_DEPTH_64F - 64-bit float
  • Multi-channel storage are implied by calls like:
cv.CreateImage((320,200), cv.IPL_DEPTH_8U, 3) // three 8-bit unsigned int channels
# Note that  cv calls will less flexible ideas
  • Matrix (Mat) data types are named like:
CV_<bit_depth>(S|U|F)C<number_of_channels>
e.g. CV_8UC3


OpenCV to PIL:

# assuming a 3-channel IPL_DEPTH_8U image that you store RGB into:
pi = Image.fromstring("RGB", cv.GetSize(cv_im), cv_im.tostring())
# OpenCV seems to like BGR, so you may wish to first have done:
cvtColor(image, cv2.COLOR_RGB2BGR)


  • RGB
  • BGR
  • also has HLS, HSV, CIE Lab, CIE Luv, Bayer
  • CvtColor does color space conversions
  • mixChannels allows some reordering
  • To/from compressed file:
pyopencv.imread("file.png") and pyopencv.imwrite("file.png") (or in cv: cv.LoadImage() and cv.SaveImage())
  • OpenCV to PIL:
Image.fromstring("RGB", cv.GetSize(cimg), cimg.tostring())
  • PIL to OpenCV:
cimg2 = cv.CreateImageHeader(pimg.size, cv.IPL_DEPTH_8U, 3)      # cimg2 is a OpenCV image 
cv.SetData(cimg2, pimg.tostring())

http://www.comp.nus.edu.sg/~cs4243/conversion.html


Pygame

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Pygame has various pixel formats, pick one that suits your needs, probably RGB or RGBA.

Pygame:

  • P - 8bpp, palettized
  • RGB - 24bpp image
  • RGBX - 32bpp, unused channel
  • RGBA - 32bpp
  • ARGB - 32bpp
  • RGBA_PREMULT -
  • ARGB_PREMULT -
  • pygame.image.fromstring(data, size, format, flipped=False), pygame.image.tostring(surface, size, format)

http://pygame-users.25799.x6.nabble.com/BGRA-to-RGB-RGBA-fastest-way-to-do-so-td163.html


Cairo to PIL

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

If you don't need alpha, then it seems Cairo's RGB24 and PIL's RGBA are compatible (RGBX, but compressors may not like that without first doing a convert('RGB'))

If you need alpha, you need some in-memory reordering.


  • Cairo to PIL (no alpha)
surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w,h)
im = Image.frombuffer("RGBX", size, surface.get_data(), "raw", "RGBX", 0,1).convert("RGB")
  • PIL to Cairo (no alpha)
TODO


Cairo pixel formats[2]:

  • CAIRO_FORMAT_ARGB32 - pre-applied alpha
  • CAIRO_FORMAT_RGB24 - 32 bits per pixel, 8 per channel, upper 8 bits unused)
  • CAIRO_FORMAT_A8 - 8-bit alpha
  • CAIRO_FORMAT_A1 - 1-bit alpha, packed.
  • CAIRO_FORMAT_RGB16_565
  • CAIRO_FORMAT_RGB30 - RGB, 10 bits per channel

PIL pixel formats:

  • RGB 24bits per pixel, 8-bit-per-channel RGB)
  • RGBA (8-bit-per-channel RGBA)
  • RGBa (8-bit-per-channel RGBA, remultiplied alpha)
  • RGBX (8-bit-per-channel, padded), considered an internal format, compressors may not like it
  • 1 - 1bpp, often for masks
  • L - 8bpp, grayscale
  • P - 8bpp, paletted
  • I - 32-bit integers, grayscale
  • F - 32-bit floats, grayscale
  • CMYK - 8 bits per channel, 4 channels
  • YCbCr - 8 bits per channel, 3 channels

-->


Agg to PIL

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


You can ask Agg's figure/canvas for RGB pixel data, which you can send to PIL.

You could then save into a (c)StringIO (probably using a compressed format) if you want to bypass disk entirely (e.g. in web serving).


See an example at Python usage notes - Matplotlib, pylab#Agg

Semi-sorted

On paletted images ("P" mode)

Paletted image data are uint8, with pixel values actually being indexes into the palette.

Internally, they are almost identical to "L"-mode images (code easily makes the mistake of using what are palette indices as grayscale values, which rarely makes any sense)


The palette (see im.getpalette()) is basically a lookup table for colors, for RGB meaning a 768-byte array (R0,G0,B0, R1,G1,B1, ...)

You can change the palette with im.putpalette(palette)

  • palette should be a list of 768 entries, but is zero-padded if a shorter list is handed to putpalette
  • for "P" mode images: changes the palette.
  • for "L" mode images: converts to "P" mode image, using this palette (could be useful for false coloring)


  • Paletted images are also a memory-compact way of using ImageDraw


Conversion:

  • .convert()ing it to RGB simply converts it through the palette
  • .convert()ing it to L seems to just chuck the palette (verify)
which usually makes no sense
except e.g. when you know a single color to make a mask from, e.g. the transparent color in a GIF:
def nontransparency_mask(pci):
    if pci==transparency_index:
        return 255
     else:
        return 0
mask = transparent_gif_image.convert('L').point(nontrans_mask) 
# ...which is an "L" mode mask storing a mask that could be stored in a "1" mode image.


ImageFont IOError: cannot open resource

Means file not found for the specific font you requested.


When you don't specify a directory, it'll look in a few places (system fonts, current directory) -- I should figure out which and how exactly.

Arguably the best-defined way is to package a font with your code and give it the absolute path to that.

See also


Some experiments

PNG transparency

''' Create a transparent PNG from a non-transparent image.

    In the code below, white becomes transparent and black opaque.
    Transparency is currently linear with the luminance -- which is _rarely_ what you want.

    You would probably want to create a LUT (e.g. via function to im.point) with much more biased responses.
'''
import sys,Image,ImageChops

try:
    im = Image.open(sys.argv[1])
    (r,g,b) = im.split()
    a = im.convert("L")
    a = ImageChops.invert(a) #comment this out if you want black transparent, white opaque
    im2 = Image.merge("RGBA",(r,g,b,a))
    im2.save(sys.argv[1]+"-trans.png",None)
except:
    print "Failed. Possibly you supplied no filename, or it's not a valid image."

Animated GIFs

For some notes on GIF structure, see Image_notes#GIF

Detecting

Doesn't have an existing function, but it's easy enough:

def is_animated(im, returntostart=True):
    ''' Returns whether this image has more than one frame.
        Since GIF is often incrementally drawn, this reader can only seek to the next frame and rewind to 0.
        Since the test is seeing if we can seek, we return to the start afterwards.
    '''
    try:
        im.seek(0) # redundant in almost all cases. Why would you have gone to other frames if you didn't know it was animated?
        im.seek(1)
        if returntostart:
            im.seek(0)
        return True
    except EOFError:
        return False

Reading

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
# you can't tell the number of frames before reading,
# but you can break when it's done (basically what ImageSequence does too)
while 1:
    frame = 0  # start at the start
    try:
       im.seek(frame) # The image will now be at this position
                      # and look like a grayscale or pallette image (L or P mode)
       print "\nFrame %d"%frame

       # There is also metadata, both global and per-frame (somewhat mixed)
       print im.info # new keys:       version      (global, "GIF87a" or "GIF89a")
                     #  if applicable: background   (global, palette colour index, only present when there is a global color table)
                     #                 duration     (per-frame. PIL uses milliseconds, not centiseconds as the GIF stores)
                     #                 transparency (per-frame, palette colour index)
       primt im.tile # used by decoding, can be useful to inspect only the region that was updated.

       frame += 1 # for next loop
    except EOFError:
        break


PIL's GIF reader consists of python code reading the structure of a GIF file, while LZW decoding and (uncompressed) writing is done in C code (the im.tile data is used to tell it where to look). The code in seek() loads the next image frame, has it decoded and copied into the overall image (done by shared code in ImageFile).

It looks like PIL does not yet correctly handle all variants of incremental rendering (verify)


im.tile looks like ('gif', (30, 55, 280, 383), 95045L, (8, False)) The contents are (decodertag , (left,top,width,height), file_offset, extra), where extra is extra information for the decoder, in this case (bits,interlaced)

Writing

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

PIL itself won't write animated GIFs, or compressed GIFs,

Animation:

  • There is gifwriter.py for PIL and Pillow - and which was apparently merged into pillow at some point(verify)
  • There is also the more independent images2gif.py, and lets you control a few more things more easily, e.g. durations per frame.
  • uncompressed GIFs are quickly megabytes large, though...

Compression:

  • There's a line you can change in GifImagePlugin.py that makes it call ppmtogif to write LZW-compressed GIFs into a file, but it's not very helpful for animated GIFs.


Animation + compression:

  • While there has been some movement, the best solution (in terms of file size and such) still seems to use some external program
either to hand it the uncompressed gif, or e.g. hand it pngs and how to compose it, depending on which program you actually use.


(You can also try bugging me. I've got a mostly-finished pure-python LZW-compressed halfway-color-clever animated-GIF writer somewhere)