Python usage notes - PIL

From Helpful
Jump to: navigation, search
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Importing

I always used

import Image, ImageChops


It's better to use:

from PIL import Image, ImageChops

because that'll work with Pillow (this is apparently by design)(verify)


PIL and Pillow

Pillow is an interface-compatible fork of PIL

The motivation:

  • PIL wasn't very actively developed
    • (and therefore wan't very py3k)
  • PIL was apparently sort of nasty to package

Pillow does its own development, and tried to stay in sync with PIL, although since PIL sees little development, it has started to drift. (verify)


tl;dr: If PIL continues to not be very actively developed, then we all want to move to Pillow now.


See also:

Pixels and speed

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

An image behaves like a two-dimensional mapping - or a one-dimensional one. You want fast array/pixel access, and you want loops to be fast.


Your options (roughly from fast to slow) include:

  • PIL supports the array interface, meaning you can use numpy, scipy, and often get their C-like speed.
I would suggest this for everything
details vary, see notes below


  • get/put arrays - which works for a few simpler operations,
such as per-pixel lookup tabling, e.g. out.putdata([lut[x] for x in im.getdata()])
treats the image as a flattened, one-dimensional iterable (of color tuples according to whatever mode/bands it has)
putdata also lets you scale (multiply) and offset (add) arguments
it seems that psyco may help
  • assignment/fetches via im[x,y]
  • getpixel((x,y)) and putpixel((x,y),v)

Loading / converting

Note: This is biased to converting RGB images. (Grayscale may be simpler, )

Other-colorspace images and alpha channels can make things more involved, and sometimes be very hard.


Read from various typical image files

Image.open(filename)

Read compressed file data from memory

e.g. if you have fetched it from the network, memcache, etc.

Image.open( StringIO.StringIO( pngdata ) )


Read from uncompressed raw pixel data

When knowing its size and type, e.g.

Image.frombuffer("RGB",(640,480), rawpixels)

For more control: the next parameter is the decoder (e.g. "raw"), and any beyond are parameters to that decoder

Numpy, scipy

Assuming you have a PIL after 1.1.6, you can use the fact that it supports the array interface (before that it was more involved).


You can deal with arbitrary numyp arrays. There is scipy's

  • ndimage is, as a type, a little more tuned to images(verify) (where array is completely generic)
  • scipy's skimage, a.k.a. scikit-image[1] has some very useful things



numpy to PIL

you'ld often control your dimensionality and type/bit depth on the numpy side (though can also do the latter by converting it to a known type first), e.g.

Image.fromarray( ary )
Image.fromarray( ary.astype(numpy.uint8) )


PIL to numpy

ary = numpy.array( im ) # You can also convert while loading by specifying a dtype.

And you can get a no-copy (read-only, same-type) view like:

ary = numpy.asarray( im )




On array versus asarray
imar = numpy.array(im)   # writable copy
# or, if the first step isn't in-place, you can avoid a copy by doing:
#imar = numpy.asarray(im) # readonly view 
 
# [do fancy stuff here]
 
# as things are likely to become float along the way, you often want to 
#  choose an output type to work towards (and scale/clamp values if necessary)
im  = Image.fromarray(  imar.astype(numpy.uint8)  )


Notes:

  • on
    asarray()
    versus
    array()
    :
    • array() always constructs something new(verify) based on the input
    • asarray is part of the array protocol(verify), meant as a 'guarantee we now see it as a numpy array'
      • If the object handed to it is a subclass of ndarray (e.g. which since 1.1.6 include PIL image objects(verify)), then a read-only, non-copied view is returned
      • If the object only presents itself as python tuples, lists, or nestings of such, a new array is returned
      • TODO: check about a.flags.writeable = True -- possibly copy-on-write?


Some tests I did to verify I understood the array/assaray difference:

>>> im = Image.fromstring('L',(2,2), '\x50\x00\xff\x12','raw')
>>> aa = numpy.asarray(im)
>>> aa
array([[ 80,   0],
      [255,  18]], dtype=uint8)
>>> aa[0][0]=130
RuntimeError: array is not writeable
 
>>> a2 = numpy.array(im)
>>> a2[0][0]=130
>>> a2
array([[130,   0],
      [255,  18]], dtype=uint8)
>>>
>>> im2=Image.fromarray(a2)
>>> im2
<Image.Image image mode=L size=2x2 at 0x83DCFCC>
>>> im2.getpixel((0,0))
130

See also:


OpenCV

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

OpenCV has matrix type and image types:

  • Image formats are based on:
    • IPL_DEPTH_8U - 8-bit unsigned int, grayscale
    • IPL_DEPTH_8S - 8-bit signed int, grayscale
    • IPL_DEPTH_16U - 8-bit unsigned int, grayscale
    • IPL_DEPTH_16S - 8-bit signed int, grayscale
    • IPL_DEPTH_32S - 8-bit signed int, grayscale
    • IPL_DEPTH_32F - 32-bit float
    • IPL_DEPTH_64F - 64-bit float
  • Multi-channel storage are implied by calls like:
cv.CreateImage((320,200), cv.IPL_DEPTH_8U, 3) // three 8-bit unsigned int channels
# Note that  cv calls will less flexible ideas
  • Matrix (Mat) data types are named like:
CV_<bit_depth>(S|U|F)C<number_of_channels>
e.g. CV_8UC3


OpenCV to PIL:

# assuming a 3-channel IPL_DEPTH_8U image that you store RGB into:
pi = Image.fromstring("RGB", cv.GetSize(cv_im), cv_im.tostring())
# OpenCV seems to like BGR, so you may wish to first have done:
cvtColor(image, cv2.COLOR_RGB2BGR)


  • RGB
  • BGR
  • also has HLS, HSV, CIE Lab, CIE Luv, Bayer
  • CvtColor does color space conversions
  • mixChannels allows some reordering
  • To/from compressed file:
pyopencv.imread("file.png") and pyopencv.imwrite("file.png") (or in cv: cv.LoadImage() and cv.SaveImage())
  • OpenCV to PIL:
Image.fromstring("RGB", cv.GetSize(cimg), cimg.tostring())
  • PIL to OpenCV:
cimg2 = cv.CreateImageHeader(pimg.size, cv.IPL_DEPTH_8U, 3)      # cimg2 is a OpenCV image 
cv.SetData(cimg2, pimg.tostring())

http://www.comp.nus.edu.sg/~cs4243/conversion.html


Pygame

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Pygame has various pixel formats, pick one that suits your needs, probably RGB or RGBA.

Pygame:

  • P - 8bpp, palettized
  • RGB - 24bpp image
  • RGBX - 32bpp, unused channel
  • RGBA - 32bpp
  • ARGB - 32bpp
  • RGBA_PREMULT -
  • ARGB_PREMULT -
  • pygame.image.fromstring(data, size, format, flipped=False), pygame.image.tostring(surface, size, format)

http://pygame-users.25799.x6.nabble.com/BGRA-to-RGB-RGBA-fastest-way-to-do-so-td163.html


Cairo to PIL

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

If you don't need alpha, then it seems Cairo's RGB24 and PIL's RGBA are compatible (RGBX, but compressors may not like that without first doing a convert('RGB'))

If you need alpha, you need some in-memory reordering.


  • Cairo to PIL (no alpha)
surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w,h)
im = Image.frombuffer("RGBX", size, surface.get_data(), "raw", "RGBX", 0,1).convert("RGB")
  • PIL to Cairo (no alpha)
TODO


Cairo pixel formats[2]:

  • CAIRO_FORMAT_ARGB32 - pre-applied alpha
  • CAIRO_FORMAT_RGB24 - 32 bits per pixel, 8 per channel, upper 8 bits unused)
  • CAIRO_FORMAT_A8 - 8-bit alpha
  • CAIRO_FORMAT_A1 - 1-bit alpha, packed.
  • CAIRO_FORMAT_RGB16_565
  • CAIRO_FORMAT_RGB30 - RGB, 10 bits per channel

PIL pixel formats:

  • RGB 24bits per pixel, 8-bit-per-channel RGB)
  • RGBA (8-bit-per-channel RGBA)
  • RGBa (8-bit-per-channel RGBA, remultiplied alpha)
  • RGBX (8-bit-per-channel, padded), considered an internal format, compressors may not like it
  • 1 - 1bpp, often for masks
  • L - 8bpp, grayscale
  • P - 8bpp, paletted
  • I - 32-bit integers, grayscale
  • F - 32-bit floats, grayscale
  • CMYK - 8 bits per channel, 4 channels
  • YCbCr - 8 bits per channel, 3 channels

-->


Agg to PIL

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


You can ask Agg's figure/canvas for RGB pixel data, which you can send to PIL.

You could then save into a (c)StringIO (probably using a compressed format) if you want to bypass disk entirely (e.g. in web serving).


See an example at Python usage notes - Matplotlib, pylab#Agg

Semi-sorted

On paletted images ("P" mode)

Paletted image data are uint8, with pixel values actually being indexes into the palette.

Internally, they are almost identical to "L"-mode images (code easily makes the mistake of using what are palette indices as grayscale values, which rarely makes any sense)


The palette (see im.getpalette()) is basically a lookup table for colors, for RGB meaning a 768-byte array (R0,G0,B0, R1,G1,B1, ...)

You can change the palette with
im.putpalette(palette)
  • palette should be a list of 768 entries, but is zero-padded if a shorter list is handed to putpalette
  • for "P" mode images: changes the palette.
  • for "L" mode images: converts to "P" mode image, using this palette (could be useful for false coloring)


  • Paletted images are also a memory-compact way of using ImageDraw


Conversion:

  • .convert()ing it to RGB simply converts it through the palette
  • .convert()ing it to L seems to just chuck the palette (verify)
which usually makes no sense
except e.g. when you know a single color to make a mask from, e.g. the transparent color in a GIF:
def nontransparency_mask(pci):
    if pci==transparency_index:
        return 255
     else:
        return 0
mask = transparent_gif_image.convert('L').point(nontrans_mask) 
# ...which is an "L" mode mask storing a mask that could be stored in a "1" mode image.

See also


Some experiments

PNG transparency

''' Create a transparent PNG from a non-transparent image.
 
    In the code below, white becomes is transparent and black opaque.
    Transparency is currently linear with the luminance, which is rarely what you want.
 
    You would probably want to create a LUT (e.g. via function to im.point) with much more biased responses.
'''
import sys,Image,ImageChops
 
try:
    im = Image.open(sys.argv[1])
    (r,g,b) = im.split()
    a = im.convert("L")
    a = ImageChops.invert(a) #comment this out if you want black transparent, white opaque
    im2 = Image.merge("RGBA",(r,g,b,a))
    im2.save(sys.argv[1]+"-trans.png",None)
except:
    print "Failed. Possibly you supplied no filename, or it's not a valid image."

Animated GIFs

For some notes on GIF structure, see Image_notes#GIF

Detecting

Doesn't have an existing function, but it's easy enough:

def is_animated(im, returntostart=True):
    ''' Returns whether this image has more than one frame.
        Since GIF is often incrementally drawn, this reader can only seek to the next frame and rewind to 0.
        Since the test is seeing if we can seek, we return to the start afterwards.
    '''
    try:
        im.seek(0) # redundant in almost all cases. Why would you have gone to other frames if you didn't know it was animated?
        im.seek(1)
        if returntostart:
            im.seek(0)
        return True
    except EOFError:
        return False

Reading

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
# you can't tell the number of frames before reading,
# but you can break when it's done (basically what ImageSequence does too)
while 1:
    frame = 0  # start at the start
    try:
       im.seek(frame) # The image will now be at this position
                      # and look like a grayscale or pallette image (L or P mode)
       print "\nFrame %d"%frame
 
       # There is also metadata, both global and per-frame (somewhat mixed)
       print im.info # new keys:       version      (global, "GIF87a" or "GIF89a")
                     #  if applicable: background   (global, palette colour index, only present when there is a global color table)
                     #                 duration     (per-frame. PIL uses milliseconds, not centiseconds as the GIF stores)
                     #                 transparency (per-frame, palette colour index)
       primt im.tile # used by decoding, can be useful to inspect only the region that was updated.
 
       frame += 1 # for next loop
    except EOFError:
        break


PIL's GIF reader consists of python code reading the structure of a GIF file, while LZW decoding and (uncompressed) writing is done in C code (the im.tile data is used to tell it where to look). The code in seek() loads the next image frame, has it decoded and copied into the overall image (done by shared code in ImageFile).

It looks like PIL does not yet correctly handle all variants of incremental rendering (verify)


im.tile looks like ('gif', (30, 55, 280, 383), 95045L, (8, False)) The contents are (decodertag , (left,top,width,height), file_offset, extra), where extra is extra information for the decoder, in this case (bits,interlaced)

Writing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

PIL itself won't write animated GIFs, or compressed GIFs,

Animation:

  • There is gifwriter.py for PIL and Pillow - and which was apparently merged into pillow at some point(verify)
  • There is also the more independent images2gif.py, and lets you control a few more things more easily, e.g. durations per frame.
  • uncompressed GIFs are quickly megabytes large, though...

Compression:

  • There's a line you can change in GifImagePlugin.py that makes it call ppmtogif to write LZW-compressed GIFs into a file, but it's not very helpful for animated GIFs.


Animation + compression:

  • While there has been some movement, the best solution (in terms of file size and such) still seems to use some external program
either to hand it the uncompressed gif, or e.g. hand it pngs and how to compose it, depending on which program you actually use.


(You can also try bugging me. I've got a mostly-finished pure-python LZW-compressed halfway-color-clever animated-GIF writer somewhere)