Introduction to gevent

One of the best kept secrets of Python

by Matias Guijarro

ESRF Software Group Lab presentation, 11/02/2013

Presenter Notes

A refresher on Concurrency

Presenter Notes

Concurrency

  • Simultaneous execution

  • Potentially interacting tasks

  • Examples

    • a network server that communicates with several hundred clients all connected at once
    • a big number crunching job that spreads its work across multiple CPUs

Presenter Notes

Multitasking

  • Concurrency implies multitasking

  • If only 1 CPU is available, the only way to run multiple tasks is by rapidly switching between them

Presenter Notes

Parallel processing

  • In the case of multiple CPUs

  • If total number of tasks exceeds the number of CPUs, then each CPU also multitasks

Presenter Notes

Task execution

  • All tasks execute by alternating between CPU processing and I/O handling

  • For I/O, tasks must wait (sleep)

  • Behind the scenes, the underlying system will carry out I/O operation and wake the task when it's finished

Presenter Notes

Overview of Concurrency models

Presenter Notes

Ways to Concurrency

  • Traditional multithreading

    • OS threads
    • Shared memory, locks, etc.
  • "Actor model" (from Erlang)

    • Multiple processes, share nothing
    • Messaging
  • Async or Evented I/O

    • I/O loop + callback chains

Presenter Notes

OS Threads

  • What most programmers think of when they hear about "concurrent programming"

  • Independent task running inside a parent program

    • relatively lightweight
    • share the memory and state of its parent
    • gets its own stack
    • independent flow of execution

Presenter Notes

Programming with threads is hard

Presenter Notes

Really hard

Presenter Notes

Shared memory

  • Tasks may run in the same memory space

  • Simultaneous access to objects

  • Often a source of unspeakable peril

Presenter Notes

Access to Shared Data

  • Threads share all data in your program

  • Thread scheduling is non-deterministic

  • Operations often take several steps and might be interrupted mid-stream (non-atomicity)

  • Thus, access to any kind of shared data is also non-deterministic !

Presenter Notes

Problems with threads

  • Race conditions

    • The corruption of shared data due to thread scheduling is often known as a "race condition"
    • It's quite diabolical--a program may produce slightly different results each time it runs
    • Or it may just flake out mysterioulsy once every two weeks...
  • Threads needs to be synchronized

    • Synchronization primitives: locks, semaphores, mutexes
    • A lot harder than it seems => deadlocks, nasty corner cases

Presenter Notes

Problems with threads with Python

  • Global Interpreter Lock

    • Only one Python thread can execute in the interpreter at once
    • GIL ensures each thread gets exclusive access to entire interpreter
  • Whenever a thread run, it holds the GIL

    • Impossible to have true parallelism
    • Except when having C extensions that release the GIL

Presenter Notes

The Actor Model (Erlang) "in Python"

  • An alternative to threads is to run multiple independent copies of the Python interpreter
    • In separate processes
    • Possibly on different machines
    • Get the interpreters to communicate through "message passing" (IPC)
      • Pipes, FIFOs, memory mapped regions, sockets, etc.

Presenter Notes

Multiprocessing module

  • Multiprocessing module is part of standard Python since 2.6

    • It mirrors Python's threading API
  • Messaging implies serialization

    • convert Python objects to a byte stream
    • done through the standard "pickle" module

Presenter Notes

The third path to Concurrency: Async. or Evented I/O

Presenter Notes

CPU bound tasks

  • A task is "CPU bound" if it spends most of its time with little I/O

  • Examples

    • Crunching big matrices
    • Image processing

Presenter Notes

I/O bound tasks

  • A task is "I/O bound" if it spends most of its time waiting for I/O

  • Examples

    • Reading input from user
    • GUI
    • Networking
    • File processing
  • Most programs are I/O bound

Presenter Notes

Async. or Evented I/O

  • "non-blocking I/O"

  • permits other processing to continue before I/O operation has completed

    • one of the main function of Operating Systems (e.g old "select" sys. call)
  • more scalable than threads or processes (much less memory consumption)

  • usually gives great response time, latency and CPU usage on I/O bound programs

  • one single main thread (SPED: Single Process Event Driven)

    • lots of troubles are avoided, easier to debug

Presenter Notes

Async. or Evented I/O

  • No silver bullet, though

    • a loop has to run to dispatch I/O operations
    • callbacks "spaghetti" make code less readable
  • Recent progress on Linux kernel since 2.6 (better threading) and 64-bits architecture (more memory) make it less interesting than before in term of pure performance

Presenter Notes

What is the best approach ?

Presenter Notes

Depends on the problem :)

Presenter Notes

The C10k problem (D. Kegel, 1999)

  • Comes from web servers world (I/O bound process)
    • How to handle 10.000 simultaneous connections ?

Presenter Notes

The C10k problem (D. Kegel, 1999)

  • Since 1999, solutions have emerged
    • 2002: Nginx, Lighttpd
      • successfully uses the Asynchronous I/O paradigm
      • 1 single thread to serve them all
    • 2003: "Why events are a bad idea (for High-concurrency servers)"
      • paper from Behren, Condit and Brewer (University of California, Berkeley)
      • read the paper! Conclusions:
        • weaknesses of threads (in term of performance) are artifacts of current implementations
        • compilers will evolve to help with thread safety issues
        • threads => classic programming model (no I/O callbacks)

Presenter Notes

Presenter Notes

Presenter Notes

2nd part: gevent

Presenter Notes

What is gevent ?

  • Python concurrency library

  • based around:

  • clean API for a variety of concurrency and network related tasks

Presenter Notes

Talk is cheap - show me the code

  • Download of 4 files over the Internet
    • Using threads
    • Using async I/O
    • Using gevent

Files are taken from the W3C web site: http://www.w3.org

1 files = ("/TR/html401/html40.txt", 
2          "/TR/2002/REC-xhtml1-20020801/xhtml1.pdf",
3          "/TR/REC-html32.html",
4          "/TR/2000/REC-DOM-Level-2-Core-20001113/DOM2-Core.txt")

Presenter Notes

Threaded version

 1 class download_task(threading.Thread):
 2   def __init__(self, host, filepath, proxy="proxy.esrf.fr:3128"):
 3     threading.Thread.__init__(self)
 4 
 5     self.host = host
 6     self.filepath = filepath
 7     self.proxy = proxy
 8     self.file_contents = ""
 9 
10   def run(self):
11     s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
12     proxy_host, proxy_port = self.proxy.split(":")
13     s.connect((proxy_host, int(proxy_port)))
14 
15     s.send("GET http://"+self.host+self.filepath+" HTTP/1.0\r\n\r\n")
16 
17     buf = []
18     while True:
19       data = s.recv(1024)
20       if not data:
21         break
22       buf.append(data)
23 
24     s.close()
25     self.file_contents = "".join(buf)
26 
27 tasks = []
28 for filepath in files:
29   tasks.append(download_task("www.w3.org", filepath, "proxy.esrf.fr:3128"))
30 [task.start() for task in tasks]
31 [task.join() for task in tasks]

Presenter Notes

Asynchronous flavor

Use of the 'asyncore' standard Python module. Callbacks are fired when socket is ready to do a non-blocking read or write operation.

 1 class download_task(asyncore.dispatcher):
 2   def __init__(self, host, filepath, proxy="proxy.esrf.fr:3128"):
 3     asyncore.dispatcher.__init__(self)
 4 
 5     self.host = host
 6     self.filepath = filepath
 7     self.proxy = proxy
 8     self.buffer = []
 9     self.file_contents = ""
10     self.request_sent = False
11 
12   def start(self):
13     self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
14     proxy_host, proxy_port = self.proxy.split(":")
15     self.connect((proxy_host, int(proxy_port)))
16 
17   def handle_close(self):
18     self.close()
19     self.file_contents = "".join(self.buffer)
20 
21   def handle_read(self):
22     data = self.recv(1024)
23     self.buffer.append(data)
24 
25   def writable(self):
26     return not self.request_sent
27 
28   def handle_write(self):
29     self.send("GET http://"+self.host+self.filepath+" HTTP/1.0\r\n\r\n")
30     self.request_sent = True
31 
32 tasks = []
33 for filepath in files:
34   tasks.append(download_task("www.w3.org", filepath, "proxy.esrf.fr:3128"))
35 [task.start() for task in tasks]
36 
37 asyncore.loop()

Presenter Notes

Gevent version

Fully asynchronous without callbacks

 1 def download_task(host, filepath, proxy="proxy.esrf.fr:3128"):
 2   s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 3   proxy_host, proxy_port = proxy.split(":")
 4   self.connect((proxy_host, int(proxy_port)))
 5 
 6   s.send("GET http://"+host+filepath+" HTTP/1.0\r\n\r\n")
 7 
 8   buf = []
 9   while True:
10     data = s.recv(1024)
11     if not data:
12       break
13     buf.append(data)
14 
15   s.close()
16   return "".join(buf)
17 
18 tasks = []
19 for filepath in files:
20   tasks.append(gevent.spawn(download_task, "www.w3.org", filepath, "proxy.esrf.fr:3128"))
21 gevent.joinall(tasks)

Presenter Notes

How does it work ?

Presenter Notes

Presenter Notes

greenlet

  • greenlet provides coroutines to Python via a C extension module

  • spin-off of Stackless, a version of Python supporting micro-threads

    • micro-threads are coroutines + a scheduler
  • the idea of coroutine is from a 1963 paper from Melvin Conway

    • like a normal subroutine, except that it has yielding points instead of a single return exit
    • when yielding, execution goes to another coroutine
  • gevent trick #1: yield automatically when doing blocking I/O (or when 'sleeping')

    • libev provides efficient async. I/O and a scheduler

Presenter Notes

greenlet

Execution flow example

Greenlets execution is deterministic. No preemption.

Cooperative scheduling

Presenter Notes

Presenter Notes

gevent trick #2 to the rescue

Presenter Notes

Monkey-patching

from gevent import monkey; monkey.patch_all()

  • Python allows for most objects to be modified at runtime including modules, classes, and even functions

    • this is generally an astoudingly bad idea!!!
    • nevertheless in extreme situations where a library needs to alter the fundamental behavior of Python itself monkey patches can be used
  • gevent patches blocking system calls in the standard library including those in socket, ssl, threading and select modules to instead behave cooperatively

  • Beware

    • libraries that wrap C libraries
      • it is possible to patch those: greenify (but requires recompiling)
    • disk I/O

Presenter Notes