I know it's an area of some debate, but at least to me, greenlets are easier to write and understand than callbacks, by an order of magnitude.
Greenlets work especially well with Python's "with" statement. You can have a database pool like this:
with (yield from pool.connection()) as conn:
users = yield from conn.query('select * from users')
for user in user: ...
Now your database connection is only locked up inside the "with" statement, and the whole thing is asynchronous even though it's written in a synchronous style.
Nice, thorough article. Kudos for showing all the alternatives. That said, I think you had a strong preference for one, and that shone through. For example:
With greenlets, you say: Why is greenlets are great? Because they allow writing asynchronous code in synchronous fashion. They allow using existing, synchronous libraries asynchronously. Context switching magic is hidden by greenlet implementation.
Greenlets don't magically make synchronous libraries asynchronous! That's gevent! Greenlets are just the coroutines. Plus, you're forgetting that "magically" making sync libraries async doesn't actually work in plenty of cases, and the way that it works generally involves monkeypatching half the IO bits in the standard library!
Also, there's a bit of a false dichotomy. You claim that there's two options: coroutines or callbacks. That might be true, but then you need to keep in mind that callbacks doesn't necessarily mean something-looking-like-the-Tornado-callback-demo-code. That mind seem like a nitpicky implementation detail, but that's the sort of thing that lets Twisted do inlineCallbacks (where you write generators instead of coroutines) or Corotwine (a third party package where you get actual coroutines) or geventreactor (a third party package where the event loop is done by gevent -- since the other way around isn't actually supported).
The inline callbacks equivalent, given the same API, would be something along the lines of:
data = yield get_more_data()
return make_response(data)
The twisted equivalent of this wouldn't even look like data = yield get_more_data(); Twisted's API calls you when there's data, so it looks even simpler:
Also, txsockjs now integrates great with the Twisted Resource API, so it will live neatly side by side with existing web stuff you're serving from twisted, which can include e.g. WSGI apps (since twisted comes with a wsgi server :))
I'd love to have a cogent argument for the things you didn't like about Twisted, but two out of three appear to be more or less the same thing ("complex", "hard to learn"), and, on top of that, subjective. I'm sorry you had a hard time. It shouldn't have been. If you have any specific issues, I would like to address them. I'm assuming that by "not PEP-8" you mean mostly "uses camelCase", in which case I'll give the usual apologist answer:
- Twisted predates PEP-8's recommendation of snake_case ;-)
- It's actually PEP-8 compliant: the PEP says to do what the code around you does, and something about consistency being a hobgoblin ;-)
Also, you mention that you can run Twisted on Tornado, but not Cyclone. Is there a particular reason for that? From all points I can tell, they get you the same result (mixing Twisted and Tornado code), but the reactor that ships with Tornado just gives you fewer event loop options (and generally inferior ones).
It's not inherently async, but it supports async. Although database support for that isnt great, so I'd recommend just sticking to synchronous stuff for now unless you don't mind using raw SQL.
If it were me, I'd probably use something else before I tried to squeeze that much performance out of Python at this point to though. I hope library support catches up and it becomes more convenient, but I don't think a predominantly synchronous ecosystem has really successfully transitioned to async before.
thanks for sqlalchemy. it's a shining star in the Python package ecosystem.
> interface asyncio frontend and backend (like asyncpg for the driver) while maintaining all the "in the middle" code as synchronous style. the dark secret is that for this approach you have to use greenlet to propagate the "await" operations. It's been in our release for 18 months now with 1M downloads a day and there have been no problems reported with it, so i continue to wonder why nobody else seems to want to look at this approach.
do you have a blog post or some other kind of writeup to read about it? maybe nobody else had the same idea and it works so well nobody is aware that it can be done...
"""
if we squint a little it looks almost like a synchronous call..
you can take a piece of code that was written using synchronous calls, and mechanically translate it into a tasklet: just decorate your functions with @tasklets.tasklet and change your synchronous calls from
result = some_function(args)
into
result = yield some_function_async(args)
"""
I feel like this style of coding (e.g twisted) is stretching python generators a bit far... are callbacks really that hard to work with? And this is clever and all, but what happens when you forget the yield call, or when there is an exception somewhere down the line, how hard will it be to figure out what is going wrong? I would prefer the work went into better function literals in python along with a node.js style approach.
That aside, glad to see better asynchronous support coming to app engine, using:
The author of SQLAlchemy (which recently released great async support) wrote a great post about how async can actually lead to a small decrease in performance compared to the older threading model, at least for db related functions in a web backend context. [1]
For websockets or lots of network calls, you can't really beat async though.
> Greenlets don't magically make synchronous libraries asynchronous! That's gevent!
That's right - I wrote that they allow writing code in synchronous fashion and I didn't say that it happens automagically though.
> That might be true, but then you need to keep in mind that callbacks doesn't necessarily mean something-looking-like-the-Tornado-callback-demo-code.
Yep, I also provided how Futures and generator-based coroutines can be built on top of the callbacks.
Just in case, Tornado supports both out of the box.
> The twisted equivalent of this wouldn't even look like data = yield get_more_data(); Twisted's API calls you when there's data, so it looks even simpler
Very similar to some Tornado API too - method of a class that gets called when something happens. For example, sockjs-tornado follows this convention.
> - Twisted predates PEP-8's recommendation of snake_case ;-)
Yes, I'm aware of it, but before I started playing with asynchronous libraries, I had some Python experience and got used to underscore naming convention. While it is easy to switch between both, I'd prefer consistent code style. Especially when I mix typical Flask Web application with real-time portion in same application.
No doubt, Twisted is mature and featureful framework. When I was investigating different options, Twisted was first one I tried. However, I also tried Tornado and found it easier to start with. And because Tornado worked for me, I decided to stick to it.
I personally like the coroutine approach better (greenlets) since your code looks synchronous without squinting and usually performs well (there are various optimization in Stackless Python and PyPy has support as well). With libraries like gevent, and its API inspired by multiprocessing, asynchronous handling of code blocking on i/o just becomes another option next to using threads, processes or computing cluster.
Whereas with node.js/twisted (excluding @inlineCallbacks) approach the implementation detail—that is how do you talk with OS kernel about I/O—becomes the force shaping how you design your program and your API.
Sure, there's place for callbacks in Python. But as Python programmers, we can afford to reserve them for where they belong, like handling high-level events (i.e. observer pattern).
Eventlet is a networking library that uses greenlet, Python implementation of
coroutines [http://codespeak.net/py/0.9.2/greenlet.html] to handle
asynchronous events in a synchronous way. Greenlet can be thought as an
analogue to Python's enhanced generator, however, without "yield" keyword
and the limitations it brings.
For the applications that are network-bounded it effectively brings Erlang-like
scalability in Python: you can spawn as many greenlets as you like (they're cheap)
(e.g. one or several per incoming connection), control their execution, etc.
If that sounds unclear, just look at the examples:
Compared to the original eventlet (home - http://wiki.secondlife.com/wiki/Eventlet)
our branch has
- a number of bugs fixed and "onions" removed
- twisted integration - you can use any twisted reactor or twisted protocol or
any other feature you like (this part is heavily influenced by corotwine)
- proc module for advanced coroutine control (spawn, link, waitall, etc)
I used that at some point, but I tend to prefer the explicit yield of asyncio to the implicit context change of greenlets. (This blog post is a good explanation of that: https://glyph.twistedmatrix.com/2014/02/unyielding.html)
I wouldn't look at that. That is 5 years old, and that was a different Diesel framework.
Ok, same name (a cool one, I must admit), and I think same group of people. But, it did concurrency differently. It used to be a yield based API. Which, funny enough, is like Guido's latest Tulip/Async.IO approach. And it seems me they (Diesel) decided against it, and I 100% agree. Having yields in the library code, aside from demos and small examples, doesn't work very well.
I think Diesel creators came to their approach from a practical standpoint. And I think Guido (and other proponents of async yield based IO) came from a purist/theoretical approach. We'll see what will happen in the future. It is hard to tell. But I don't see yields everywhere as a good thing. Just like I don't see callbacks or deferreds everywhere as a good way to do concurrency.
greenlet (the base module gevent and eventlet are based on it) is probably the best thing that happened to Python in its recent history. It is really unfortunate it wasn't adopted as the main concurrency mechanism in the latest version.
We've been using Python (2.7) and Postgres, where I work, and I must say I really miss the async feature. It's really convenient to be able to parallelize and saturate I/O very quickly. Right now, we're using a combination of threads and processes without proper messaging, which leads to deadlocks and other nasty things.
That's one thing I've been quite pleased with, when using Scala.
In asyncio and similar systems, the choice between "is this code syncronous or asyncronous" is made at declaration time - either you define a sync function or an async one, and you can only use it in that way.
In gevent and similar systems, the same choice is made at call time - there's no distinction between sync and async functions, instead you can call syncronously with "foo()" or asyncronously with "future = gevent.spawn(foo)" (I'm taking some liberties by calling the returned greenlet object a future but it can be used as such).
We experimented with different async DB approaches, but settled on
synchronous at FriendFeed because generally if our DB queries were
backlogging our requests, our backends couldn't scale to the load
anyway. Things that were slow enough were abstracted to separate
backend services which we fetched asynchronously via the async HTTP
module.
I may open source the async MySQL client I wrote, but I am still
skeptical of the long term value given the code complexity it
introduces.
Bret
Greenlets work especially well with Python's "with" statement. You can have a database pool like this:
Now your database connection is only locked up inside the "with" statement, and the whole thing is asynchronous even though it's written in a synchronous style.reply