Asynchronous Python

It can’t be premature optimisation if it took 20 years to start

March 24, 2018 — November 18, 2024

computers are awful
concurrency hell
python
Figure 1

Not covered: Niceties of asynchrony, when threads run truly concurrently (it’s complicated), when evented poll systems are “truly” asynchronous (never, but it doesn’t matter).

🏗 cover uvloop.

1 baseline asyncio ecosystem

Modern Python async-style stuff.

Raw asyncio is getting civilised these days, might be worth using. But there is a complicated relationship between the various bits. And I no longer need to do this, so my advice might be stale. G’luck.

I’ve notices that is has sometimes been easier to use the event loop from tornado or pyzmq. They are comparatively easy and well-documented.

Here are some links I found helpful getting asynchronous python working

  • BBC’s tutorial Python Asyncio Part 1 – Basic Concepts and Patterns

  • HTTPX “is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support for both HTTP/1.1 and HTTP/2.”

    Seems to aim to be the future version of the popular Python requests library.

  • aiohttp seems to be the ascendant asynchronous server/client Swiss army knife for HTTP stuff.

  • sanic is a hip, Python3.8+-only, Flask-like web server. Supports websocket and graphql extensions if you really want it.

  • pallets/quart seems similar to sanic but an even more flask-like API with a different API.

  • backoff is a handy Python library for a menial and common task, retrying with a slightly longer delay.

  • rx exists for Python as rxpy and is tornado compatible.

  • terminado provides a terminal for tornado, for quick and dirty interaction.

    This feels over-engineered to me, but looks easy for some common cases.

  • 0mq itself is attractive because it already uses tornado loops, and can pass numpy arrays without copying.

  • aiomonitor inject REPL for async Python

2 Trio

trio is what my colleagues seem to use for green-field developments

The Trio project’s goal is to produce a production-quality, permissively licensed, async/await-native I/O library for Python. Like all async libraries, its main purpose is to help you write programs that do multiple things at the same time with parallelized I/O. A web spider that wants to fetch lots of pages in parallel, a web server that needs to juggle lots of downloads and websocket connections at the same time, a process supervisor monitoring multiple subprocesses… that sort of thing. Compared to other libraries, Trio attempts to distinguish itself with an obsessive focus on usability and correctness. Concurrency is complicated; we try to make it easy to get things right.

Trio was built from the ground up to take advantage of the latest Python features, and draws inspiration from many sources, in particular Dave Beazley’s Curio. The resulting design is radically simpler than older competitors like asyncio and Twisted, yet just as capable. Trio is the Python I/O library I always wanted; I find it makes building I/O-oriented programs easier, less error-prone, and just plain more fun. Perhaps you’ll find the same.

3 Alternative asynchronous ecosystems

Yes, as always there is something newer and hipper and more artisanal.

curio:

Curio is a library for performing concurrent I/O and common system programming tasks such as launching subprocesses and farming work out to thread and process pools. It uses Python coroutines and the explicit async/await syntax introduced in Python 3.5. Its programming model is based on cooperative multitasking and existing programming abstractions such as threads, sockets, files, subprocesses, locks, and queues. You’ll find it to be small and fast.

The essay that explains why there is a different synchronous ecosystem: Nathaniel J. Smith, Some thoughts on asynchronous API design in a post-async/await world.

Curio doesn’t have much in the way of tooling and seems to be on hiatus. e.g. for HTTP requests you might use curequests or asks. for a server, you might import a raw HTTP2 library and go bareback.

There is also the ancient and justified Twisted which is a monster, but has a lot of features.

4 Idioms

5 Threaded asynchrony

Sometimes we need it? But I don’t have much to say, and am not expert.

For threaded and multi-proc concurrency we sometimes need simple shared variables. Here is, e.g. counters HOWTO.

6 Locking resources

If we are doing parallel stuff, we need locking to avoid two things doing something at the same time that should not be done at the same time. portalocker is a handy tool to lock files and optionally other stuff.