• 36 Posts
  • 370 Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle















  • So, the word here is parallelism. It’s not something specific to python, asyncio in python is just the implementation of asynchronous execution allowing for parallelism.

    Imagine a pizza restaurant that has one cook. This is your typical non-async, non-threading python script - single-threaded.
    The cook checks for new orders, pickups the first one and starts making the pizza one instruction at the time - fetching the dough, waiting for the ham slicer to finish slicing, … eventually putting the unbaked pizza into oven and sitting there waiting for the pizza to bake.
    The cook is rather inefficient here, instead of waiting for the ham slicer and oven to finish it’s job he could be picking up new orders, starting new pizzas and fetching/making other different ingredients.

    This is where asynchronicity comes in as a solution, the cook is your single-thread and the machines would be mechanisms that have to be started but don’t have to be waited on - these are usually various sockets, file buffers (notice these are what your OS can handle for you on the side, asyncIO ).
    So, the cook configures the ham slicer (puts a block of ham in) and starts it - but does not wait for each ham slice to fall out and put it on the pizza. Instead he picks up a new order and goes through the motions until the ham slicer is done (or until he requires the slicer to cut different ingredient, in this case he would have to wait for the ham task to finish first, put …cheese there and switch to finishing the first order with ham).

    With proper asynchronicity your cook can now handle a lot more pizza orders, simply because his time is not spent so much on waiting.
    Making a single pizza is not faster but in-total the cook can handle making more of them in the same time, this is the important bit.


    Coming back to why a async REPL is useful comes simply to how python implements async - with special (“colored”) functions:

    async def prepare_and_bake(pizza):
      await oven.is_empty()  # await - a context switch can occur and python will check if other asynchronous tasks can be continued/finalized
      # so instead of blocking here, waiting for the oven to be empty the cook looks for other tasks to be done
      await oven.bake(pizza)  
      ...
    

    The function prepare_and_bake() is asynchronous function (async def) which makes it special, I would have to dive into Event Loops here to fully explain why async REPL is useful but in short, you can’t call async functions directly to execute them - you have to schedule the func.
    Async REPL is here to help with that, allowing you to do await prepare_and_bake() directly, in the REPL.


    And to give you an example where async does not help, you can’t speed up cutting up onions with a knife, or grating cheese.
    Now, if every ordered pizza required a lot of cheese you might want to employ a secondary cook to preemptively do these tasks (and “buffer” the processed ingredients in a bowl so that your primary cook does not have to always wait for the other cook to start and finish).

    This is called concurrency, multiple tasks that require direct work and can’t be relegated to a machine (OS, or to be precise can’t be just started and awaited upon) are done at the same time.
    In a real example if something requires a lot of computation (calculating something - like getting nth fibonnaci number, applying a function to list with a lot of entries, …) you would want to employ secondary threads or processes so that your main thread does not get blocked.

    To summarize, async/parallelism helps in cases where you can delegate (IO) processing to the OS (usually reading/writing into/out of a buffer) but does not make anything go faster in itself, just more efficient as you don’t have to wait so much which is often a problem in single-threaded applications.

    Hopefully this was somewhat understandable explanation haha. Here is some recommended reading https://realpython.com/async-io-python/

    Final EDIT: Reading it myself few times, a pizza bakery example is not optimal, a better example would have been something where one has to talk with other people but these other people don’t have immediate responses - to better drive home that this is mainly used on Input/Output tasks.