Ask HN: event loops vs. greenthreading in modern languages

7 points by hot_gril 4 months ago

Most explanations I've seen on "async programming" bury the reason why we're doing this in the first place. It's purely to avoid making OS threads wait on I/O in a concurrent application, such as a webserver. So languages or even libraries instead have various ways to switch tasks on a single thread. If OS threads were cheap enough to just spawn one for each request, this wouldn't be a thing.

Some languages with built-in event loops have async/await syntax, like JS, C#(?), and now Python and Rust. This is convenient syntax. Java, ObjC, etc also had a lot of futures-based stuff that mangled up your code somewhat. Callbacks were also common in ObjC, which gave you crazy nesting. Either way, you're explicitly saying when it can switch tasks, aka what you expect to block.

Golang has greenthreading ("goroutines") instead, and Java recently got them back (as "virtual threads"), which is what I'm less sure about. I get that N greenthreads can run on 1 OS thread, and they're cheap enough to spawn whenever you want. This is less explicit than the async/await way, because the runtime automatically decides what is blocking (i.e. when to switch out the greenthread running on the OS thread), which I assume is based on what syscalls it uses.

So now I'm wondering, if Golang and now Java can task-switch without the user having to tell it when, is there any point of doing it explicitly like in JS or Rust? Is it faster or something, or are other runtimes just not able to tell when exactly things are blocking?

steveklabnik 4 months ago

> It's purely to avoid making OS threads wait on I/O in a concurrent application

This isn't true. That is, this is an important part of it, but that's not the only part. Thread APIs don't support things like cancellation natively, and async/await lets you write state machines in a way that doesn't look like a state machine.

> If OS threads were cheap enough to just spawn one for each request, this wouldn't be a thing.

These are both advantages regardless of the overhead of spawning OS threads.

> So now I'm wondering, if Golang and now Java can task-switch without the user having to tell it when, is there any point of doing it explicitly like in JS or Rust?

Yes. I actually gave two talks on this a while back, there's transcriptions on these pages:

* https://www.infoq.com/presentations/rust-2019/

* https://www.infoq.com/presentations/rust-async-await/

The first one is more of what you're asking about, and the second one is how Rust's design here works.

One short way to answer the question though, is about this part:

> because the runtime automatically decides what is blocking

Yes, in Rust specifically, there is no runtime, and so you cannot make these guarantees.

I hope the first link answers things more thoroughly than that, but that's one simple way into thinking about this.

hot_gril 4 months ago

Thanks for the links, Steve, good to hear from someone on the Rust team. Somehow I automatically assumed Rust's executors sorta acted like a runtime, but now that you mention it, no reason they have to intercept syscalls. Anyway, I'm watching the rest of those videos now.
And yeah, there's always that extra step of handling thread cancellation yourself. I probably shouldn't have said it's purely about cooperative multitasking, but the docs on this (e.g. Microsoft's C# guide) still obscure these points a lot of the time.
- steveklabnik 4 months ago
  
  You're welcome! Just to be clear, I'm not on the Rust team anymore.
  > I automatically assumed Rust's executors sorta acted like a runtime,
  I mean, they do, it's just that like, you can't force all code to go through the executor, which is how they "intercept syscalls."

oftenwrong 4 months ago

The reason for avoiding OS threads is not only the cost of creating them, but also the overhead of scheduling them. The kernel uses preemptive multitasking, so it does work to determine when it should switch control between threads, and which threads should be running, and there is a cost of performing that context switch. The OS also can tell from syscalls when a thread is waiting on I/O.

Java virtual threads are more similar to async/await since they are a form of cooperative multitasking. That is, virtual threads must explicitly tell the runtime when they are ready to yield execution to other tasks. In practice, the programmer does not need to do this manually; the low-level 'blocking' operations in the JDK have been modified to signal this state. This is why you do not need to use an explicit syntax like async/await. Under the bonnet, virtual threads are mounted and unmounted to/from platform threads (JVM wrappers for OS threads) in a pool. All of this involves overhead of copying virtual thread contexts to and from the heap, and of managing the scheduling of virtual threads.

Why don't other languages use this implicit approach? I am sure there a various reasons, but I cannot really speak to them in specifics. I would be interested to know as well. I would guess that the overhead of the implementation is one major reason.

hot_gril 4 months ago

Yeah, it makes a lot of sense in the JVM or Golang runtime where you're already paying for indirection. My first guess for why JS didn't do greenthreading is simplicity of implementation.
I know that threads are expensive to schedule in the OS, but admittedly I don't remember exactly why. There's context-switching and such, but greenthreads also have to do their own analog of that. I know there are answers online of why OS threads can't be cheap like Golang greenthreads, just gotta dig through them.
Also turns out someone asked a similar question on Reddit, but I don't see an answer that nails it: https://www.reddit.com/r/ProgrammingLanguages/comments/prr8j...