Well, to a degree this is also done by the OS, IO syscalls are frequent location...

Well, to a degree this is also done by the OS, IO syscalls are frequent locations where the OS scheduler might decide to schedule another thread, but this is a very slow context switch (flushing caches, including TLB, the switch to kernel mode and back, and since heartbleed and alia it is even more expensive).

Loom implements every IO on top of a more modern async OS calls, and these virtual thread context switches are on the order of function calls, so the overhead and number of switches that can happen are much much lower.