It all eventually makes its way down to the OS’s scheduler, which hands out timeslices to processes and threads.
sleep(n)
says “I’m done with my timeslice, and please don’t give me another one for at least n milliseconds.” The OS doesn’t even try to schedule the sleeping thread until requested time has passed.
yield()
says “I’m done with my timeslice, but I still have work to do.” The OS is free to immediately give the thread another timeslice, or to give some other thread or process the CPU the yielding thread just gave up.
.wait()
says “I’m done with my timeslice. Don’t give me another timeslice until someone calls notify().” As with sleep()
, the OS won’t even try to schedule your task unless someone calls notify()
(or one of a few other wakeup scenarios occurs).
Threads also lose the remainder of their timeslice when they perform blocking IO and under a few other circumstances. If a thread works through the entire timeslice, the OS forcibly takes control roughly as if yield()
had been called, so that other processes can run.
You rarely need yield()
, but if you have a compute-heavy app with logical task boundaries, inserting a yield()
might improve system responsiveness (at the expense of time — context switches, even just to the OS and back, aren’t free). Measure and test against goals you care about, as always.