Here is a short piece on an important topic, but with some levity sprinkled into the article to make it more fun to read. I hope it is useful.
When we use a spin lock, we are asking a processor to simply spin (loop) trying and trying to gain a lock so it can proceed. We can insert a little time into the loop, but it is still a spin lock.
In simpler times (decades ago - I think there might have been dinosaurs around), this was a fine thing to do. Those days are long gone, so when I was recently asked what the best way to do a spin lock was these days my immediate reaction was to say, "do not use spin locks!"
But I quickly acknowledged that I should not be so hasty. A spin lock is appropriate if the expectation is for very few iterations or it is really important to try to fool the processor into not going into a sleep state. This is the sort of code we may see in an operating system kernel, and especially in a device driver, but not in application code.
What's So Bad About a Spin Lock?
Two key problems with a spin lock are:
- While spinning, a thread is considered busy even though it is actually not moving forward with useful work. As a result, the thread is failing to yield quickly to another thread that may actually do useful work. It's a case of "if everyone did it, no one would get any work done." It is not polite to be a hog and use long running spin locks.
- While spinning, a thread is executing instructions which keeps the processor running even if there is no useful work to do. Busy processors consume power. They probably wear out sooner too, but I do not have such proof handy. We can say that such behavior is not a very green thing to do. For any system using battery power, it is a real shame to run out of power while doing nothing useful.
For both cases, there are more polite locks that yield control so that useful threads may proceed to do real work. If there is nothing else to do, we can enter sleep states to help save power.
What Should We Use Instead of a Spin Lock?
Assuming a spin lock is not the best choice, use any lock that yields instead of spinning is preferred. Of course, we need to trust when we yield that we get to run again when we can get the lock. Even security conscientious nerds agree this is an okay example of "trust." Mutex and semaphores are commonly useful locks that are not spin locks. There is excellent reading material about locks available from many sources. I googled it and found endless tutorials - the sort that a CS student studies in their first year of CS studies such as CS 110 at Stanford.
Even without spin locks, using locks properly is important. First rule is to never hold a lock longer than we must. More importantly than this first rule, we should never hold a lock less than we should. Also, do not hold a lock an indefinite amount of time if we are trying to gain another lock (Dining philosophers problem). Finally, always obtain locks in a defined sequence (if you have locks A B C... always get lock A then B then C in any thread... do not be careless and grab them in different orders). This final advice is not exactly the Dining philosophers problem but it is close. While technically not required that we behave that way, it is very good advice. Many system 'hangs' can be traced back to bad lock hygiene... especially grabbing nested locks in haphazard order.
Enough with the "Do Not": What Should We Do?
Sometimes using spin locks are a good idea and doing them properly is important. Here is how to do our best.
There is a bit of "spin lock etiquette":
- Use a spin lock only when we expect to spin very few times (do not create a long busy wait loop in your code that prevents real work and/or power saving)
- When we are granted a spin lock, get work done very quickly (do not make others wait)
For advice on the best code sequences, we should always turn to the Intel® 64 and IA-32 Architectures Optimization Reference Manual. This is a rich source of information. Here are tips on what to read (and not read) about spin locks:
- Section 11.4.3 covers spin locks in general, and Section 11.4.6 has the advice to not have a spin lock span a cache line.
- Section 2.1.3, it says for Hybrid Architectures with p-cores and e-cores, "Software should replace active spin-waits with lightweight waits ideally using the new UMWAIT/TPAUSE and older PAUSE instructions which will allow for better hints to the scheduler on time spinning." Architectures have evolved with more and more controls to help with locks. Recent processors have Intel Thread Director built-in to give the OS more controls - such as done by Linux 5.18 and Windows 11.
- Section 2.5.4 it has a discussion of using pause and exponential back-off. It also talks about a general trend of increase latency in some detail, including the warning: As the PAUSE latency has been increased significantly, workloads that are sensitive to PAUSE latency will
suffer some performance loss. Thankfully, the guide is about optimization and had advice on that should be done for better performance from now into the future (hence the prior mention of Section 2.1.3). - Ignore Chapter 16 (including advice about spin locks). You can code them, but your time will be wasted because transactional memory extensions are almost always turned off, or otherwise not supported, due to security issues and concerns.
Summary
Locks are an important topic. I once made the mistake of saying "use them less" as a way to get better performance. Unfortunately, since using them too little is far worse it turns out my advice could encourage bad behavior. Of course, my advice now is "use them as little as possible, but no less."
Since locking is such an important topic, it has received no shortage of attention by architects. And, as architecture evolve so does the advice on the best code sequences.
Therefore, the Intel® 64 and IA-32 Architectures Optimization Reference Manual is an invaluable resource when looking for such answers.
"