Threads and Atomicity By schveiguy
Introduction For those of you who have not used threads before, they are simply processes that share the same memory and name space. Each has its own stack and is run preemptively by the system scheduler. However, they all access the same variables with the same names, which makes communication between threads much simpler than communication between processes. In addition, creating a thread is much less taxing on the OS since new memory space is not needed.
Atomicity It is very important to understand that normally any operation can be non-atomic. Even machine instructions can be non-atomic if multiple CPUs are running the threads. To demonstrate, let's view the following simple C/C++ code: int var = 0; void thread_main() { var++; }
If you execute one instance of thread_main, then var will have a value of 1
after the thread is done. What happens if you execute multiple instances of
thread_main? To see, we will dissect the instruction in question. The
compiler breaks the instruction "var++;" into three machine instructions:
Let's say there are two threads executing, and after one undergoes the first step, it is preempted by the second thread. The second thread performs all three steps, so now the value of var is 1. When the first thread wakes back up, it still has a value of 0 in the register, which it increments and stores back to var. The result is that var is now 1 after both threads have executed. This corrupted access of the var variable is a good example of a race condition. Of course, the threads may both execute steps 1, 2, and 3 before being preempted, which will result in var having a value of 2. This is what makes thread debugging so difficult -- you may only have a one in a million shot at seeing erroneous behavior. The best way to avoid this is to synchronize the threads with a mutex lock. By synchronizing the threads, you are forcing them to cooperate and let only one of them execute certain instructions at a time. A mutex lock works by putting a thread to sleep until the lock can be acquired, and once the lock is obtained, no other threads can obtain the lock. Not coincidentally, mutex locks are usually implemented using atomic machine instructions. With mutex locks, the code above now reads (in pthread style): int var = 0; pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; void thread_main() { pthread_mutex_lock(&mutex); var++; pthread_mutex_unlock(&mutex); } This code will always result in var having a value equal to the number of threads run. By using the locks, the "var++;" instruction now looks atomic to the two threads. I say "looks atomic", because there is nothing stopping some other function from changing the value of var in another thread. In order to maintain atomicity for instructions that operate on a variable, all places that change or read the variable need to lock the mutex before accessing it.
Always Use Mutexes As a general rule, whenever you access variables that are shared between threads -- including reading the variable -- you should use a mutex lock. There are some exceptions, and generally this is when the read operation is atomic. For example, in our above code, we could have a thread that reads the value of var without changing it. This code would not require a mutex lock because the save from memory to var (step 3) IS atomic. However, this shortcut is not portable. For example, on an 8-bit system, if var is a 16-bit value, the save to memory is NOT atomic (2 instructions are required). Very bizarre errors could occur if var is only partially written when it is read. In short, the answer is always to lock. Locking cannot hurt, it can only help.
Further Resources For (POSIX) pthreads, and other UNIX-supported thread libraries:
For Win32 threads:
For Java threads:
It is a good idea to look through at least one threading book instead of just jumping into the APIs. Most of the APIs do not tell you about how to code with threads correctly, they just tell you the function calls. Coding with threads can bring great benefits to your application, but it can also bring great problems. Thread responsibly! Would you like to write a feature? |
|