• Operating System: Three Easy Pieces --- Locks: Test and Set (Note)


    Because disabling interrupts does not work on multiple processors, system designers started to

    invent hardware support for locking. The earliest multiprocessor systems, such as the Burroughts

    B5000 in the  early 1960's, had such support; today all systems provide this type of support, even

    for single CPU systems.

    The simple bit of hardware support to understand is what is known as a test-and-set instruction,

    also known as atomic exchange. To understand how test-and-set works, let's first try to build a 

    simple lock without it. In this failed attempt, we use a simple flag variable to denote whether the

    lock is held or not.

    In this first attempt, the idea is quite simple: use a simple variale to indicate whether some

    thread has possession of a lock. The first thread that enters the critical section will call lock(), 

    which tests whether the flag is equal to 1 (in this case, it is not), and then sets the flag to 1 to

    indicate that the thread now holds the lock. When finished with the critical section, the thread

    calls unlock() and clears the flag, thus indicating that the lock is no longer held.

    typedef struct __lock_t { int flag; } lock_t;
    
    void init(lock_t* mutex) {
         mutex->flag = 0;
    }
    
    void lock(lock_t* mutex) {
        while (mutex->flag == 1)
                 ;
        mutex->flag = 1;
    }
    
    void unlock(lock_t* mutex) {
        mutex->flag = 0;
    }

    If another thread happens to call lock() while that first thread is in the critical section, it will

    simply spin-wait in the while loop for that thread to call unlock() and clear the flag. Once the first

    flag does so, the waiting thread will fall out of the while loop, set the flag to 1 for itself, and 

    proceed into the critical section.

    Unfortunately, the code has two problems: one of correctness, and another of performance. The

    correctness problem is simple to see once you get used to thinking about concurrent programming

    . Imagine the code interleaving; assume flag = 0 to being.

    As you can see from this interleaving, with timely (untimely?) interrupts, we can easily produce

    a case where both threads set the flag to 1 and both threads are thus able to enter the critical

    section. This behavior is what professionals call "bad" - we have obviously failed to provide the

    most basic requirement: providing mutual exclusion.

    The performance problem, which we will address more later on, is the fact that the way a thread

    waits to acquire a lock that is already held: it endlessly checks the value of flag, a technique 

    known as spin-waiting. Spin-waiting wastes time waiting for another thread to release a lock. The

    waste is exceptionally high on a uniprocessor, where the thread that the waiter is waiting for 

    cannot even run (at least, until a context switch occurs!) Thus, as we move forward and develop

    more sophisticated solutions, we should also consider ways to avoid this kind of waste.

                        Building A Working Spin Lock

    While the idea behind the example above is a good one, it is not possible to implement without

    some support from the hardware. Fortunately, some systems provide an instruction to support

    the creation of simple based one this concepty. This more powerful instruction has different 

    names -- on SPARC, it is load/store unsigned byte instruction (ldstub), whereas on x86, it is the

    atomic exchange instruction (xchg) -- but basically does the same thing across platforms, and is

    generally referred to as test-and-set. We define what the test-and-set instruction does with the

    following C code snippet:

    int TestAndSet(int* old_ptr, int new) {
         int old = *old_ptr;
         *old_ptr = new;
         return old;
    }

    What the test-and-set instruction does is as follows. It returns the old value pointed to by the ptr,

    and simultaneously updates said value to new. The key, of course, is that this sequence of 

    operations is performed atomically. The reason it is called test-and-set is that it enables you to 

    test the old value (which is what is returned) while simultaneouly setting the memory location to

    a new value; as it turns out, this slightly more powerful instruction is enough to build a simple

    spin lock, as we now examine in figure 28.3. Or better yet: figure it out first yourself!

    Let's make sure we understand why this lock works. Imagine first the case where a thread calls

    lock() and no other thread currently holds the lock; thus, flag should be 0. When the thread calls

    TestAndSet(flag, 1), the routine will return the old value of flag, which is 0; thus, the calling 

    thread, which is testing the value of flag, will not get caught spinning in the while loop and will

    acquire the lock. The thread will also atomically set the value to 1, thus indicating that the lock

    is now held. When the thread is finished with its critical section, it calls unlock() to set the flag

    back to zero.

    typedef struct __lock_t {
        int flag;
    }
    
    void init(lock_t* lock) {
        lock->flag = 0;
    }
    
    void lock(lock_t* lock) {
        while (TestAndSet(&lock->flag, 1) == 1)
              ;
    }
    
    void unlock(lock_t* lock) {
        lock->flag = 0;
    }

    The second case we can imagine arises when one thread already has the lock held (i.e., flag is 1).

    In this case, this thread will call  lock() and then call TestAndSet(flag, 1) as well. This time, 

    () will return the old value at flag, which is 1 (because the lock is held), while simultaneouly 

    setting it to 1 again. As long as the lock is held by another thread, TestAndSet() will repeatedly 

    return 1, and thus this thread will spin and spin until the lock is finally released. When the flag is

    finally set to 0 by some other thread, this thread will call TestAndSet() again, which will now 

    return 0 while atomically setting the value to 1 and thus acquire the lock and enter the critical

    section.

    By making both the test of the old lock value and set of the new value a single atomic operation,

    we ensure that only one thread acquires the lock. And thst's how to build a working mutual 

    exclusion primitive!

    You may also now understand why this type of lock is usually referred to as a spin lock. It is the

    simplest type of lock to build, and simply spins using CPU cycles, until the lock becomes available.

    To work corectly on a single processor, it requires a preemptive scheduler (i.e., one that will

    interrupt a thread via a timer, in order to run a different thread, from time to time). Without

    preemption, spin locks don't make much sense on a single CPU, as a thread spinning on a CPU

    will never relinquish it.

                      TIPs: Think About Concurrent As Malicious Scheduler

    From this example, you might get a sense of the approach you need to take to understand 

    concurrent execution. What you should try to do is to pretend you are a malicious scheduler, one

    that interrupts threads at the most inopportune of times in order to foil their feeble attempts at

    building synchronization promitives. What a mean scheduler you are! Although the exact sequence

    of interrupts may be improbable, it is possible, and that is all we need to demonstrate that a

    particular approach does not work. It can be useful to think maliciouly! (At least, sometimes.)

  • 相关阅读:
    Qt: 自动调整到最合适的大小(不是很明白)
    Qt: 读写二进制文件(写对象, 原始数据等)
    Qt: 把内容写进字符串中与C++很相似(使用QTextStream包装QString)
    2008技术内幕:T-SQL语言基础
    bootstrap + angularjs + seajs构建Web Form前端2
    SignalR 2.0 系列: SignalR简介
    Amazon前技术副总裁解剖完美技术面试
    MongoDB数据文件内部结构
    SQL Server三种表连接原理
    了解mongoDB存储结构
  • 原文地址:https://www.cnblogs.com/miaoyong/p/4991344.html
Copyright © 2020-2023  润新知