Linux Interview: Synchronization mechanisms inside Linux kernel

Hello folks, today i am going to talk about the synchronization mechanism which is available in Linux kernel. This post may help you to choose right synchronization mechanism for uni-processor or SMP system. Choosing wrong mechanism can cause crash to kernel or it can damage any hardware component.

Before we begin, lets closely examine three terminology which will use frequently in this post.

1. Critical Region: A critical section is a piece of code which should be executed under mutual exclusion. Suppose that, two threads are updating the same variable which is in parent process’s address space. So the code area where both thread access/update the shared variable/resource is called as a Critical Region. It is essential to protect critical region to avoid collusion in code/system.

2. Race Condition: So developer has to protect Critical Region such that, at one time instance there is only one thread/process which is passing under that region( accessing shared resources). If Critical Region doesn’t protected with the proper mechanism, then there are chances of Race Condition.

Finally, a race condition is a flaw that occurs when the timing or ordering of events affects a program’s correctness. by using appropriate synchronization mechanism or properly protecting Critical Region we can avoid/reduce the chance of this flaw.

3. Deadlock: This is the other flaw which can be generated by NOT using proper synchronization mechanism. It is a situation in which two thread/process sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function.

So the question comes to mind is “what is synchronization mechanism ?” Synchronization mechanism is set of APIs & objects which can be use to protect critical region and avoid deadlock/race condition.

Linux kernel provide couple of synchronization mechanism.

1. Atomic operation: This is the very simple approach to avoid race condition or deadlock. Atomic operators are operations, like add and subtract, which perform in one clock cycle (uninterruptible operation). The atomic integer methods operate on a special data type, atomic_t. A common use of the atomic integer operations is to implement counters which is updated by multiple threads. The kernel provides two sets of interfaces for atomic operations, one that operates on integers and another that operates on individual bits. All atomic functions are inline functions.

1. APIs for operations on integers:

atomic_t i    /* defining atomic variable i */
atomic_set(&i, 10);      /* atomically assign value(here 10) to atomic variable i */
atomic_inc(&i);           /* atomically increment value of i variable (i++ atomically), so i = 11 */
atomic_dec(&i);          /* atomically decrement value of i variable (i– atomically), so i = 10 */
atomic_add(4, &i);    /* atomically add 4 with value of i (i = 10+4) */
atomic_read(&i);       /* atomically read and return value of i (14) */

2. APIs for operates on individual bits:

Bit-wise APIs operate on any generic memory addresses. So there is no need to for explicitly defining an object with the type of atomic_t.

unsigned int i = 0; /* defining a normal variable (i = 0X00000000)*/

set_bit( 5, &i ); /*atomically Set 5th bit of the variable i ( i = 0X00000010)*/

clear_bit( 5, &i ); /* atomically clear 5th bit of the variable i ( i = 0X00000000)*/

2. Semaphore: This is another kind of synchronization mechanism which will be provided by the Linux kernel. When some process is trying to access semaphore which is not available, semaphore puts process on wait queue(FIFO) and puts task on sleep. That’s why semaphore is known as a sleeping lock. After this processor is free to jump to other task which is not requiring this semaphore. As soon as semaphore get available, one of task from wait queue in invoked.

There two flavors of semaphore is present.

Basic semaphore
Reader-Writter Semaphore

When a multiple threads/process wants to share data, in the case where read operation on data is more frequent and write operation is rare. In this scenario Reader-Writter Semaphore is used. Multiple thread can read a data by same time. The data will be only locked(all other read thread should wait) when one thread write/update data. On the other side writers has to wait until all the readers release the read lock. When writer process release lock the reader from wait-queue(FIFO) will get invoked.

Couple of observations about nature of semaphore :

Semaphore puts a task on sleep. So the semaphore can be only used in process context. Interrupt context can not sleep.
Operation to put task on sleep is time consuming(overhead) for CPU. So semaphore is suitable for lock which is holding for long term. Sleeping and invoking task over kills CPU if semaphore is locked and unlocked for short time via multiple tasks.
A code holding a semaphore can be preempted. It does not disable kernel preemption.
After disabling interrupts from some task, semaphore should not acquired. Because task would sleep if it fails to acquire the semaphore, at this time the interrupt has been disabled and current task cannot be scheduled out.
Semaphore wait list is FIFO in nature. So the task which tried to acquire semaphore first will be waken up from wait list first.
Semaphore can be acquired/release from any process/thread.

3. Spin-lock: This is special type of synchronization mechanism which is preferable to use in multi-processor(SMP) system. Basically its a busy-wait locking mechanism until the lock is available. In case of unavailability of lock, it keeps thread in light loop and keep checking the availability of lock. Spin-lock is not recommended to use in single processor system. If some procesq_1 has acquired a lock and other process_2 is trying to acquire lock, in this case process 2 will spins around and keep processor core busy until it acquires lock. process_2 will create a deadlock, it dosent allow any other process to execute because CPU core is busy in light loop by semaphore.

Couple of observations about nature of spinlocks:

Spinlocks are very much suitable to use in interrupt(atomic) context becaue it dosent put process/thread in sleep.
In the uni processor environment, if the kernel acquires a spin lock, it would disable preemption first ; if the kernel releases the spin lock, it would enable preemption. This is to avoid dead lock on uni processor system. EG: In uni processor system, thread_1 has acquired spinlock. After that kernel preemption takes place, which puts thread_1 to the stack and thread_2 comes on CPU. Thread_2 tries to acquire same spin-lock but which is not available. In this scenario, therad_2 will keep CPU busy in light loop. This situation dose not allow other thread to execute on CPU. This create deadlock.
Spin-locks is not recursive
Special care must be taken in case where spin-lock is shared b/w interrupt handler and thread. Local interrupts must be disabled on the same CPU(core) before acquiring spin-lock. In the case where interrupt occurs on a different processor, and it spins on the same lock, does not cause deadlock because the processor who acquire lock will be able to release the lock using the other core. EG: Suppose that an interrupt handler to interrupt kernel code while the lock is acquired by thread. The interrupt handler spins, wait for the lock to become available. The locker thread, does not run until the interrupt handler completes. This can cause dead lock.
When data is shared between two tasklet, there is not need to disable interrupts because tasklet dose not allow another running tasklet on the same processor. Here you can get more details about nature of tasklets.

There two flavors of spin-lock is present.

Basic spin-lock
Reader-Writter Spin-lock

With increasing the level of concurrency in Linux kernel read-write variant of spin-lock is introduces. This lock is used in the scenario where many readers and few writers are present. Read-write spin-lock can have multiple readers at a time but only one writer and there can be no readers while there is a writer. Any reader will not get lock until writer finishes it.

4. Sequence Lock: This is very useful synchronization mechanism to provide a lightweight and scalable lock for the scenario where many readers and a few writers are present. Sequence lock maintains a counter for sequence. When the shared data is written, a lock is obtained and a sequence counter is incremented by 1. Write operation makes the sequence counter value to odd and releasing it makes even. In case of reading, sequence counter is read before and after reading the data. If the values are the same which indicates that a write did not begin in the middle of the read. In addition to that, if the values are even, a write operation is not going on. Sequence lock gives the high priority to writers compared to readers. An acquisition of the write lock always succeeds if there are no other writers present. Pending writers continually cause the read loop to repeat, until there are no longer any writers holding the lock. So reader may sometimes be forced to read the same data several times until it gets a valid copy(writer releases lock). On the other side writer never waits until and unless another writer is active.

So, every synchronization mechanism has its own pros and cons. Kernel developer has to smartly choose appropriate synchronization mechanism based on pros and cons.

Linux Interview

Thursday, 7 July 2016

Synchronization mechanisms inside Linux kernel

No comments:

Post a Comment