Table of Contents
LUCOS: Live Updating Contemporary Operating Systems
Patches and upgrades are a part of everyday life for a contemporary operating system. Such patches and upgrades are frequently applied in order to plug security holes, add new features and enhance performance. Unfortunately, this process usually requires stopping and restarting a running operating system, which could constitute a major source of its loss of availability. However, for some long running and mission-critical systems, any such disruption could be expensive and intolerable. They have to keep all tasks running all the time, otherwise, risk dire consequences. Therefore, features such as the live update capability have become increasingly important, because it could minimize the planned and unplanned downtime in order to diminish the loss of availability.
Most modern operating systems are large and complex. To live update such operating systems safely, several requirements are identified in. First, updatable units in an operating system need to be easily defined. For an operating system using an objectoriented approach such as K42, an object is a natural updatable unit. Second, a quiescent state or a safe point needs to be detected or enforced before a dynamic patch could be applied. Otherwise, the operating system may result in an inconsistent state. This necessitates an efficient way to track the states of the operating system, for example, using a reference counter to track the number of threads executing in an updatable unit. Finally, an effective approach is required to redirect invocations from the original unit to the newly updated unit after a dynamic patch is applied.
However, most existing operating systems are not designed with a live update capability in mind. First, they are usually implemented using non-object-oriented approaches. Hence, function calls are often made directly rather than going through an indirection table, making it difficult to redirect function calls. Moreover, they often lack well-defined boundaries among various components, preventing component-level live updates. Second, they usually lack the mechanism that supports safe points detection (e.g. reference count). It makes a quiescent state detection either very time consuming or simply impractical. Furthermore, it is very rare for hot spots in an operating system to enter a quiescent state in which live updates can be safely applied. Examples include network modules in a web server and a root file system module. A network module is always busy receiving and sending packets, and a root file system module cannot be unmounted while the operating system is still running. Under such circumstances, emergency patches and updates need to be indefinitely postponed, exposing the whole system to possible attacks or corruption. Finally, even if such a safe state could be reached and detected, due to the fact that the update process is executing inside the operating system, it may trigger an execution of the code in the patch program and result in a dead lock situation or an inconsistent state. For example, a live update to an interrupt handler may trigger the interrupt and brings the operating system into an undefined state.
Being aware of the above problems, we propose using virtualization as a way to support live updates on existing operating systems. We argue that system virtualization, recently a popular technique for many applications, provides the operating system with a seamless capability to support live updates, thus reducing downtime and improving availability.
As shown in Figure 1, by running the operating system on a high performance virtual machine, it is convenient and natural for the virtual machine monitor (VMM) to modify the state of the operating system without having to stop and reboot the operating system. We apply live updates at the function level rather than at the component level because it is often impossible to unambiguously partition the whole system into disjoint components. Given that a quiescent state may not even exist in some functions, we eliminate this requirement and instead allow live updates at any time. If a live update changes data, we keep different versions of the data. It is the responsibility of VMM to invoke the state transfer function that maintains the coherence of different versions.To support live updates to a running operating system, LOCUS follows several design principles:
OS-Transparency and OS-Neutral: To avoid disrupting services on a running operating system, any change to the operating system should not necessitate an operating system reboot. Fortunately, most existing operating systems provide some means to extend their functionalities on the fly (e.g., Linux loader kernel module), eliminating the need for reboots. LUCOS takes advantage of this capability through making both the update manager in the target operating system and the dynamic patches in the form of loadable kernel modules, so no modification to the operating system kernels is required. Further, the support for live updates in the VMM is OS-neutral, allowing good portability and easy inclusion in a general-purpose VMM.
Flexibility: LUCOS allows live updating an operating system at the granularity of functions. It also permits updates to both code and data structures, even dynamically adding and removing single instance or multiple instances of data structures. Furthermore, demanding a quiescent state is no longer imperative. Updates are allowed to be performed at any time, even when the code to be updated is still active.
Safety and Maintainability: Any update to the operating system should be transactional to avoid corrupting the whole system. If an error occurs during the update process, the system should be able to roll back any change already made on it. LUCOS also allows any previously committed updates to be rolled back.
Correctness: For simplicity, LUCOS neither verifies nor validates the input patch files, but assume its trustability and correctness. The construction of a patch program is decoupled from the generation of the corresponding patch files, leaving the verification of the program to developers and testers. However, LUCOS allows rolling back problematic patches or patching the same update units more than once.
To demonstrate the applicability of LUCOS, we selected several real-life kernel patches from Linux-2.6.10 to 2.6.11 and applied/rolled back them on the fly. The following table shows the corresponding time to apply and rollback these patches.
|Patch||Type||Funtions||Apply Time(ns)||Rollback Times(ns)|
|Fixing the page reading bug||1||2||21,426||19,663|
|Removal of livelock avoidance||1||1||14,916||13,113|
|Upgrading the process scheduler||1||3||23,715||22,041|
|Reconstruction of the IRQ descriptors||2||13||215,921||217,479|
|Upgrading backend block device drivers(xen-linux)||2||1||42,900||36,330|
To measure the overall performance overhead of LUCOS, we compare Xen-Linux in LUCOS against native Linux and the original Xen-Linux, which is a variant of Linux ported to run on Xen VMM. Figure 3, show that our implementation incurs negligible performance overhead: a less than 1% performance degradation compared to a Xen-Linux. The time to apply a patch is also very minimal.
We propose using virtualization to live update a running operating system on demand, without the requirement of a quiescence state. The prototype we have implemented, named LUCOS, is able to live update the Linux without disrupting its services and with minimal overhead during the normal execution. We demonstrate this approach by applying several real-life Linux kernel patches on the fly. Performance measurements showed that our implementation incurs negligible performance overhead compared to a Xen-Linux.
- Live Updating Operating Systems Using Virtualization. Haibo Chen, Rong Chen, Fengzhe Zang, Binyu Zang and Pen-chung Yew. Proceedings of the 2nd international conference on Virtual execution environments, pp. 35 - 44 Ottawa, Ontario, Canada, June 2006. (VEE '06) [pdf] [bib] [ppt]
We would like to specially acknowledge Hai Du, Pengcheng Liu, Jie Yu and Tong Sun for their hard work on building the prototype system and doing the experiments. This work was funded by China National 973 Plan under grant numbered 2005CB321905 and Intel University Research Grant.