Linux kernel hits general protection if %ds is corrupt for 32-bit PVOPS.

Linux kernel when returning from an iret assumes that %ds segment is safe and uses it to reference various per-cpu related fields. Unfortunately the user can modify the LDT and provide a NULL one. Whenever an iret is called we end up in xen_iret and try to use the %ds segment and cause an general protection fault.

lack of check (invalid segment)

guest user能够修改LDT,提供一个NULL pointer,当运行iret的时候,guest linux kernel会访问NULL %ds,造成general protection fault.



unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic

The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch.

--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -89,11 +89,11 @@ ENTRY(xen_iret)
 #ifdef CONFIG_SMP
-   movl TI_cpu(%eax), %eax
-   movl __per_cpu_offset(,%eax,4), %eax
-   mov xen_vcpu(%eax), %eax
+   movl %ss:TI_cpu(%eax), %eax
+   movl %ss:__per_cpu_offset(,%eax,4), %eax
+   mov %ss:xen_vcpu(%eax), %eax
-   movl xen_vcpu, %eax
+   movl %ss:xen_vcpu, %eax
    /* check IF state we're restoring */
@@ -106,11 +106,11 @@ ENTRY(xen_iret)
     * resuming the code, so we don't have to be worried about
     * being preempted to another CPU.
-   setz XEN_vcpu_info_mask(%eax)
+   setz %ss:XEN_vcpu_info_mask(%eax)
    /* check for unmasked and pending */
-   cmpw $0x0001, XEN_vcpu_info_pending(%eax)
+   cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
     * If there's something pending, mask events again so we can
@@ -118,7 +118,7 @@ xen_iret_start_crit:
     * touch XEN_vcpu_info_mask.
    jne 1f
-   movb $1, XEN_vcpu_info_mask(%eax)
+   movb $1, %ss:XEN_vcpu_info_mask(%eax)
 1: popl %eax


Malicious or buggy unprivileged user space can cause the guest kernel to crash, or permit a privilege escalation within the guest, or operate erroneously.

guest DoS, privilege escalation