Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755981AbcL1Ajs (ORCPT ); Tue, 27 Dec 2016 19:39:48 -0500 Received: from mga01.intel.com ([192.55.52.88]:19669 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752395AbcL1Ajp (ORCPT ); Tue, 27 Dec 2016 19:39:45 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,419,1477983600"; d="scan'208";a="23506566" Message-ID: <1482885582.106950.29.camel@ranerica-desktop> Subject: Re: [v2 5/7] x86: Add emulation code for UMIP instructions From: Ricardo Neri To: Andy Lutomirski Cc: Ingo Molnar , Thomas Gleixner , Borislav Petkov , Andy Lutomirski , Peter Zijlstra , "linux-kernel@vger.kernel.org" , X86 ML , linux-msdos@vger.kernel.org, wine-devel@winehq.org, Andrew Morton , "H . Peter Anvin" , Brian Gerst , Chen Yucong , Chris Metcalf , Dave Hansen , Fenghua Yu , Huang Rui , Jiri Slaby , Jonathan Corbet , "Michael S . Tsirkin" , Paul Gortmaker , "Ravi V . Shankar" , Shuah Khan , Vlastimil Babka , Tony Luck , Paolo Bonzini , "Liang Z . Li" , Alexandre Julliard , Stas Sergeev Date: Tue, 27 Dec 2016 16:39:42 -0800 In-Reply-To: References: <20161224013745.108716-1-ricardo.neri-calderon@linux.intel.com> <20161224013745.108716-6-ricardo.neri-calderon@linux.intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6965 Lines: 170 On Fri, 2016-12-23 at 18:11 -0800, Andy Lutomirski wrote: > On Fri, Dec 23, 2016 at 5:37 PM, Ricardo Neri > wrote: > > The feature User-Mode Instruction Prevention present in recent Intel > > processor prevents a group of instructions from being executed with > > CPL > 0. Otherwise, a general protection fault is issued. > > > > Rather than relaying this fault to the user space (in the form of a SIGSEGV > > signal), the instructions protected by UMIP can be emulated to provide > > dummy results. This allows to conserve the current kernel behavior and not > > reveal the system resources that UMIP intends to protect (the global > > descriptor and interrupt descriptor tables, the segment selectors of the > > local descriptor table and the task state and the machine status word). > > > > This emulation is needed because certain applications (e.g., WineHQ) rely > > on this subset of instructions to function. > > > > The instructions protected by UMIP can be split in two groups. Those who > > return a kernel memory address (sgdt and sidt) and those who return a > > value (sldt, str and smsw). > > > > For the instructions that return a kernel memory address, the result is > > emulated as the location of a dummy variable in the kernel memory space. > > This is needed as applications such as WineHQ rely on the result being > > located in the kernel memory space function. The limit for the GDT and the > > IDT are set to zero. > > Nak. This is a trivial KASLR bypass. Just give them hardcoded > values. For x86_64, I would suggest 0xfffffffffffe0000 and > 0xffffffffffff0000. I see. I assume you are suggesting these values for x86_64 because they lie in an unused hole. That makes sense to me. For the case of x86_32, I have trouble finding a suitable place as there are not many available holes. It could be put before VMALLOC_START or after VMALLOC_END but this would reveal the position of the vmalloc area. Although, to my knowledge, randomized memory is not available for x86_32. Without randomization, does it hurt to make sidt/sgdt return the address of a kernel static variable? > > > > > The instructions sldt and str return a segment selector relative to the > > base address of the global descriptor table. Since the actual address of > > such table is not revealed, it makes sense to emulate the result as zero. > > Hmm, now I wonder if anything uses SLDT to see if there is an LDT. If > so, we could emulate it better, but I doubt this matters. So you are saying that the emulated sldt should return a different value based on the presence/absence of a LDT? This could reveal this very fact. > > > > > The instruction smsw is emulated to return zero. > > If you're going to emulate it, please return something plausible. The > protected mode bit should be on, for example. 0x33 is probably > reasonable. Sure. Will do. > > > +static int __emulate_umip_insn(struct insn *insn, enum umip_insn umip_inst, > > + unsigned char *data, int *data_size) > > +{ > > + unsigned long const *dummy_base_addr; > > + unsigned short dummy_limit = 0; > > + unsigned short dummy_value = 0; > > + > > + switch (umip_inst) { > > + /* > > + * These two instructions return the base address and limit of the > > + * global and interrupt descriptor table. The base address can be > > + * 32-bit or 64-bit. Limit is always 16-bit. > > + */ > > + case UMIP_SGDT: > > + case UMIP_SIDT: > > + if (umip_inst == UMIP_SGDT) > > + dummy_base_addr = &umip_dummy_gdt_base; > > + else > > + dummy_base_addr = &umip_dummy_idt_base; > > + if (X86_MODRM_MOD(insn->modrm.value) == 3) { > > + WARN_ONCE(1, "SGDT cannot take register as argument!\n"); > > No warnings please. I'll. Remove it. > > > +int fixup_umip_exception(struct pt_regs *regs) > > +{ > > + struct insn insn; > > + unsigned char buf[MAX_INSN_SIZE]; > > + /* 10 bytes is the maximum size of the result of UMIP instructions */ > > + unsigned char dummy_data[10] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; > > + int x86_64 = !test_thread_flag(TIF_IA32); > > user_64bit_mode(regs) I'll make this change. > > > + int not_copied, nr_copied, reg_offset, dummy_data_size; > > + void __user *uaddr; > > + unsigned long *reg_addr; > > + enum umip_insn umip_inst; > > + > > + not_copied = copy_from_user(buf, (void __user *)regs->ip, sizeof(buf)); > > This is slightly wrong due to PKRU. I doubt we care. I see. If I am not mistaken, if the memory is protected by a protection key this would cause a page fault. I'll make a note of it. > > > + nr_copied = sizeof(buf) - not_copied; > > + /* > > + * The decoder _should_ fail nicely if we pass it a short buffer. > > + * But, let's not depend on that implementation detail. If we > > + * did not get anything, just error out now. > > + */ > > + if (!nr_copied) > > + return -EFAULT; > > If the caller cares about EINVAL vs EFAULT, it cares because it is > considering changing the signal to a fake page fault. If so, then > this should be EINVAL -- failure to read the text should just prevent > emulation. I see. The caller in this case do_general_protection, which will issue a SIGSEGV to the user space anyways. I don't think it cares about the EINVAL vs EFAULT. It does care about whether the emulation was successful. > > > + insn_init(&insn, buf, nr_copied, x86_64); > > + insn_get_length(&insn); > > + if (nr_copied < insn.length) > > + return -EFAULT; > > Ditto. I will change to EINVAL. > > > + > > + umip_inst = __identify_insn(&insn); > > + /* Check if we found an instruction protected by UMIP */ > > + if (umip_inst < 0) > > + return -EINVAL; > > + > > + if (__emulate_umip_insn(&insn, umip_inst, dummy_data, &dummy_data_size)) > > + return -EINVAL; > > + > > + /* If operand is a register, write directly to it */ > > + if (X86_MODRM_MOD(insn.modrm.value) == 3) { > > + reg_offset = get_reg_offset_rm(&insn, regs); > > + reg_addr = (unsigned long *)((unsigned long)regs + reg_offset); > > + memcpy(reg_addr, dummy_data, dummy_data_size); > > + } else { > > + uaddr = insn_get_addr_ref(&insn, regs); > > + nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size); > > + if (nr_copied > 0) > > + return -EFAULT; > > This should be the only EFAULT case. Should this be EFAULT event if the caller cares only about successful (return 0) vs failed (return non-0) emulation? Thanks for your thorough review! I really appreciate it. Thanks and BR, Ricardo