MIME-Version: 1.0
In-Reply-To: <1482885582.106950.29.camel@ranerica-desktop>
References: <20161224013745.108716-1-ricardo.neri-calderon@linux.intel.com>
 <20161224013745.108716-6-ricardo.neri-calderon@linux.intel.com>
 <CALCETrWkL_RumwxcTKaQC=kBKZE4P5u95=OpcPGEjCJfaN9wVA@mail.gmail.com> <1482885582.106950.29.camel@ranerica-desktop>
From: Andy Lutomirski <luto@amacapital.net>
Date: Tue, 27 Dec 2016 16:48:33 -0800
Message-ID: <CALCETrUZsCGSyd3sf1O7-Sb90w7hewRWunwQXU1x2kZog_aFcA@mail.gmail.com>
Subject: Re: [v2 5/7] x86: Add emulation code for UMIP instructions
To: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
        Borislav Petkov <bp@suse.de>, Andy Lutomirski <luto@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        X86 ML <x86@kernel.org>, linux-msdos@vger.kernel.org,
        wine-devel@winehq.org, Andrew Morton <akpm@linux-foundation.org>,
        "H . Peter Anvin" <hpa@zytor.com>, Brian Gerst <brgerst@gmail.com>,
        Chen Yucong <slaoub@gmail.com>, Chris Metcalf <cmetcalf@mellanox.com>,
        Dave Hansen <dave.hansen@linux.intel.com>,
        Fenghua Yu <fenghua.yu@intel.com>, Huang Rui <ray.huang@amd.com>,
        Jiri Slaby <jslaby@suse.cz>, Jonathan Corbet <corbet@lwn.net>,
        "Michael S . Tsirkin" <mst@redhat.com>,
        Paul Gortmaker <paul.gortmaker@windriver.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Shuah Khan <shuah@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
        Tony Luck <tony.luck@intel.com>, Paolo Bonzini <pbonzini@redhat.com>,
        "Liang Z . Li" <liang.z.li@intel.com>,
        Alexandre Julliard <julliard@winehq.org>, Stas Sergeev <stsp@list.ru>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 6433
Lines: 145

On Tue, Dec 27, 2016 at 4:39 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
> On Fri, 2016-12-23 at 18:11 -0800, Andy Lutomirski wrote:
>> On Fri, Dec 23, 2016 at 5:37 PM, Ricardo Neri
>> <ricardo.neri-calderon@linux.intel.com> wrote:
>> > The feature User-Mode Instruction Prevention present in recent Intel
>> > processor prevents a group of instructions from being executed with
>> > CPL > 0. Otherwise, a general protection fault is issued.
>> >
>> > Rather than relaying this fault to the user space (in the form of a SIGSEGV
>> > signal), the instructions protected by UMIP can be emulated to provide
>> > dummy results. This allows to conserve the current kernel behavior and not
>> > reveal the system resources that UMIP intends to protect (the global
>> > descriptor and interrupt descriptor tables, the segment selectors of the
>> > local descriptor table and the task state and the machine status word).
>> >
>> > This emulation is needed because certain applications (e.g., WineHQ) rely
>> > on this subset of instructions to function.
>> >
>> > The instructions protected by UMIP can be split in two groups. Those who
>> > return a kernel memory address (sgdt and sidt) and those who return a
>> > value (sldt, str and smsw).
>> >
>> > For the instructions that return a kernel memory address, the result is
>> > emulated as the location of a dummy variable in the kernel memory space.
>> > This is needed as applications such as WineHQ rely on the result being
>> > located in the kernel memory space function. The limit for the GDT and the
>> > IDT are set to zero.
>>
>> Nak.  This is a trivial KASLR bypass.  Just give them hardcoded
>> values.  For x86_64, I would suggest 0xfffffffffffe0000 and
>> 0xffffffffffff0000.
>
> I see. I assume you are suggesting these values for x86_64 because they
> lie in an unused hole. That makes sense to me.
>
> For the case of x86_32, I have trouble finding a suitable place as there
> are not many available holes. It could be put before VMALLOC_START or
> after VMALLOC_END but this would reveal the position of the vmalloc
> area. Although, to my knowledge, randomized memory is not available for
> x86_32. Without randomization, does it hurt to make sidt/sgdt return the
> address of a kernel static variable?

I would just use the same addresses, truncated.  There's no reason
that the address needs to be truly not present -- it just needs to be
inaccessible to user code.  Anything near the top of the address space
should work.

>
>>
>> >
>> > The instructions sldt and str return a segment selector relative to the
>> > base address of the global descriptor table. Since the actual address of
>> > such table is not revealed, it makes sense to emulate the result as zero.
>>
>> Hmm, now I wonder if anything uses SLDT to see if there is an LDT.  If
>> so, we could emulate it better, but I doubt this matters.
>
> So you are saying that the emulated sldt should return a different value
> based on the presence/absence of a LDT? This could reveal this very
> fact.

User code knows whether the LDT exists because an LDT only exists if
the program called modify_ldt().  But I doubt this matters in
practice.

>> > +static int __emulate_umip_insn(struct insn *insn, enum umip_insn umip_inst,
>> > +                              unsigned char *data, int *data_size)
>> > +{
>> > +       unsigned long const *dummy_base_addr;
>> > +       unsigned short dummy_limit = 0;
>> > +       unsigned short dummy_value = 0;
>> > +
>> > +       switch (umip_inst) {
>> > +       /*
>> > +        * These two instructions return the base address and limit of the
>> > +        * global and interrupt descriptor table. The base address can be
>> > +        * 32-bit or 64-bit. Limit is always 16-bit.
>> > +        */
>> > +       case UMIP_SGDT:
>> > +       case UMIP_SIDT:
>> > +               if (umip_inst == UMIP_SGDT)
>> > +                       dummy_base_addr = &umip_dummy_gdt_base;
>> > +               else
>> > +                       dummy_base_addr = &umip_dummy_idt_base;
>> > +               if (X86_MODRM_MOD(insn->modrm.value) == 3) {
>> > +                       WARN_ONCE(1, "SGDT cannot take register as argument!\n");
>>
>> No warnings please.
>
> I'll. Remove it.

Thanks.  In general, WARN_ONCE, etc are supposed to indicate kernel
bugs, not user bugs.

>> > +       int not_copied, nr_copied, reg_offset, dummy_data_size;
>> > +       void __user *uaddr;
>> > +       unsigned long *reg_addr;
>> > +       enum umip_insn umip_inst;
>> > +
>> > +       not_copied = copy_from_user(buf, (void __user *)regs->ip, sizeof(buf));
>>
>> This is slightly wrong due to PKRU.  I doubt we care.
>
> I see. If I am not mistaken, if the memory is protected by a protection
> key this would cause a page fault. I'll make a note of it.

Exactly.  This is correct behavior unless the key happens to be set up
so it can be executed but not read, in which case emulation will fail.

>>
>> > +       nr_copied = sizeof(buf) - not_copied;
>> > +       /*
>> > +        * The decoder _should_ fail nicely if we pass it a short buffer.
>> > +        * But, let's not depend on that implementation detail.  If we
>> > +        * did not get anything, just error out now.
>> > +        */
>> > +       if (!nr_copied)
>> > +               return -EFAULT;
>>
>> If the caller cares about EINVAL vs EFAULT, it cares because it is
>> considering changing the signal to a fake page fault.  If so, then
>> this should be EINVAL -- failure to read the text should just prevent
>> emulation.
>
> I see. The caller in this case do_general_protection, which will issue a
> SIGSEGV to the user space anyways. I don't think it cares about the
> EINVAL vs EFAULT. It does care about whether the emulation was
> successful.

Maybe just make it return bool then?  But fixing up the return codes
would be fine, too.  I just think that, if it returns int, the value
should be meaningful.

>> > +               if (nr_copied  > 0)
>> > +                       return -EFAULT;
>>
>> This should be the only EFAULT case.
> Should this be EFAULT event if the caller cares only about successful
> (return 0) vs failed (return non-0) emulation?

In theory this particular error would be a page fault not a general
protection fault (in the UMIP off case).  If you were emulating it
extra carefully, you could change the signal accordingly.  But, as I
said, I really doubt this matters.