2015-07-27 21:38:43

by Willy Tarreau

[permalink] [raw]

Subject: Re: [PATCH 4/3] x86/ldt: allow to disable modify_ldt at runtime

On Mon, Jul 27, 2015 at 12:04:54PM -0700, Kees Cook wrote:
> On Sat, Jul 25, 2015 at 6:03 AM, Willy Tarreau <[email protected]> wrote:
> > On Sat, Jul 25, 2015 at 09:50:52AM +0200, Willy Tarreau wrote:
> >> On Fri, Jul 24, 2015 at 11:44:52PM -0700, Andy Lutomirski wrote:
> >> > I'm all for it, but I think it should be hard-disablable in config,
> >> > too, for the -tiny people.
> >>
> >> I totally agree.
> >>
> >> > If we add a runtime disable, let's do a
> >> > separate patch, and you and Kees can fight over how general it should
> >> > be.
> >>
> >> Initially I was thinking about changing it for a 3-state option but
> >> that would prevent X86_16BIT from being hard-disablable, so I'll do
> >> something completely separate.
> >
> > So here comes the proposed patch. It adds a default setting for the
> > sysctl when the option is not hard-disabled (eg: distros not wanting
> > to take risks with legacy apps). It suggests to leave the option off.
> > In case a syscall is blocked, a printk_ratelimited() is called with
> > relevant info (program name, pid, uid) so that the admin can decide
> > whether it's a legitimate call or not. Eg:
> >
> > Denied a call to modify_ldt() from a.out[1736] (uid: 100). Adjust sysctl if this was not an exploit attempt.
> >
> > I personally think it completes well your series, hence the 4/3 numbering.
> > Feel free to adopt it if you cycle another round and if you're OK with it
> > of course.
> >
> > CCing Kees as well.
>
> This patch looks reasonable, but I'd prefer a tri-state (enable,
> disable, hard-disable).

That was my first goal initially until I realized that the current two
options make it possible to also get rid of X86_16BIT as Andy did. I
don't see how to do this with the 3-state mode.

> I do something like this for Yama's ptrace
> zero to max_scope range (which "pins" to max_scope if set):
>
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/security/yama/yama_lsm.c#n361

I agree with this and initially I intended to do something approximately
like this when I realized that for this specific case it didn't match the
pattern. In fact here we have the opportunity to completely remove support
for LDT changes, not just the modify_ldt() syscall. Then it makes sense to
have the two options here.

Regards,
Willy

2015-07-28 02:21:13

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option

On Mon, Jul 27, 2015 at 9:18 AM, Boris Ostrovsky
<[email protected]> wrote:
> On 07/27/2015 11:53 AM, Andy Lutomirski wrote:
>>
>> On Mon, Jul 27, 2015 at 8:36 AM, Boris Ostrovsky
>> <[email protected]> wrote:
>>>
>>> On 07/25/2015 01:36 AM, Andy Lutomirski wrote:
>>>>
>>>> Here's v3. It fixes the "dazed and confused" issue, I hope. It's also
>>>> probably a good general attack surface reduction, and it replaces some
>>>> scary code with IMO less scary code.
>>>>
>>>> Also, servers and embedded systems should probably turn off modify_ldt.
>>>> This makes that possible.
>>>>
>>>> Xen people, can you take a look at this?
>>>>
>>>> Willy and Kees: I left the config option alone. The -tiny people will
>>>> like it, and we can always add a sysctl of some sort later.
>>>>
>>>> Changes from v3:
>>>> - Hopefully fixed Xen.
>>>
>>>
>>> 32b-on-32b fails in the same manner. (but non-zero LDT is taken care of)
>>>
>>>> - Fixed 32-bit test case on 32-bit native kernel.
>>>
>>>
>>> I am not sure I see what changed.
>>
>> I misplaced the fix in the wrong git commit, so I failed to sent it.
>> Oops.
>>
>> I just sent v4.1 of patch 3. Can you try that?
>
>
>
> I am hitting BUG() in Xen code (returning from a hypercall) when freeing LDT
> in destroy_context(). Interestingly though when I run the test in the
> debugger I get SIGILL (just like before) but no BUG().
>
> Let me get back to you on that later today.
>
>

After forward-porting my virtio patches, I got this thing to run on
Xen. After several tries, I got:

[ 53.985707] ------------[ cut here ]------------
[ 53.986314] kernel BUG at arch/x86/xen/enlighten.c:496!
[ 53.986677] invalid opcode: 0000 [#1] SMP
[ 53.986677] Modules linked in:
[ 53.986677] CPU: 0 PID: 1400 Comm: bash Not tainted 4.2.0-rc4+ #4
[ 53.986677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org
04/01/2014
[ 53.986677] task: c2376180 ti: c0874000 task.ti: c0874000
[ 53.986677] EIP: 0061:[<c10530f2>] EFLAGS: 00010282 CPU: 0
[ 53.986677] EIP is at set_aliased_prot+0xb2/0xc0
[ 53.986677] EAX: ffffffea EBX: cc3d1000 ECX: 0672e063 EDX: 80000000
[ 53.986677] ESI: 00000000 EDI: 80000000 EBP: c0875e94 ESP: c0875e74
[ 53.986677] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[ 53.986677] CR0: 80050033 CR2: b77404d4 CR3: 020b6000 CR4: 00042660
[ 53.986677] Stack:
[ 53.986677] 80000000 0672e063 000021c0 cc3d1000 00000001 cc3d2000
00000b4a 00000200
[ 53.986677] c0875ea8 c105312d c2317940 c2373a80 00000000 c0875eb4
c1062310 c01861c0
[ 53.986677] c0875ec0 c1062735 c01861c0 c0875ed4 c10a764e c7007a00
c2373a80 00000000
[ 53.986677] Call Trace:
[ 53.986677] [<c105312d>] xen_free_ldt+0x2d/0x40
[ 53.986677] [<c1062310>] free_ldt_struct.part.1+0x10/0x40
[ 53.986677] [<c1062735>] destroy_context+0x25/0x40
[ 53.986677] [<c10a764e>] __mmdrop+0x1e/0xc0
[ 53.986677] [<c10c9858>] finish_task_switch+0xd8/0x1a0
[ 53.986677] [<c1863736>] __schedule+0x316/0x950
[ 53.986677] [<c1863d96>] schedule+0x26/0x70
[ 53.986677] [<c10ac613>] do_wait+0x1b3/0x200
[ 53.986677] [<c10ac9d7>] SyS_waitpid+0x67/0xd0
[ 53.986677] [<c10aa820>] ? task_stopped_code+0x50/0x50
[ 53.986677] [<c186717a>] syscall_call+0x7/0x7
[ 53.986677] Code: e8 c1 e3 0c 81 eb 00 00 00 40 39 5d ec 74 11 8b
4d e4 8b 55 e0 31 f6 e8 dd e0 fa ff 85 c0 75 0d 83 c4 14 5b 5e 5f 5d
c3 90 0f 0b <0f> 0b 0f 0b 8d 76 00 8d bc 27 00 00 00 00 85 d2 74 31 55
89 e5
[ 53.986677] EIP: [<c10530f2>] set_aliased_prot+0xb2/0xc0 SS:ESP 0069:c0875e74
[ 54.010069] ---[ end trace 89ac35b29c1c59bb ]---

Is that the error you're seeing?

If I change xen_free_ldt to:

static void xen_free_ldt(struct desc_struct *ldt, unsigned entries)
{
const unsigned entries_per_page = PAGE_SIZE / LDT_ENTRY_SIZE;
int i;

vm_unmap_aliases();
xen_mc_flush();

for(i = 0; i < entries; i += entries_per_page)
set_aliased_prot(ldt + i, PAGE_KERNEL);
}

then it works. I don't know why this makes a difference.
(xen_mc_flush makes a little bit of sense to me. vm_unmap_aliases
doesn't.)

It's *possible* that there's some but in my code that causes a CPU to
retain a reference to a stale LDT, but I don't see it. Hmm. Is it
possible that, when a process exits, we kill the mm without
synchronously unlazying it everywhere else? Seems a bit hard to
imagine to me -- I don't see why this wouldn't blow up when the pgt
went away.

My best guess is that there's a silly race in which one CPU frees and
LDT before the other CPU flushes its hypercalls. But I don't really
believe this, because I got this trace:

[ 14.257546] Free LDT cb912000: CPU0 cb923000 CPU1 cb923000
[OK] All 5 iterations succeeded
root@(none):/# [ 15.824723] Free LDT cb923000: CPU0 (null) CPU1 (null)
[ 15.827404] ------------[ cut here ]------------
[ 15.828349] kernel BUG at arch/x86/xen/enlighten.c:497!
[ 15.828349] invalid opcode: 0000 [#1] SMP

with this patch applied:

@@ -537,7 +542,9 @@ static void xen_set_ldt(const void *addr, unsigned entries)

MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);

- xen_mc_issue(PARAVIRT_LAZY_CPU);
+ xen_mc_flush();
+
+ this_cpu_write(cpu_ldt, addr);
}

so both CPUs on my VM have definitely zeroed their LDTs before the
failing hypercall.

Hmm. Looking at the hypervisor code, I don't see why setting the LDT
to NULL is handled correctly. Am I missing something?

--Andy

2015-07-28 03:16:46

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option

On Mon, Jul 27, 2015 at 7:20 PM, Andy Lutomirski <[email protected]> wrote:
> On Mon, Jul 27, 2015 at 9:18 AM, Boris Ostrovsky
> <[email protected]> wrote:
>> On 07/27/2015 11:53 AM, Andy Lutomirski wrote:
>>>
>>> On Mon, Jul 27, 2015 at 8:36 AM, Boris Ostrovsky
>>> <[email protected]> wrote:
>>>>
>>>> On 07/25/2015 01:36 AM, Andy Lutomirski wrote:
>>>>>
>>>>> Here's v3. It fixes the "dazed and confused" issue, I hope. It's also
>>>>> probably a good general attack surface reduction, and it replaces some
>>>>> scary code with IMO less scary code.
>>>>>
>>>>> Also, servers and embedded systems should probably turn off modify_ldt.
>>>>> This makes that possible.
>>>>>
>>>>> Xen people, can you take a look at this?
>>>>>
>>>>> Willy and Kees: I left the config option alone. The -tiny people will
>>>>> like it, and we can always add a sysctl of some sort later.
>>>>>
>>>>> Changes from v3:
>>>>> - Hopefully fixed Xen.
>>>>
>>>>
>>>> 32b-on-32b fails in the same manner. (but non-zero LDT is taken care of)
>>>>
>>>>> - Fixed 32-bit test case on 32-bit native kernel.
>>>>
>>>>
>>>> I am not sure I see what changed.
>>>
>>> I misplaced the fix in the wrong git commit, so I failed to sent it.
>>> Oops.
>>>
>>> I just sent v4.1 of patch 3. Can you try that?
>>
>>
>>
>> I am hitting BUG() in Xen code (returning from a hypercall) when freeing LDT
>> in destroy_context(). Interestingly though when I run the test in the
>> debugger I get SIGILL (just like before) but no BUG().
>>
>> Let me get back to you on that later today.
>>
>>
>
> After forward-porting my virtio patches, I got this thing to run on
> Xen. After several tries, I got:
>
> [ 53.985707] ------------[ cut here ]------------
> [ 53.986314] kernel BUG at arch/x86/xen/enlighten.c:496!
> [ 53.986677] invalid opcode: 0000 [#1] SMP
> [ 53.986677] Modules linked in:
> [ 53.986677] CPU: 0 PID: 1400 Comm: bash Not tainted 4.2.0-rc4+ #4
> [ 53.986677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org
> 04/01/2014
> [ 53.986677] task: c2376180 ti: c0874000 task.ti: c0874000
> [ 53.986677] EIP: 0061:[<c10530f2>] EFLAGS: 00010282 CPU: 0
> [ 53.986677] EIP is at set_aliased_prot+0xb2/0xc0
> [ 53.986677] EAX: ffffffea EBX: cc3d1000 ECX: 0672e063 EDX: 80000000
> [ 53.986677] ESI: 00000000 EDI: 80000000 EBP: c0875e94 ESP: c0875e74
> [ 53.986677] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [ 53.986677] CR0: 80050033 CR2: b77404d4 CR3: 020b6000 CR4: 00042660
> [ 53.986677] Stack:
> [ 53.986677] 80000000 0672e063 000021c0 cc3d1000 00000001 cc3d2000
> 00000b4a 00000200
> [ 53.986677] c0875ea8 c105312d c2317940 c2373a80 00000000 c0875eb4
> c1062310 c01861c0
> [ 53.986677] c0875ec0 c1062735 c01861c0 c0875ed4 c10a764e c7007a00
> c2373a80 00000000
> [ 53.986677] Call Trace:
> [ 53.986677] [<c105312d>] xen_free_ldt+0x2d/0x40
> [ 53.986677] [<c1062310>] free_ldt_struct.part.1+0x10/0x40
> [ 53.986677] [<c1062735>] destroy_context+0x25/0x40
> [ 53.986677] [<c10a764e>] __mmdrop+0x1e/0xc0
> [ 53.986677] [<c10c9858>] finish_task_switch+0xd8/0x1a0
> [ 53.986677] [<c1863736>] __schedule+0x316/0x950
> [ 53.986677] [<c1863d96>] schedule+0x26/0x70
> [ 53.986677] [<c10ac613>] do_wait+0x1b3/0x200
> [ 53.986677] [<c10ac9d7>] SyS_waitpid+0x67/0xd0
> [ 53.986677] [<c10aa820>] ? task_stopped_code+0x50/0x50
> [ 53.986677] [<c186717a>] syscall_call+0x7/0x7
> [ 53.986677] Code: e8 c1 e3 0c 81 eb 00 00 00 40 39 5d ec 74 11 8b
> 4d e4 8b 55 e0 31 f6 e8 dd e0 fa ff 85 c0 75 0d 83 c4 14 5b 5e 5f 5d
> c3 90 0f 0b <0f> 0b 0f 0b 8d 76 00 8d bc 27 00 00 00 00 85 d2 74 31 55
> 89 e5
> [ 53.986677] EIP: [<c10530f2>] set_aliased_prot+0xb2/0xc0 SS:ESP 0069:c0875e74
> [ 54.010069] ---[ end trace 89ac35b29c1c59bb ]---
>
> Is that the error you're seeing?
>
> If I change xen_free_ldt to:
>
> static void xen_free_ldt(struct desc_struct *ldt, unsigned entries)
> {
> const unsigned entries_per_page = PAGE_SIZE / LDT_ENTRY_SIZE;
> int i;
>
> vm_unmap_aliases();
> xen_mc_flush();
>
> for(i = 0; i < entries; i += entries_per_page)
> set_aliased_prot(ldt + i, PAGE_KERNEL);
> }
>
> then it works. I don't know why this makes a difference.
> (xen_mc_flush makes a little bit of sense to me. vm_unmap_aliases
> doesn't.)
>

That fix makes sense if there's some way that the vmalloc area we're
freeing has an extra alias somewhere, which is very much possible. On
the other hand, I don't see how this happens without first doing an
MMUEXT_SET_LDT with an unexpectedly aliased address, and I would have
expected that to blow up and/or result in test case failures.

But I'm still confused, because it seems like Xen will never populate
the actual (hidden) LDT mapping unless the pages backing it are
unaliased and well-formed, which make me wonder why this stuff ever
worked. Wouldn't LDT access with pre-existing vmalloc aliases result
in segfaults?

The semantics seem to be very odd. xen_free_ldt with an aliased
address might fail (and OOPS), but actual access to the LDT with an
aliased address page faults.

Also, using kzalloc for everything fixes the problem, which suggests
that there really is something to my theory that the problem involves
unexpected aliases.

--Andy

2015-07-28 03:23:25

On 07/30/2015 04:05 PM, Andy Lutomirski wrote:
> On Thu, Jul 30, 2015 at 1:01 PM, Boris Ostrovsky
> <[email protected]> wrote:
>> On 07/30/2015 02:54 PM, Andrew Cooper wrote:
>>> On 30/07/15 19:30, Andy Lutomirski wrote:
>>>> On Wed, Jul 29, 2015 at 5:29 PM, Andrew Cooper
>>>> <[email protected]> wrote:
>>>>> On 30/07/2015 00:13, Andy Lutomirski wrote:
>>>>>> On Wed, Jul 29, 2015 at 4:02 PM, Andrew Cooper
>>>>>> <[email protected]> wrote:
>>>>>>> On 29/07/2015 23:49, Boris Ostrovsky wrote:
>>>>>>>> On 07/29/2015 06:46 PM, David Vrabel wrote:
>>>>>>>>> On 29/07/2015 23:11, Andrew Cooper wrote:
>>>>>>>>>> On 29/07/2015 23:05, Andy Lutomirski wrote:
>>>>>>>>>>> On Wed, Jul 29, 2015 at 2:37 PM, Andrew Cooper
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> On 29/07/2015 22:26, Andy Lutomirski wrote:
>>>>>>>>>>>>> On Wed, Jul 29, 2015 at 2:23 PM, Boris Ostrovsky
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> On 07/29/2015 03:03 PM, Andrew Cooper wrote:
>>>>>>>>>>>>>>> On 29/07/15 15:43, Boris Ostrovsky wrote:
>>>>>>>>>>>>>>>> FYI, I have got a repro now and am investigating.
>>>>>>>>>>>>>>> Good and bad news. This bug has nothing to do with LDTs
>>>>>>>>>>>>>>> themselves.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have worked out what is going on, but this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/arch/x86/xen/enlighten.c
>>>>>>>>>>>>>>> b/arch/x86/xen/enlighten.c
>>>>>>>>>>>>>>> index 5abeaac..7e1a82e 100644
>>>>>>>>>>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>>>>>>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>>>>>>>>>>> @@ -493,6 +493,7 @@ static void set_aliased_prot(void *v,
>>>>>>>>>>>>>>> pgprot_t prot)
>>>>>>>>>>>>>>> pte = pfn_pte(pfn, prot);
>>>>>>>>>>>>>>> + (void)*(volatile int*)v;
>>>>>>>>>>>>>>> if (HYPERVISOR_update_va_mapping((unsigned long)v,
>>>>>>>>>>>>>>> pte, 0)) {
>>>>>>>>>>>>>>> pr_err("set_aliased_prot va update failed
>>>>>>>>>>>>>>> w/
>>>>>>>>>>>>>>> lazy mode
>>>>>>>>>>>>>>> %u\n", paravirt_get_lazy_mode());
>>>>>>>>>>>>>>> BUG();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is perhaps not the fix we are looking for, and every use of
>>>>>>>>>>>>>>> HYPERVISOR_update_va_mapping() is susceptible to the same
>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>> I think in most cases we know that page is mapped so hopefully
>>>>>>>>>>>>>> this is the
>>>>>>>>>>>>>> only site that we need to be careful about.
>>>>>>>>>>>>> Is there any chance we can get some kind of quick-and-dirty fix
>>>>>>>>>>>>> that
>>>>>>>>>>>>> can go to x86/urgent in the next few days even if a clean fix
>>>>>>>>>>>>> isn't
>>>>>>>>>>>>> available yet?
>>>>>>>>>>>> Quick and dirty?
>>>>>>>>>>>>
>>>>>>>>>>>> Reading from v is the most obvious and quick way, for areas where
>>>>>>>>>>>> we are
>>>>>>>>>>>> certain v exists, is kernel memory and is expected to have a
>>>>>>>>>>>> backing
>>>>>>>>>>>> page. I don't know offhand how many of current
>>>>>>>>>>>> HYPERVISOR_update_va_mapping() callsites this applies to.
>>>>>>>>>>> __get_user((char *)v, tmp), perhaps, unless there's something
>>>>>>>>>>> better
>>>>>>>>>>> in the wings. Keep in mind that we need this for -stable, and
>>>>>>>>>>> it's
>>>>>>>>>>> likely to get backported quite quickly due to CVE-2015-5157.
>>>>>>>>>> Hmm - something like that tucked inside
>>>>>>>>>> HYPERVISOR_update_va_mapping()
>>>>>>>>>> would probably work, and certainly be minimal hassle for -stable.
>>>>>>>>>>
>>>>>>>>>> Altering the hypercall used is certainly not something to backport,
>>>>>>>>>> nor
>>>>>>>>>> are we sure it is a viable fix at this time.
>>>>>>>>> Changing this one use of update_va_mapping to use
>>>>>>>>> mmu_update_normal_pt
>>>>>>>>> is the correct fix to unblock this LDT series. I see no reason why
>>>>>>>>> this
>>>>>>>>> cannot be backported.
>>>>>>>> To properly fix it should include batching and that is not something
>>>>>>>> that I think we should target for stable.
>>>>>>> Batching is absolutely not necessary to alter update_va_mapping to
>>>>>>> mmu_update_normal_pt. After all, update_va_mapping isn't batched.
>>>>>>>
>>>>>>> However this isn't the first issue issue we have had lazy mmu
>>>>>>> faulting,
>>>>>>> and I doubt it is the last. There are not many callsites of
>>>>>>> update_va_mapping - I will audit them tomorrow and see if any similar
>>>>>>> issues are lurking elsewhere.
>>>>>> One thing I should add: nothing flushes old aliases in xen_alloc_ldt,
>>>>>> yet I haven't been able to get xen_alloc_ldt to fail or subsequent LDT
>>>>>> access to fault. Is this something we should be worried about?
>>>>> Yes. update_va_mapping() will function perfectly well taking one RW
>>>>> mapping to RO even if there is a second RW mapping. In such a case, the
>>>>> next LDT access will fault.
>>>> Which is a problem because that alias might still exist, and also
>>>> because Linux really doesn't expect that fault.
>>>>
>>>>> On closer inspection, Xen is rather unhelpful with the fault. Xen's
>>>>> lazy #PF will be bounced back to the guest with cr2 adjusted to appear
>>>>> in the range passed to set_ldt(). The error code however will be
>>>>> unmodified (and limited only by not-user and not-reserved), so will
>>>>> appear as a non-present read or write supervisor access to an address
>>>>> which the kernel has a valid read mapping of.
>>>> More yuck.
>>>>
>>>> I think I'm just going to stick an unconditional vm_flush_aliases in
>>>> alloc_ldt.
>>>>
>>>>> Therefore, set_ldt() needs to be confident that there are no writeable
>>>>> mappings to the frames used to make up the LDT. It could proactively
>>>>> fault them in by accessing one descriptor in each page inside the limit,
>>>>> but by the time a fault is received it is probably too late to work out
>>>>> where the other mapping is which prevented the typechange (or indeed,
>>>>> whether Xen objected to one of the descriptors instead).
>>>> This seems like overkill.
>>>>
>>>> I'm still a bit confused, though: the failure is in xen_free_ldt. How
>>>> do we make it all the way to xen_free_ldt without the vmapped page
>>>> existing in the guest's page tables? After all, we had to survive
>>>> xen_alloc_ldt first, and ISTM that should fail in exactly the same
>>>> way.
>>> (Summarising part of a discussion which has just occurred on IRC)
>>>
>>> I presume that xen_free_ldt() is called while in the context of an mm
>>> which doesn't have the particular area of the vmalloc() space faulted in.
>>
>> This is exactly what's happening --- the bug is only triggered during exit
>> and xen_free_ldt() is called from someone else's context, e.g.:
>>
>> [ 53.986677] Call Trace:
>> [ 53.986677] [<c105312d>] xen_free_ldt+0x2d/0x40
>> [ 53.986677] [<c1062310>] free_ldt_struct.part.1+0x10/0x40
>> [ 53.986677] [<c1062735>] destroy_context+0x25/0x40
>> [ 53.986677] [<c10a764e>] __mmdrop+0x1e/0xc0
>> [ 53.986677] [<c10c9858>] finish_task_switch+0xd8/0x1a0
>> [ 53.986677] [<c1863736>] __schedule+0x316/0x950
>> [ 53.986677] [<c1863d96>] schedule+0x26/0x70
>> [ 53.986677] [<c10ac613>] do_wait+0x1b3/0x200
>> [ 53.986677] [<c10ac9d7>] SyS_waitpid+0x67/0xd0
>> [ 53.986677] [<c10aa820>] ? task_stopped_code+0x50/0x50
>> [ 53.986677] [<c186717a>] syscall_call+0x7/0x7
>>
>> But that would imply that this other context has mm->context.ldt of
>> ldt_gdt_32. How is that possible?
>>
> It's freed via destroy_context, which destroys someone else's LDT, right?
>

Yes, that's what it appears to be.

-boris