2013-04-08 22:45:24

by Kees Cook

[permalink] [raw]
Subject: [PATCH] x86: make IDT read-only

This makes the IDT unconditionally read-only. This primarily removes
the IDT from being a target for arbitrary memory write attacks. It has
an added benefit of also not leaking (via the "sidt" instruction) the
kernel base offset, if it has been relocated.

Signed-off-by: Kees Cook <[email protected]>
Cc: Eric Northup <[email protected]>
---
arch/x86/include/asm/fixmap.h | 4 +---
arch/x86/kernel/cpu/intel.c | 15 ---------------
arch/x86/kernel/traps.c | 8 ++++++++
arch/x86/xen/mmu.c | 4 +---
4 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index a09c285..51b9e32 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -104,9 +104,7 @@ enum fixed_addresses {
FIX_LI_PCIA, /* Lithium PCI Bridge A */
FIX_LI_PCIB, /* Lithium PCI Bridge B */
#endif
-#ifdef CONFIG_X86_F00F_BUG
- FIX_F00F_IDT, /* Virtual mapping for IDT */
-#endif
+ FIX_RO_IDT, /* Virtual mapping for read-only IDT */
#ifdef CONFIG_X86_CYCLONE_TIMER
FIX_CYCLONE_TIMER, /*cyclone timer register*/
#endif
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1905ce9..76148a3 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -164,20 +164,6 @@ int __cpuinit ppro_with_ram_bug(void)
return 0;
}

-#ifdef CONFIG_X86_F00F_BUG
-static void __cpuinit trap_init_f00f_bug(void)
-{
- __set_fixmap(FIX_F00F_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);
-
- /*
- * Update the IDT descriptor and reload the IDT so that
- * it uses the read-only mapped virtual address.
- */
- idt_descr.address = fix_to_virt(FIX_F00F_IDT);
- load_idt(&idt_descr);
-}
-#endif
-
static void __cpuinit intel_smp_check(struct cpuinfo_x86 *c)
{
/* calling is from identify_secondary_cpu() ? */
@@ -215,7 +201,6 @@ static void __cpuinit intel_workarounds(struct cpuinfo_x86 *c)

c->f00f_bug = 1;
if (!f00f_workaround_enabled) {
- trap_init_f00f_bug();
printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n");
f00f_workaround_enabled = 1;
}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 68bda7a..a2a9b78 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -753,6 +753,14 @@ void __init trap_init(void)
#endif

/*
+ * Set the IDT descriptor to a fixed read-only location, so that the
+ * "sidt" instruction will not leak the location of the kernel, and
+ * to defend the IDT against arbitrary memory write vulnerabilities.
+ * It will be reloaded in cpu_init() */
+ __set_fixmap(FIX_RO_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);
+ idt_descr.address = fix_to_virt(FIX_RO_IDT);
+
+ /*
* Should be a barrier for any external CPU state:
*/
cpu_init();
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 6afbb2c..8bc4dec 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2039,9 +2039,7 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)

switch (idx) {
case FIX_BTMAP_END ... FIX_BTMAP_BEGIN:
-#ifdef CONFIG_X86_F00F_BUG
- case FIX_F00F_IDT:
-#endif
+ case FIX_RO_IDT:
#ifdef CONFIG_X86_32
case FIX_WP_TEST:
case FIX_VDSO:
--
1.7.9.5


--
Kees Cook
Chrome OS Security


2013-04-08 22:50:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On 04/08/2013 03:43 PM, Kees Cook wrote:
> This makes the IDT unconditionally read-only. This primarily removes
> the IDT from being a target for arbitrary memory write attacks. It has
> an added benefit of also not leaking (via the "sidt" instruction) the
> kernel base offset, if it has been relocated.
>
> Signed-off-by: Kees Cook <[email protected]>
> Cc: Eric Northup <[email protected]>

This isn't quite what this patch does, though, right? There is still a
writable IDT mapping at all times, which is different from a true
readonly IDT, no?

-hpa

2013-04-08 22:51:08

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On 04/08/2013 03:43 PM, Kees Cook wrote:
> This makes the IDT unconditionally read-only. This primarily removes
> the IDT from being a target for arbitrary memory write attacks. It has
> an added benefit of also not leaking (via the "sidt" instruction) the
> kernel base offset, if it has been relocated.
>
> Signed-off-by: Kees Cook <[email protected]>
> Cc: Eric Northup <[email protected]>

Also, tglx: does this interfere with your per-cpu IDT efforts?

-hpa

2013-04-08 22:55:13

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, Apr 8, 2013 at 3:47 PM, H. Peter Anvin <[email protected]> wrote:
> On 04/08/2013 03:43 PM, Kees Cook wrote:
>> This makes the IDT unconditionally read-only. This primarily removes
>> the IDT from being a target for arbitrary memory write attacks. It has
>> an added benefit of also not leaking (via the "sidt" instruction) the
>> kernel base offset, if it has been relocated.
>>
>> Signed-off-by: Kees Cook <[email protected]>
>> Cc: Eric Northup <[email protected]>
>
> This isn't quite what this patch does, though, right? There is still a
> writable IDT mapping at all times, which is different from a true
> readonly IDT, no?

Ah, I guess that's true. I suppose I should say it makes the memory
seen at the "sidt" location read-only. Can we make them both
read-only?

-Kees

--
Kees Cook
Chrome OS Security

2013-04-08 23:00:07

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, Apr 8, 2013 at 3:56 PM, Maciej W. Rozycki <[email protected]> wrote:
> On Mon, 8 Apr 2013, Kees Cook wrote:
>
>> This makes the IDT unconditionally read-only. This primarily removes
>> the IDT from being a target for arbitrary memory write attacks. It has
>> an added benefit of also not leaking (via the "sidt" instruction) the
>> kernel base offset, if it has been relocated.
> [...]
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -215,7 +201,6 @@ static void __cpuinit intel_workarounds(struct cpuinfo_x86 *c)
>>
>> c->f00f_bug = 1;
>> if (!f00f_workaround_enabled) {
>> - trap_init_f00f_bug();
>> printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n");
>> f00f_workaround_enabled = 1;
>> }
>
> FWIW the change looks reasonable to me, however given that that it makes
> the arrangement unconditional and there is no longer a workaround to
> enable I think it would make sense to remove the conditional block quoted
> above altogether, along with the f00f_workaround_enabled variable itself
> (alternatively "Intel Pentium with F0 0F bug" alone could be printed
> instead and the name of the variable adjusted to make sense with the new
> meaning -- up to you to decide).

Ah, yes, I misread this and didn't see that the ifdef ended 2 lines
further down. :) I'll just remove the entire section of code.

-Kees

--
Kees Cook
Chrome OS Security

2013-04-08 23:01:44

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, 8 Apr 2013, Kees Cook wrote:

> This makes the IDT unconditionally read-only. This primarily removes
> the IDT from being a target for arbitrary memory write attacks. It has
> an added benefit of also not leaking (via the "sidt" instruction) the
> kernel base offset, if it has been relocated.
[...]
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -215,7 +201,6 @@ static void __cpuinit intel_workarounds(struct cpuinfo_x86 *c)
>
> c->f00f_bug = 1;
> if (!f00f_workaround_enabled) {
> - trap_init_f00f_bug();
> printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n");
> f00f_workaround_enabled = 1;
> }

FWIW the change looks reasonable to me, however given that that it makes
the arrangement unconditional and there is no longer a workaround to
enable I think it would make sense to remove the conditional block quoted
above altogether, along with the f00f_workaround_enabled variable itself
(alternatively "Intel Pentium with F0 0F bug" alone could be printed
instead and the name of the variable adjusted to make sense with the new
meaning -- up to you to decide).

Maciej

2013-04-08 23:05:57

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, Apr 8, 2013 at 3:56 PM, Maciej W. Rozycki <[email protected]> wrote:
> On Mon, 8 Apr 2013, Kees Cook wrote:
>
>> This makes the IDT unconditionally read-only. This primarily removes
>> the IDT from being a target for arbitrary memory write attacks. It has
>> an added benefit of also not leaking (via the "sidt" instruction) the
>> kernel base offset, if it has been relocated.
> [...]
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -215,7 +201,6 @@ static void __cpuinit intel_workarounds(struct cpuinfo_x86 *c)
>>
>> c->f00f_bug = 1;
>> if (!f00f_workaround_enabled) {
>> - trap_init_f00f_bug();
>> printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n");
>> f00f_workaround_enabled = 1;
>> }
>
> FWIW the change looks reasonable to me, however given that that it makes
> the arrangement unconditional and there is no longer a workaround to
> enable I think it would make sense to remove the conditional block quoted
> above altogether, along with the f00f_workaround_enabled variable itself
> (alternatively "Intel Pentium with F0 0F bug" alone could be printed
> instead and the name of the variable adjusted to make sense with the new
> meaning -- up to you to decide).

Actually, I take it back. The other portion of the workaround is still
active (in mm/fault.c), and this chunk announces it, so I'm going to
leave it as-is.

-Kees

--
Kees Cook
Chrome OS Security

2013-04-08 23:42:53

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, 8 Apr 2013, Kees Cook wrote:

> > FWIW the change looks reasonable to me, however given that that it makes
> > the arrangement unconditional and there is no longer a workaround to
> > enable I think it would make sense to remove the conditional block quoted
> > above altogether, along with the f00f_workaround_enabled variable itself
> > (alternatively "Intel Pentium with F0 0F bug" alone could be printed
> > instead and the name of the variable adjusted to make sense with the new
> > meaning -- up to you to decide).
>
> Actually, I take it back. The other portion of the workaround is still
> active (in mm/fault.c), and this chunk announces it, so I'm going to
> leave it as-is.

Ah, OK then. Thanks for double-checking.

Maciej

2013-04-09 09:23:28

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Mon, 8 Apr 2013, H. Peter Anvin wrote:

> On 04/08/2013 03:43 PM, Kees Cook wrote:
> > This makes the IDT unconditionally read-only. This primarily removes
> > the IDT from being a target for arbitrary memory write attacks. It has
> > an added benefit of also not leaking (via the "sidt" instruction) the
> > kernel base offset, if it has been relocated.
> >
> > Signed-off-by: Kees Cook <[email protected]>
> > Cc: Eric Northup <[email protected]>
>
> Also, tglx: does this interfere with your per-cpu IDT efforts?

I don't think so. And it's on the backburner at the moment.

Thanks,

tglx

2013-04-09 09:46:14

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

"H. Peter Anvin" <[email protected]> writes:

> On 04/08/2013 03:43 PM, Kees Cook wrote:
>> This makes the IDT unconditionally read-only. This primarily removes
>> the IDT from being a target for arbitrary memory write attacks. It has
>> an added benefit of also not leaking (via the "sidt" instruction) the
>> kernel base offset, if it has been relocated.
>>
>> Signed-off-by: Kees Cook <[email protected]>
>> Cc: Eric Northup <[email protected]>
>
> Also, tglx: does this interfere with your per-cpu IDT efforts?

Given that we don't change any IDT entries why would anyone want a
per-cpu IDT? The cache lines should easily be shared accross all
processors.

Or are there some giant NUMA machines that trigger cache misses when
accessing the IDT and the penalty for pulling the cache line across
the NUMA fabric is prohibitive?

Eric

2013-04-09 18:22:29

by Kees Cook

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On Tue, Apr 9, 2013 at 2:23 AM, Thomas Gleixner <[email protected]> wrote:
> On Mon, 8 Apr 2013, H. Peter Anvin wrote:
>
>> On 04/08/2013 03:43 PM, Kees Cook wrote:
>> > This makes the IDT unconditionally read-only. This primarily removes
>> > the IDT from being a target for arbitrary memory write attacks. It has
>> > an added benefit of also not leaking (via the "sidt" instruction) the
>> > kernel base offset, if it has been relocated.
>> >
>> > Signed-off-by: Kees Cook <[email protected]>
>> > Cc: Eric Northup <[email protected]>
>>
>> Also, tglx: does this interfere with your per-cpu IDT efforts?
>
> I don't think so. And it's on the backburner at the moment.

What would be a good way to do something similar for the GDT? sgdt
leaks GDT location as well, and even though it's percpu, it should be
trivial to figure out a kernel base address, IIUC.

$ ./sgdt
ffff88001fc04000
# cat /sys/kernel/debug/kernel_page_tables
...
---[ Low Kernel Mapping ]---
...
0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd

With the IDT patch, things look good for sidt:

$ ./sidt
ffffffffff579000
# cat /sys/kernel/debug/kernel_page_tables
...
---[ End Modules ]---
0xffffffffff579000-0xffffffffff57a000 4K ro GLB NX pte

Can we create a RO fixed per-cpu area?

-Kees

--
Kees Cook
Chrome OS Security

2013-04-09 18:28:52

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On 04/09/2013 11:22 AM, Kees Cook wrote:
>
> $ ./sgdt
> ffff88001fc04000
> # cat /sys/kernel/debug/kernel_page_tables
> ...
> ---[ Low Kernel Mapping ]---
> ...
> 0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd
>

That is the 1:1 memory map area...

-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-04-09 18:31:51

by Kees Cook

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On Tue, Apr 9, 2013 at 11:26 AM, H. Peter Anvin <[email protected]> wrote:
> On 04/09/2013 11:22 AM, Kees Cook wrote:
>>
>> $ ./sgdt
>> ffff88001fc04000
>> # cat /sys/kernel/debug/kernel_page_tables
>> ...
>> ---[ Low Kernel Mapping ]---
>> ...
>> 0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd
>>
>
> That is the 1:1 memory map area...

Meaning what?

-Kees

--
Kees Cook
Chrome OS Security

2013-04-09 18:44:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On 04/09/2013 11:31 AM, Kees Cook wrote:
>>> ...
>>> 0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd
>>>
>>
>> That is the 1:1 memory map area...
>
> Meaning what?
>
> -Kees
>

That's the area in which we just map 1:1 to memory. Anything allocated
with e.g. kmalloc() ends up with those addresses.

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-04-09 18:46:10

by Kees Cook

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On Tue, Apr 9, 2013 at 11:39 AM, H. Peter Anvin <[email protected]> wrote:
> On 04/09/2013 11:31 AM, Kees Cook wrote:
>>>> ...
>>>> 0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd
>>>>
>>>
>>> That is the 1:1 memory map area...
>>
>> Meaning what?
>>
>> -Kees
>>
>
> That's the area in which we just map 1:1 to memory. Anything allocated
> with e.g. kmalloc() ends up with those addresses.

Ah-ha! Yes, I see now when comparing the debug/kernel_page_tables
reports. It's just the High Kernel Mapping that we care about.
Addresses outside that range are less of a leak. Excellent, then GDT
may not be a problem. Whew.

Does the v2 IDT patch look okay, BTW?

-Kees

--
Kees Cook
Chrome OS Security

2013-04-09 18:53:06

by Kees Cook

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On Tue, Apr 9, 2013 at 11:50 AM, H. Peter Anvin <[email protected]> wrote:
> On 04/09/2013 11:46 AM, Kees Cook wrote:
>>
>> Ah-ha! Yes, I see now when comparing the debug/kernel_page_tables
>> reports. It's just the High Kernel Mapping that we care about.
>> Addresses outside that range are less of a leak. Excellent, then GDT
>> may not be a problem. Whew.
>>
>
> It does beg the question if we need to randomize kmalloc... which could
> have issues by itself.

Agreed, but this should be a separate issue. As is the fact that GDT
is writable and a discoverable target.

-Kees

--
Kees Cook
Chrome OS Security

2013-04-09 18:54:18

by Eric Northup

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On Tue, Apr 9, 2013 at 11:46 AM, Kees Cook <[email protected]> wrote:
> On Tue, Apr 9, 2013 at 11:39 AM, H. Peter Anvin <[email protected]> wrote:
>> On 04/09/2013 11:31 AM, Kees Cook wrote:
>>>>> ...
>>>>> 0xffff880001e00000-0xffff88001fe00000 480M RW PSE GLB NX pmd
>>>>>
>>>>
>>>> That is the 1:1 memory map area...
>>>
>>> Meaning what?
>>>
>>> -Kees
>>>
>>
>> That's the area in which we just map 1:1 to memory. Anything allocated
>> with e.g. kmalloc() ends up with those addresses.
>
> Ah-ha! Yes, I see now when comparing the debug/kernel_page_tables
> reports. It's just the High Kernel Mapping that we care about.
> Addresses outside that range are less of a leak. Excellent, then GDT
> may not be a problem. Whew.

The GDT is a problem if the address returned by 'sgdt' is
kernel-writable - it doesn't necessarily reveal the random offset, but
I'm pretty sure that writing to the GDT could cause privilege
escalation.

>
> Does the v2 IDT patch look okay, BTW?
>
> -Kees
>
> --
> Kees Cook
> Chrome OS Security

2013-04-09 18:57:05

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On 04/09/2013 11:46 AM, Kees Cook wrote:
>
> Ah-ha! Yes, I see now when comparing the debug/kernel_page_tables
> reports. It's just the High Kernel Mapping that we care about.
> Addresses outside that range are less of a leak. Excellent, then GDT
> may not be a problem. Whew.
>

It does beg the question if we need to randomize kmalloc... which could
have issues by itself.

-hpa



--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-04-09 19:02:13

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On 04/09/2013 11:54 AM, Eric Northup wrote:
>
> The GDT is a problem if the address returned by 'sgdt' is
> kernel-writable - it doesn't necessarily reveal the random offset, but
> I'm pretty sure that writing to the GDT could cause privilege
> escalation.
>

That is a pretty safe assumption...

-hpa

2013-04-10 00:19:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only

On 04/09/2013 11:22 AM, Kees Cook wrote:
>
> Can we create a RO fixed per-cpu area?
>

"Fixed" and "percpu" are mutually exclusive...

-hpa

2013-04-10 00:44:18

by H. Peter Anvin

[permalink] [raw]
Subject: Readonly GDT

OK, thinking about the GDT here.

The GDT is quite small -- 256 bytes on i386, 128 bytes on x86-64. As
such, we probably don't want to allocate a full page to it for only
that. This means that in order to create a readonly mapping we have to
pack GDTs from different CPUs together in the same pages, *or* we
tolerate that other things on the same page gets reflected in the same
mapping.

However, the packing solution has the advantage of reducing address
space consumption which matters on 32 bits: even on i386 we can easily
burn a megabyte of address space for 4096 processors, but burning 16
megabytes starts to hurt.

It would be important to measure the performance impact on task switch,
though.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-04-10 00:53:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: Readonly GDT

On Tue, 2013-04-09 at 17:43 -0700, H. Peter Anvin wrote:
> OK, thinking about the GDT here.
>
> The GDT is quite small -- 256 bytes on i386, 128 bytes on x86-64. As
> such, we probably don't want to allocate a full page to it for only
> that. This means that in order to create a readonly mapping we have to
> pack GDTs from different CPUs together in the same pages, *or* we
> tolerate that other things on the same page gets reflected in the same
> mapping.

What about grouping via nodes?

>
> However, the packing solution has the advantage of reducing address
> space consumption which matters on 32 bits: even on i386 we can easily
> burn a megabyte of address space for 4096 processors, but burning 16
> megabytes starts to hurt.

Having 4096 32 bit processors, you deserve what you get. ;-)

-- Steve

>
> It would be important to measure the performance impact on task switch,
> though.

2013-04-10 01:05:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Readonly GDT

On 04/09/2013 05:53 PM, Steven Rostedt wrote:
> On Tue, 2013-04-09 at 17:43 -0700, H. Peter Anvin wrote:
>> OK, thinking about the GDT here.
>>
>> The GDT is quite small -- 256 bytes on i386, 128 bytes on x86-64. As
>> such, we probably don't want to allocate a full page to it for only
>> that. This means that in order to create a readonly mapping we have to
>> pack GDTs from different CPUs together in the same pages, *or* we
>> tolerate that other things on the same page gets reflected in the same
>> mapping.
>
> What about grouping via nodes?
>

Would be nicer for locality, although probably adds [even] more complexity.

We don't really care about 32-bit NUMA anymore -- it keeps getting
suggested for deletion, even. For 64-bit it might make sense to just
reflect out of the percpu area even though it munches address space.

>>
>> However, the packing solution has the advantage of reducing address
>> space consumption which matters on 32 bits: even on i386 we can easily
>> burn a megabyte of address space for 4096 processors, but burning 16
>> megabytes starts to hurt.
>
> Having 4096 32 bit processors, you deserve what you get. ;-)
>

Well, the main problem is that it might get difficult to make this a
runtime thing; it more likely ends up being a compile-time bit.

-hpa

2013-04-10 09:41:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only


* Kees Cook <[email protected]> wrote:

> > That's the area in which we just map 1:1 to memory. Anything allocated with
> > e.g. kmalloc() ends up with those addresses.
>
> Ah-ha! Yes, I see now when comparing the debug/kernel_page_tables reports. It's
> just the High Kernel Mapping that we care about. Addresses outside that range
> are less of a leak. Excellent, then GDT may not be a problem. Whew.

It's still an infoleak to worry about: any function pointers nearby matter, and
the x86 GDT is obviously full of useful and highly privilege-relevant function
pointers ...

I have no objections against read-only mapping the GDT as well.

Thanks,

Ingo

2013-04-10 09:42:39

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] Readonly GDT

>>> On 10.04.13 at 02:43, "H. Peter Anvin" <[email protected]> wrote:
> OK, thinking about the GDT here.
>
> The GDT is quite small -- 256 bytes on i386, 128 bytes on x86-64. As
> such, we probably don't want to allocate a full page to it for only
> that. This means that in order to create a readonly mapping we have to
> pack GDTs from different CPUs together in the same pages, *or* we
> tolerate that other things on the same page gets reflected in the same
> mapping.

I think a read-only GDT is incompatible with exceptions delivered
through task gates (i.e. double fault on 32-bit), so I would assume
this needs to remain a 64-bit only thing.

> However, the packing solution has the advantage of reducing address
> space consumption which matters on 32 bits: even on i386 we can easily
> burn a megabyte of address space for 4096 processors, but burning 16
> megabytes starts to hurt.

Packing would have the additional benefit of Xen not needing to
become a special case in yet another area (because pages
containing live descriptor table entries need to be read-only for
PV guests, and need to consist of only descriptor table entries).

Jan

2013-04-10 09:52:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: [PATCH] x86: make IDT read-only


* H. Peter Anvin <[email protected]> wrote:

> On 04/09/2013 11:22 AM, Kees Cook wrote:
> >
> > Can we create a RO fixed per-cpu area?
> >
>
> "Fixed" and "percpu" are mutually exclusive...

There's a fixmap area that holds kmap_atomic() percpu mappings:

FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,

In a similar fashion we could use a per CPU high-mapped read-only alias as well
(assuming it fits, memory is pretty tight there).

Thanks,

Ingo

2013-04-10 09:57:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only


* Eric W. Biederman <[email protected]> wrote:

> "H. Peter Anvin" <[email protected]> writes:
>
> > On 04/08/2013 03:43 PM, Kees Cook wrote:
> >> This makes the IDT unconditionally read-only. This primarily removes
> >> the IDT from being a target for arbitrary memory write attacks. It has
> >> an added benefit of also not leaking (via the "sidt" instruction) the
> >> kernel base offset, if it has been relocated.
> >>
> >> Signed-off-by: Kees Cook <[email protected]>
> >> Cc: Eric Northup <[email protected]>
> >
> > Also, tglx: does this interfere with your per-cpu IDT efforts?
>
> Given that we don't change any IDT entries why would anyone want a
> per-cpu IDT? The cache lines should easily be shared accross all
> processors.

That's true iif they are cached.

If not then it's a remote DRAM access cache miss for all CPUs except the node that
holds that memory.

> Or are there some giant NUMA machines that trigger cache misses when accessing
> the IDT and the penalty for pulling the cache line across the NUMA fabric is
> prohibitive?

IDT accesses for pure userspace execution are pretty rare. So we are not just
talking about huge NUMA machines here but about ordinary NUMA machines taking a
remote cache miss hit for the first IRQ or other IDT-accessing operation they do
after some cache-intense user-space processing.

It's a small effect, but it exists and improving it would be legitimate.

Thanks,

Ingo

2013-04-10 10:40:54

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

Ingo Molnar <[email protected]> writes:

> * Eric W. Biederman <[email protected]> wrote:
>
>> "H. Peter Anvin" <[email protected]> writes:
>>
>> > On 04/08/2013 03:43 PM, Kees Cook wrote:
>> >> This makes the IDT unconditionally read-only. This primarily removes
>> >> the IDT from being a target for arbitrary memory write attacks. It has
>> >> an added benefit of also not leaking (via the "sidt" instruction) the
>> >> kernel base offset, if it has been relocated.
>> >>
>> >> Signed-off-by: Kees Cook <[email protected]>
>> >> Cc: Eric Northup <[email protected]>
>> >
>> > Also, tglx: does this interfere with your per-cpu IDT efforts?
>>
>> Given that we don't change any IDT entries why would anyone want a
>> per-cpu IDT? The cache lines should easily be shared accross all
>> processors.
>
> That's true iif they are cached.
>
> If not then it's a remote DRAM access cache miss for all CPUs except the node that
> holds that memory.
>
>> Or are there some giant NUMA machines that trigger cache misses when accessing
>> the IDT and the penalty for pulling the cache line across the NUMA fabric is
>> prohibitive?
>
> IDT accesses for pure userspace execution are pretty rare. So we are not just
> talking about huge NUMA machines here but about ordinary NUMA machines taking a
> remote cache miss hit for the first IRQ or other IDT-accessing operation they do
> after some cache-intense user-space processing.
>
> It's a small effect, but it exists and improving it would be
> legitimate.

If the effect is measurable I agree it is a legitimate optimization. At
one point there was a suggestion to make the code in the IDT vectors
differ based on the which interrupt was registed. While that can also
reduce cache misses that can get hairy very quickly, and of course that
would require read-write IDTs.

My only practical concern with duplicating the IDT tables per cpu is (a)
there are generic idt handlers that remain unduplicated reducing the
benefit and this is essentially the same optimization as making the
entire kernel text per cpu which last time it was examined was not an
optimization worth making. So I wonder if just a subset of the
optimization is worth making.

Eric

2013-04-10 14:18:35

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Xen-devel] Readonly GDT

Right... the TSS does get written to during a task switch.

Jan Beulich <[email protected]> wrote:

>>>> On 10.04.13 at 02:43, "H. Peter Anvin" <[email protected]> wrote:
>> OK, thinking about the GDT here.
>>
>> The GDT is quite small -- 256 bytes on i386, 128 bytes on x86-64. As
>> such, we probably don't want to allocate a full page to it for only
>> that. This means that in order to create a readonly mapping we have
>to
>> pack GDTs from different CPUs together in the same pages, *or* we
>> tolerate that other things on the same page gets reflected in the
>same
>> mapping.
>
>I think a read-only GDT is incompatible with exceptions delivered
>through task gates (i.e. double fault on 32-bit), so I would assume
>this needs to remain a 64-bit only thing.
>
>> However, the packing solution has the advantage of reducing address
>> space consumption which matters on 32 bits: even on i386 we can
>easily
>> burn a megabyte of address space for 4096 processors, but burning 16
>> megabytes starts to hurt.
>
>Packing would have the additional benefit of Xen not needing to
>become a special case in yet another area (because pages
>containing live descriptor table entries need to be read-only for
>PV guests, and need to consist of only descriptor table entries).
>
>Jan

--
Sent from my mobile phone. Please excuse brevity and lack of formatting.

2013-04-10 16:31:45

by Eric Northup

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On Wed, Apr 10, 2013 at 3:40 AM, Eric W. Biederman
<[email protected]> wrote:
> Ingo Molnar <[email protected]> writes:
>
>> * Eric W. Biederman <[email protected]> wrote:
>>
>>> "H. Peter Anvin" <[email protected]> writes:
>>>
>>> > On 04/08/2013 03:43 PM, Kees Cook wrote:
>>> >> This makes the IDT unconditionally read-only. This primarily removes
>>> >> the IDT from being a target for arbitrary memory write attacks. It has
>>> >> an added benefit of also not leaking (via the "sidt" instruction) the
>>> >> kernel base offset, if it has been relocated.
>>> >>
>>> >> Signed-off-by: Kees Cook <[email protected]>
>>> >> Cc: Eric Northup <[email protected]>
>>> >
>>> > Also, tglx: does this interfere with your per-cpu IDT efforts?
>>>
>>> Given that we don't change any IDT entries why would anyone want a
>>> per-cpu IDT? The cache lines should easily be shared accross all
>>> processors.
>>
>> That's true iif they are cached.
>>
>> If not then it's a remote DRAM access cache miss for all CPUs except the node that
>> holds that memory.
>>
>>> Or are there some giant NUMA machines that trigger cache misses when accessing
>>> the IDT and the penalty for pulling the cache line across the NUMA fabric is
>>> prohibitive?
>>
>> IDT accesses for pure userspace execution are pretty rare. So we are not just
>> talking about huge NUMA machines here but about ordinary NUMA machines taking a
>> remote cache miss hit for the first IRQ or other IDT-accessing operation they do
>> after some cache-intense user-space processing.
>>
>> It's a small effect, but it exists and improving it would be
>> legitimate.
>
> If the effect is measurable I agree it is a legitimate optimization. At
> one point there was a suggestion to make the code in the IDT vectors
> differ based on the which interrupt was registed. While that can also
> reduce cache misses that can get hairy very quickly, and of course that
> would require read-write IDTs.

read-write IDT or GDT are fine: map them twice, once read+write, once
read-only. Point the GDTR and IDTR at the read-only alias.

>
> My only practical concern with duplicating the IDT tables per cpu is (a)
> there are generic idt handlers that remain unduplicated reducing the
> benefit and this is essentially the same optimization as making the
> entire kernel text per cpu which last time it was examined was not an
> optimization worth making. So I wonder if just a subset of the
> optimization is worth making.
>
> Eric

2013-04-10 16:50:42

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: make IDT read-only

On 04/10/2013 09:31 AM, Eric Northup wrote:
>>
>> If the effect is measurable I agree it is a legitimate optimization. At
>> one point there was a suggestion to make the code in the IDT vectors
>> differ based on the which interrupt was registed. While that can also
>> reduce cache misses that can get hairy very quickly, and of course that
>> would require read-write IDTs.
>
> read-write IDT or GDT are fine: map them twice, once read+write, once
> read-only. Point the GDTR and IDTR at the read-only alias.
>

Well, it is weaker, because if you can discover the pointer to the
writable alias you win.

Now, as has been pointed out the GDT needs to be writable on 32 bits as
a matter of hardware requirement. However, doing it for 64 bits only is
probably enough of a win.

-hpa

2013-04-10 18:33:11

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Xen-devel] Readonly GDT

On 04/10/2013 02:42 AM, Jan Beulich wrote:
>
>> However, the packing solution has the advantage of reducing address
>> space consumption which matters on 32 bits: even on i386 we can easily
>> burn a megabyte of address space for 4096 processors, but burning 16
>> megabytes starts to hurt.
>
> Packing would have the additional benefit of Xen not needing to
> become a special case in yet another area (because pages
> containing live descriptor table entries need to be read-only for
> PV guests, and need to consist of only descriptor table entries).
>

OK, so on 64 bits this sounds like a win all around, whereas it is not
feasible on 32 bits (which means we don't really need to worry about
burning address space.)

So, anyone interested in implementing this for 64 bits?

-hpa