2020-07-09 10:35:27

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH] x86/idt: Make sure idt_table takes a whole page

From: Joerg Roedel <[email protected]>

On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.

As a result, symbols from other .bss sections may end up on the same
4k page as the idt_table, and will accidentially get mapped read-only
during boot, causing unexpected page-faults when the kernel writes to
them.

Avoid this by making the idt_table 4kb in size even on x86-32. On
x86-64 the idt_table is already 4kb large, so nothing changes there.

Fixes: 3e77abda65b1c ("x86/idt: Consolidate idt functionality")
Signed-off-by: Joerg Roedel <[email protected]>
---
arch/x86/kernel/idt.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 0db21206f2f3..83e24f837127 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -157,8 +157,13 @@ static const __initconst struct idt_data apic_idts[] = {
#endif
};

-/* Must be page-aligned because the real IDT is used in the cpu entry area */
-static gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
+/*
+ * Must be page-aligned because the real IDT is used in the cpu entry area.
+ * Allocate more entries than needed so that idt_table takes a whole page, so it
+ * is safe to map the idt_table read-only and into the user-space page-table.
+ */
+#define IDT_ENTRIES_ALLOCATED (PAGE_SIZE / sizeof(gate_desc))
+static gate_desc idt_table[IDT_ENTRIES_ALLOCATED] __page_aligned_bss;

struct desc_ptr idt_descr __ro_after_init = {
.size = IDT_TABLE_SIZE - 1,
@@ -335,6 +340,9 @@ void __init idt_setup_apic_and_irq_gates(void)
idt_map_in_cea();
load_idt(&idt_descr);

+ BUILD_BUG_ON(IDT_ENTRIES_ALLOCATED < IDT_ENTRIES);
+ BUILD_BUG_ON(sizeof(idt_table) != PAGE_SIZE);
+
/* Make the IDT table read only */
set_memory_ro((unsigned long)&idt_table, 1);

--
2.27.0


2020-07-18 18:01:02

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

On July 9, 2020 3:33:55 AM PDT, Joerg Roedel <[email protected]> wrote:
>From: Joerg Roedel <[email protected]>
>
>On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
>page-aligned, but the end of the .bss..page_aligned section is not
>guaranteed to be page-aligned.
>
>As a result, symbols from other .bss sections may end up on the same
>4k page as the idt_table, and will accidentially get mapped read-only
>during boot, causing unexpected page-faults when the kernel writes to
>them.
>
>Avoid this by making the idt_table 4kb in size even on x86-32. On
>x86-64 the idt_table is already 4kb large, so nothing changes there.
>
>Fixes: 3e77abda65b1c ("x86/idt: Consolidate idt functionality")
>Signed-off-by: Joerg Roedel <[email protected]>
>---
> arch/x86/kernel/idt.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
>index 0db21206f2f3..83e24f837127 100644
>--- a/arch/x86/kernel/idt.c
>+++ b/arch/x86/kernel/idt.c
>@@ -157,8 +157,13 @@ static const __initconst struct idt_data
>apic_idts[] = {
> #endif
> };
>
>-/* Must be page-aligned because the real IDT is used in the cpu entry
>area */
>-static gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
>+/*
>+ * Must be page-aligned because the real IDT is used in the cpu entry
>area.
>+ * Allocate more entries than needed so that idt_table takes a whole
>page, so it
>+ * is safe to map the idt_table read-only and into the user-space
>page-table.
>+ */
>+#define IDT_ENTRIES_ALLOCATED (PAGE_SIZE / sizeof(gate_desc))
>+static gate_desc idt_table[IDT_ENTRIES_ALLOCATED] __page_aligned_bss;
>
> struct desc_ptr idt_descr __ro_after_init = {
> .size = IDT_TABLE_SIZE - 1,
>@@ -335,6 +340,9 @@ void __init idt_setup_apic_and_irq_gates(void)
> idt_map_in_cea();
> load_idt(&idt_descr);
>
>+ BUILD_BUG_ON(IDT_ENTRIES_ALLOCATED < IDT_ENTRIES);
>+ BUILD_BUG_ON(sizeof(idt_table) != PAGE_SIZE);
>+
> /* Make the IDT table read only */
> set_memory_ro((unsigned long)&idt_table, 1);
>

NAK... this isn't the right way to fix this and just really kicks the can down the road. The reason is that you aren't fixing the module that actually has a problem.

The Right Way[TM] is to figure out which module(s) lack the proper alignment for this section. A script using objdump -h or readelf -SW running over the .o files looking for alignment less than 2**12 should spot the modules that are missing the proper .balign directives.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-07-18 19:27:22

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page


> On Jul 18, 2020, at 10:57 AM, [email protected] wrote:
>
> On July 9, 2020 3:33:55 AM PDT, Joerg Roedel <[email protected]> wrote:
>> From: Joerg Roedel <[email protected]>
>>
>> On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
>> page-aligned, but the end of the .bss..page_aligned section is not
>> guaranteed to be page-aligned.
>>
>> As a result, symbols from other .bss sections may end up on the same
>> 4k page as the idt_table, and will accidentially get mapped read-only
>> during boot, causing unexpected page-faults when the kernel writes to
>> them.
>>
>> Avoid this by making the idt_table 4kb in size even on x86-32. On
>> x86-64 the idt_table is already 4kb large, so nothing changes there.
>>
>> Fixes: 3e77abda65b1c ("x86/idt: Consolidate idt functionality")
>> Signed-off-by: Joerg Roedel <[email protected]>
>> ---
>> arch/x86/kernel/idt.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
>> index 0db21206f2f3..83e24f837127 100644
>> --- a/arch/x86/kernel/idt.c
>> +++ b/arch/x86/kernel/idt.c
>> @@ -157,8 +157,13 @@ static const __initconst struct idt_data
>> apic_idts[] = {
>> #endif
>> };
>>
>> -/* Must be page-aligned because the real IDT is used in the cpu entry
>> area */
>> -static gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
>> +/*
>> + * Must be page-aligned because the real IDT is used in the cpu entry
>> area.
>> + * Allocate more entries than needed so that idt_table takes a whole
>> page, so it
>> + * is safe to map the idt_table read-only and into the user-space
>> page-table.
>> + */
>> +#define IDT_ENTRIES_ALLOCATED (PAGE_SIZE / sizeof(gate_desc))
>> +static gate_desc idt_table[IDT_ENTRIES_ALLOCATED] __page_aligned_bss;
>>
>> struct desc_ptr idt_descr __ro_after_init = {
>> .size = IDT_TABLE_SIZE - 1,
>> @@ -335,6 +340,9 @@ void __init idt_setup_apic_and_irq_gates(void)
>> idt_map_in_cea();
>> load_idt(&idt_descr);
>>
>> + BUILD_BUG_ON(IDT_ENTRIES_ALLOCATED < IDT_ENTRIES);
>> + BUILD_BUG_ON(sizeof(idt_table) != PAGE_SIZE);
>> +
>> /* Make the IDT table read only */
>> set_memory_ro((unsigned long)&idt_table, 1);
>>
>
> NAK... this isn't the right way to fix this and just really kicks the can down the road. The reason is that you aren't fixing the module that actually has a problem.
>
> The Right Way[TM] is to figure out which module(s) lack the proper alignment for this section. A script using objdump -h or readelf -SW running over the .o files looking for alignment less than 2**12 should spot the modules that are missing the proper .balign directives.

I don’t see the problem. If we are going to treat an object as though it’s 4096 bytes, making C think it’s 4096 bytes seems entirely reasonable to me.

> --

> Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-07-19 01:19:05

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

On July 18, 2020 12:25:46 PM PDT, Andy Lutomirski <[email protected]> wrote:
>
>> On Jul 18, 2020, at 10:57 AM, [email protected] wrote:
>>
>> On July 9, 2020 3:33:55 AM PDT, Joerg Roedel <[email protected]>
>wrote:
>>> From: Joerg Roedel <[email protected]>
>>>
>>> On x86-32 the idt_table with 256 entries needs only 2048 bytes. It
>is
>>> page-aligned, but the end of the .bss..page_aligned section is not
>>> guaranteed to be page-aligned.
>>>
>>> As a result, symbols from other .bss sections may end up on the same
>>> 4k page as the idt_table, and will accidentially get mapped
>read-only
>>> during boot, causing unexpected page-faults when the kernel writes
>to
>>> them.
>>>
>>> Avoid this by making the idt_table 4kb in size even on x86-32. On
>>> x86-64 the idt_table is already 4kb large, so nothing changes there.
>>>
>>> Fixes: 3e77abda65b1c ("x86/idt: Consolidate idt functionality")
>>> Signed-off-by: Joerg Roedel <[email protected]>
>>> ---
>>> arch/x86/kernel/idt.c | 12 ++++++++++--
>>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
>>> index 0db21206f2f3..83e24f837127 100644
>>> --- a/arch/x86/kernel/idt.c
>>> +++ b/arch/x86/kernel/idt.c
>>> @@ -157,8 +157,13 @@ static const __initconst struct idt_data
>>> apic_idts[] = {
>>> #endif
>>> };
>>>
>>> -/* Must be page-aligned because the real IDT is used in the cpu
>entry
>>> area */
>>> -static gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
>>> +/*
>>> + * Must be page-aligned because the real IDT is used in the cpu
>entry
>>> area.
>>> + * Allocate more entries than needed so that idt_table takes a
>whole
>>> page, so it
>>> + * is safe to map the idt_table read-only and into the user-space
>>> page-table.
>>> + */
>>> +#define IDT_ENTRIES_ALLOCATED (PAGE_SIZE / sizeof(gate_desc))
>>> +static gate_desc idt_table[IDT_ENTRIES_ALLOCATED]
>__page_aligned_bss;
>>>
>>> struct desc_ptr idt_descr __ro_after_init = {
>>> .size = IDT_TABLE_SIZE - 1,
>>> @@ -335,6 +340,9 @@ void __init idt_setup_apic_and_irq_gates(void)
>>> idt_map_in_cea();
>>> load_idt(&idt_descr);
>>>
>>> + BUILD_BUG_ON(IDT_ENTRIES_ALLOCATED < IDT_ENTRIES);
>>> + BUILD_BUG_ON(sizeof(idt_table) != PAGE_SIZE);
>>> +
>>> /* Make the IDT table read only */
>>> set_memory_ro((unsigned long)&idt_table, 1);
>>>
>>
>> NAK... this isn't the right way to fix this and just really kicks the
>can down the road. The reason is that you aren't fixing the module that
>actually has a problem.
>>
>> The Right Way[TM] is to figure out which module(s) lack the proper
>alignment for this section. A script using objdump -h or readelf -SW
>running over the .o files looking for alignment less than 2**12 should
>spot the modules that are missing the proper .balign directives.
>
>I don’t see the problem. If we are going to treat an object as though
>it’s 4096 bytes, making C think it’s 4096 bytes seems entirely
>reasonable to me.
>
>> --
>
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.

It isn't the object, it is its alignment. You can have an object page-aligned so it doesn't cross page boundaries.

The bigger issue, though, is that this means there are other object files which don't have the correct alignment directives, which means that this error can crop up again at any point. The really important thing here is that we fix the real problem.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-07-19 02:46:23

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

On Sat, Jul 18, 2020 at 06:15:26PM -0700, [email protected] wrote:
> On July 18, 2020 12:25:46 PM PDT, Andy Lutomirski <[email protected]> wrote:
> >
> >> On Jul 18, 2020, at 10:57 AM, [email protected] wrote:
> >>
> >> On July 9, 2020 3:33:55 AM PDT, Joerg Roedel <[email protected]>
> >wrote:
> >>> From: Joerg Roedel <[email protected]>
> >>>
> >>> On x86-32 the idt_table with 256 entries needs only 2048 bytes. It
> >is
> >>> page-aligned, but the end of the .bss..page_aligned section is not
> >>> guaranteed to be page-aligned.
> >>>
> >>> As a result, symbols from other .bss sections may end up on the same
> >>> 4k page as the idt_table, and will accidentially get mapped
> >read-only
> >>> during boot, causing unexpected page-faults when the kernel writes
> >to
> >>> them.
> >>>
> >>> Avoid this by making the idt_table 4kb in size even on x86-32. On
> >>> x86-64 the idt_table is already 4kb large, so nothing changes there.
> >>>
> >>> Fixes: 3e77abda65b1c ("x86/idt: Consolidate idt functionality")
> >>> Signed-off-by: Joerg Roedel <[email protected]>
> >>
> >> NAK... this isn't the right way to fix this and just really kicks the
> >can down the road. The reason is that you aren't fixing the module that
> >actually has a problem.
> >>
> >> The Right Way[TM] is to figure out which module(s) lack the proper
> >alignment for this section. A script using objdump -h or readelf -SW
> >running over the .o files looking for alignment less than 2**12 should
> >spot the modules that are missing the proper .balign directives.
> >
> >I don’t see the problem. If we are going to treat an object as though
> >it’s 4096 bytes, making C think it’s 4096 bytes seems entirely
> >reasonable to me.
> >
> >> --
> >
> >> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> It isn't the object, it is its alignment. You can have an object
> page-aligned so it doesn't cross page boundaries.
>
> The bigger issue, though, is that this means there are other object
> files which don't have the correct alignment directives, which means
> that this error can crop up again at any point. The really important
> thing here is that we fix the real problem. --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

To repeat the commit message, the problem is not misaligned
bss..page_aligned objects, but symbols in _other_ bss sections, which
can get allocated in the last page of bss..page_aligned, because its end
isn't page-aligned (maybe it should be?)

bss..page_aligned objects are unlikely to be misaligned, because its
used in C via a macro that includes the alignment attribute, and its
only use in x86 assembly is in head_{32,64}.S which have correct
alignment.

Given that this IDT's page is actually going to be mapped with different
page protections, it seems like allocating the full page isn't
unreasonable.

2020-07-19 10:40:20

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

Arvind Sankar <[email protected]> writes:
> To repeat the commit message, the problem is not misaligned
> bss..page_aligned objects, but symbols in _other_ bss sections, which
> can get allocated in the last page of bss..page_aligned, because its end
> isn't page-aligned (maybe it should be?)

That's the real and underlying problem.

> Given that this IDT's page is actually going to be mapped with different
> page protections, it seems like allocating the full page isn't
> unreasonable.

Wrong. The expectation of bss page aligned is that each object in that
section starts at a page boundary independent of its size.

Having the regular .bss objects which have no alignment requirements
start inside the bss aligned section if the last object there does not
have page size or a multiple of page size, is just hideous.

The right fix is trivial. See below.

Thanks,

tglx
----
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 1 +
2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -358,6 +358,7 @@ SECTIONS
.bss : AT(ADDR(.bss) - LOAD_OFFSET) {
__bss_start = .;
*(.bss..page_aligned)
+ . = ALIGN(PAGE_SIZE);
*(BSS_MAIN)
BSS_DECRYPTED
. = ALIGN(PAGE_SIZE);
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -738,6 +738,7 @@
.bss : AT(ADDR(.bss) - LOAD_OFFSET) { \
BSS_FIRST_SECTIONS \
*(.bss..page_aligned) \
+ . = ALIGN(PAGE_SIZE); \
*(.dynbss) \
*(BSS_MAIN) \
*(COMMON) \

2020-07-20 16:11:49

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

On Sun, Jul 19, 2020 at 12:39:44PM +0200, Thomas Gleixner wrote:
> The right fix is trivial. See below.
>
> Thanks,
>
> tglx
> ----
> arch/x86/kernel/vmlinux.lds.S | 1 +
> include/asm-generic/vmlinux.lds.h | 1 +
> 2 files changed, 2 insertions(+)
>
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -358,6 +358,7 @@ SECTIONS
> .bss : AT(ADDR(.bss) - LOAD_OFFSET) {
> __bss_start = .;
> *(.bss..page_aligned)
> + . = ALIGN(PAGE_SIZE);
> *(BSS_MAIN)
> BSS_DECRYPTED
> . = ALIGN(PAGE_SIZE);
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -738,6 +738,7 @@
> .bss : AT(ADDR(.bss) - LOAD_OFFSET) { \
> BSS_FIRST_SECTIONS \
> *(.bss..page_aligned) \
> + . = ALIGN(PAGE_SIZE); \
> *(.dynbss) \
> *(BSS_MAIN) \
> *(COMMON) \

I thougt about that too (and doing the same for .data..page_aligned),
but decided that 'page_aligned' does not imply 'page_sized', so that
putting other variables on the same page is fine in general and saves
some memory. The problem why it breaks here is only because x86 maps a
variabe which is not page-sized RO, so my thinking was that it should be
fixed right there, at the variable.

But if the above is fine too I prepare a patch which also aligns the end
of .data..page_aligned.

Thanks,

Joerg

2020-07-20 16:50:10

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

Joerg Roedel <[email protected]> writes:
> On Sun, Jul 19, 2020 at 12:39:44PM +0200, Thomas Gleixner wrote:
>> *(.bss..page_aligned) \
>> + . = ALIGN(PAGE_SIZE); \
>> *(.dynbss) \
>> *(BSS_MAIN) \
>> *(COMMON) \
>
> I thougt about that too (and doing the same for .data..page_aligned),
> but decided that 'page_aligned' does not imply 'page_sized', so that
> putting other variables on the same page is fine in general and saves
> some memory. The problem why it breaks here is only because x86 maps a
> variabe which is not page-sized RO, so my thinking was that it should be
> fixed right there, at the variable.
>
> But if the above is fine too I prepare a patch which also aligns the end
> of .data..page_aligned.

If you do

struct foo foo __attribute__ ((aligned(__alignof__(PAGE_SIZE))));

then this ends up page aligned in the data section and the linker can
place another object right next to it.

But with explicit sections which store only page aligned objects there
is an implicit guarantee that the object is alone in the page in which
it is placed. That works for all objects except the last one. That's
inconsistent.

By enforcing page sized objects for this section you might also wreckage
memory sanitizers, because your object is artificially larger than it
should be and out of bound access becomes legit.

Thanks,

tglx

2020-07-21 09:09:40

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH] x86/idt: Make sure idt_table takes a whole page

On Mon, Jul 20, 2020 at 06:48:03PM +0200, Thomas Gleixner wrote:
> But with explicit sections which store only page aligned objects there
> is an implicit guarantee that the object is alone in the page in which
> it is placed. That works for all objects except the last one. That's
> inconsistent.
>
> By enforcing page sized objects for this section you might also wreckage
> memory sanitizers, because your object is artificially larger than it
> should be and out of bound access becomes legit.

Okay, valid points about the consistency and the memory sanitizers. I'll
submit a patch for the linker scripts soon.

Regards,

Joerg