2009-10-13 01:03:57

by Siarhei Liakh

[permalink] [raw]
Subject: [PATCH V5] x86: NX protection for kernel data

This patch expands functionality of CONFIG_DEBUG_RODATA to set main
(static) kernel data area as NX.
The following steps are taken to achieve this:
1. Linker script is adjusted so .text always starts and ends on a page boundary
2. Linker script is adjusted so .rodata and .data always start and
end on a page boundary
3. void mark_nxdata_nx(void) added to arch/x86/mm/init.c with actual
functionality: NX is set for all pages from _etext through _end.
4. mark_nxdata_nx() called from free_initmem() (after init has been released)
5. free_init_pages() sets released memory NX in arch/x86/mm/init.c

The patch have been developed for Linux 2.6.31-rc7 x86 by Siarhei Liakh
<[email protected]> and Xuxian Jiang <[email protected]>.

V1: initial patch for 2.6.30
V2: patch for 2.6.31-rc7
V3: moved all code into arch/x86, adjusted credits
V4: fixed ifdef, removed credits from CREDITS
V5: fixed an address calculation bug in mark_nxdata_nx()
---

Signed-off-by: Siarhei Liakh <[email protected]>
Signed-off-by: Xuxian Jiang <[email protected]>

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 78d185d..83ae734 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -43,14 +43,14 @@ jiffies_64 = jiffies;

PHDRS {
text PT_LOAD FLAGS(5); /* R_E */
- data PT_LOAD FLAGS(7); /* RWE */
+ data PT_LOAD FLAGS(6); /* RW_ */
#ifdef CONFIG_X86_64
- user PT_LOAD FLAGS(7); /* RWE */
- data.init PT_LOAD FLAGS(7); /* RWE */
+ user PT_LOAD FLAGS(6); /* RW_ */
+ data.init PT_LOAD FLAGS(6); /* RW_ */
#ifdef CONFIG_SMP
- percpu PT_LOAD FLAGS(7); /* RWE */
+ percpu PT_LOAD FLAGS(6); /* RW_ */
#endif
- data.init2 PT_LOAD FLAGS(7); /* RWE */
+ data.init2 PT_LOAD FLAGS(6); /* RW_ */
#endif
note PT_NOTE FLAGS(0); /* ___ */
}
@@ -89,6 +89,8 @@ SECTIONS
IRQENTRY_TEXT
*(.fixup)
*(.gnu.warning)
+ /* .text should occupy whole number of pages */
+ . = ALIGN(PAGE_SIZE);
/* End of text section */
_etext = .;
} :text = 0x9090
@@ -151,6 +153,8 @@ SECTIONS
.data.read_mostly : AT(ADDR(.data.read_mostly) - LOAD_OFFSET) {
*(.data.read_mostly)

+ /* .data should occupy whole number of pages */
+ . = ALIGN(PAGE_SIZE);
/* End of data section */
_edata = .;
}
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 0607119..7bfd411 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -423,9 +423,10 @@ void free_init_pages(char *what, unsigned long
begin, unsigned long end)
/*
* We just marked the kernel text read only above, now that
* we are going to free part of that, we need to make that
- * writeable first.
+ * writeable and non-executable first.
*/
set_memory_rw(begin, (end - begin) >> PAGE_SHIFT);
+ set_memory_nx(begin, (end - begin) >> PAGE_SHIFT);

printk(KERN_INFO "Freeing %s: %luk freed\n", what, (end - begin) >> 10);

@@ -440,11 +441,29 @@ void free_init_pages(char *what, unsigned long
begin, unsigned long end)
#endif
}

+void mark_nxdata_nx(void)
+{
+#ifdef CONFIG_DEBUG_RODATA
+ /*
+ * When this called, init has already been executed and released,
+ * so everything past _etext sould be NX.
+ */
+ unsigned long start = PAGE_ALIGN((unsigned long)(&_etext));
+ unsigned long size = PAGE_ALIGN((unsigned long)(&_end)) - start;
+
+ printk(KERN_INFO "NX-protecting the kernel data: %lx, %lu pages\n",
+ start, size >> PAGE_SHIFT);
+ set_memory_nx(start, size >> PAGE_SHIFT);
+#endif
+}
+
void free_initmem(void)
{
free_init_pages("unused kernel memory",
(unsigned long)(&__init_begin),
(unsigned long)(&__init_end));
+ /* Set kernel's data as NX */
+ mark_nxdata_nx();
}

#ifdef CONFIG_BLK_DEV_INITRD


2009-10-13 04:32:34

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

On Mon, 12 Oct 2009 21:03:17 -0400
Siarhei Liakh <[email protected]> wrote:

> This patch expands functionality of CONFIG_DEBUG_RODATA to set main
> (static) kernel data area as NX.
> The following steps are taken to achieve this:
> 1. Linker script is adjusted so .text always starts and ends on a
> page boundary 2. Linker script is adjusted so .rodata and .data
> always start and end on a page boundary
> 3. void mark_nxdata_nx(void) added to arch/x86/mm/init.c with actual
> functionality: NX is set for all pages from _etext through _end.
> 4. mark_nxdata_nx() called from free_initmem() (after init has been
> released) 5. free_init_pages() sets released memory NX in
> arch/x86/mm/init.c
>
> The patch have been developed for Linux 2.6.31-rc7 x86 by Siarhei
> Liakh <[email protected]> and Xuxian Jiang <[email protected]>.
>

I like doing this, but... maybe it is useful to have a diff of the
pagetable dump (PT_DUMP config option) to show the effect, in the
changelog. That'd be like the proof on the pudding...


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-10-13 06:04:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data


* Arjan van de Ven <[email protected]> wrote:

> On Mon, 12 Oct 2009 21:03:17 -0400
> Siarhei Liakh <[email protected]> wrote:
>
> > This patch expands functionality of CONFIG_DEBUG_RODATA to set main
> > (static) kernel data area as NX.
> > The following steps are taken to achieve this:
> > 1. Linker script is adjusted so .text always starts and ends on a
> > page boundary 2. Linker script is adjusted so .rodata and .data
> > always start and end on a page boundary
> > 3. void mark_nxdata_nx(void) added to arch/x86/mm/init.c with actual
> > functionality: NX is set for all pages from _etext through _end.
> > 4. mark_nxdata_nx() called from free_initmem() (after init has been
> > released) 5. free_init_pages() sets released memory NX in
> > arch/x86/mm/init.c
> >
> > The patch have been developed for Linux 2.6.31-rc7 x86 by Siarhei
> > Liakh <[email protected]> and Xuxian Jiang <[email protected]>.
> >
>
> I like doing this, but... maybe it is useful to have a diff of the
> pagetable dump (PT_DUMP config option) to show the effect, in the
> changelog. That'd be like the proof on the pudding...

That's a good suggestion. Siarhei Liakh, mind doing that?

Thanks,

Ingo

2009-10-13 07:15:11

by David Howells

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

Siarhei Liakh <[email protected]> wrote:

> @@ -440,11 +441,29 @@ void free_init_pages(char *what, unsigned long
> begin, unsigned long end)

Your mail client is word wrapping your patches.

David

2009-10-13 07:49:47

by David Howells

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

Siarhei Liakh <[email protected]> wrote:

> This patch expands functionality of CONFIG_DEBUG_RODATA to set main
> (static) kernel data area as NX.
> The following steps are taken to achieve this:
> 1. Linker script is adjusted so .text always starts and ends on a page boundary
> 2. Linker script is adjusted so .rodata and .data always start and
> end on a page boundary
> 3. void mark_nxdata_nx(void) added to arch/x86/mm/init.c with actual
> functionality: NX is set for all pages from _etext through _end.
> 4. mark_nxdata_nx() called from free_initmem() (after init has been released)
> 5. free_init_pages() sets released memory NX in arch/x86/mm/init.c
>
> The patch have been developed for Linux 2.6.31-rc7 x86 by Siarhei Liakh
> <[email protected]> and Xuxian Jiang <[email protected]>.
>
> V1: initial patch for 2.6.30
> V2: patch for 2.6.31-rc7
> V3: moved all code into arch/x86, adjusted credits
> V4: fixed ifdef, removed credits from CREDITS
> V5: fixed an address calculation bug in mark_nxdata_nx()
> ---
>
> Signed-off-by: Siarhei Liakh <[email protected]>
> Signed-off-by: Xuxian Jiang <[email protected]>

That seems to fix the problem, thanks.

Acked-by: David Howells <[email protected]>

2009-10-13 11:36:10

by Siarhei Liakh

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

>> I like doing this, but... maybe it is useful to have a diff of the
>> pagetable dump (PT_DUMP config option) to show the effect, in the
>> changelog. That'd be like the proof on the pudding...
>
> That's a good suggestion. Siarhei Liakh, mind doing that?

Here you go:
===============================================
--- data_nx_pt_before.txt 2009-10-13 07:26:17.000000000 -0400
+++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400
@@ -2,12 +2,9 @@
0x00000000-0xc0000000 3G pmd
---[ Kernel Mapping ]---
0xc0000000-0xc0100000 1M RW GLB x pte
-0xc0100000-0xc048d000 3636K ro GLB x pte
-0xc048d000-0xc04d0000 268K RW GLB x pte
-0xc04d0000-0xc04d2000 8K RW GLB NX pte
-0xc04d2000-0xc04d3000 4K RW GLB x pte
-0xc04d3000-0xc0531000 376K RW GLB NX pte
-0xc0531000-0xc0600000 828K RW GLB x pte
+0xc0100000-0xc0381000 2564K ro GLB x pte
+0xc0381000-0xc048d000 1072K ro GLB NX pte
+0xc048d000-0xc0600000 1484K RW GLB NX pte
0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
0xf7800000-0xf79fe000 2040K RW GLB NX pte
0xf79fe000-0xf7a00000 8K pte
===============================================

Would you like me to re-post whole patch with this addition?

Thanks.

2009-10-13 12:29:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data


* Siarhei Liakh <[email protected]> wrote:

> >> I like doing this, but... maybe it is useful to have a diff of the
> >> pagetable dump (PT_DUMP config option) to show the effect, in the
> >> changelog. That'd be like the proof on the pudding...
> >
> > That's a good suggestion. Siarhei Liakh, mind doing that?
>
> Here you go:
> ===============================================
> --- data_nx_pt_before.txt 2009-10-13 07:26:17.000000000 -0400
> +++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400
> @@ -2,12 +2,9 @@
> 0x00000000-0xc0000000 3G pmd
> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0100000 1M RW GLB x pte
> -0xc0100000-0xc048d000 3636K ro GLB x pte
> -0xc048d000-0xc04d0000 268K RW GLB x pte
> -0xc04d0000-0xc04d2000 8K RW GLB NX pte
> -0xc04d2000-0xc04d3000 4K RW GLB x pte
> -0xc04d3000-0xc0531000 376K RW GLB NX pte
> -0xc0531000-0xc0600000 828K RW GLB x pte
> +0xc0100000-0xc0381000 2564K ro GLB x pte
> +0xc0381000-0xc048d000 1072K ro GLB NX pte
> +0xc048d000-0xc0600000 1484K RW GLB NX pte
> 0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
> 0xf7800000-0xf79fe000 2040K RW GLB NX pte
> 0xf79fe000-0xf7a00000 8K pte
> ===============================================
>
> Would you like me to re-post whole patch with this addition?

Yep, v6 with Arjan's ack (once he sends it) would be handy.

Ingo

2009-10-13 14:07:29

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

On Tue, 13 Oct 2009 07:35:28 -0400
Siarhei Liakh <[email protected]> wrote:

> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0100000 1M RW GLB x pte
> -0xc0100000-0xc048d000 3636K ro GLB x pte
> -0xc048d000-0xc04d0000 268K RW GLB x pte
> -0xc04d0000-0xc04d2000 8K RW GLB NX pte
> -0xc04d2000-0xc04d3000 4K RW GLB x pte
> -0xc04d3000-0xc0531000 376K RW GLB NX pte
> -0xc0531000-0xc0600000 828K RW GLB x pte
> +0xc0100000-0xc0381000 2564K ro GLB x pte
> +0xc0381000-0xc048d000 1072K ro GLB NX pte
> +0xc048d000-0xc0600000 1484K RW GLB NX pte
> 0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
> 0xf7800000-0xf79fe000 2040K RW GLB NX pte
> 0xf79fe000-0xf7a00000 8K pte
> ===============================================
>

looks great to me; the result is
* kernel is ro + x
* rodata is ro + NX
* data is RW + NX
(and there is no "RW + x", other than the first megabyte... hmm. maybe
we need to look at that as well at some point)

Acked-by: Arjan van de Ven <[email protected]>

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-10-13 14:16:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data


* Arjan van de Ven <[email protected]> wrote:

> On Tue, 13 Oct 2009 07:35:28 -0400
> Siarhei Liakh <[email protected]> wrote:
>
> > ---[ Kernel Mapping ]---
> > 0xc0000000-0xc0100000 1M RW GLB x pte
> > -0xc0100000-0xc048d000 3636K ro GLB x pte
> > -0xc048d000-0xc04d0000 268K RW GLB x pte
> > -0xc04d0000-0xc04d2000 8K RW GLB NX pte
> > -0xc04d2000-0xc04d3000 4K RW GLB x pte
> > -0xc04d3000-0xc0531000 376K RW GLB NX pte
> > -0xc0531000-0xc0600000 828K RW GLB x pte
> > +0xc0100000-0xc0381000 2564K ro GLB x pte
> > +0xc0381000-0xc048d000 1072K ro GLB NX pte
> > +0xc048d000-0xc0600000 1484K RW GLB NX pte
> > 0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
> > 0xf7800000-0xf79fe000 2040K RW GLB NX pte
> > 0xf79fe000-0xf7a00000 8K pte
> > ===============================================
> >
>
> looks great to me; the result is
> * kernel is ro + x
> * rodata is ro + NX
> * data is RW + NX
>
> (and there is no "RW + x", other than the first megabyte... hmm. maybe
> we need to look at that as well at some point)

Could we cover the first megabyte too please via a (default-disabled)
option? Modern Xorg shouldnt mind about that anymore, right?

Ingo

2009-10-13 14:29:09

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

On Tue, 13 Oct 2009 16:15:27 +0200
Ingo Molnar <[email protected]> wrote:

>
> * Arjan van de Ven <[email protected]> wrote:
>
> > On Tue, 13 Oct 2009 07:35:28 -0400
> > Siarhei Liakh <[email protected]> wrote:
> >
> > > ---[ Kernel Mapping ]---
> > > 0xc0000000-0xc0100000 1M RW GLB x pte
> > > -0xc0100000-0xc048d000 3636K ro GLB x pte
> > > -0xc048d000-0xc04d0000 268K RW GLB x pte
> > > -0xc04d0000-0xc04d2000 8K RW GLB NX pte
> > > -0xc04d2000-0xc04d3000 4K RW GLB x pte
> > > -0xc04d3000-0xc0531000 376K RW GLB NX pte
> > > -0xc0531000-0xc0600000 828K RW GLB x pte
> > > +0xc0100000-0xc0381000 2564K ro GLB x pte
> > > +0xc0381000-0xc048d000 1072K ro GLB NX pte
> > > +0xc048d000-0xc0600000 1484K RW GLB NX pte
> > > 0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
> > > 0xf7800000-0xf79fe000 2040K RW GLB NX pte
> > > 0xf79fe000-0xf7a00000 8K pte
> > > ===============================================
> > >
> >
> > looks great to me; the result is
> > * kernel is ro + x
> > * rodata is ro + NX
> > * data is RW + NX
> >
> > (and there is no "RW + x", other than the first megabyte... hmm.
> > maybe we need to look at that as well at some point)
>
> Could we cover the first megabyte too please via a (default-disabled)
> option? Modern Xorg shouldnt mind about that anymore, right?


I'd be surprised if anything ever did; this is the *kernel* mapping of
the first megabyte, not some userspace mapping....



--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-10-13 14:35:14

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

On Tue, 13 Oct 2009 16:15:27 +0200
Ingo Molnar <[email protected]> wrote:

>
> * Arjan van de Ven <[email protected]> wrote:
>
> > On Tue, 13 Oct 2009 07:35:28 -0400
> > Siarhei Liakh <[email protected]> wrote:
> >
> > > ---[ Kernel Mapping ]---
> > > 0xc0000000-0xc0100000 1M RW GLB x pte
> > > -0xc0100000-0xc048d000 3636K ro GLB x pte
> > > -0xc048d000-0xc04d0000 268K RW GLB x pte
> > > -0xc04d0000-0xc04d2000 8K RW GLB NX pte
> > > -0xc04d2000-0xc04d3000 4K RW GLB x pte
> > > -0xc04d3000-0xc0531000 376K RW GLB NX pte
> > > -0xc0531000-0xc0600000 828K RW GLB x pte
> > > +0xc0100000-0xc0381000 2564K ro GLB x pte
> > > +0xc0381000-0xc048d000 1072K ro GLB NX pte
> > > +0xc048d000-0xc0600000 1484K RW GLB NX pte
> > > 0xc0600000-0xf7800000 882M RW PSE GLB NX pmd
> > > 0xf7800000-0xf79fe000 2040K RW GLB NX pte
> > > 0xf79fe000-0xf7a00000 8K pte
> > > ===============================================
> > >
> >
> > looks great to me; the result is
> > * kernel is ro + x
> > * rodata is ro + NX
> > * data is RW + NX
> >
> > (and there is no "RW + x", other than the first megabyte... hmm.
> > maybe we need to look at that as well at some point)
>
> Could we cover the first megabyte too please via a (default-disabled)
> option? Modern Xorg shouldnt mind about that anymore, right?

just to be clear, for me this 1Mb is a seperate issue, and for a
separate patch.... the current patch is good as is.

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-10-13 14:49:51

by Alan

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

> I'd be surprised if anything ever did; this is the *kernel* mapping of
> the first megabyte, not some userspace mapping....

APM, BIOS32, EDD, PnPBIOS ..

However except for APM (which isn't generally needed on NX capable
devices or found on them) none of them are usually on critical paths
because EDD is just grovelling around sort of stuff, and BIOS32 isn't
generally used by the kernel anyway so could probably cope with flipping
the permissions on the low 1 MB each call.

2009-10-13 15:35:07

by Siarhei Liakh

[permalink] [raw]
Subject: Re: [PATCH V5] x86: NX protection for kernel data

>> I'd be surprised if anything ever did; this is the *kernel* mapping of
>> the first megabyte, not some userspace mapping....
>
> APM, BIOS32, EDD, PnPBIOS ..
>
> However except for APM (which isn't generally needed on NX capable
> devices or found on them) none of them are usually on critical paths
> because EDD is just grovelling around sort of stuff, and BIOS32 isn't
> generally used by the kernel anyway so could probably cope with flipping
> the permissions on the low 1 MB each call.

Actually, I have posted a patch to fix RW+X problem with BIOS32 some
time ago. See my submission to LKML (and subsequent discussion) on Jul
19 2009 "[PATCH] x86: Reducing footprint of BIOS32 service mappings".

Nevertheless, that 1MB area is on my "to do" list, and I will be
patching it sooner or later (assuming I get my patches tested well
enough to get them accepted).