2010-01-31 23:22:39

by Siarhei Liakh

[permalink] [raw]
Subject: [PATCH v8] RO/NX protection for loadable kernel modules

This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting
module_core and module_init into three logical parts each and setting
appropriate page access permissions for each individual section:

1. Code: RO+X
2. RO data: RO+NX
3. RW data: RW+NX

In order to achieve proper protection, layout_sections() have been
modified to align each of the three parts mentioned above onto page
boundary. Next, the corresponding page access permissions are set
right before successful exit from load_module(). Further, free_module()
and sys_init_module have been modified to set module_core and
module_init as RW+NX right before calling module_free().

By default, the original section layout and access flags are preserved.
When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y, the patch
will page-align each group of sections to ensure that each page contains
only one type of content and will enforce RO/NX for each group of pages.

v1: Initial proof-of-concept patch.

v2: The patch have been re-written to reduce the number of #ifdefs and
to make it architecture-agnostic. Code formatting have been corrected also.

v3: Opportunistic RO/NX protectiuon is now unconditional. Section
page-alignment is enabled when CONFIG_DEBUG_RODATA=y.

v4: Removed most macros and improved coding style.

v5: Changed page-alignment and RO/NX section size calculation

v6: Fixed comments. Restricted RO/NX enforcement to x86 only

v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added calls to
set_all_modules_text_rw() and set_all_modules_text_ro() in ftrace

v8: updated for compatibility with linux 2.6.33-rc5

The patch have been developed for Linux 2.6.33-rc5 by Siarhei Liakh
<[email protected]> and Xuxian Jiang <[email protected]>.

---

Signed-off-by: Siarhei Liakh <[email protected]>
Signed-off-by: Xuxian Jiang <[email protected]>
Acked-by: Arjan van de Ven <[email protected]>

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index bc01e3e..7c80e28 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -115,6 +115,17 @@ config DEBUG_RODATA_TEST
feature as well as for the change_page_attr() infrastructure.
If in doubt, say "N"

+config DEBUG_SET_MODULE_RONX
+ bool "Set loadable kernel module data as NX and text as RO"
+ default n
+ depends on X86 && MODULES
+ ---help---
+ This option helps to catch unintended modifications to loadable
+ kernel module's text and read-only data. It also prevents execution
+ of LKM's data. Such protection may interfere with run-time code
+ patching and dynamic kernel tracing.
+ If in doubt, say "N".
+
config DEBUG_NX_TEST
tristate "Testcase for the NX non-executable stack feature"
depends on DEBUG_KERNEL && m
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 3096892..b47a154 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -19,6 +19,7 @@
#include <linux/sched.h>
#include <linux/init.h>
#include <linux/list.h>
+#include <linux/module.h>

#include <trace/syscall.h>

@@ -29,16 +30,17 @@


#ifdef CONFIG_DYNAMIC_FTRACE
-
int ftrace_arch_code_modify_prepare(void)
{
set_kernel_text_rw();
+ set_all_modules_text_rw();
return 0;
}

int ftrace_arch_code_modify_post_process(void)
{
set_kernel_text_ro();
+ set_all_modules_text_ro();
return 0;
}

diff --git a/include/linux/module.h b/include/linux/module.h
index 6cb1a3c..c6040ce 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -300,6 +300,9 @@ struct module
/* The size of the executable code in each section. */
unsigned int init_text_size, core_text_size;

+ /* Size of RO sections of the module (text+rodata) */
+ unsigned int init_ro_size, core_ro_size;
+
/* Arch-specific module values */
struct mod_arch_specific arch;

@@ -542,6 +545,9 @@ extern void print_modules(void);
extern void module_update_tracepoints(void);
extern int module_get_iter_tracepoints(struct tracepoint_iter *iter);

+void set_all_modules_text_rw(void);
+void set_all_modules_text_ro(void);
+
#else /* !CONFIG_MODULES... */
#define EXPORT_SYMBOL(sym)
#define EXPORT_SYMBOL_GPL(sym)
@@ -662,6 +668,13 @@ static inline int
module_get_iter_tracepoints(struct tracepoint_iter *iter)
return 0;
}

+static inline void set_all_modules_text_rw()
+{
+}
+
+static inline void set_all_modules_text_ro()
+{
+}
#endif /* CONFIG_MODULES */

struct device_driver;
diff --git a/kernel/module.c b/kernel/module.c
index f82386b..d382b59 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -55,6 +55,7 @@
#include <linux/async.h>
#include <linux/percpu.h>
#include <linux/kmemleak.h>
+#include <linux/pfn.h>

#define CREATE_TRACE_POINTS
#include <trace/events/module.h>
@@ -71,6 +72,26 @@ EXPORT_TRACEPOINT_SYMBOL(module_get);
#define ARCH_SHF_SMALL 0
#endif

+/*
+ * Modules' sections will be aligned on page boundaries
+ * to ensure complete separation of code and data, but
+ * only when CONFIG_DEBUG_SET_MODULE_RONX=y
+ */
+#ifdef CONFIG_DEBUG_SET_MODULE_RONX
+#define debug_align(X) ALIGN(X, PAGE_SIZE)
+#else
+#define debug_align(X) (X)
+#endif
+
+/*
+ * Given BASE and SIZE this macro calculates the number of pages the
+ * memory regions occupies
+ */
+#define NUMBER_OF_PAGES(BASE, SIZE) ((SIZE > 0) ? \
+ (PFN_DOWN((unsigned long)BASE + SIZE - 1) - \
+ PFN_DOWN((unsigned long)BASE) + 1) \
+ : (0UL))
+
/* If this is set, the section belongs in the init part of the module */
#define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))

@@ -1372,6 +1393,134 @@ static int __unlink_module(void *_mod)
return 0;
}

+#ifdef CONFIG_DEBUG_SET_MODULE_RONX
+/*
+ * LKM RO/NX protection: protect module's text/ro-data
+ * from modification and any data from execution.
+ */
+static void set_section_ro_nx(void *base,
+ unsigned long text_size,
+ unsigned long ro_size,
+ unsigned long total_size)
+{
+ /* begin and end PFNs of the current subsection */
+ unsigned long begin_pfn;
+ unsigned long end_pfn;
+
+ /*
+ * Set RO for module text and RO-data:
+ * - Always protect first page.
+ * - Do not protect last partial page.
+ */
+ if (ro_size > 0) {
+ begin_pfn = PFN_DOWN((unsigned long)base);
+ end_pfn = PFN_DOWN((unsigned long)base + ro_size);
+ if (end_pfn > begin_pfn)
+ set_memory_ro(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+
+ /*
+ * Set NX permissions for module data:
+ * - Do not protect first partial page.
+ * - Always protect last page.
+ */
+ if (total_size > text_size) {
+ begin_pfn = PFN_UP((unsigned long)base + text_size);
+ end_pfn = PFN_UP((unsigned long)base + total_size);
+ if (end_pfn > begin_pfn)
+ set_memory_nx(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+}
+
+/* Setting memory back to RW+NX before releasing it */
+void unset_section_ro_nx(struct module *mod, void *module_region)
+{
+ unsigned long total_pages;
+
+ if (mod->module_core == module_region) {
+ /* Set core as NX+RW */
+ total_pages = NUMBER_OF_PAGES(mod->module_core, mod->core_size);
+ set_memory_nx((unsigned long)mod->module_core, total_pages);
+ set_memory_rw((unsigned long)mod->module_core, total_pages);
+
+ } else if (mod->module_init == module_region) {
+ /* Set init as NX+RW */
+ total_pages = NUMBER_OF_PAGES(mod->module_init, mod->init_size);
+ set_memory_nx((unsigned long)mod->module_init, total_pages);
+ set_memory_rw((unsigned long)mod->module_init, total_pages);
+ }
+}
+
+/* Iterate through all modules and set each module's text as RW */
+void set_all_modules_text_rw()
+{
+ struct module *mod;
+ unsigned long begin_pfn;
+ unsigned long end_pfn;
+
+ mutex_lock(&module_mutex);
+ list_for_each_entry_rcu(mod, &modules, list) {
+ if ((mod->module_core) && (mod->core_text_size > 0)) {
+ begin_pfn = PFN_DOWN((unsigned long)mod->module_core);
+ end_pfn = PFN_DOWN((unsigned long)mod->module_core +
+ mod->core_text_size);
+ if (end_pfn > begin_pfn)
+ set_memory_rw(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+ if ((mod->module_init) && (mod->init_text_size > 0)) {
+ begin_pfn = PFN_DOWN((unsigned long)mod->module_init);
+ end_pfn = PFN_DOWN((unsigned long)mod->module_init +
+ mod->init_text_size);
+ if (end_pfn > begin_pfn)
+ set_memory_rw(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+ }
+ mutex_unlock(&module_mutex);
+}
+
+/* Iterate through all modules and set each module's text as RO */
+void set_all_modules_text_ro()
+{
+ struct module *mod;
+ unsigned long begin_pfn;
+ unsigned long end_pfn;
+
+ mutex_lock(&module_mutex);
+ list_for_each_entry_rcu(mod, &modules, list) {
+ if ((mod->module_core) && (mod->core_text_size > 0)) {
+ begin_pfn = PFN_DOWN((unsigned long)mod->module_core);
+ end_pfn = PFN_DOWN((unsigned long)mod->module_core +
+ mod->core_text_size);
+ if (end_pfn > begin_pfn)
+ set_memory_ro(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+ if ((mod->module_init) && (mod->init_text_size > 0)) {
+ begin_pfn = PFN_DOWN((unsigned long)mod->module_init);
+ end_pfn = PFN_DOWN((unsigned long)mod->module_init +
+ mod->init_text_size);
+ if (end_pfn > begin_pfn)
+ set_memory_ro(begin_pfn << PAGE_SHIFT,
+ end_pfn - begin_pfn);
+ }
+ }
+ mutex_unlock(&module_mutex);
+}
+#else
+static void set_section_ro_nx(void *base,
+ unsigned long text_size,
+ unsigned long ro_size,
+ unsigned long total_size) { }
+
+void unset_section_ro_nx(struct module *mod, void *module_region) { }
+void set_all_modules_text_rw() { }
+void set_all_modules_text_ro() { }
+#endif
+
/* Free a module, remove from lists, etc (must hold module_mutex). */
static void free_module(struct module *mod)
{
@@ -1393,6 +1542,7 @@ static void free_module(struct module *mod)
destroy_params(mod->kp, mod->num_kp);

/* This may be NULL, but that's OK */
+ unset_section_ro_nx(mod, mod->module_init);
module_free(mod, mod->module_init);
kfree(mod->args);
if (mod->percpu)
@@ -1405,6 +1555,7 @@ static void free_module(struct module *mod)
lockdep_free_key_range(mod->module_core, mod->core_size);

/* Finally, free the core (containing the module structure) */
+ unset_section_ro_nx(mod, mod->module_core);
module_free(mod, mod->module_core);

#ifdef CONFIG_MPU
@@ -1582,8 +1733,19 @@ static void layout_sections(struct module *mod,
s->sh_entsize = get_offset(mod, &mod->core_size, s, i);
DEBUGP("\t%s\n", secstrings + s->sh_name);
}
- if (m == 0)
+ switch (m) {
+ case 0: /* executable */
+ mod->core_size = debug_align(mod->core_size);
mod->core_text_size = mod->core_size;
+ break;
+ case 1: /* RO: text and ro-data */
+ mod->core_size = debug_align(mod->core_size);
+ mod->core_ro_size = mod->core_size;
+ break;
+ case 3: /* whole core */
+ mod->core_size = debug_align(mod->core_size);
+ break;
+ }
}

DEBUGP("Init section allocation order:\n");
@@ -1600,8 +1762,19 @@ static void layout_sections(struct module *mod,
| INIT_OFFSET_MASK);
DEBUGP("\t%s\n", secstrings + s->sh_name);
}
- if (m == 0)
+ switch (m) {
+ case 0: /* executable */
+ mod->init_size = debug_align(mod->init_size);
mod->init_text_size = mod->init_size;
+ break;
+ case 1: /* RO: text and ro-data */
+ mod->init_size = debug_align(mod->init_size);
+ mod->init_ro_size = mod->init_size;
+ break;
+ case 3: /* whole init */
+ mod->init_size = debug_align(mod->init_size);
+ break;
+ }
}
}

@@ -2382,6 +2555,18 @@ static noinline struct module *load_module(void
__user *umod,

trace_module_load(mod);

+ /* Set RO and NX regions for core */
+ set_section_ro_nx(mod->module_core,
+ mod->core_text_size,
+ mod->core_ro_size,
+ mod->core_size);
+
+ /* Set RO and NX regions for init */
+ set_section_ro_nx(mod->module_init,
+ mod->init_text_size,
+ mod->init_ro_size,
+ mod->init_size);
+
/* Done! */
return mod;

@@ -2504,6 +2689,7 @@ SYSCALL_DEFINE3(init_module, void __user *, umod,
mod->symtab = mod->core_symtab;
mod->strtab = mod->core_strtab;
#endif
+ unset_section_ro_nx(mod, mod->module_init);
module_free(mod, mod->module_init);
mod->module_init = NULL;
mod->init_size = 0;


2010-02-01 01:39:20

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

On Mon, 1 Feb 2010 09:52:35 am Siarhei Liakh wrote:
> +/*
> + * Given BASE and SIZE this macro calculates the number of pages the
> + * memory regions occupies
> + */
> +#define NUMBER_OF_PAGES(BASE, SIZE) ((SIZE > 0) ? \
> + (PFN_DOWN((unsigned long)BASE + SIZE - 1) - \
> + PFN_DOWN((unsigned long)BASE) + 1) \
> + : (0UL))

Needs more brackets around arguments, otherwise someone calling it with
a complex expression will get very upset.

Or just replace with a static inline function?

> + if ((mod->module_core) && (mod->core_text_size > 0)) {

The core_text_size test should be enough here.

> + begin_pfn = PFN_DOWN((unsigned long)mod->module_core);
> + end_pfn = PFN_DOWN((unsigned long)mod->module_core +
> + mod->core_text_size);
> + if (end_pfn > begin_pfn)
> + set_memory_rw(begin_pfn << PAGE_SHIFT,
> + end_pfn - begin_pfn);

Much of this code might be neater if you created a helper:

void set_page_attributes(void *start, void *end,
void (*set)(unsigned long start, unsigned long num_pages))
{
unsigned long begin_pfn = PFN_DOWN((unsigned long)start);
unsigned long end_pfn = PFN_DOWN((unsigned long)end);
if (end_pfn > begin_pfn)
set(begin_pfn << PAGE_SHIFT, end_pfn - begin_pfn);
}

But these are minor: patch looks good!

Thanks,
Rusty.

2010-02-01 16:22:49

by Siarhei Liakh

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

> Needs more brackets around arguments, otherwise someone calling it with
> a complex expression will get very upset.
[...]

Done.

> The core_text_size test should be enough here.

Done.

>> + ? ? ? ? ? ? ? ? ? ? begin_pfn = PFN_DOWN((unsigned long)mod->module_core);
>> + ? ? ? ? ? ? ? ? ? ? end_pfn = PFN_DOWN((unsigned long)mod->module_core +
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? mod->core_text_size);
>> + ? ? ? ? ? ? ? ? ? ? if (end_pfn > begin_pfn)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? set_memory_rw(begin_pfn << PAGE_SHIFT,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? end_pfn - begin_pfn);
>
> Much of this code might be neater if you created a helper:

Done.

> But these are minor: patch looks good!

Thanks for your feedback!

2010-02-02 23:05:44

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

Siarhei Liakh <[email protected]> writes:

> This patch is a logical extension of the protection provided by
> CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting
> module_core and module_init into three logical parts each and setting
> appropriate page access permissions for each individual section:

My current kernel has 52 modules loaded, most of them very small.
Assuming the additional alignment of the data section cost two more
pages on average (I think that's a good assumption), that's roughly
424KB of additional memory, plus associated runtime costs in increased
TLB usage.

What would I get for that if I applied the patch and enabled the option?

That information seems to be missing in this patch submission.

Did you find any bugs with this option?

-Andi

--
[email protected] -- Speaking for myself only.

2010-02-03 04:07:23

by Siarhei Liakh

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

On Tue, Feb 2, 2010 at 6:05 PM, Andi Kleen <[email protected]> wrote:

> My current kernel has 52 modules loaded, most of them very small.
> Assuming the additional alignment of the data section cost two more
> pages on average (I think that's a good assumption), that's roughly
> 424KB of additional memory, plus associated runtime costs in increased
> TLB usage.

You are right about the additional memory consumption with this option
enabled. In worst case scenario, a one-page module will get split into
3 pages. This might be a significant and unwelcome increase in some
cases (embedded systems come to mind) and that is why this is
implemented as compile-time option.
However, on a general run-of-the-mill system the impact is not that
significant. Case in point: I run Ubuntu 9.04 amd x86_64 with stock
kernel on a generic AMD box. There are 111 modules loaded for all the
hardware in the box and all the functions that it provides (regular
consumer-grade hardware, but definitely not a typical home desktop
setup). In total, 111 modules consume 15590884 bytes, with no module
less then 10KB. After eliminating top 3 memory hogs (nvidia@8MB,
[email protected] and [email protected]) we left with 5007532 bytes, 108
modules and nfs leading the list at 300KB. Assuming the worst-case
scenario of 2 additional pages per module, we get
(5007532+108*8192)/5007532 = 1.18 - only 18%. In addition, most
modules already have some unused space left at the end of the last
page and some padding in the middle for page/cache-aligned data. With
that in mind, it looks to me that typical overhead will be about 1
page per module (9% for the case above, although I have no solid
statistics to back this up).

> What would I get for that if I applied the patch and enabled the option?

With this patch you get strict separation of code, RO-data and RW-data
in LKMs. This will prevent all attempts (intentional or erroneous) to
modify code and RO-data, while keeping all data non-executable. While
not targeted to prevent any specific exploit, this patch does improve
general kernel security. When combined with the other two patches I
have submitted (RW+NX for first megabyte and NX for static kernel
data), this patch set completely eliminates kernel pages with
executable data and modifiable code.

> Did you find any bugs with this option?

Well, so far I did not find any bugs in the last version of the patch,
but it definitely breaks any module that uses self-modifying code or
data execution.
An issue with ftrace have been reported earlier and fixed in V7.

2010-02-08 01:45:37

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

On Wed, 3 Feb 2010 09:35:39 am Andi Kleen wrote:
> Siarhei Liakh <[email protected]> writes:
>
> > This patch is a logical extension of the protection provided by
> > CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting
> > module_core and module_init into three logical parts each and setting
> > appropriate page access permissions for each individual section:
>
> My current kernel has 52 modules loaded, most of them very small.
> Assuming the additional alignment of the data section cost two more
> pages on average (I think that's a good assumption), that's roughly
> 424KB of additional memory, plus associated runtime costs in increased
> TLB usage.
>
> What would I get for that if I applied the patch and enabled the option?

Strict RO/NX protection. But without the option enabled, the patch gives
best-effort protection, which is nice (for no additional space).

Cheers,
Rusty.

2010-02-08 01:58:23

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v8] RO/NX protection for loadable kernel modules

On 02/07/2010 05:45 PM, Rusty Russell wrote:
>
> Strict RO/NX protection. But without the option enabled, the patch gives
> best-effort protection, which is nice (for no additional space).
>

Since Linux kernel modules are actually .o's, not .so's, in theory we
could bundle the sections together by type. There could still be
external fragmentation, of course, but on most systems module unload is
relatively rare.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.