Hi,
Here's the second version of my patch to add better sanity checks for
binfmt_elf
The first patch can be found in the archives. Here's one online location :
http://lkml.org/lkml/2004/1/10/89 - that email contains a description of
the first version of the patch. Here follows a changelog for v1->v2 (this
patch includes everything v1 did)
Changes releative to v1:
-----
- Patch is against 2.6.1-mm2
- a test for e_version == e_ident[EI_VERSION] was added.
This is redundant as long as there is only one ELF version in
existance but as soon as two versions exist, checking consistency
between the two places where the binary can claim a version is
sensible. Redundant for now but a sane check none the less.
- A check has been added to verify that p_filesz is not larger than p_memsz.
This is not allowed by ELF, and would not make sense in any case
as we would then attempt to load more data into memory than we
have room for.
- I moved the out: label below the other out_* labels. Seems more logical.
- Reformatted the huge if() that tests the header a bit - for readabillity.
I've been running with this patch for the better part of a day, and no
binaries have failed so far, so it seems OK.
I've had some feedback from 3 users who tested v1 of the patch, and all of
them reported that it didn't seem to cause any trouble.
Things I've got planned for v3
---
- I'll probably make elf_check_arch into an inline function rather than
the current macro which evaluates its argument multiple times.
- I'll move most of the checks into a sepperate function to be
called from both load_elf_binary() and load_elf_interp().
- There are still some header fields that are not checked, most of those
will hopefully be tested in v3 - we'll see, I want to do this in nice
small incremental steps.
Here's v2 - Andrew, please consider merging it into -mm for testing :
diff -up linux-2.6.1-mm2-orig/fs/binfmt_elf.c linux-2.6.1-mm2-juhl/fs/binfmt_elf.c
--- linux-2.6.1-mm2-orig/fs/binfmt_elf.c 2004-01-09 07:59:27.000000000 +0100
+++ linux-2.6.1-mm2-juhl/fs/binfmt_elf.c 2004-01-13 02:28:21.000000000 +0100
@@ -7,6 +7,13 @@
* Tools".
*
* Copyright 1993, 1994: Eric Youngdale ([email protected]).
+ *
+ * More info on ELF can be found online at
+ * http://x86.ddj.com/ftp/manuals/tools/elf.pdf
+ *
+ * January 2004
+ * Implement stronger ELF sanity checks: Jesper Juhl <[email protected]>
+ *
*/
#include <linux/module.h>
@@ -301,7 +308,7 @@ static unsigned long load_elf_interp(str
unsigned long error = ~0UL;
int retval, i, size;
- /* First of all, some simple consistency checks */
+ /* First of all, ELF header consistency checks */
if (interp_elf_ex->e_type != ET_EXEC &&
interp_elf_ex->e_type != ET_DYN)
goto out;
@@ -479,15 +486,49 @@ static int load_elf_binary(struct linux_
elf_ex = *((struct elfhdr *) bprm->buf);
retval = -ENOEXEC;
- /* First of all, some simple consistency checks */
- if (memcmp(elf_ex.e_ident, ELFMAG, SELFMAG) != 0)
+
+ /* First of all, ELF header consistency checks */
+ /* first check things most likely to fail;
+ magic number, type and architecture, so we bail out
+ early in the common case.
+ */
+ if (((memcmp(elf_ex.e_ident, ELFMAG, SELFMAG) != 0)) ||
+ (elf_ex.e_type != ET_EXEC && elf_ex.e_type != ET_DYN) ||
+ (!elf_check_arch(&elf_ex)))
goto out;
- if (elf_ex.e_type != ET_EXEC && elf_ex.e_type != ET_DYN)
+ /* check all remaining entries in e_ident[] */
+ if ((elf_ex.e_ident[EI_CLASS] != ELF_CLASS) ||
+ (elf_ex.e_ident[EI_DATA] != ELF_DATA) ||
+ (elf_ex.e_ident[EI_VERSION] != EV_CURRENT)) /* see comment for e_version */
+ /* we don't check anything in e_ident[EI_PAD]
+ the ELF spec states that when reading object files, these
+ bytes should be ignored - reserved for future use.
+ */
goto out;
- if (!elf_check_arch(&elf_ex))
+
+ /* check remaining ELF header fields */
+ /*
+ The value 1 for e_version signifies the original file format;
+ extensions will create new versions with higher numbers. The
+ value of EV_CURRENT, though being 1 currently, will change as
+ necessary to reflect the current version number.
+ This needs to be kept in mind when new ELF versions come out and
+ we want to support both new and old versions.
+ */
+ if ((elf_ex.e_version != EV_CURRENT) ||
+ (elf_ex.e_version != elf_ex.e_ident[EI_VERSION]) ||
+ /* how can we check e_entry? any guarenteed invalid entry points? */
+ /* need to come up with valid checks for e_phoff & e_shoff */
+ /* e_flags is checked by elf_check_arch */
+ (elf_ex.e_ehsize != sizeof(Elf_Ehdr)) ||
+ /* e_phentsize checked below */
+ /* how can we check e_phnum, e_shentsize & e_shnum ? */
+ /* check for e_shstrndx needs to improve */
+ ((elf_ex.e_shstrndx == SHN_UNDEF) && (elf_ex.e_shnum != 0)))
goto out;
- if (!bprm->file->f_op||!bprm->file->f_op->mmap)
+
+ if (!bprm->file->f_op || !bprm->file->f_op->mmap)
goto out;
/* Now read in all of the header information */
@@ -506,6 +547,14 @@ static int load_elf_binary(struct linux_
if (retval < 0)
goto out_free_ph;
+ /* p_filesz must not exceed p_memsz.
+ if it does then the binary is corrupt, hence -ENOEXEC
+ */
+ if (elf_phdata->p_filesz > elf_phdata->p_memsz) {
+ retval = -ENOEXEC;
+ goto out_free_ph;
+ }
+
files = current->files; /* Refcounted so ok */
if(unshare_files() < 0)
goto out_free_ph;
@@ -857,9 +906,8 @@ static int load_elf_binary(struct linux_
send_sig(SIGTRAP, current, 0);
}
retval = 0;
-out:
- return retval;
-
+ goto out;
+
/* error cleanup */
out_free_dentry:
allow_write_access(interpreter);
@@ -876,7 +924,8 @@ out_free_fh:
}
out_free_ph:
kfree(elf_phdata);
- goto out;
+out:
+ return retval;
}
/* This is really simpleminded and specialized - we are loading an
diff -up linux-2.6.1-mm2-orig/include/asm-i386/elf.h linux-2.6.1-mm2-juhl/include/asm-i386/elf.h
--- linux-2.6.1-mm2-orig/include/asm-i386/elf.h 2004-01-09 07:59:46.000000000 +0100
+++ linux-2.6.1-mm2-juhl/include/asm-i386/elf.h 2004-01-11 00:29:33.000000000 +0100
@@ -35,9 +35,11 @@ typedef struct user_fxsr_struct elf_fpxr
/*
* This is used to ensure we don't load something for the wrong architecture.
+ *
+ * Include a check of e_flags - Jesper Juhl <[email protected]>
*/
#define elf_check_arch(x) \
- (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486))
+ ((((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) && ((x)->e_flags == 0))
/*
* These are used to set parameters in the core dumps.
Kind regards,
Jesper Juhl
On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> Here's the second version of my patch to add better sanity checks for
> binfmt_elf
I assume this breaks Brian Raiter's tiny ELF executables[1]. Even
though these binaries are evil hacks that don't comply to standards
and serve no serious purpose, I'm not sure what the purpose of the
sanity checks is. Are there any risks associated with running
non-compliant ELF executables? (Now that I mention it, the
proof-of-concept exploit for the brk() hole comes to mind, but I don't
know offhand if that did anything against the spec.) I don't mean to
question the usefulness of your work, especially as I don't know much
about ELF, but I'm personally curious about why you think additional
sanity checks are worth a slight increase in code complexity.
1. http://www.muppetlabs.com/~breadbox/software/tiny/
Aaron Lehmann <[email protected]> writes:
> On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> > Here's the second version of my patch to add better sanity checks for
> > binfmt_elf
>
> I assume this breaks Brian Raiter's tiny ELF executables[1].
Hmm. I would expect most of the to continue to work because they
are valid. The only problem I see is when he starts scrunching
things together by changing the value of fields that have a specified
meaning.
> Even
> though these binaries are evil hacks that don't comply to standards
> and serve no serious purpose, I'm not sure what the purpose of the
> sanity checks is. Are there any risks associated with running
> non-compliant ELF executables?
Sanity checks are always good for future compatibility so someone does
not come to rely on your bugs. This is less of a problem in linux than
in some systems but still. This is the primary reason cpus have undefined
opcode exceptions for example.
> (Now that I mention it, the
> proof-of-concept exploit for the brk() hole comes to mind, but I don't
> know offhand if that did anything against the spec.) I don't mean to
> question the usefulness of your work, especially as I don't know much
> about ELF, but I'm personally curious about why you think additional
> sanity checks are worth a slight increase in code complexity.
That was my impression as well. Increasing the complexity of the
if statements when goto's are already in use seems silly.
Eric
On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
>
> - if (elf_ex.e_type != ET_EXEC && elf_ex.e_type != ET_DYN)
> + /* check all remaining entries in e_ident[] */
> + if ((elf_ex.e_ident[EI_CLASS] != ELF_CLASS) ||
> + (elf_ex.e_ident[EI_DATA] != ELF_DATA) ||
> + (elf_ex.e_ident[EI_VERSION] != EV_CURRENT)) /* see comment for e_version */
> + /* we don't check anything in e_ident[EI_PAD]
> + the ELF spec states that when reading object files, these
> + bytes should be ignored - reserved for future use.
> + */
> goto out;
> - if (!elf_check_arch(&elf_ex))
> +
> + /* check remaining ELF header fields */
> + /*
> + The value 1 for e_version signifies the original file format;
> + extensions will create new versions with higher numbers. The
> + value of EV_CURRENT, though being 1 currently, will change as
> + necessary to reflect the current version number.
> + This needs to be kept in mind when new ELF versions come out and
> + we want to support both new and old versions.
> + */
> + if ((elf_ex.e_version != EV_CURRENT) ||
> + (elf_ex.e_version != elf_ex.e_ident[EI_VERSION]) ||
Why are you checking elf_ex.e_ident[EI_VERSION] again?
You previously checked that elf_ex.e_version == EV_CURRENT and
elf_ex.e_ident[EI_VERSION] == EV_CURRENT.
Also, comment about e_version increasing is not needed IMHO.
There is no ELF version bumping anywhere on the horizon.
> + /* how can we check e_entry? any guarenteed invalid entry points? */
> + /* need to come up with valid checks for e_phoff & e_shoff */
Why?
> + /* e_flags is checked by elf_check_arch */
> + (elf_ex.e_ehsize != sizeof(Elf_Ehdr)) ||
This belongs to elflint, not kernel.
> + /* e_phentsize checked below */
> + /* how can we check e_phnum, e_shentsize & e_shnum ? */
> + /* check for e_shstrndx needs to improve */
> + ((elf_ex.e_shstrndx == SHN_UNDEF) && (elf_ex.e_shnum != 0)))
Kernel has no business looking at sections. It doesn't use them,
why should it care?
> goto out;
> - if (!bprm->file->f_op||!bprm->file->f_op->mmap)
> +
> + if (!bprm->file->f_op || !bprm->file->f_op->mmap)
> goto out;
>
> /* Now read in all of the header information */
> @@ -506,6 +547,14 @@ static int load_elf_binary(struct linux_
> if (retval < 0)
> goto out_free_ph;
>
> + /* p_filesz must not exceed p_memsz.
> + if it does then the binary is corrupt, hence -ENOEXEC
> + */
> + if (elf_phdata->p_filesz > elf_phdata->p_memsz) {
> + retval = -ENOEXEC;
> + goto out_free_ph;
> + }
> +
> files = current->files; /* Refcounted so ok */
> if(unshare_files() < 0)
> goto out_free_ph;
Why special case first Phdr (which btw doesn't have to be PT_LOAD)?
> /* This is really simpleminded and specialized - we are loading an
> diff -up linux-2.6.1-mm2-orig/include/asm-i386/elf.h linux-2.6.1-mm2-juhl/include/asm-i386/elf.h
> --- linux-2.6.1-mm2-orig/include/asm-i386/elf.h 2004-01-09 07:59:46.000000000 +0100
> +++ linux-2.6.1-mm2-juhl/include/asm-i386/elf.h 2004-01-11 00:29:33.000000000 +0100
> @@ -35,9 +35,11 @@ typedef struct user_fxsr_struct elf_fpxr
>
> /*
> * This is used to ensure we don't load something for the wrong architecture.
> + *
> + * Include a check of e_flags - Jesper Juhl <[email protected]>
> */
> #define elf_check_arch(x) \
> - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486))
> + ((((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) && ((x)->e_flags == 0))
Why? This seems unnecessary.
Jakub
On Tue, 13 Jan 2004, Jakub Jelinek wrote:
> On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> >
> > + if ((elf_ex.e_version != EV_CURRENT) ||
> > + (elf_ex.e_version != elf_ex.e_ident[EI_VERSION]) ||
>
> Why are you checking elf_ex.e_ident[EI_VERSION] again?
> You previously checked that elf_ex.e_version == EV_CURRENT and
> elf_ex.e_ident[EI_VERSION] == EV_CURRENT.
>
Right, that's completely redundant. My brain must have been out for lunch
for a while there.
> Also, comment about e_version increasing is not needed IMHO.
> There is no ELF version bumping anywhere on the horizon.
>
Ok, the comment should probably go the way of the Dodo then.
> > + /* how can we check e_entry? any guarenteed invalid entry points? */
> > + /* need to come up with valid checks for e_phoff & e_shoff */
>
> Why?
>
For much the same reasons that I want to check everything else. If
all the information in the ELF header is not valid then I'd consider the
binary corrupt. If a file claims to be an ELF binary, then it should be a
/valid/ ELF binary, and the more we can do to ensure that the better.
As I see it, if a binary contains invalid information then either the
tools that created the binary are corrupt or something has
(incorrectly) modified the binary since its creation.
In both cases I see no reason why the kernel should not throw out the
binary as early as possible.
I tried to outline my reasons for doing this in a previous mail to lkml,
but I'll repeat them here for your convenience.
"
- Correctness. If it's invalid it /should/ fail, and as early as possible.
- Stability. Who knows when it'll crash and what damage it may have done
before it crashes?
- Least amount of surprise for the user. If a binary has become corrupted
the user is likely to want to be told it's bad when loading it (if
possible), rather than being left wondering about strange crashes during
runtime.
- Security 1. Is it not plausible that someone may try to play tricks on
the kernel with invalid binaries? Isn't it safer just rejecting them if we
*know* they are bad?
- Security 2. If a virus/worm/trojan/whatever attempts to infect a binary
and does not do a perfect job of fixing up the ELF header, section table
headers etc, then with the current code we would in some cases still run
the binary. If we enforce as many sanity checks as possible such an
infected binary will likely fail to run.
"
To that I'll add "defensive programming". In my experience it's never wise
to trust sources you don't have 100% control over, and I don't think that
there is any source as un trust worthy as userspace from the kernels
perspective. So, we are given a binary to load and execute from an
un-trusted source and I think it's prudent to be as thorough as possible
in validating that binary in as many ways as we possibly can before
executing it.
It doesn't matter if the kernel actually uses the information it
validates. If the kernel detects anything at all to be out of place that
it has in its power to properly validate then that binary should not be
deemed fit to execute. I think that's a fairly pragmatic and wise
approach.
> > + /* e_flags is checked by elf_check_arch */
> > + (elf_ex.e_ehsize != sizeof(Elf_Ehdr)) ||
>
> This belongs to elflint, not kernel.
>
> > + /* e_phentsize checked below */
> > + /* how can we check e_phnum, e_shentsize & e_shnum ? */
> > + /* check for e_shstrndx needs to improve */
> > + ((elf_ex.e_shstrndx == SHN_UNDEF) && (elf_ex.e_shnum != 0)))
>
> Kernel has no business looking at sections. It doesn't use them,
> why should it care?
>
It should care becourse if something (anything that the kernel can
validate) is wrong with the binary then it's suspicious and either the
binary is simply broken (and to be nice to the user we should fail early)
or someone is trying to load a binary that might be attempting to do
something it shouldn't and for safety reasons we should reject it to avoid
it causing trouble later on. The earlier you catch potential problems and
react to them the less chance they have of potentially damaging something.
> > /* Now read in all of the header information */
> > @@ -506,6 +547,14 @@ static int load_elf_binary(struct linux_
> > if (retval < 0)
> > goto out_free_ph;
> >
> > + /* p_filesz must not exceed p_memsz.
> > + if it does then the binary is corrupt, hence -ENOEXEC
> > + */
> > + if (elf_phdata->p_filesz > elf_phdata->p_memsz) {
> > + retval = -ENOEXEC;
> > + goto out_free_ph;
> > + }
> > +
> > files = current->files; /* Refcounted so ok */
> > if(unshare_files() < 0)
> > goto out_free_ph;
>
> Why special case first Phdr (which btw doesn't have to be PT_LOAD)?
>
I'm not trying to special case anything. I'm simply working slowly and
making small changes at a time. So what is a special case at the moment
will hopefully evolve into just the first of several similar checks in the
future.
Reason 1: The more changes I make at a time the harder it is to
track down a change which breaks something.
Reason 2: I don't know everything about ELF yet, nor everything that goes
on when loading a binary. I'm learning as I go along and I only want to
make changes that I feel fairly confident are correct.
> > #define elf_check_arch(x) \
> > - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486))
> > + ((((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) && ((x)->e_flags == 0))
>
> Why? This seems unnecessary.
>
According to http://x86.ddj.com/ftp/manuals/tools/elf.pdf Book II section
1-2 :
"The ELF header's e_flags member holds bit flags associated with the file.
The Intel architecture defines no flags; so this member contains zero."
So, it's documented that for Intel binaries this flag should be zero, thus
any Intel binary where this flag does not contain zero has been corrupted
(or not generated properly), hence the check. All valid binaries should
pass the check fine, but invalid binaries may fail it - which is the
point.
-- Jesper Juhl
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jesper Juhl wrote:
> - Correctness. If it's invalid it /should/ fail, and as early as possible.
> [...]
A _lot_ of tests are possible, but you don't want to perform them in the
kernel. The elflint tool I wrote and which Jakub mentioned gives you an
impression of what tests are possible. You do not want to run all these
tests in the kernel to fulfill your "as early as possible" rule.
The kernel should restrict itself to testing the values which affect the
execution. This was already (mostly) the case with the old loader which
is why I never bothered to send any changes. Every additional check is
just extra overhead for 99.999% of all cases.
ld.so performs itself some tests, supplementing the tests in the kernel.
This is enough to catch ill-formed programs which might cause problems.
If you want to have notification of changed files use tripwire or an
equivalent. If you want to find invalid ELF files, use elflint. The
right tool for the job.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
iD8DBQFABkSo2ijCOnn/RHQRAqJlAJ48KnZ/O3xHgKc3bDOPMF8cV9DGdQCgrBqA
A9t3cBQeO7o4IMWHd2MQNqs=
=KYJ7
-----END PGP SIGNATURE-----
On Thu, 15 Jan 2004 08:50:12 +0100, you wrote in linux.kernel:
> ld.so performs itself some tests, supplementing the tests in the kernel.
> This is enough to catch ill-formed programs which might cause problems.
Not everybody is using glibc, ld.so implementations vary and it's
probable that not all do really check much.
I agree that the kernel should only check values that it really uses,
though. The other checks could be optional (CONFIG_WHATEVER) and/or
only lead to debugging messages if they fail.
--
Ciao,
Pascal
Hi!
> On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> > Here's the second version of my patch to add better sanity checks for
> > binfmt_elf
>
> I assume this breaks Brian Raiter's tiny ELF executables[1]. Even
> though these binaries are evil hacks that don't comply to standards
> and serve no serious purpose, I'm not sure what the purpose of the
> sanity checks is. Are there any risks associated with running
> non-compliant ELF executables? (Now that I mention it, the
You get vy ugly behaviour. If you compile executable with huge static
data, it will compile okay, link okay, *launch okay* and die on
segfault. That's wrong, it should have died on -ENOMEM during exec.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]
On Friday 16 January 2004 10:08, Pavel Machek wrote:
> Hi!
>
> > On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> > > Here's the second version of my patch to add better sanity checks for
> > > binfmt_elf
> >
> > I assume this breaks Brian Raiter's tiny ELF executables[1]. Even
> > though these binaries are evil hacks that don't comply to standards
> > and serve no serious purpose, I'm not sure what the purpose of the
> > sanity checks is. Are there any risks associated with running
> > non-compliant ELF executables? (Now that I mention it, the
>
> You get vy ugly behaviour. If you compile executable with huge static
> data, it will compile okay, link okay, *launch okay* and die on
> segfault. That's wrong, it should have died on -ENOMEM during exec.
> Pave
Wouldn't that depend on the overcommit options?
With permitted overcommit -
compile/link ok
launch - segfault/-ENOMEM if heap/stack + static data UPDATES exceed
system capacity
Without overcommit:
-ENOMEM if the heap/stack can't be initialized; as in even the
first page of the heap/stack fails - and before actual
launch completes, as the situation you describe.
If static data is just referenced, it should page in, and get dropped.
Hi!
> > > On Tue, Jan 13, 2004 at 02:55:07AM +0100, Jesper Juhl wrote:
> > > > Here's the second version of my patch to add better sanity checks for
> > > > binfmt_elf
> > >
> > > I assume this breaks Brian Raiter's tiny ELF executables[1]. Even
> > > though these binaries are evil hacks that don't comply to standards
> > > and serve no serious purpose, I'm not sure what the purpose of the
> > > sanity checks is. Are there any risks associated with running
> > > non-compliant ELF executables? (Now that I mention it, the
> >
> > You get vy ugly behaviour. If you compile executable with huge static
> > data, it will compile okay, link okay, *launch okay* and die on
> > segfault. That's wrong, it should have died on -ENOMEM during exec.
>
> Wouldn't that depend on the overcommit options?
I believe in this case data were so big they did not even fit in
address space...
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]