All,
Let me start by saying that, if we have compiled functionality
X as a built-in part of kernel, and then if we try to load X compiled
as a module, we get _bad_ results, varying from weird behaviour to
upfront crashes.
The question is : Why does insmod not check for redefinition
of symbols and hence disallow module loading in such cases ?
For the records, the kernel version I'm using is some flavour of
2.6.9.
I understand that this is a very basic thing and the kernel
module subsystem authors would have thought about it and if it behaves
this way, it would more likely be a feature. I am keenly interested
in knowing the rationale behind it.
On my setup, SCSI midlayer was compiled as part of kernel proper
and then the initrd tried to load scsi_mod.ko as a module also (which was
present in initrd as I accidently used a wrong initrd). I would expect
this to result in insmod failure due to redefinition of various
functions already exported by the SCSI mid-layer (which is part of
kernel proper).
What actually happened is that the scsi_mod.ko module got loaded
and its init_module() function was called, which apart from lot of other
things, called kmem_cache_create() to create a slab cache. Since the slab
cache with the same name was already present (the first one was created
when the SCSI midlayer init function was called as part of kernel proper
initialization), this triggered a BUG.
When I checked for the exported SCSI midlayer symbols in
/proc/kallsyms I saw duplicate symbols for all the SCSI midlayer symbols,
one in the kernel text segment 0xcXXXXXXX and the other in the module
text segment (this one was 0xeXXXXXXX).
I tried this with other components (ext3, jbd, e1000 etc) and the
results were the same; the module gets loaded on top of the builtin
functionality resulting in multiple definitions of the EXPORTed symbols.
I've tried the same thing on 2.4.20 kernel with _same_ results.
Since we see the same behaviour with different kernels, it is not specific
to a particular kernel.
Thanx,
Tomar
-- "Theory is when you know something, but it doesn't work.
Practice is when something works, but you don't know why.
Programmers combine theory and practice: Nothing works
and they don't know why ..."
Did'nt get any response to this one, so sending it again.
Can any of the module subsystem authors tell, why they have decided to
allow loading a kernel module having an EXPORTed symbol with the same name
as an EXPORTed symbol in kernel proper. The safest thing would be to
disallow module loading in this case, giving a "Symbol redefinition"
error.
Allowing the module load will lead to overriding kernel functions
which will affect modules loaded in future, that reference those
functions. Overall, it can have bad effects of varying severity.
Thanx,
Tomar
>
> All,
> Let me start by saying that, if we have compiled functionality
> X as a built-in part of kernel, and then if we try to load X compiled
> as a module, we get _bad_ results, varying from weird behaviour to
> upfront crashes.
> The question is : Why does insmod not check for redefinition
> of symbols and hence disallow module loading in such cases ?
>
> For the records, the kernel version I'm using is some flavour of
> 2.6.9.
>
> I understand that this is a very basic thing and the kernel
> module subsystem authors would have thought about it and if it behaves
> this way, it would more likely be a feature. I am keenly interested
> in knowing the rationale behind it.
>
> On my setup, SCSI midlayer was compiled as part of kernel proper
> and then the initrd tried to load scsi_mod.ko as a module also (which was
> present in initrd as I accidently used a wrong initrd). I would expect
> this to result in insmod failure due to redefinition of various
> functions already exported by the SCSI mid-layer (which is part of
> kernel proper).
> What actually happened is that the scsi_mod.ko module got loaded
> and its init_module() function was called, which apart from lot of other
> things, called kmem_cache_create() to create a slab cache. Since the slab
> cache with the same name was already present (the first one was created
> when the SCSI midlayer init function was called as part of kernel proper
> initialization), this triggered a BUG.
> When I checked for the exported SCSI midlayer symbols in
> /proc/kallsyms I saw duplicate symbols for all the SCSI midlayer symbols,
> one in the kernel text segment 0xcXXXXXXX and the other in the module
> text segment (this one was 0xeXXXXXXX).
> I tried this with other components (ext3, jbd, e1000 etc) and the
> results were the same; the module gets loaded on top of the builtin
> functionality resulting in multiple definitions of the EXPORTed symbols.
> I've tried the same thing on 2.4.20 kernel with _same_ results.
> Since we see the same behaviour with different kernels, it is not specific
> to a particular kernel.
>
>
> Thanx,
> Tomar
>
>
>
>
> -- "Theory is when you know something, but it doesn't work.
> Practice is when something works, but you don't know why.
> Programmers combine theory and practice: Nothing works
> and they don't know why ..."
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
-- "Theory is when you know something, but it doesn't work.
Practice is when something works, but you don't know why.
Programmers combine theory and practice: Nothing works
and they don't know why ..."
> Did'nt get any response to this one, so sending it again.
It was discussed recently on LKML in the thread Over-riding symbols in the Kernel causes Kernel Panic.
http://marc.theaimsgroup.com/?l=linux-kernel&m=113275593121320&w=2
Apparently people are waiting for a patch ;)
Parag
On Fri, 2005-11-25 at 10:45 +0530, Nagendra Singh Tomar wrote:
> Did'nt get any response to this one, so sending it again.
>
> Can any of the module subsystem authors tell, why they have decided to
> allow loading a kernel module having an EXPORTed symbol with the same name
> as an EXPORTed symbol in kernel proper. The safest thing would be to
> disallow module loading in this case, giving a "Symbol redefinition"
> error.
> Allowing the module load will lead to overriding kernel functions
> which will affect modules loaded in future, that reference those
> functions. Overall, it can have bad effects of varying severity.
Sure. It was due to minimalism. If you override a symbol it's
undefined behavior. It should be fairly simple to add a check that
noone overrides a symbol. We didn't bother checking for it because it
wasn't clear that it was problematic.
Hope that clarifies,
Rusty.
--
A bad analogy is like a leaky screwdriver -- Richard Braakman
On Thu, 1 Dec 2005, Rusty Russell wrote:
> Sure. It was due to minimalism. If you override a symbol it's
> undefined behavior. It should be fairly simple to add a check that
> noone overrides a symbol. We didn't bother checking for it because it
> wasn't clear that it was problematic.
Thanx.
Of all the problems (including kernel crashes, BUGs etc) one of the
more serious kinds are the ones where someone writes a new module and
accidently defines a function which has the same name as one of functions
(say foo_export), already EXPORTed by either kernel proper or some
loaded module (as the kernel is growing bigger chances of this happening
is also growing). The module happily loads and then some other module
which wants to use the function foo_export (obviously the one EXPORTed by
kernel proper and not the one overidden by the overiding module) is
loaded. It will also load happily but will get linked against the new
foo_export, defnitely not something that he wants. And, all this has
happened without the kernel telling the user anything.
IMHO, these kind of silent errors are very dangerous and not
something that should be acceptable.
As you rightly said, it should be fairly straightforward to check for
symbol redefinition. We need to do it only for the symbols EXPORTed by the
loadable module.
Thanx,
Tomar
-- "Theory is when you know something, but it doesn't work.
Practice is when something works, but you don't know why.
Programmers combine theory and practice: Nothing works
and they don't know why ..."
2005/12/1, Nagendra Singh Tomar <[email protected]>:
> On Thu, 1 Dec 2005, Rusty Russell wrote:
>
> > Sure. It was due to minimalism. If you override a symbol it's
> > undefined behavior. It should be fairly simple to add a check that
> > noone overrides a symbol. We didn't bother checking for it because it
> > wasn't clear that it was problematic.
>
> Thanx.
> Of all the problems (including kernel crashes, BUGs etc) one of the
> more serious kinds are the ones where someone writes a new module and
> accidently defines a function which has the same name as one of functions
> (say foo_export), already EXPORTed by either kernel proper or some
> loaded module (as the kernel is growing bigger chances of this happening
> is also growing). The module happily loads and then some other module
> which wants to use the function foo_export (obviously the one EXPORTed by
> kernel proper and not the one overidden by the overiding module) is
> loaded. It will also load happily but will get linked against the new
> foo_export, defnitely not something that he wants. And, all this has
> happened without the kernel telling the user anything.
> IMHO, these kind of silent errors are very dangerous and not
> something that should be acceptable.
> As you rightly said, it should be fairly straightforward to check for
> symbol redefinition. We need to do it only for the symbols EXPORTed by the
> loadable module.
This shouldn't happen if you only use in-tree modules as you should.
Don't take kernel modules as user mode applications.
--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/